Martin Häusler

Posted on Apr 19, 2018

Java Enterprise 101

#java #javaenterprise #jee #spring

There are a lot of ways to create software. In fact, there are even a lot of ways to create good software. When it comes to application server development, one of them has stood the test of time, and for good reason: Java Enterprise Edition.

Java EE is much more than a software library. It is also an architecture and a phiplosophy. JEE is not for the faint of heart; if you are building a throw-away prototype, you headed the wrong way. But if you came to read about an architecture that can carry major enterprises and support massive scale applications, then you're right on track. JEE is the heavy artillery of software development. And it's freaking awesome.

In this article, we will explore the architectural side of JEE. An article about the implementation will follow soon.

A little side note. When I talk about Java Enterprise in this article, I'm not specifically talking about the library, but the architecture and the philosophy. The official Java EE libraries are just one way of implementing this stack.

The JEE Stack

So you are ready for some serious software development? Cast aside any untyped languages, fancy scripting, hype-driven development and hipster-tech, let's get serious and write software that runs for the next 20+ years. Here is the Java EE architecture in a nutshell:

Okay, granted, you need a big nutshell. Let's start with some initial observations:

Every communcation with the outside world is strictly request-response based.
The incoming request passes through several layers before it reaches your application code. Each layer can refuse, redirect or alter the request. The motivation behind these layers is separation of concerns.
Many of the layers are already implemented and you just need to make use of them.
JEE is all about allowing developers to focus on business functionality only. 99% of everything else has been taken care of for you.

Application Containers

When a request reaches our server machine over the network, it is first passed to the operating system. The OS will determine to which application to forward the request, based on the port it has been sent to. In this case, the application is the Java Virtual Machine. The JVM internally runs an application container (such as Apache Tomcat or Glassfish). An application container implements the Java Servlet API. An application container has several responsibilities:

It manages one or more applications which are delivered as servlets. In practice, most cotainers only hold a single servlet, but in theory one Tomcat can hold arbitrarily many servlets.
It provides an implementation of the servlet API. This allows the contained applications to talk to the container. The most prominent use of this feature is to establish a filter chain (more on that later).
It provides integration with the Services API of the operating system. That way, an application running inside the container can be started, terminated and rebooted as an OS-level service. For that reason, even though Java is multi-platform, many application containers contain platform-specific code (so the contained applications remain platform-independent).
It redirects incoming requests to the correct application based on the path mapping. It is not uncommon to see one registered servlet for static content (bound to /static) and one for the dynamic API of your application server (bound to /api).
It manages the thread pool for requests and binds each request to a thread. As the thread holds the request context, it is discouraged to manually start new threads in a JEE environment (unless you know exactly what you are doing).

Traditionally, Java EE applications are deployed via archive files known as WAR files (for Web ARchive) or EAR files (for Enterprise ARchive). The internal file structure of these archives is standardized. The application containers extract the contained files on startup and launch the contained servlet(s). While doing so, the container binds your servlet to the specified port (either specified in code or a configuration file).

The JEE Framework

Usually when working in the JEE architecture, you will not implement everything from scratch. Many tasks are the exactly the same from one JEE application to the next, so it makes a lot of sense to use a suitable framework. Predominantly, there are the actual JEE reference implementation, and the Spring framework. I can't say much about "vanilla" JEE, as I exclusively used the Spring framework so far. We will discuss it in more detail in the next article.

The Filter Chain

Every incoming request, before it is handed to your application, must pass a series of so-called Servlet Filters, which form a filter chain. Once a request passes the first filter, the second filter kicks in, and so on. Each filter has the option to block a request. Application containers allow to customize the filter chain via the Servlet API. The JEE framework implementations use the filter chain for many tasks, including session management and security. Filters can also have side-effects; if there is a task you want to perform per request, you will often see the implementation in the form of a servlet filter. Also, if you need to bind some information to the request itself, servlet filters are a common place to do so.

The Presentation Layer

The presentation layer is where your actual application code meets an incoming request for the first time. This request has passed the servlet filter chain, so the user session is set up and ready to go, and all authentication has already been taken care of. In the early days of JEE, the presentation layer was the place where server-side generation of HTML pages has occurred. Nowadays, the presentation layer consists of a collection of REST controllers that offer various endpoints that make up your REST API. If you are faced with older applications, you will also encounter XML webservices in the presentation layer. A common thing to do in the presentation layer is server-side validation of user input and general request validation. In the same way as you should never write SQL queries in your GUI code, the presentation layer must not attempt to access the database directly. A class in the presentation layer is only allowed to talk to another presentation layer class, a service layer class, or an element of the data model which was returned by the service layer.

The Service Layer

The service layer is where your actual application code resides. This is the place to put your business rules into code. The service layer is where you move data around in your data model, create new elements, delete old ones, etc. Depending on your use case, the service layer may be as small as "forward this call to the repository layer", or an extremely involved process. Classes from the service layer may only talk to other services, or to classes from the repository layer.

The Repository Layer

This is the last layer in your code that modifies your data before it hits the database. The predominant element in this layer are Repositories (also known as Data Access Objects, or DAO*s). These classes simply offer a number of methods that allow you to *persist, load, delete and query your data in the database. What is important here is that you must never let any specifics of your data store escape the repository layer - its very purpose is to make sure that you can exchange the data store with a different one (potentially even an SQL database with a NoSQL store!). Internally, your repository methods will contain the actual query statements. If you are working with a standard JEE stack, then you will have a Java Persistence API (JPA) Provider such as Hibernate in place. JPA allows you to convert your domain model to SQL tables and back with relative ease. It still has a lot of pitfalls and would be deserving of its own article. As you probably already guessed, the repository layer classes do not call any other classes outside of their own layer, except for JPA classes.

The Data Model

The data model represents the data in your domain. It is the only architectural element that will be used by all three layers of your application. It is therefore crucial that the domain model classes have NO references to any other classes, except for classes that reside within the domain model themselves. In contrast to the presentation-, service- and persistence-layer classes, the domain model is stateful. Typically, you will not want to have a lot of logic in the domain model; it mostly exists to hold your data and provide a clean API, the actual complex modifications are done in the business layer. The domain model, while not explicitly required in JEE, almost always follows the Java Bean pattern. Proper getters and setters are not negotiable here if you want to make use of standard frameworks for easily handling your domain model, such as Bean Validation and JPA (more on that later). A domain model element is your typical POJO - private fields, a constructor, and getters and setters. Usually, frameworks like JPA, Jackson and JAXB will in addition force you to give each class a default constructor, because these classes need to be instantiable via Java reflection. In contrast to almost all other classes in the JEE architecture, having a clean implementation of equals() and hashCode() is crucial for domain model POJOs. Usually, each domain model element has a unique ID for this purpose, which also coincides with its ID in the database tables.

Threads in JEE

A request is always bound to a thread in JEE which is instantiated and managed by the Application Container (typically in a Thread Pool). This means that a JEE server application is always inherently concurrent, you cannot avoid that. As we all know, properly dealing with concurrency is hard. Thankfully, the JEE architecture has you covered when it comes to concurrency. If you look at the picture above, you see four users working with the application in parallel, each being represented by a request/response bound to a thread. There is one particular detail worth noting: the threads never intersect. The application performs no synchronization, and instead leaves it to a component which is really good at doing so: the database.

How is this possible? How can we have all these layers above the database without having to consider multi-threading? Recall when concurrency becomes an issue: when several threads access the same data. You want to avoid this case at all costs in a JEE application (there are exceptions, such as application-level caches). In order to do so, all classes that belong to the repository layer and the service layer are stateless in JEE. They have no fields, neither private nor public ones, which hold mutable state. So what about the data? The data is loaded per user and per request. When a request arrives at the service layer (the presentation layer is a bit of an exception here) then a new database transaction is opened for exclusive use by this user. The services then gather the requested data and/or perform the requested modifications, all inside this single transaction. Before the result is passed to the presentation layer, the transaction is committed and closed.

This architecture has two big advantages:

The server is stateless, which is a nice property to have, e.g. for testing. It helps to keep business logic very simple and works well with a more functional programming style.
The only place where concurrent modifications ever meet is at the database, but they are specifically engineered to handle that.

The cost is of course that each thread builds its own (partial) view of the data model. So if two users request the same piece of data, it will be held in memory twice.

Closing Words

There would be so much more to say about JEE. I often feel that it gets a lot of undeserved criticism simply because it is misunderstood. It plays really well with modern programming styles and languages and it helps to build very stable applications. In a way, JEE is not so much about what it provides to you as a programmer, but rather what it protects you from (concurrency issues, data integrity issues, ...). The JEE architecture is a prime example for defensive programming in this regard - it is all about safety first. This architecture has a proven track record of being well-suited for large projects and teams.

In the next article, we will take a closer look on the actual implementation of this architecture by a concrete example - it will take a lot less code on our part to make all of this happen than you think.

Top comments (16)

Flo Roform • Apr 22 '18

Hi there, nice article!

However, one thing I don’t agree with is

Typically, you will not want to have a lot of logic in the domain model; it mostly exists to hold your data

as it pretends that JEE implies an anemic domain model. Of course, this approach makes sense for simple CRUD applications but it is also possible to follow an object-oriented approach, e.g. domain-driven design.

Martin Häusler • Apr 22 '18

Glad you like it, Flo!

So maybe I wasn't very clear in that regard in the article. You can of course add whichever methods you want or need in your domain model; JPA will not care about them (unless they have the signature of a getter/setter but that's another story). The main point that I was trying to make here is: do never, never ever try to reference one of your services (no matter from which layer) inside the domain model. Things will go south very very fast from there, I lived to regret it myself. One of the "golden rules" of the JEE architecture is: once you are outside the Dependency Injection Container context, do not try to get back into it.

I totally agree that a data model that does literally nothing else than containing the data isn't necessarily something you want in an object-oriented system. I also agree that this is troublesome from an OOP perspective because you eliminate a large portion of the benefit that OOP provides (in particular polymorphism). The folks who do react-redux in the oh-so-object-oriented javascript world do just that all day long: they do not even bother to assign classes (prototypes) to their data structures anymore, it's just data. JEE doesn't force you to do anything like that; as long as whatever your method is going to do will not access services. In particular, you would have:

userRepository.save(user)

... and not...

user.save()

The latter is the "solution" that OOP actually promises: functions and data go together. And I could not agree more that this is very appealing and very nice. However, in practice, you hit a wall here. save() - well, save to where exactly? The database? An XML document? There is an excellent talk by Robert Martin on a related issue. Sometimes you need to separate the algorithm from the data, because there may be more than one algorithm operating on it (save to XML, save to database, save to I/O stream...) and you cannot possibly foresee any future algorithm you will require. In the end, it all boils down to one thing: managing the dependencies between your objects. In particular in a JEE-style environment, objects move from the persistent state (stored in the DB) to the transient state back and forth all the time, and there are multiple representations of each object, due to the way requests are handled and isolated from one another. Thus, the separation of services (for "heavyweight" operations) and methods on the domain model itself (for local changes) makes sense. But I do understand your point. Frankly, I've been pondering on this very issue myself a lot.

Flo Roform • Apr 23 '18

Thanks a lot for taking the time to answer!

I am still not sure I agree :D Again, my point was that JEE simply makes no assumptions about how to model the domain. While I agree with many of the points in your post these have nothing to do with JEE in particular but with software architecture in general. So I think it is a bit misleading as people might connect these ideas to enterprise Java.

JEE doesn't force you to do anything like that; as long as whatever your method is going to do will not access services.

Uhm, JEE also does not force anything even if a rich model does reference a service. Again, I think you are mixing JEE and general advice about software architecture. As for the advice part: not so sure this statement holds in any case. For example, think about domain-driven design. There each layer explicitly makes space for own services and, of course, this can be modelled in JEE.

P.S. Uncle Bob’s talk indeed is great. You should also check out his blog post about Clean Architecture if you have not already read it anyway :-)

Martin Häusler • Apr 23 '18

Thanks for the response. I think this discussion is important and quite interesting.

As you stated, domain model classes accessing or not accessing services isn't strictly about the JEE architecture anymore. Indeed, the JEE architecture is agnostic to this decision. However, if we look a bit deeper into the technical details, when you are dealing with a JEE-like architecture, you will likely also want some kind of dependency injection (DI) framework. Otherwise, you will end up with a lot of singletons and hard-wired dependencies which in turn make your code very difficult to handle in unit and integration tests (been there, done that, and lived to regret it).
So what you usually find is either some flavor of Google Guice, Spring's IoC Container or some other implementation of JSR 330. All of these containers help you to implement de-facto singletons without resorting to static variables in your code, such that you can easily interchange them (or even mock them) for testing. However, this technique only works for singletons (and a selected handful of other 'scopes'), but it will not work for domain model elements - which is: objects you pass around and you instantiate as required with new (or a factory) on the fly.
In such a scenario, any service you define becomes a "singleton-scoped bean" in the container. Every bean in the container can request the container to inject other container beans into itself, usually via the constructor, via annotated setters or directly in fields. Now, if you have a method in a domain model element, there are only two ways of how this method could ever gain access to a bean from the container: either by passing the bean directly as a parameter (which is ugly; who passes around singletons?) or by having some static reference to the application container itself. And no matter which JSR 330 implementation you choose, every single one of them will tell you on the first page of the instructions manual: making the application context publicly available in a static variable is an anti-pattern.
So how do we implement a method that requires access to a service method then? Well, that method becomes a service method of its own. Does it contradict the principles of OOP? Yes, it certainly does. But it helps a lot in keeping your call hierarchies clean. Some junior programmer next door might otherwise come up with the glorious idea of sending an HTTP request (and blocking the thread while waiting for the response, because why not!) in a regular bean setter. Limiting the capabilities of the domain model a bit helps programmers to estimate the impact of their method calls. What I would not want to have is that a regular setter method internally calls X, X calls Y, Y calls Z, and Z withdraws money from my bank account ;-) Trust me, I've had it all. I've seen people implementing hibernate entities as observables.

I guess this will become a bit more clear when we talk about actual implementations. I've got the feeling that we are actually aiming for similar things but express them in a different vocabulary.

Regarding Uncle Bob: I really enjoy his talks and books (I'm currently reading "Clean Architecture", his latest book). I am usually very much in agreement with what he says (except for his strange love for "write the test first" practices). However, even though in contrast to other "architecture gurus" he tries to keep his arguments close to reality and make them actionable, he remains quite abstract in his conclusions all too often. He tells people what to avoid, but rarely what to go for. At the end of the day, we have to put it into code, not lawsuits. My personal position is usually somewhere in the middle between Uncle Bob, Martin Fowler and Bertrand Mayer.

Flo Roform • Apr 23 '18

Hi Martin,

this is a great answer and I especially like that it now draws a clear line between JEE and architectural best practices. You should consider writing another blog post about this!

It's funny that we are both interested in similar philosophies about programming. I have read Clean Architecture as well (even pre-ordered :D) and I can also agree with the majority of things that Uncle Bob suggest (yep, even the TDD part; although I as well do not practice it constantly but this is more of an organizational problem in the company I work for). Anyway, if you’d like to talk about actual implementations feel free to send a link to a GitHub repository (or something) that reflects your approach. Although by now, I am quite confident that we probably won’t find much to discuss ;-)

Martin Häusler • Apr 23 '18

The next article I have planned is a follow up on this one on how to actually implement this architecture with spring-boot. Time is a bit short at the moment though. I'm looking forward to your comments!

Herdy Handoko • Apr 20 '18

It's great that it provides an opinionated architecture, such that it might be a good choice in systems integration projects. But it's hardly efficient nor scalable compared to existing alternatives (JVM-based or otherwise).

Martin Häusler • Apr 20 '18

As already stated in the text, it is a defensive choice. Will you end up with the best possible and most efficient solution? Most likely not. But you will create a solution that works. Also, if you tell someone that your application uses a standard JEE stack, then that person will already know a whole lot about the code without ever having seen it. It's not perfect, but it has a lot of merits.

Herdy Handoko • Apr 21 '18

Yes, I'm fine with opinionated architectures. However, I just felt many of the claims in the articles are exaggerated.

In addition, the landscape is different from 10 years ago. JVM devs are spoilt for choice of frameworks or stack that just works, so I don't see how this is a key differentiator.

Martin Häusler • Apr 21 '18

Feel free to suggest alternatives then :) I'm not aware of anything comparable on the JVM.

Herdy Handoko • Apr 21 '18

Sure :)

Let's use Spring as a baseline. It's a good framework and Spring Boot had made it effortless to get a project up and running and plugging new functionalities.

In the same breath, I do like Ninja Framework and their focused approach. Reminds me of the Java-based Play 1.x with their hot reload.

If it's simply services you're after (no web UI), a lot of Java shops are really happy with DropWizard.

But of course, they are all servlet-based. Those who seek better performance (throughput and stability under load) would usually go for Netty-based frameworks. It a step up from the servlet's thread-per-request model.

Spring has support for both servlet (Web MVC) and Netty (WebFlux), I've never used WebFlux so I can't really comment on it yet.

With those in mind, I'd go for those akka-http based. Either akka-http for service only or Play Framework for full web framework.

Another framework worthy of mention is vert.x. I haven't used it but I know of people who had a lot of success with it.

Martin Häusler • Apr 21 '18

Okay, thanks! I've heard about the play framework - mostly rants though. Still, I'll have to check those technologies out to see what they do and how they work. Thank you for the input!

Herdy Handoko • Apr 22 '18 • Edited

No worries, hope you'll have fun learning about these different approaches. I sure did :)

Apart from frameworks or stacks, another completely different paradigm that I think you might find interesting is eventually consistent systems using EventSourcing + CQRS (Command Query Responsibility Segregation).

It's completely different (and overkill for most use cases), but it shifts your mindset from behaviours of an object to event-driven... Which, I think how things happens in real life, ie. we respond to events.