Data Engineering with Java & Apache Spark
Hypertext Transport Protocol, or HTTP, is a common standard for delivering content across the internet, especially between users and servers. An HTTP request should include a url destination, an HTTP method, and content such as metadata and a payload carried by its headers and body. A response will then be sent back with a status code and its own headers and body.
Java Enterprise Edition, this is a community driven collection of Specifications, APIs and Frameworks which provide enterprise functionality. We will start with Java Servlets, a JavaEE API for communicating between a Java code base and HTTP.
Found in javax.servlet
and javax.http
the Servlet API provides a Servlet Interface which is implemented by the GenericServlet Abstract Class, then the HTTPServlet Abstract Class, and then finally allows you to extend the HTTPServlet.
In a normal servlet process:
The Tomcat server will handle the control flow of your application through a servlet lifecycle process.
init()
methodservice()
to process the request.
destroy()
method of your servlet.tl;dr Container calls init() once, service() many times, and eventually destroy() once.
JEE web applications are usually packaged as .war
files, a web archive similar to a .jar
but with some minor changes to folder heirarchy. A src/main/webapp
folder becomes the root directory, and the archive is usually hosted on a Java web server such as Apache Tomcat. Be sure to set the packaging
property in your pom.xml
to war
:
...
<packaging>war</packaging>
...
To begin using Java Servlets, include the following dependency to your Maven pom.xml
:
...
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
<version>2.5</version>
...
Servlet
is a JEE specification which is not included in Java’s standard library, so it must be imported or the project must use a JDK with JEE libraries included already.
A Java server like Tomcat will deploy a web archive and register all servlets according to the deployment descriptor found in src/main/webapp/WEB-INF/web.xml
:
<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" version="2.5">
...
</web-app>
Servlets, Filters, Context parameters, and other configurations are declared and defined in the web.xml
Servlet classes are registered in the web.xml
by assigning a name to both a url mapping and the fully qualified class name of the Servlet:
...
<servlet>
<servlet-name>myServlet</servlet-name>
<servlet-class>servlets.MyServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>myServlet</servlet-name>
<url-pattern>/myServlet</url-pattern>
</servlet-mapping>
...
This will provide access to the servlet through its url mapping, in the above case at http:[hostname]:[port]/[app-context]/myServlet
The Servlet API provides the HttpServlet
abstract class which can be extended to define your own custom behavior:
package servlets;
import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
public class MyServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
// Implements GET behavior
}
@Override
protected void doPost(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
// Implements POST behavior
}
}
The HttpServletRequest
object provides useful methods such as getParameter(String name)
which returns the value of a Query or post body parameter. HttpServletResponse
provides a getWriter()
method which returns a PrintWriter
object to append data to the response body.
More recent versions of the Servlet API offer convenience annotations that allows configuration in a decorator over the Servlet class itself, leaving the web.xml
blank for the most part:
...
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
<version>3.0.1</version>
<scope>provided</scope>
...
package servlets;
import java.io.IOException;
import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
@WebServlet("/myServlet")
public class MyServlet extends HttpServlet {
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
// Implements GET behavior
}
}
A servlet container like Tomcat creates a singleton instance for each servlet, each sharing a ServletContext where certain initialization parameters are shared. These can be set in the Deployment Descriptor with the context-param
tag:
...
<welcome-file-list>
<welcome-file>index.html</welcome-file>
</welcome-file-list>
<context-param>
<param-name>key</param-name>
<param-value>value</param-value>
</context-param>
<servlet>
<servlet-name>myServlet</servlet-name>
<servlet-class>servlets.MyServlet</servlet-class>
</servlet>
...
The parameter can be accessed through a servlet’s ServletContext delegate method:
getServletContext().getInitParameter(“key”);
Each servlet meanwhile has its own local instance for its own configuration, its ServletConfig, which can be given initialization parameters with the init-param
tag:
...
<servlet>
<servlet-name>myServlet</servlet-name>
<servlet-class>servlets.MyServlet</servlet-class>
<context-param>
<param-name>key</param-name>
<param-value>value</param-value>
</context-param>
</servlet>
...
This parameter can be accessed directly from the servlet:
getInitParameter(“key”);
Download Tomcat and extract the archive somewhere.
Package the project into a war
file, then move the file into your Tomcat server’s webapp folder. Tomcat should unpack and deploy the application automatically.
Eclipse and other IDEs can deploy an application onto a debug server for testing. Use the Eclipse server view to install Tomcat then add the project to the workspace.
Add the tomcat7-maven-plugin
to the pom.xml
:
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.tomcat.maven</groupId>
<artifactId>tomcat7-maven-plugin</artifactId>
<version>2.2</version>
<configuration>
<path>/${project.build.finalName}</path>
<finalName>executable.jar</finalName>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
Then run the project with Maven:
mvn tomcat7:run
If Tomcat is installed, with the CATALINA_HOME environment variable set, the same plugin can be used to deploy the application to the server with:
mvn tomcat7:deploy
Tomcat embed is a library that allows for programmatic configuration and deployment of an embedded server within a Java application. In the pom.xml
:
<dependency>
<groupId>org.apache.tomcat.embed</groupId>
<artifactId>tomcat-embed-core</artifactId>
<version>8.5.23</version>
</dependency>
Then in a main method:
import org.apache.catalina.Context;
import org.apache.catalina.startup.Tomcat;
import javax.servlet.http.HttpServlet;
import java.io.File;
public class App {
public static void main(String[] args) throws Exception {
Tomcat tomcat = new Tomcat();
tomcat.setPort(8080);
// Set context path and root folder
String contextPath = "/";
String docBase = new File(".").getAbsolutePath();
Context context = tomcat.addContext(contextPath, docBase);
// Declare, define, and map servlets
HttpServlet helloServlet = new HttpServlet(){
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
PrintWriter out = resp.getWriter();
out.write("<html><title>Example</title><body><h1>Hello, World!</h1></body></html>");
out.close();
}
};
String servletName = "HelloServlet";
String urlPattern = "/hello";
// Register servlets with Tomcat
Tomcat.addServlet(context, servletName, helloServlet);
context.addServletMappingDecoded(urlPattern, servletName);
// Start the server
tomcat.start();
tomcat.getServer().await();
}
}