Revature 200413

Logo

Data Engineering with Java & Apache Spark

View My GitHub Profile

Project 1

A Java application ETL batch processor. Data should be parsed from a CSV or JSON file, sent to an Apache Spark cluster using the Java Spark RDD library for analysis, and its result persisted to a SQL database. Prior queries should be presented over a Java Servlet using HTTP.

Features

Tech Stack

Presentation