Revature 200413

Logo

Data Engineering with Java & Apache Spark

View My GitHub Profile

Big Data with Apache Spark

Welcome to the docs repository for Revature’s 200413 Big Data/Spark cohort. Here you will find weekly topics, useful resources, and project requirements.

Weekly Topics

Every week, we will focus on a particular technology or theme to add to our repertoire of competencies. These topics will feature heavily in assessments and QC meetings every week, and self-study and practical exploration will be necessary.

Each week may have a list of topic-based questions, which you should be prepared to study and answer in an assessment, whether in a meeting or a quiz. Associates are expected to answer at least 5 on a weekly discussion board, and respond to other posts with suggestions to improve or clarify them.

Process

Google Doc - Contains our standard schedule, QC assessments overview and links, and a list of important contacts.

Projects

This cohort will prioritize individual and group-based project work:

Each project will require a list of features to be implemented, whether functional or operational, and finishing your MVP (minimum viable product) as early as possible before iterating new features upon the project is highly suggested. Plan ahead, and be sure to reach out to everyone whenever you require guidance (or offer your own to those in need).

Developmemt Environment

To maximize resources and minimize troubleshooting, please perform a clean install or refresh of your operating system. Update your system, Enable VT-x in BIOS if possible, and uninstall all unnecessary programs. Your development environment should be set up for Java, Git, and Maven as soon as possible. In later weeks we will also require PostgreSQL, Docker, SSH, curl, and of course Apache Spark. Refer to this Readme or the links provided in each week’s topic and resources document to keep updated on the latest tools and programs needed for project work. You will be responsible for maintaining your environment throughout the program.

Tools

Installers

Package Managers

Java SE 8

Command-line tools

Editors

Installing Git, Java, Maven, and an IDE with Chocolatey (Windows only)

Install Chocolatey:

  1. Open Powershell as an administrator.
  2. Run:

    Set-ExecutionPolicy AllSigned

  3. Agree to all changes
  4. Run:

    Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString(‘https://chocolatey.org/install.ps1’))

Install using Chocolatey

  1. Open a new Powershell window as an administrator and run the following commands:
  2. Install Git for Windows:

    choco install git

  3. Install OpenJDK 8:

    choco install adoptopenjdk8

  4. Install Apache Maven:

    choco install maven

  5. Install an IDE of your choice:
    • Visual Studio Code:

      choco install vscode

    • Eclipse:

      choco install eclipse

    • IntelliJ IDEA community:

      choco install intellijidea-community

Summary

To confirm all tools are properly installed and configured, be sure the following commands return no errors:

git -v
java -version
javac -version
mvn -v

java and javac should only reference Java 1.8.

All above tools can be installed at once for convenience using the following command:

choco install -y git adoptopenjdk8 maven vscode