1. There are 4x more Data Engineering jobs posted on Indeed vs. Data Science
2. Data Engineers earn on average $132,680 per year according to Indeed
3. In 2019, Data Engineering positions saw a growth of 50% according to the Dice Tech Job Report
Data Engineering Bootcamp
Are you a...
- Recent college graduate
- An early career developer
- A self taught programmer
Understanding the Fundamentals of ETL Pipelines by Ingesting Historical Flight and Passenger Data
This section starts it all! You will be learning the fundamentals of simple data exploration and cleaning using Python Pandas. From local data cleanup you will move onto loading CSV reference flight data into GCP BigQuery for further exploration using SQL. You then move onto Using GCP Dataflow (Apache Beam) to extract, transform, and load (ETL)) 4 years of historical flight data in parallel. To advance your distributed computing knowledge, you will then use GCP Cloud Dataproc (Apache Spark) to transform and load millions of rows of passenger data into GCP BigQuery.
Chapter 1: Loading reference Dataset into BigQuery
Chapter 2: Loading Flights data using Apache Beam (Google Dataflow)
Chapter 3: Processing Passengers using Apache Spark (Google Dataproc)
Tech: Pandas, SQL, Google Cloud Storage, Google Cloud BigQuery, Google Cloud Dataflow (Apache Beam), Google Cloud Dataproc (Apache Spark),
Designing and Monitoring Real-time Ticket Purchase Data
Put on your Architect hat and learn the best practices behind developing logical Data Architecture that will be utilized throughout the rest of the course. You will then use GCP Dataflow (Apache Beam) to stream process real-time flight queries from GCP Pub/Sub (Apache Kafka). Utilizing GCP Dataproc and BigTable, you will develop an Online Transactional Platform (OLTP) to monitor ticket sales.
Chapter 4: Putting on our Data Architect Hat!
Chapter 5: Real-time Stream Processing of Live Flight Queries with Cloud Pub/Sub
Chapter 6: Registering Ticket Sales with Google BigTable
Tech: Google Cloud Pub/Sub (Apache Kafka, Google Cloud Dataflow (Apache Beam), Google Cloud BigTable (Apache HBase,
Automating Processes and Analytics to Determine Ticket Prices
Artificial Intelligence (AI) and Machine Learning (ML) will be levereged in this section to create advanced analytics built on top of your existing data pipeline. Further automation of pipeline processes will be implemented utilizing GCP Cloud Composer (Apache Airflow). Finally, you will complete your data pipeline by creating a Data Hub to expose all the AI data via a REST API.
Chapter 7: Advanced Analytics using BigQuery
Chapter 8: Building an A/I with BigQuery ML (Machine Learning)
Chapter 9: Pipeline Automation with Cloud Composer (Apache Airflow)
Chapter 10: Creating a Data Hub, Exporting Data via Google AppEgine (Python Flask)
Tech: SQL, Machine Learning, Google Cloud Bigquery, Google Cloud Bigquery ML, Google Cloud Composer (Apache Airflow), Flask (Python) REST API, Google AppEngine,
Why choose TuraLabs?
Real world examples and datasets
End to end data pipeline based project
Curriculumn developed by industry experts
Convenient online self-paced course
Free to Join Discord Community
Weekly Office Hours with TuraLabs Engineers
Pre-register Before June 1st, 2021
Be a part of our early adaptor community and help us improve the course before June 1st, 2021 and enjoy all the course content for FREE.
Free communication with course developers via Discord