You landed at the right place.
This course will teach you how to build distributed data processing pipelines using open-source frameworks.
For every component of a typical big data system, we learn the underlying principles, then apply those concepts using Python and SQL-like frameworks to construct pipelines from scratch.
The course covers:
- Evolution of Data Processing
- Types of Big Data Systems
- Using Docker and Google Cloud Platform
- Distributed Cache with Redis
- Parallel Processing with Hadoop
- Batch Processing with Spark
- Batch Storage with HBase
- Data Collection with Flume
- Messaging with Kafka
- Stream Processing with Spark & Kafka
- Real-time Storage with Cassandra
- Distributed Search & Indexing with the ELK stack
- Presentation with Grafana & Prometheus
Throughout the Data Engineering course, you'll learn everything you need to build robust data pipelines.