Beginner data engineering project - batch edition
-
Updated
Jan 22, 2025 - HTML
Beginner data engineering project - batch edition
Airflow Deployment on AWS ECS Fargate Using Cloudformation
Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.
The Earthquake Emergency Response Robots project aims to create, develop, and implement systems specifically designed to handle post-earthquake situations. The main focus of the project is to build adaptable robots that come equipped with sensors and communication capabilities.
This ETL (Extract, Transform, Load) project employs several Python libraries, including Airflow, Soda, Polars, YData Profiling, DuckDB, Requests, Loguru, and Google Cloud to streamline the extraction, transformation, and loading of CSV datasets from the U.S. government's data repository at https://catalog.data.gov.
As a Data Engineer for a fictional E-commerce startup, this project addresses the task of analyzing the web server logs to find the number of product pages visited and the number of items in the cart.
Live COVID-19 tracker with Airflow
A data pipeline to extract News articles from BBC News, storing it to MongoDB, performing NLP analysis (Topic modeling, sentiment analysis), and orchestrating with airflow.
I generated the list of tables using Schemaspy and hosting it on my github. Definitely helps in understanding the backend tables of airflow.
End-to-end ML pipeline model and drift monitoring for Loan Eligibility classification.
White and Red Wine classification using logistic regression
Ask Ubuntu Logs analysis with PySpark on GCP | Pipeline with Airflow (Cloud Composer)
Add a description, image, and links to the airflow topic page so that developers can more easily learn about it.
To associate your repository with the airflow topic, visit your repo's landing page and select "manage topics."