Example end to end data engineering project.
-
Updated
Dec 8, 2022 - Python
Example end to end data engineering project.
Replicate data from MySQL, Postgres and MongoDB to ClickHouse®
Nyc_Taxi_Data_Pipeline - DE Project
Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)
Enables Python developers to leverage Debezium's CDC capabilities with custom event handlers and seamless integration.
Repo for CDC with debezium blog post
Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transformed data into Glue database and created real-time dashboards using Power BI and Tableau with Athena. The pipeline is orchestrated using Airflow.
Data Streaming with Debezium, Kafka, Spark Streaming, Delta Lake, and MinIO
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK Serverless and MSK Connect (Debezium)
A Smart Traffic Management System for Ho Chi Minh City, Vietnam leveraging batch and real-time data processing, intuitive dashboards, and monitoring tools to optimize traffic flow, enhance safety, and support sustainable urban mobility through advanced analytics and user-friendly applications.
Guardian for your Kafka Connect connectors. It check status of connectors and tasks and restart if they are failed
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK and MSK Connect (Debezium)
Outbox pattern using Debezium and Protobuf serialization
Пример создания CDC через Debezium
Data Pipeline for CDC data from MySQL DB to Amazon S3 through Amazon MSK Serverless using Amazon MSK Connect (Debezium).
Add a description, image, and links to the debezium topic page so that developers can more easily learn about it.
To associate your repository with the debezium topic, visit your repo's landing page and select "manage topics."