Build software better, together

A cloud-based ETL pipeline on AWS for automating airline flight data ingestion, transformation, and storage using S3, Glue, Redshift, EventBridge, Step Functions, and SNS.

aws s3-bucket sns redshift step-functions etl-pipeline glue-job eventbridge

Updated Feb 17, 2025
Python

AJAbe / DE_AWS_Maven

Star

First experimentation into the world of AWS Serverless Data Engineering

aws aws-lambda aws-s3 grafana data-engineering serverless-framework glue-job

Updated Dec 26, 2024
Python

heitorkobayashi / action-movies-tmdb-analysis

Star

Análise das relações entre orçamento, popularidade e qualidade em filmes de ação, utilizando dados do TMDB e ferramentas da AWS.

api aws aws-lambda aws-s3 data-visualization python3 pyspark data-engineering data-analysis sparksql boto3 tmdb-api quicksight glue-job quicksight-dashboard

Updated Jun 13, 2025
Python

Sanjay-dev-ds / aws-serverless-data-pipeline

Star

This project creates a serverless data pipeline to extract data from the Colombo Stock Market ASI Index API using AWS Lambda, Kinesis Firehose, and S3. An AWS Glue workflow processes and transforms the data, storing it in an Apache Iceberg table via Athena and Glue ETL jobs.

aws aws-lambda athena apache s3-bucket iceberg glue-job

Updated Jul 2, 2024
Python

vidupriya / AWS-Glue--Data-Copy

Star

The function for copying data like CSV, Parquet, avro etc., from a source S3 bucket to a destination S3 bucket using AWS Glue. It includes the necessary setup for the Glue job, logging, reading data from the source bucket, and writing it to the destination bucket

aws data spark s3 glue s3-bucket python3 pyspark s3-storage s3-buckets awss3 glue-job awsglue data-copying

Updated Apr 20, 2025
Python

Hanagojiv / End-to-End-Reddit-Data-Processing-Pipeline-with-AWS-Services

Star

This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.

aws sql postgresql s3-bucket python3 celery redshift amazon-athena apache-airflow glue-job

Updated Feb 7, 2024
Python

DieGit0 / data_realtime_-_batch_analytics

Star

Data Streaming and Batch processing using AWS Services

athena json-api lambda-functions s3-bucket python3 data-catalog kinesis-stream parquet sns-notifications glue-job glue-crawler eventbridge-scheduler

Updated Aug 18, 2024
Python

NSVpriya / Youtube_Data_ETL_Project

Star

This project aims to analyze the popularity of YouTube content across different regions by leveraging datasets sourced from Kaggle. It employs a systematic approach to data preprocessing, cleaning, and analysis using various AWS (Amazon Web Services) services including S3, Lambda, Glue, and others, to build an automated ETL pipeline.

python shell aws sql aws-lambda powerbi glue-job dataanlytics

Updated Apr 26, 2024
Python

elmezianech / AutoInventory

Star

This project is an end-to-end, fully automated warehouse management solution designed to tackle real-world inventory challenges in the FMCG sector. From real-time data ingestion and predictive analytics to interactive dashboards, this project combines cutting-edge technologies and an event-driven architecture to simulate a business-ready system.

Updated Dec 28, 2024
Python

muhd-minhaz / AWS-Glue--Data-Copy

Star

The function for copying data like CSV, Parquet, avro etc., from a source S3 bucket to a destination S3 bucket using AWS Glue. It includes the necessary setup for the Glue job, logging, reading data from the source bucket, and writing it to the destination bucket

aws data spark s3 glue s3-bucket pyspark s3-storage s3-buckets awss3 glue-job awsglue data-copying

Updated Jul 18, 2025
Python

GustavoGuarany / projeto-engenharia-dados-tv-jornalismo

Star

O projeto foi elaborado com o objetivo de estabelecer uma arquitetura na AWS, originada a partir de uma migração de um banco de dados existente em um ambiente local (on-premise).

python docker aws crawler sql sql-server spark athena terraform s3 glue jupyter-notebook s3-bucket dms powerbi glue-job database-migration-service rds-postgres

Updated Aug 17, 2023
Python

ShikhaYadav123 / AWS-Glue-IMDB-Data-Quality-ETL-Pipeline

Star

IMDB Movie Data ETL Pipeline using S3, Glue, Redshift, EventBridge, SNS

crawler amazon-web-services amazon-s3 redshift-database redshift-cluster glue-job glue-etl

Updated Aug 6, 2024
Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glue-job

Here are 24 public repositories matching this topic...

awslabs / athena-glue-service-logs

vincentclaes / datajob

miztiik / s3-to-rds-with-glue

g-lorena / aws_streaming_pipeline

camposvinicius / aws-snowflake-etl

GabrielDan92 / AWS_Terraform_PySpark-ETL_Job

phaniteja5789 / Event-Driven-Data-Processing-and-Workflow-Orchestration-on-AWS

PATRICIAJUNQUEIRA / DataLake_PipelineAWS

Kaushik-Puttaswamy / Airline-Data-Ingestion-Processing-on-AWS

AJAbe / DE_AWS_Maven

heitorkobayashi / action-movies-tmdb-analysis

Sanjay-dev-ds / aws-serverless-data-pipeline

vidupriya / AWS-Glue--Data-Copy

Hanagojiv / End-to-End-Reddit-Data-Processing-Pipeline-with-AWS-Services

DieGit0 / data_realtime_-_batch_analytics

NSVpriya / Youtube_Data_ETL_Project

elmezianech / AutoInventory

muhd-minhaz / AWS-Glue--Data-Copy

GustavoGuarany / projeto-engenharia-dados-tv-jornalismo

ShikhaYadav123 / AWS-Glue-IMDB-Data-Quality-ETL-Pipeline

Improve this page

Add this topic to your repo