Glue scripts for converting AWS Service Logs for use in Athena
-
Updated
Feb 1, 2024 - Python
Glue scripts for converting AWS Service Logs for use in Athena
Build and deploy a serverless data pipeline on AWS with no effort.
Extract, transform, and load data for analytic processing using AWS Glue
This is a data pipeline built with the purpose of serving a business team.
Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.
Pipeline ETL na AWS
A cloud-based ETL pipeline on AWS for automating airline flight data ingestion, transformation, and storage using S3, Glue, Redshift, EventBridge, Step Functions, and SNS.
First experimentation into the world of AWS Serverless Data Engineering
Análise das relações entre orçamento, popularidade e qualidade em filmes de ação, utilizando dados do TMDB e ferramentas da AWS.
This project creates a serverless data pipeline to extract data from the Colombo Stock Market ASI Index API using AWS Lambda, Kinesis Firehose, and S3. An AWS Glue workflow processes and transforms the data, storing it in an Apache Iceberg table via Athena and Glue ETL jobs.
The function for copying data like CSV, Parquet, avro etc., from a source S3 bucket to a destination S3 bucket using AWS Glue. It includes the necessary setup for the Glue job, logging, reading data from the source bucket, and writing it to the destination bucket
This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
Data Streaming and Batch processing using AWS Services
This project aims to analyze the popularity of YouTube content across different regions by leveraging datasets sourced from Kaggle. It employs a systematic approach to data preprocessing, cleaning, and analysis using various AWS (Amazon Web Services) services including S3, Lambda, Glue, and others, to build an automated ETL pipeline.
This project is an end-to-end, fully automated warehouse management solution designed to tackle real-world inventory challenges in the FMCG sector. From real-time data ingestion and predictive analytics to interactive dashboards, this project combines cutting-edge technologies and an event-driven architecture to simulate a business-ready system.
The function for copying data like CSV, Parquet, avro etc., from a source S3 bucket to a destination S3 bucket using AWS Glue. It includes the necessary setup for the Glue job, logging, reading data from the source bucket, and writing it to the destination bucket
O projeto foi elaborado com o objetivo de estabelecer uma arquitetura na AWS, originada a partir de uma migração de um banco de dados existente em um ambiente local (on-premise).
IMDB Movie Data ETL Pipeline using S3, Glue, Redshift, EventBridge, SNS
Add a description, image, and links to the glue-job topic page so that developers can more easily learn about it.
To associate your repository with the glue-job topic, visit your repo's landing page and select "manage topics."