Skip to content

lanterno/ocw-tracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCW course tracker

This project tracks new courses in the MIT OCW platform.
It checks for the latest 10 courses across all departments, and stores their general information into the Database.
We also provide a GraphQL API to query the stored courses, and filter through them.
More details on how to use the application follows.

Usage

to setup and use the project, please refer to this document.

Architecture

ocw-tracker.png

As you can see in the graph, there's three main components, one message bus, and one DB.

Scraper

Based on a cron schedule -e.g. midnight- a new job is run to scrap OCW.mit.edu, and check for new courses.
For each new course, a new message is published on the message bus containing course data.
more details document

Parsers

These are workers that listen on the message bus for new messages.
For each new message, they attempt to parse the content into a structured form.

The result is then stored in a SQL database.
more details document

Course API

The user interface; a restful API queryable by the user, and provides a basic direct interface into the database. This component is mainly a search engine on the database. more details document

RoadMap

  • Improve the database layer abstraction
  • The parser instances should be spawned only when needed and one container per message
  • Currently, a scraper container is spawned when a make up is performed. It should only be spawned at the correct cron intervals.
  • Improve the GraphQL capabilities

About

Tracks MIT OCW new courses for my personal amuse

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published