nimbus

Nimbus is a blog recommendation system designed to help users stay updated with the latest insights in technology. It aggregates blog updates from various sources, processes the data, and provides personalized recommendations through natural language queries.

Features

Automated Blog Fetching: Periodically collects blog updates from RSS/Atom feeds
Content Processing: Generates summaries and embeddings using OpenAI API
Vector Search: Fast similarity search using DuckDB with VSS extension
REST API: Real-time blog recommendations based on natural language queries
CLI Tool: Command-line interface for searching blogs
Interactive Mode: User-friendly interface with search history

Architecture Overview

Nimbus is designed as a serverless and cost-effective system deployed on Google Cloud Platform (GCP). The system consists of three main services:

1. Blog Fetcher

Purpose: Fetches blog updates from RSS/Atom feeds
Technology: Python with feedparser, BeautifulSoup
Storage: BigQuery for structured data storage
Deployment: Cloud Run Job triggered by Cloud Scheduler

2. Blog Preprocessor

Purpose: Processes blog entries to generate summaries and embeddings
Technology:
- OpenAI API for summarization (GPT-4o-mini)
- OpenAI Embeddings API (text-embedding-3-small) for vectorization
- DuckDB with VSS extension for vector storage
Features: Async processing, supports both local and GCS storage
Deployment: Cloud Run Job

3. Blog Recommender

Purpose: Provides REST API for blog recommendations
Technology:
- FastAPI for REST API
- DuckDB VSS for vector similarity search
- HNSW index with cosine similarity
Features:
- Real-time search with natural language queries
- Complete metadata retrieval without BigQuery dependency
- CLI tool with multiple output formats
Deployment: Cloud Run Service with auto-scaling

Quick Start

Prerequisites

Python 3.13+
uv (Python package manager)
Task (Taskfile)
Google Cloud SDK (for deployment)

Local Development

Clone the repository

git clone https://github.com/uu64/nimbus.git
cd nimbus

Set up each service

# Blog Fetcher
cd blog_fetcher
task install
task test

# Blog Preprocessor
cd ../blog_preprocessor
task install
cp .env.local.example .env.local  # Configure your OpenAI API key
task test

# Blog Recommender
cd ../blog_recommender
task install
task test

Run the services

# In separate terminals:

# Terminal 1: Run blog fetcher (one-time)
cd blog_fetcher
task run

# Terminal 2: Run blog preprocessor (one-time)
cd blog_preprocessor
task run

# Terminal 3: Run blog recommender API
cd blog_recommender
task run

Use the CLI tool

cd blog_recommender

# Search for blogs
uv run nimbus-cli search "Kubernetes security best practices"

# Interactive mode
uv run nimbus-cli interactive

# Check API health
uv run nimbus-cli health

CLI Usage

Basic Commands

# Search with natural language query
nimbus-cli search "Python async programming tips"

# Search with options
nimbus-cli search "Docker tutorials" --limit 5 --days 30

# Get different output formats
nimbus-cli search "React hooks" --format json
nimbus-cli search "Vue.js" --format simple

# Open first result in browser
nimbus-cli search "TypeScript" --open

# Get blog details
nimbus-cli detail <blog-id>

Interactive Mode

nimbus-cli interactive

# In interactive mode:
> Kubernetes security          # Search
> 1                           # Show details for result #1
> open 2                      # Open result #2 in browser
> help                        # Show available commands
> exit                        # Exit

Configuration

Environment Variables

Each service uses environment variables for configuration:

blog_fetcher: Uses ENV variable (local/production)
blog_preprocessor: Requires .env file with OpenAI API key
blog_recommender: Optional NIMBUS_API_URL for CLI tool

Data Flow

blog_fetcher → BigQuery (feed entries)
BigQuery → blog_preprocessor → DuckDB (embeddings + metadata)
DuckDB → blog_recommender → REST API/CLI

Deployment

Google Cloud Platform Setup

Create a GCP project
Enable required APIs: Cloud Run, BigQuery, Cloud Storage, Cloud Scheduler
Set up service accounts with appropriate permissions

Deploy Services

# Deploy each service
cd blog_fetcher && task deploy
cd blog_preprocessor && task deploy
cd blog_recommender && task deploy

Schedule Jobs

blog_fetcher: Schedule with Cloud Scheduler (e.g., daily)
blog_preprocessor: Schedule after blog_fetcher completes

Technology Stack

Languages: Python 3.13+
Frameworks: FastAPI, Typer, Rich
Databases: BigQuery, DuckDB with VSS
AI/ML: OpenAI API (GPT-4o-mini, text-embedding-3-small)
Cloud: Google Cloud Platform (Cloud Run, BigQuery, Cloud Storage)
Tools: uv, Task, pytest, ruff

Contributing

Fork the repository
Create a feature branch
Make your changes
Run tests: task test
Format code: task fmt
Submit a pull request

License

MIT License

Nimbus: Keeping you updated with the latest in tech, effortlessly.

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
.github/workflows		.github/workflows
blog_fetcher		blog_fetcher
blog_preprocessor		blog_preprocessor
blog_recommender		blog_recommender
.editorconfig		.editorconfig
CLAUDE.md		CLAUDE.md
README.md		README.md
github-comment.yaml		github-comment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

nimbus

Features

Architecture Overview

1. Blog Fetcher

2. Blog Preprocessor

3. Blog Recommender

Quick Start

Prerequisites

Local Development

CLI Usage

Basic Commands

Interactive Mode

Configuration

Environment Variables

Data Flow

Deployment

Google Cloud Platform Setup

Deploy Services

Schedule Jobs

Technology Stack

Contributing

License

About

Uh oh!

Releases

Packages

Languages

uu64/nimbus

Folders and files

Latest commit

History

Repository files navigation

nimbus

Features

Architecture Overview

1. Blog Fetcher

2. Blog Preprocessor

3. Blog Recommender

Quick Start

Prerequisites

Local Development

CLI Usage

Basic Commands

Interactive Mode

Configuration

Environment Variables

Data Flow

Deployment

Google Cloud Platform Setup

Deploy Services

Schedule Jobs

Technology Stack

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages