Skip to content

AvitalTamir/memora

Repository files navigation

Memora 🧠

A distributed hybrid vector and graph database designed as an LLM long-term memory backend with human visibility layers. Built in Zig with Raft consensus for reliability, MCP integration for LLM access, and Cypher-like queries for human exploration.

🎯 Bifocal Memory System

Memora serves as a dual-purpose memory architecture:

πŸ€– For LLMs: High-performance semantic memory store via Model Context Protocol (MCP)
πŸ‘¨β€πŸ’» For Humans: Transparent, queryable knowledge graph with web UI and audit capabilities

πŸ’‘ Why This Is Powerful

Feature Description
πŸ” Feedback Loop LLMs read/write memories with purpose; humans inspect, audit, and correct
πŸ” Visibility + Explainability Trace answers back to source paragraphs, events, relationships
πŸ”— Shared World Model Graph shows how concepts connect (not just stored)
🧠 Long-Term Memory API LLM stores observations, experiences, decisions
πŸ” Memory Audit Devs and users can query what the LLM "knows"
🧩 Cross-Session Coherence Persistent memory survives across prompts/sessions
πŸ”— Traceable Retrieval Responses tied to graph+vector provenance
πŸ§‘β€πŸ’» Human/LLM Co-curation Humans can shape or clean memory alongside the model

πŸ—οΈ Architecture: Observability Dashboard for Machine Cognition

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Memora System                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Access Layer                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚   MCP Server    β”‚  β”‚   Web UI         β”‚ β”‚ Cypher-like QL  β”‚ β”‚
β”‚  β”‚ (LLM Interface) β”‚  β”‚ (Human Insight)  β”‚ β”‚ (Dev Queries)   β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Consensus Layer (Raft Protocol)                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Leader Election β”‚  β”‚ Log Replication  β”‚ β”‚ Fault Tolerance β”‚ β”‚
β”‚  β”‚ (150-300ms)     β”‚  β”‚ (TCP + CRC32)    β”‚ β”‚ (Majority)      β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Memory Layer                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Concept Graph   β”‚  β”‚ Semantic Vectors β”‚ β”‚ Memory Cache    β”‚ β”‚
β”‚  β”‚ (Knowledge Web) β”‚  β”‚ (Similarity)     β”‚ β”‚ (LRU + LFU)     β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Storage Layer                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Graph Index    β”‚  β”‚  Vector Index    β”‚ β”‚ Experience Log  β”‚ β”‚
β”‚  β”‚ (Memory-Mapped) β”‚  β”‚ (HNSW Structure) β”‚ β”‚ (Replicated)    β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Persistence Layer                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ Memory Snapshotsβ”‚  β”‚ Network Protocol β”‚ β”‚ S3 Sync         β”‚ β”‚
β”‚  β”‚ (Event Sourcing)β”‚  β”‚ (Binary + CRC32) β”‚ β”‚ (Snapshots)     β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🌐 API Access

HTTP REST API

Method Endpoint Description
GET /api/v1/health Health check and cluster status
GET /api/v1/metrics Database metrics and statistics
POST /api/v1/nodes Insert a new node
GET /api/v1/nodes/:id Get node information
GET /api/v1/nodes/:id/related Get related nodes (graph traversal)
POST /api/v1/edges Insert a new edge
POST /api/v1/vectors Insert a new vector
GET /api/v1/vectors/:id/similar Get similar vectors
POST /api/v1/batch Batch insert nodes, edges, and vectors
POST /api/v1/query/hybrid Execute hybrid graph+vector queries
POST /api/v1/snapshot Create a database snapshot
# Health check and memory stats
curl http://localhost:8080/api/v1/health

# Store data
curl -X POST http://localhost:8080/api/v1/nodes \
  -H "Content-Type: application/json" \
  -d '{"id": 100, "label": "UserPreference"}'

# Query relationships
curl http://localhost:8080/api/v1/nodes/1/related

# Vector similarity
curl http://localhost:8080/api/v1/vectors/1/similar

# Hybrid queries
curl -X POST http://localhost:8080/api/v1/query/hybrid \
  -H "Content-Type: application/json" \
  -d '{"node_id": 1, "depth": 2, "top_k": 5}'

MCP Server Endpoints

# Start MCP server
zig build mcp-server --port 9090

# LLMs connect via MCP protocol
# Supports all MCP v1.0 capabilities:
# - Resource discovery
# - Tool invocation  
# - Streaming responses
# - Bidirectional communication

πŸš€ Quick Start

Prerequisites

  • Zig 0.14+ - Install Zig
  • MCP-compatible LLM - Claude, GPT-4, or custom implementation

Setup

# Clone Memora
git clone <repo-url> memora
cd memora

# Build and start HTTP server
zig build http-server

# Or start MCP server for LLM integration
zig build mcp-server

# Servers run on localhost:8080 (HTTP) and localhost:9090 (MCP)

Basic Usage

const std = @import("std");
const Memora = @import("src/main.zig").Memora;
const types = @import("src/types.zig");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const config = Memora.MemoraConfig{
        .data_path = "memora_data",
        .enable_persistent_indexes = true,
    };

    var db = try Memora.init(allocator, config);
    defer db.deinit();

    // Store knowledge as nodes, edges, vectors
    try db.insertNode(types.Node.init(1, "UserPreference"));
    try db.insertEdge(types.Edge.init(1, 2, types.EdgeKind.related));
    
    const related = try db.queryRelated(1, 2);
    defer related.deinit();
}

πŸ“Š Data Types

Core Types

// Node: Graph vertex with ID and label
const Node = packed struct {
    id: u64,
    label: [32]u8,
};

// Edge: Graph connection with relationship type
const Edge = packed struct {
    from: u64,
    to: u64,
    kind: u8, // EdgeKind enum value
};

// Vector: 128-dimensional embedding
const Vector = packed struct {
    id: u64,
    dims: [128]f32,
};

Edge Types

const EdgeKind = enum(u8) {
    owns = 0,
    links = 1,
    related = 2,
    child_of = 3,
    similar_to = 4,
};

πŸ” Query Capabilities

Graph Queries

// Find nodes within N hops
const related = try db.queryRelated(start_node_id, depth);

// Hybrid graph+vector queries
const hybrid_result = try db.queryHybrid(start_node_id, depth, top_k);
defer hybrid_result.deinit();

Vector Queries

// Find K most similar vectors
const similar = try db.querySimilar(vector_id, top_k);

πŸ’Ύ Storage System

Iceberg-Style File Layout

memora/
β”œβ”€β”€ metadata/
β”‚   β”œβ”€β”€ snapshot-000001.json
β”‚   β”œβ”€β”€ snapshot-000002.json
β”‚   └── snapshot-000003.json
β”œβ”€β”€ vectors/
β”‚   β”œβ”€β”€ vec-000001.blob       # Binary vector data
β”‚   β”œβ”€β”€ vec-000002.blob
β”‚   └── vec-000003.blob
β”œβ”€β”€ nodes/
β”‚   β”œβ”€β”€ node-000001.json      # JSON node data
β”‚   β”œβ”€β”€ node-000002.json
β”‚   └── node-000003.json
β”œβ”€β”€ edges/
β”‚   β”œβ”€β”€ edge-000001.json      # JSON edge data
β”‚   β”œβ”€β”€ edge-000002.json
β”‚   └── edge-000003.json
└── memora.log               # Append-only binary log

☁️ Distributed Memory Architecture

Multi-Node Memory Clusters

# Start 3-node cluster
# Node 1 (Leader)
zig build run-distributed -- --node-id=1 --port=8001

# Node 2 (Follower)  
zig build run-distributed -- --node-id=2 --port=8001

# Node 3 (Follower)
zig build run-distributed -- --node-id=3 --port=8001

Memory Replication & Consistency

  • Strong Consistency: All writes replicated via Raft consensus
  • Read Scaling: Can read from any node for performance
  • Partition Tolerance: Continues operating with majority of nodes
  • Automatic Recovery: Failed nodes catch up automatically when rejoining

πŸ§ͺ Testing

Test Commands

# Run all tests
zig build test-all

# Test specific components
zig build test-raft          # Distributed consensus tests
zig build test-partitioning  # Data partitioning and consistent hashing tests
zig build test               # Core database tests  
zig build test-http-api      # HTTP REST API tests

# Fuzzing campaigns
zig build fuzz-quick         # Quick fuzzing (50 iterations)
zig build fuzz-stress        # Stress testing with large datasets

# Run distributed demo
zig build demo-distributed   # Interactive cluster demonstration
zig build gossip-demo        # Automatic node discovery demo

🎯 Performance Benchmarks

Operation Latency

Operation Single Node 3-Node Cluster
Node Insert ~200ΞΌs ~2ms
Vector Query ~500ΞΌs ~500ΞΌs
Graph Traversal ~1ms ~1ms
Hybrid Query ~2ms ~2ms

Capacity

  • Nodes/Vectors: Tested with 100K+ items
  • Concurrent Connections: HTTP server handles 1000+ connections
  • Memory Usage: Efficient memory-mapped indexes
  • Storage: Compressed snapshots with S3 sync

πŸš€ Development Roadmap

βœ… Foundation Systems (COMPLETE)

Core database infrastructure with deterministic, append-only architecture

  • βœ… Graph Database Core - Node/edge storage with adjacency lists and HNSW vector indexing
  • βœ… Vector Search Engine - 128-dimensional embeddings with cosine similarity and O(log n) performance
  • βœ… Deterministic Database - Append-only WAL, atomic transactions, snapshot consistency
  • βœ… Query Optimization Engine - Intelligent query planning and caching
  • βœ… Caching System - High-performance memory access with LRU/LFU policies
  • βœ… Parallel Processing System - Multi-threaded operations with load balancing
  • βœ… Memory-Mapped Persistent Indexes - Instant startup via disk-backed indexes
  • βœ… HTTP REST API - Production-ready web API for programmatic access
  • βœ… Monitoring & Metrics - Comprehensive operation observability
  • βœ… Advanced Configuration System - Production-ready configuration management

βœ… LLM Memory Integration (COMPLETE) - Priority 1

Semantic memory types optimized for LLM workflows instead of generic nodes

  • βœ… LLM Memory Data Models - 10 semantic types: experience, concept, fact, decision, observation, preference, context, skill, intention, emotion
  • βœ… Confidence & Importance Tracking - 5-level granular metadata for memory relevance and reliability
  • βœ… Memory Sources & Provenance - Track origin: user_input, llm_inference, system_observation, external_api, computed
  • βœ… Semantic Relationships - 8 relationship types: similar_to, caused_by, supports, contradicts, co_occurred, sequence, contains, derives_from
  • βœ… LLM Session Management - Group memories by conversation/context with metadata tracking
  • βœ… Advanced Memory Querying - Filter by type, confidence, importance, session, time ranges
  • βœ… Model Context Protocol (MCP) Server - Native MCP v2.0 implementation with semantic memory tools
  • βœ… Memory Statistics & Analytics - Distribution tracking and usage patterns

Demo: Run zig build llm-memory-demo to see semantic memory storage and retrieval

βœ… Gossip Protocol & Node Discovery (COMPLETE) - Latest Addition

Automatic node discovery and cluster formation without manual configuration

  • βœ… Gossip Protocol Implementation - Epidemic-style node discovery with failure detection
  • βœ… Automatic Bootstrap - Zero-configuration cluster formation using seed nodes
  • βœ… Node Health Monitoring - Heartbeat-based failure detection and recovery
  • βœ… Dynamic Membership - Nodes can join/leave clusters automatically
  • βœ… Raft Integration - Discovered nodes automatically form Raft consensus clusters
  • βœ… Raft Protocol Implementation - Leader election, log replication, membership changes
  • βœ… Node Discovery - Automatic cluster formation and health monitoring
  • βœ… Data Partitioning - Consistent hashing for horizontal scaling with virtual nodes, load balancing, and automatic rebalancing
  • βœ… Failover & Recovery - Automatic leader failover, node recovery, state synchronization, and data repair
  • βœ… Conflict Resolution - Split-brain protection, vector clock conflict detection, and network partition handling

🚧 Strategic Priority Systems

Priority 1: Human Visibility (Q1 2025) - Next Priority

Tools for humans to understand and manage LLM memory systems

  • MemQL Query Language - Cypher-like syntax for memory exploration and debugging
  • Web UI Memory Dashboard - Visual memory timeline, concept graphs, decision audit trails
  • Memory Audit & Curation Tools - Human interfaces for inspecting and correcting LLM memories
  • LLM Decision Provenance Tracking - Trace responses back to specific memory evidence

Priority 2: Production Operations (Q2 2025)

Enterprise-grade memory management and monitoring

  • Structured Memory Logging - Professional debugging and audit trails for memory operations
  • Memory Lifecycle Management - Automatic cleanup, archival, and importance-based retention
  • Multi-Tenant Memory - Isolated memory spaces for different LLMs/users/projects
  • Memory Analytics & Insights - Understanding LLM learning patterns and memory utilization

Priority 3: Advanced Memory Features (Q2-Q3 2025)

Next-generation semantic memory capabilities

  • Memory Compression & Summarization - Intelligent memory consolidation for long-term storage
  • Cross-Model Memory Sharing - Secure memory exchange between different LLM instances
  • Temporal Memory Reasoning - Time-aware memory retrieval and concept evolution tracking
  • Memory Contradiction Detection - Identify and resolve conflicting memories automatically

Priority 4: Scale & Reliability (Q3-Q4 2025)

Massive scale deployment and bulletproof reliability

  • Horizontal Memory Sharding - Distribute massive memory datasets across nodes
  • Real-time Memory Replication - Writer-reader replication with memory consistency guarantees
  • Memory Backup & Recovery - Point-in-time memory restoration and disaster recovery
  • Advanced Memory Security - Encryption, access control, and memory privacy protection

LLM Memory Use Cases πŸ€–

  • πŸ“š Long-term Conversational Memory - Remember user preferences, context, and history across sessions
  • 🧠 Knowledge Accumulation - Build persistent knowledge from multiple interactions and sources
  • πŸ” Contextual Decision Making - Access relevant past experiences for better current responses
  • πŸ“ˆ Learning & Adaptation - Track what works, what doesn't, and evolve interaction patterns
  • πŸ”— Cross-Domain Knowledge Transfer - Apply insights from one domain to related problems
  • 🎯 Personalization - Adapt communication style and content based on stored user models

🀝 Contributing to LLM Memory Infrastructure

  1. Fork the Memora repository
  2. Create a feature branch focused on LLM memory capabilities
  3. Add comprehensive tests including integration scenarios
  4. Run zig build test-all
  5. Submit a pull request with performance benchmarks

Development Guidelines

  • Memory First: All features should enhance LLM memory capabilities
  • Human Debuggable: Ensure human developers can inspect and understand stored data
  • Deterministic: Memory operations must be reproducible for testing
  • Performance Critical: Memory retrieval should be sub-millisecond
  • Privacy Aware: Design with multi-tenant memory isolation in mind

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • TigerBeetle: Inspiration for deterministic, high-performance memory backend design
  • Model Context Protocol (MCP): Standard for LLM tool integration and memory access
  • Apache Iceberg: Inspiration for immutable, time-travel capable memory snapshots
  • Zig Community: For the amazing language perfect for system-level LLM infrastructure

Memora: Building the memory layer for the age of AI 🧠✨

Vibe-coded with ❀️ by AIs and humans, for AIs and humans.

About

A Bifocal Memory System for Humans and LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages