Go to file

BobAi 5f42751e9a 🛡️ Add comprehensive LLM safeguards and dual-mode demo scripts

🛡️ SMART MODEL SAFEGUARDS:
- Implement runaway prevention with pattern detection (repetition, thinking loops, rambling)
- Add context length management with optimal parameters per model size
- Quality validation prevents problematic responses before reaching users
- Helpful explanations when issues occur with recovery suggestions
- Model-specific parameter optimization (qwen3:0.6b vs 1.7b vs 3b+)
- Timeout protection and graceful degradation

⚡ OPTIMAL PERFORMANCE SETTINGS:
- Context window: 32k tokens for good balance
- Repeat penalty: 1.15 for 0.6b, 1.1 for 1.7b, 1.05 for larger models
- Presence penalty: 1.5 for quantized models to prevent repetition
- Smart output limits: 1500 tokens for 0.6b, 2000+ for larger models
- Top-p/top-k tuning based on research best practices

🎬 DUAL-MODE DEMO SCRIPTS:
- create_synthesis_demo.py: Shows fast search with AI synthesis workflow
- create_exploration_demo.py: Interactive thinking mode with conversation memory
- Realistic typing simulation and response timing for quality GIFs
- Clear demonstration of when to use each mode

Perfect for creating compelling demo videos showing both RAG experiences!

2025-08-12 19:07:48 +10:00

assets

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

claude_rag

🛡️ Add comprehensive LLM safeguards and dual-mode demo scripts

2025-08-12 19:07:48 +10:00

docs

🎓 Complete beginner-friendly polish with production reliability

2025-08-12 18:59:24 +10:00

examples

🎓 Complete beginner-friendly polish with production reliability

2025-08-12 18:59:24 +10:00

recordings

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

reports

Complete two-mode architecture documentation and testing

2025-08-12 18:22:19 +10:00

tests

Complete two-mode architecture documentation and testing

2025-08-12 18:22:19 +10:00

.gitignore

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

asciinema_to_gif.py

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

create_demo_script.py

Integrate LLM synthesis across all interfaces and update demo

2025-08-12 17:13:21 +10:00

create_exploration_demo.py

🛡️ Add comprehensive LLM safeguards and dual-mode demo scripts

2025-08-12 19:07:48 +10:00

create_synthesis_demo.py

🛡️ Add comprehensive LLM safeguards and dual-mode demo scripts

2025-08-12 19:07:48 +10:00

GET_STARTED.md

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

install_mini_rag.sh

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

rag-mini

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

rag-mini-enhanced

Integrate LLM synthesis across all interfaces and update demo

2025-08-12 17:13:21 +10:00

rag-mini.py

🎓 Complete beginner-friendly polish with production reliability

2025-08-12 18:59:24 +10:00

rag-tui

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

rag-tui.py

Complete two-mode architecture documentation and testing

2025-08-12 18:22:19 +10:00

README.md

Complete two-mode architecture documentation and testing

2025-08-12 18:22:19 +10:00

record_demo.sh

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

requirements-full.txt

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

requirements.txt

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

run_mini_rag.sh

Initial release: FSS-Mini-RAG - Lightweight semantic code search system

2025-08-12 16:38:28 +10:00

README.md

FSS-Mini-RAG

A lightweight, educational RAG system that actually works
Built for beginners who want results, and developers who want to understand how RAG really works

How It Works

graph LR
    Files[📁 Your Code] --> Index[🔍 Index]
    Index --> Chunks[✂️ Smart Chunks]
    Chunks --> Embeddings[🧠 Semantic Vectors]
    Embeddings --> Database[(💾 Vector DB)]
    
    Query[❓ "user auth"] --> Search[🎯 Hybrid Search]
    Database --> Search
    Search --> Results[📋 Ranked Results]
    
    style Files fill:#e3f2fd
    style Results fill:#e8f5e8
    style Database fill:#fff3e0

What This Is

FSS-Mini-RAG is a distilled, lightweight implementation of a production-quality RAG (Retrieval Augmented Generation) search system. Born from 2 years of building, refining, and tuning RAG systems - from enterprise-scale solutions handling 14,000 queries/second to lightweight implementations that anyone can install and understand.

The Problem This Solves: Most RAG implementations are either too simple (poor results) or too complex (impossible to understand and modify). This bridges that gap.

Two Powerful Modes

FSS-Mini-RAG offers two distinct experiences optimized for different use cases:

🚀 Synthesis Mode - Fast & Consistent

./rag-mini search ~/project "authentication logic" --synthesize

Perfect for: Quick answers, code discovery, fast lookups
Speed: Lightning fast responses (no thinking overhead)
Quality: Consistent, reliable results

🧠 Exploration Mode - Deep & Interactive

./rag-mini explore ~/project
> How does authentication work in this codebase?
> Why is the login function slow?
> What security concerns should I be aware of?

Perfect for: Learning codebases, debugging, detailed analysis
Features: Thinking-enabled LLM, conversation memory, follow-up questions
Quality: Deep reasoning with full context awareness

Quick Start (2 Minutes)

# 1. Install everything
./install_mini_rag.sh

# 2. Choose your interface
./rag-tui                         # Friendly interface for beginners
# OR choose your mode:
./rag-mini index ~/my-project     # Index your project first
./rag-mini search ~/my-project "query" --synthesize  # Fast synthesis
./rag-mini explore ~/my-project   # Interactive exploration

That's it. No external dependencies, no configuration required, no PhD in computer science needed.

What Makes This Different

For Beginners

Just works - Zero configuration required
Multiple interfaces - TUI for learning, CLI for speed
Educational - Shows you CLI commands as you use the TUI
Solid results - Finds code by meaning, not just keywords

For Developers

Hackable - Clean, documented code you can actually modify
Configurable - YAML config for everything, or change the code directly
Multiple embedding options - Ollama, ML models, or hash-based
Production patterns - Streaming, batching, error handling, monitoring

For Learning

Complete technical documentation - How chunking, embedding, and search actually work
Educational tests - See the system in action with real examples
No magic - Every decision explained, every component documented

Usage Examples

Find Code by Concept

./rag-mini search ~/project "user authentication"
# Finds: login functions, auth middleware, session handling, password validation

Natural Language Queries

./rag-mini search ~/project "error handling for database connections"
# Finds: try/catch blocks, connection pool error handlers, retry logic

Development Workflow

./rag-mini index ~/new-project              # Index once
./rag-mini search ~/new-project "API endpoints"   # Search as needed
./rag-mini status ~/new-project            # Check index health

Installation Options

Recommended: Full Installation

./install_mini_rag.sh
# Handles Python setup, dependencies, optional AI models

Experimental: Copy & Run (May Not Work)

# Copy folder anywhere and try to run directly
./rag-mini index ~/my-project
# Auto-setup will attempt to create environment
# Falls back with clear instructions if it fails

Manual Setup

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Note: The experimental copy & run feature is provided for convenience but may fail on some systems. If you encounter issues, use the full installer for reliable setup.

System Requirements

Python 3.8+ (installer checks and guides setup)
Optional: Ollama (for best search quality - installer helps set up)
Fallback: Works without external dependencies (uses built-in embeddings)

Project Philosophy

This implementation prioritizes:

Educational Value - You can understand and modify every part
Practical Results - Actually finds relevant code, not just keyword matches
Zero Friction - Works out of the box, configurable when needed
Real-world Patterns - Production techniques in beginner-friendly code

What's Inside

Hybrid embedding system - Ollama → ML → Hash fallbacks
Smart chunking - Language-aware code parsing
Vector + keyword search - Best of both worlds
Streaming architecture - Handles large codebases efficiently
Multiple interfaces - TUI, CLI, Python API, server mode

Next Steps

New users: Run ./rag-mini for guided experience
Developers: Read TECHNICAL_GUIDE.md for implementation details
Contributors: See CONTRIBUTING.md for development setup

Documentation

Quick Start Guide - Get running in 5 minutes
Visual Diagrams - 📊 System flow charts and architecture diagrams
TUI Guide - Complete walkthrough of the friendly interface
Technical Guide - How the system actually works
Configuration Guide - Customizing for your needs
Development Guide - Extending and modifying the code

License

MIT - Use it, learn from it, build on it.

Built by someone who got frustrated with RAG implementations that were either too simple to be useful or too complex to understand. This is the system I wish I'd found when I started.

Languages

Python 84.1%

Shell 9.1%

PowerShell 4.8%

Batchfile 1.8%

Makefile 0.2%