Fss-Rag-Mini/docs/FALLBACK_SETUP.md
BobAi 4166d0a362 Initial release: FSS-Mini-RAG - Lightweight semantic code search system
🎯 Complete transformation from 5.9GB bloated system to 70MB optimized solution

 Key Features:
- Hybrid embedding system (Ollama + ML fallback + hash backup)
- Intelligent chunking with language-aware parsing
- Semantic + BM25 hybrid search with rich context
- Zero-config portable design with graceful degradation
- Beautiful TUI for beginners + powerful CLI for experts
- Comprehensive documentation with 8+ Mermaid diagrams
- Professional animated demo (183KB optimized GIF)

🏗️ Architecture Highlights:
- LanceDB vector storage with streaming indexing
- Smart file tracking (size/mtime) to avoid expensive rehashing
- Progressive chunking: Markdown headers → Python functions → fixed-size
- Quality filtering: 200+ chars, 20+ words, 30% alphanumeric content
- Concurrent batch processing with error recovery

📦 Package Contents:
- Core engine: claude_rag/ (11 modules, 2,847 lines)
- Entry points: rag-mini (unified), rag-tui (beginner interface)
- Documentation: README + 6 guides with visual diagrams
- Assets: 3D icon, optimized demo GIF, recording tools
- Tests: 8 comprehensive integration and validation tests
- Examples: Usage patterns, config templates, dependency analysis

🎥 Demo System:
- Scripted demonstration showing 12 files → 58 chunks indexing
- Semantic search with multi-line result previews
- Complete workflow from TUI startup to CLI mastery
- Professional recording pipeline with asciinema + GIF conversion

🛡️ Security & Quality:
- Complete .gitignore with personal data protection
- Dependency optimization (removed python-dotenv)
- Code quality validation and educational test suite
- Agent-reviewed architecture and documentation

Ready for production use - copy folder, run ./rag-mini, start searching\!
2025-08-12 16:38:28 +10:00

62 lines
1.9 KiB
Markdown

# RAG System - Hybrid Mode Setup
This RAG system can operate in three modes:
## 🚀 **Mode 1: Ollama Only (Recommended - Lightweight)**
```bash
pip install -r requirements-light.txt
# Requires: ollama serve running with nomic-embed-text model
```
- **Size**: ~426MB total
- **Performance**: Fastest (leverages Ollama)
- **Network**: Uses local Ollama server
## 🔄 **Mode 2: Hybrid (Best of Both Worlds)**
```bash
pip install -r requirements-full.txt
# Works with OR without Ollama
```
- **Size**: ~3GB total (includes ML fallback)
- **Resilience**: Automatic fallback if Ollama unavailable
- **Performance**: Ollama speed when available, ML fallback when needed
## 🛡️ **Mode 3: ML Only (Maximum Compatibility)**
```bash
pip install -r requirements-full.txt
# Disable Ollama fallback in config
```
- **Size**: ~3GB total
- **Compatibility**: Works anywhere, no external dependencies
- **Use case**: Offline environments, embedded systems
## 🔧 **Configuration**
Edit `.claude-rag/config.json` in your project:
```json
{
"embedding": {
"provider": "hybrid", // "hybrid", "ollama", "fallback"
"model": "nomic-embed-text:latest",
"base_url": "http://localhost:11434",
"enable_fallback": true // Set to false to disable ML fallback
}
}
```
## 📊 **Status Check**
```python
from claude_rag.ollama_embeddings import OllamaEmbedder
embedder = OllamaEmbedder()
status = embedder.get_status()
print(f"Mode: {status['mode']}")
print(f"Ollama: {'✅' if status['ollama_available'] else '❌'}")
print(f"ML Fallback: {'✅' if status['fallback_available'] else '❌'}")
```
## 🎯 **Automatic Behavior**
1. **Try Ollama first** - fastest and most efficient
2. **Fall back to ML** - if Ollama unavailable and ML dependencies installed
3. **Use hash fallback** - deterministic embeddings as last resort
The system automatically detects what's available and uses the best option!