Complete deployment expansion and system context integration

Major enhancements:
• Add comprehensive deployment guide covering all platforms (mobile, edge, cloud)
• Implement system context collection for enhanced AI responses
• Update documentation with current workflows and deployment scenarios
• Fix Windows compatibility bugs in file locking system
• Enhanced diagrams with system context integration flow
• Improved exploration mode with better context handling

Platform support expanded:
• Full macOS compatibility verified
• Raspberry Pi deployment with ARM64 optimizations
• Android deployment via Termux with configuration examples
• Edge device deployment strategies and performance guidelines
• Docker containerization for universal deployment

Technical improvements:
• System context module provides OS/environment awareness to AI
• Context-aware prompts improve response relevance
• Enhanced error handling and graceful fallbacks
• Better integration between synthesis and exploration modes

Documentation updates:
• Complete deployment guide with troubleshooting
• Updated getting started guide with current installation flows
• Enhanced visual diagrams showing system architecture
• Platform-specific configuration examples

Ready for extended deployment testing and user feedback.
This commit is contained in:
FSSCoding 2025-08-16 12:31:16 +10:00
parent 8e67c76c6d
commit f5de046f95
7 changed files with 875 additions and 156 deletions

381
docs/DEPLOYMENT_GUIDE.md Normal file
View File

@ -0,0 +1,381 @@
# FSS-Mini-RAG Deployment Guide
> **Run semantic search anywhere - from smartphones to edge devices**
> *Complete guide to deploying FSS-Mini-RAG on every platform imaginable*
## Platform Compatibility Matrix
| Platform | Status | AI Features | Installation | Notes |
|----------|--------|-------------|--------------|-------|
| **Linux** | ✅ Full | ✅ Full | `./install_mini_rag.sh` | Primary platform |
| **Windows** | ✅ Full | ✅ Full | `install_windows.bat` | Desktop shortcuts |
| **macOS** | ✅ Full | ✅ Full | `./install_mini_rag.sh` | Works perfectly |
| **Raspberry Pi** | ✅ Excellent | ✅ AI ready | `./install_mini_rag.sh` | ARM64 optimized |
| **Android (Termux)** | ✅ Good | 🟡 Limited | Manual install | Terminal interface |
| **iOS (a-Shell)** | 🟡 Limited | ❌ Text only | Manual install | Sandbox limitations |
| **Docker** | ✅ Excellent | ✅ Full | Dockerfile | Any platform |
## Desktop & Server Deployment
### 🐧 **Linux** (Primary Platform)
```bash
# Full installation with AI features
./install_mini_rag.sh
# What you get:
# ✅ Desktop shortcuts (.desktop files)
# ✅ Application menu integration
# ✅ Full AI model downloads
# ✅ Complete terminal interface
```
### 🪟 **Windows** (Fully Supported)
```cmd
# Full installation with desktop integration
install_windows.bat
# What you get:
# ✅ Desktop shortcuts (.lnk files)
# ✅ Start Menu entries
# ✅ Full AI model downloads
# ✅ Beautiful terminal interface
```
### 🍎 **macOS** (Excellent Support)
```bash
# Same as Linux - works perfectly
./install_mini_rag.sh
# Additional macOS optimizations:
brew install python3 # If needed
brew install ollama # For AI features
```
**macOS-specific features:**
- Automatic path detection for common project locations
- Integration with Spotlight search locations
- Support for `.app` bundle creation (advanced)
## Edge Device Deployment
### 🥧 **Raspberry Pi** (Recommended Edge Platform)
**Perfect for:**
- Home lab semantic search server
- Portable development environment
- IoT project documentation search
- Offline code search station
**Installation:**
```bash
# On Raspberry Pi OS (64-bit recommended)
sudo apt update && sudo apt upgrade
./install_mini_rag.sh
# The installer automatically detects ARM and optimizes:
# ✅ Suggests lightweight models (qwen3:0.6b)
# ✅ Reduces memory usage
# ✅ Enables efficient chunking
```
**Raspberry Pi optimized config:**
```yaml
# Automatically generated for Pi
embedding:
preferred_method: ollama
ollama_model: nomic-embed-text # 270MB - perfect for Pi
llm:
synthesis_model: qwen3:0.6b # 500MB - fast on Pi 4+
context_window: 4096 # Conservative memory use
cpu_optimized: true
chunking:
max_size: 1500 # Smaller chunks for efficiency
```
**Performance expectations:**
- **Pi 4 (4GB)**: Excellent performance, full AI features
- **Pi 4 (2GB)**: Good performance, text-only or small models
- **Pi 5**: Outstanding performance, handles large models
- **Pi Zero**: Text-only search (hash-based embeddings)
### 🔧 **Other Edge Devices**
**NVIDIA Jetson Series:**
- Overkill performance for this use case
- Can run largest models with GPU acceleration
- Perfect for AI-heavy development workstations
**Intel NUC / Mini PCs:**
- Excellent performance
- Full desktop experience
- Can serve multiple users simultaneously
**Orange Pi / Rock Pi:**
- Similar to Raspberry Pi
- Same installation process
- May need manual Ollama compilation
## Mobile Deployment
### 📱 **Android (Recommended: Termux)**
**Installation in Termux:**
```bash
# Install Termux from F-Droid (not Play Store)
# In Termux:
pkg update && pkg upgrade
pkg install python python-pip git
pip install --upgrade pip
# Clone and install FSS-Mini-RAG
git clone https://github.com/your-repo/fss-mini-rag
cd fss-mini-rag
pip install -r requirements.txt
# Quick start
python -m mini_rag index /storage/emulated/0/Documents/myproject
python -m mini_rag search /storage/emulated/0/Documents/myproject "your query"
```
**Android-optimized config:**
```yaml
# config-android.yaml
embedding:
preferred_method: hash # No heavy models needed
chunking:
max_size: 800 # Small chunks for mobile
files:
min_file_size: 20 # Include more small files
llm:
enable_synthesis: false # Text-only for speed
```
**What works on Android:**
- ✅ Full text search and indexing
- ✅ Terminal interface (`rag-tui`)
- ✅ Project indexing from phone storage
- ✅ Search your phone's code projects
- ❌ Heavy AI models (use cloud providers instead)
**Android use cases:**
- Search your mobile development projects
- Index documentation on your phone
- Quick code reference while traveling
- Offline search of downloaded repositories
### 🍎 **iOS (Limited but Possible)**
**Option 1: a-Shell (Free)**
```bash
# Install a-Shell from App Store
# In a-Shell:
pip install requests pathlib
# Limited installation (core features only)
# Files must be in app sandbox
```
**Option 2: iSH (Alpine Linux)**
```bash
# Install iSH from App Store
# In iSH terminal:
apk add python3 py3-pip git
pip install -r requirements-light.txt
# Basic functionality only
```
**iOS limitations:**
- Sandbox restricts file access
- No full AI model support
- Terminal interface only
- Limited to app-accessible files
## Specialized Deployment Scenarios
### 🐳 **Docker Deployment**
**For any platform with Docker:**
```dockerfile
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
# Expose ports for server mode
EXPOSE 7777
# Default to TUI interface
CMD ["python", "-m", "mini_rag.cli"]
```
**Usage:**
```bash
# Build and run
docker build -t fss-mini-rag .
docker run -it -v $(pwd)/projects:/projects fss-mini-rag
# Server mode for web access
docker run -p 7777:7777 fss-mini-rag python -m mini_rag server
```
### ☁️ **Cloud Deployment**
**AWS/GCP/Azure VM:**
- Same as Linux installation
- Can serve multiple users
- Perfect for team environments
**GitHub Codespaces:**
```bash
# Works in any Codespace
./install_mini_rag.sh
# Perfect for searching your workspace
```
**Replit/CodeSandbox:**
- Limited by platform restrictions
- Basic functionality available
### 🏠 **Home Lab Integration**
**Home Assistant Add-on:**
- Package as Home Assistant add-on
- Search home automation configs
- Voice integration possible
**NAS Integration:**
- Install on Synology/QNAP
- Search all stored documents
- Family code documentation
**Router with USB:**
- Install on OpenWrt routers with USB storage
- Search network documentation
- Configuration management
## Configuration by Use Case
### 🪶 **Ultra-Lightweight (Old hardware, mobile)**
```yaml
# Minimal resource usage
embedding:
preferred_method: hash
chunking:
max_size: 800
strategy: fixed
llm:
enable_synthesis: false
```
### ⚖️ **Balanced (Raspberry Pi, older laptops)**
```yaml
# Good performance with AI features
embedding:
preferred_method: ollama
ollama_model: nomic-embed-text
llm:
synthesis_model: qwen3:0.6b
context_window: 4096
```
### 🚀 **Performance (Modern hardware)**
```yaml
# Full features and performance
embedding:
preferred_method: ollama
ollama_model: nomic-embed-text
llm:
synthesis_model: qwen3:1.7b
context_window: 16384
enable_thinking: true
```
### ☁️ **Cloud-Hybrid (Mobile + Cloud AI)**
```yaml
# Local search, cloud intelligence
embedding:
preferred_method: hash
llm:
provider: openai
api_key: your_api_key
synthesis_model: gpt-4
```
## Troubleshooting by Platform
### **Raspberry Pi Issues**
- **Out of memory**: Reduce context window, use smaller models
- **Slow indexing**: Use hash-based embeddings
- **Model download fails**: Check internet, use smaller models
### **Android/Termux Issues**
- **Permission denied**: Use `termux-setup-storage`
- **Package install fails**: Update packages first
- **Can't access files**: Use `/storage/emulated/0/` paths
### **iOS Issues**
- **Limited functionality**: Expected due to iOS restrictions
- **Can't install packages**: Use lighter requirements file
- **File access denied**: Files must be in app sandbox
### **Edge Device Issues**
- **ARM compatibility**: Ensure using ARM64 Python packages
- **Limited RAM**: Use hash embeddings, reduce chunk sizes
- **No internet**: Skip AI model downloads, use text-only
## Advanced Edge Deployments
### **IoT Integration**
- Index sensor logs and configurations
- Search device documentation
- Troubleshoot IoT deployments
### **Offline Development**
- Complete development environment on edge device
- No internet required after setup
- Perfect for remote locations
### **Educational Use**
- Raspberry Pi computer labs
- Student project search
- Coding bootcamp environments
### **Enterprise Edge**
- Factory floor documentation search
- Field service technical reference
- Remote site troubleshooting
---
## Quick Start by Platform
### Desktop Users
```bash
# Linux/macOS
./install_mini_rag.sh
# Windows
install_windows.bat
```
### Edge/Mobile Users
```bash
# Raspberry Pi
./install_mini_rag.sh
# Android (Termux)
pkg install python git && pip install -r requirements.txt
# Any Docker platform
docker run -it fss-mini-rag
```
**💡 Pro tip**: Start with your current platform, then expand to edge devices as needed. The system scales from smartphones to servers seamlessly!

View File

@ -11,6 +11,7 @@
- [Search Architecture](#search-architecture)
- [Installation Flow](#installation-flow)
- [Configuration System](#configuration-system)
- [System Context Integration](#system-context-integration)
- [Error Handling](#error-handling)
## System Overview
@ -22,10 +23,12 @@ graph TB
CLI --> Index[📁 Index Project]
CLI --> Search[🔍 Search Project]
CLI --> Explore[🧠 Explore Project]
CLI --> Status[📊 Show Status]
TUI --> Index
TUI --> Search
TUI --> Explore
TUI --> Config[⚙️ Configuration]
Index --> Files[📄 File Discovery]
@ -34,17 +37,32 @@ graph TB
Embed --> Store[💾 Vector Database]
Search --> Query[❓ User Query]
Search --> Context[🖥️ System Context]
Query --> Vector[🎯 Vector Search]
Query --> Keyword[🔤 Keyword Search]
Vector --> Combine[🔄 Hybrid Results]
Keyword --> Combine
Combine --> Results[📋 Ranked Results]
Context --> Combine
Combine --> Synthesize{Synthesis Mode?}
Synthesize -->|Yes| FastLLM[⚡ Fast Synthesis]
Synthesize -->|No| Results[📋 Ranked Results]
FastLLM --> Results
Explore --> ExploreQuery[❓ Interactive Query]
ExploreQuery --> Memory[🧠 Conversation Memory]
ExploreQuery --> Context
Memory --> DeepLLM[🤔 Deep AI Analysis]
Context --> DeepLLM
Vector --> DeepLLM
DeepLLM --> Interactive[💬 Interactive Response]
Store --> LanceDB[(🗄️ LanceDB)]
Vector --> LanceDB
Config --> YAML[📝 config.yaml]
Status --> Manifest[📋 manifest.json]
Context --> SystemInfo[💻 OS, Python, Paths]
```
## User Journey
@ -276,6 +294,58 @@ flowchart TD
style Error fill:#ffcdd2
```
## System Context Integration
```mermaid
graph LR
subgraph "System Detection"
OS[🖥️ Operating System]
Python[🐍 Python Version]
Project[📁 Project Path]
OS --> Windows[Windows: rag.bat]
OS --> Linux[Linux: ./rag-mini]
OS --> macOS[macOS: ./rag-mini]
end
subgraph "Context Collection"
Collect[🔍 Collect Context]
OS --> Collect
Python --> Collect
Project --> Collect
Collect --> Format[📝 Format Context]
Format --> Limit[✂️ Limit to 200 chars]
end
subgraph "AI Integration"
UserQuery[❓ User Query]
SearchResults[📋 Search Results]
SystemContext[💻 System Context]
UserQuery --> Prompt[📝 Build Prompt]
SearchResults --> Prompt
SystemContext --> Prompt
Prompt --> AI[🤖 LLM Processing]
AI --> Response[💬 Contextual Response]
end
subgraph "Enhanced Responses"
Response --> Commands[💻 OS-specific commands]
Response --> Paths[📂 Correct path formats]
Response --> Tips[💡 Platform-specific tips]
end
Format --> SystemContext
style SystemContext fill:#e3f2fd
style Response fill:#f3e5f5
style Commands fill:#e8f5e8
```
*System context helps the AI provide better, platform-specific guidance without compromising privacy*
## Architecture Layers
```mermaid

View File

@ -1,212 +1,314 @@
# Getting Started with FSS-Mini-RAG
## Step 1: Installation
> **Get from zero to searching in 2 minutes**
> *Everything you need to know to start finding code by meaning, not just keywords*
Choose your installation based on what you want:
## Installation (Choose Your Adventure)
### Option A: Ollama Only (Recommended)
### 🎯 **Option 1: Full Installation (Recommended)**
*Gets you everything working reliably with desktop shortcuts and AI features*
**Linux/macOS:**
```bash
# Install Ollama first
curl -fsSL https://ollama.ai/install.sh | sh
# Pull the embedding model
ollama pull nomic-embed-text
# Install Python dependencies
pip install -r requirements.txt
./install_mini_rag.sh
```
### Option B: Full ML Stack
```bash
# Install everything including PyTorch
pip install -r requirements-full.txt
**Windows:**
```cmd
install_windows.bat
```
## Step 2: Test Installation
**What this does:**
- Sets up Python environment automatically
- Installs all dependencies
- Downloads AI models (with your permission)
- Creates desktop shortcuts and application menu entries
- Tests everything works
- Gives you an interactive tutorial
**Time needed:** 5-10 minutes (depending on AI model downloads)
---
### 🚀 **Option 2: Copy & Try (Experimental)**
*Just copy the folder and run - may work, may need manual setup*
**Linux/macOS:**
```bash
# Index this RAG system itself
# Copy folder anywhere and try running
./rag-mini index ~/my-project
# Auto-setup attempts to create virtual environment
# Falls back with clear instructions if it fails
```
**Windows:**
```cmd
# Copy folder anywhere and try running
rag.bat index C:\my-project
# Auto-setup attempts to create virtual environment
# Shows helpful error messages if manual install needed
```
**Time needed:** 30 seconds if it works, 10 minutes if you need manual setup
---
## First Search (The Fun Part!)
### Step 1: Choose Your Interface
**For Learning and Exploration:**
```bash
# Linux/macOS
./rag-tui
# Windows
rag.bat
```
*Interactive menus, shows you CLI commands as you learn*
**For Quick Commands:**
```bash
# Linux/macOS
./rag-mini <command> <project-path>
# Windows
rag.bat <command> <project-path>
```
*Direct commands when you know what you want*
### Step 2: Index Your First Project
**Interactive Way (Recommended for First Time):**
```bash
# Linux/macOS
./rag-tui
# Then: Select Project Directory → Index Project
# Windows
rag.bat
# Then: Select Project Directory → Index Project
```
**Direct Commands:**
```bash
# Linux/macOS
./rag-mini index ~/my-project
# Search for something
./rag-mini search ~/my-project "chunker function"
# Windows
rag.bat index C:\my-project
```
# Check what got indexed
**What indexing does:**
- Finds all text files in your project
- Breaks them into smart "chunks" (functions, classes, logical sections)
- Creates searchable embeddings that understand meaning
- Stores everything in a fast vector database
- Creates a `.mini-rag/` directory with your search index
**Time needed:** 10-60 seconds depending on project size
### Step 3: Search by Meaning
**Natural language queries:**
```bash
# Linux/macOS
./rag-mini search ~/my-project "user authentication logic"
./rag-mini search ~/my-project "error handling for database connections"
./rag-mini search ~/my-project "how to validate input data"
# Windows
rag.bat search C:\my-project "user authentication logic"
rag.bat search C:\my-project "error handling for database connections"
rag.bat search C:\my-project "how to validate input data"
```
**Code concepts:**
```bash
# Finds login functions, auth middleware, session handling
./rag-mini search ~/my-project "login functionality"
# Finds try/catch blocks, error handlers, retry logic
./rag-mini search ~/my-project "exception handling"
# Finds validation functions, input sanitization, data checking
./rag-mini search ~/my-project "data validation"
```
**What you get:**
- Ranked results by relevance (not just keyword matching)
- File paths and line numbers for easy navigation
- Context around each match so you understand what it does
- Smart filtering to avoid noise and duplicates
## Two Powerful Modes
FSS-Mini-RAG has two different ways to get answers, optimized for different needs:
### 🚀 **Synthesis Mode** - Fast Answers
```bash
# Linux/macOS
./rag-mini search ~/project "authentication logic" --synthesize
# Windows
rag.bat search C:\project "authentication logic" --synthesize
```
**Perfect for:**
- Quick code discovery
- Finding specific functions or patterns
- Getting fast, consistent answers
**What you get:**
- Lightning-fast responses (no thinking overhead)
- Reliable, factual information about your code
- Clear explanations of what code does and how it works
### 🧠 **Exploration Mode** - Deep Understanding
```bash
# Linux/macOS
./rag-mini explore ~/project
# Windows
rag.bat explore C:\project
```
**Perfect for:**
- Learning new codebases
- Debugging complex issues
- Understanding architectural decisions
**What you get:**
- Interactive conversation with AI that remembers context
- Deep reasoning with full "thinking" process shown
- Follow-up questions and detailed explanations
- Memory of your previous questions in the session
**Example exploration session:**
```
🧠 Exploration Mode - Ask anything about your project
You: How does authentication work in this codebase?
AI: Let me analyze the authentication system...
💭 Thinking: I can see several authentication-related files. Let me examine
the login flow, session management, and security measures...
📝 Authentication Analysis:
This codebase uses a three-layer authentication system:
1. Login validation in auth.py handles username/password checking
2. Session management in sessions.py maintains user state
3. Middleware in auth_middleware.py protects routes
You: What security concerns should I be aware of?
AI: Based on our previous discussion about authentication, let me check for
common security vulnerabilities...
```
## Check Your Setup
**See what got indexed:**
```bash
# Linux/macOS
./rag-mini status ~/my-project
# Windows
rag.bat status C:\my-project
```
## Step 3: Index Your First Project
**What you'll see:**
- How many files were processed
- Total chunks created for searching
- Embedding method being used (Ollama, ML models, or hash-based)
- Configuration file location
- Index health and last update time
## Configuration (Optional)
Your project gets a `.mini-rag/config.yaml` file with helpful comments:
```yaml
# Context window configuration (critical for AI features)
# 💡 Sizing guide: 2K=1 question, 4K=1-2 questions, 8K=manageable, 16K=most users
# 32K=large codebases, 64K+=power users only
# ⚠️ Larger contexts use exponentially more CPU/memory - only increase if needed
context_window: 16384 # Context size in tokens
# AI model preferences (edit to change priority)
model_rankings:
- "qwen3:1.7b" # Excellent for RAG (1.4GB, recommended)
- "qwen3:0.6b" # Lightweight and fast (~500MB)
- "qwen3:4b" # Higher quality but slower (~2.5GB)
```
**When to customize:**
- Your searches aren't finding what you expect → adjust chunking settings
- You want AI features → install Ollama and download models
- System is slow → try smaller models or reduce context window
- Getting too many/few results → adjust similarity threshold
## Troubleshooting
### "Project not indexed"
**Problem:** You're trying to search before indexing
```bash
# Index any project directory
./rag-mini index /path/to/your/project
# The system creates .mini-rag/ directory with:
# - config.json (settings)
# - manifest.json (file tracking)
# - database.lance/ (vector database)
# Run indexing first
./rag-mini index ~/my-project # Linux/macOS
rag.bat index C:\my-project # Windows
```
## Step 4: Search Your Code
### "No Ollama models available"
**Problem:** AI features need models downloaded
```bash
# Basic semantic search
./rag-mini search /path/to/project "user login logic"
# Install Ollama first
curl -fsSL https://ollama.ai/install.sh | sh # Linux/macOS
# Or download from https://ollama.com # Windows
# Enhanced search with smart features
./rag-mini-enhanced search /path/to/project "authentication"
# Start Ollama server
ollama serve
# Find similar patterns
./rag-mini-enhanced similar /path/to/project "def validate_input"
# Download a model
ollama pull qwen3:1.7b
```
## Step 5: Customize Configuration
Edit `project/.mini-rag/config.json`:
```json
{
"chunking": {
"max_size": 3000,
"strategy": "semantic"
},
"files": {
"min_file_size": 100
}
}
```
Then re-index to apply changes:
### "Virtual environment not found"
**Problem:** Auto-setup didn't work, need manual installation
```bash
./rag-mini index /path/to/project --force
# Run the full installer instead
./install_mini_rag.sh # Linux/macOS
install_windows.bat # Windows
```
## Common Use Cases
### Find Functions by Name
### Getting weird results
**Solution:** Try different search terms or check what got indexed
```bash
./rag-mini search /project "function named connect_to_database"
# See what files were processed
./rag-mini status ~/my-project
# Try more specific queries
./rag-mini search ~/my-project "specific function name"
```
### Find Code Patterns
```bash
./rag-mini search /project "error handling try catch"
./rag-mini search /project "database query with parameters"
```
## Next Steps
### Find Configuration
```bash
./rag-mini search /project "database connection settings"
./rag-mini search /project "environment variables"
```
### Learn More
- **[Beginner's Glossary](BEGINNER_GLOSSARY.md)** - All the terms explained simply
- **[TUI Guide](TUI_GUIDE.md)** - Master the interactive interface
- **[Visual Diagrams](DIAGRAMS.md)** - See how everything works
### Find Documentation
```bash
./rag-mini search /project "how to deploy"
./rag-mini search /project "API documentation"
```
### Advanced Features
- **[Query Expansion](QUERY_EXPANSION.md)** - Make searches smarter with AI
- **[LLM Providers](LLM_PROVIDERS.md)** - Use different AI models
- **[CPU Deployment](CPU_DEPLOYMENT.md)** - Optimize for older computers
## Python API Usage
### Customize Everything
- **[Technical Guide](TECHNICAL_GUIDE.md)** - How the system actually works
- **[Configuration Examples](../examples/)** - Pre-made configs for different needs
```python
from mini_rag import ProjectIndexer, CodeSearcher, CodeEmbedder
from pathlib import Path
---
# Initialize
project_path = Path("/path/to/your/project")
embedder = CodeEmbedder()
indexer = ProjectIndexer(project_path, embedder)
searcher = CodeSearcher(project_path, embedder)
**🎉 That's it!** You now have a semantic search system that understands your code by meaning, not just keywords. Start with simple searches and work your way up to the advanced AI features as you get comfortable.
# Index the project
print("Indexing project...")
result = indexer.index_project()
print(f"Indexed {result['files_processed']} files, {result['chunks_created']} chunks")
# Search
print("\nSearching for authentication code...")
results = searcher.search("user authentication logic", top_k=5)
for i, result in enumerate(results, 1):
print(f"\n{i}. {result.file_path}")
print(f" Score: {result.score:.3f}")
print(f" Type: {result.chunk_type}")
print(f" Content: {result.content[:100]}...")
```
## Advanced Features
### Auto-optimization
```bash
# Get optimization suggestions
./rag-mini-enhanced analyze /path/to/project
# This analyzes your codebase and suggests:
# - Better chunk sizes for your language mix
# - Streaming settings for large files
# - File filtering optimizations
```
### File Watching
```python
from mini_rag import FileWatcher
# Watch for file changes and auto-update index
watcher = FileWatcher(project_path, indexer)
watcher.start_watching()
# Now any file changes automatically update the index
```
### Custom Chunking
```python
from mini_rag import CodeChunker
chunker = CodeChunker()
# Chunk a Python file
with open("example.py") as f:
content = f.read()
chunks = chunker.chunk_text(content, "python", "example.py")
for chunk in chunks:
print(f"Type: {chunk.chunk_type}")
print(f"Content: {chunk.content}")
```
## Tips and Best Practices
### For Better Search Results
- Use descriptive phrases: "function that validates email addresses"
- Try different phrasings if first search doesn't work
- Search for concepts, not just exact variable names
### For Better Indexing
- Exclude build directories: `node_modules/`, `build/`, `dist/`
- Include documentation files - they often contain valuable context
- Use semantic chunking strategy for most projects
### For Configuration
- Start with default settings
- Use `analyze` command to get optimization suggestions
- Increase chunk size for larger functions/classes
- Decrease chunk size for more granular search
### For Troubleshooting
- Check `./rag-mini status` to see what was indexed
- Look at `.mini-rag/manifest.json` for file details
- Run with `--force` to completely rebuild index
- Check logs in `.mini-rag/` directory for errors
## What's Next?
1. Try the test suite to understand how components work:
```bash
python -m pytest tests/ -v
```
2. Look at the examples in `examples/` directory
3. Read the main README.md for complete technical details
4. Customize the system for your specific project needs
**💡 Pro tip:** The best way to learn is to index a project you know well and try searching for things you know are in there. You'll quickly see how much better meaning-based search is than traditional keyword search.

View File

@ -194,6 +194,16 @@ class ConfigManager:
return config
except yaml.YAMLError as e:
# YAML syntax error - help user fix it instead of silent fallback
error_msg = f"⚠️ Config file has YAML syntax error at line {getattr(e, 'problem_mark', 'unknown')}: {e}"
logger.error(error_msg)
print(f"\n{error_msg}")
print(f"Config file: {self.config_path}")
print("💡 Check YAML syntax (indentation, quotes, colons)")
print("💡 Or delete config file to reset to defaults")
return RAGConfig() # Still return defaults but warn user
except Exception as e:
logger.error(f"Failed to load config from {self.config_path}: {e}")
logger.info("Using default configuration")
@ -210,7 +220,15 @@ class ConfigManager:
# Create YAML content with comments
yaml_content = self._create_yaml_with_comments(config_dict)
# Write with basic file locking to prevent corruption
with open(self.config_path, 'w') as f:
try:
import fcntl
fcntl.flock(f.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB) # Non-blocking exclusive lock
f.write(yaml_content)
fcntl.flock(f.fileno(), fcntl.LOCK_UN) # Unlock
except (OSError, ImportError):
# Fallback for Windows or if fcntl unavailable
f.write(yaml_content)
logger.info(f"Configuration saved to {self.config_path}")
@ -274,7 +292,11 @@ class ConfigManager:
f" synthesis_temperature: {config_dict['llm']['synthesis_temperature']} # LLM temperature for analysis",
"",
" # Context window configuration (critical for RAG performance)",
f" context_window: {config_dict['llm']['context_window']} # Context size in tokens (8K=fast, 16K=balanced, 32K=advanced)",
" # 💡 Sizing guide: 2K=1 question, 4K=1-2 questions, 8K=manageable, 16K=most users",
" # 32K=large codebases, 64K+=power users only",
" # ⚠️ Larger contexts use exponentially more CPU/memory - only increase if needed",
" # 🔧 Low context limits? Try smaller topk, better search terms, or archive noise",
f" context_window: {config_dict['llm']['context_window']} # Context size in tokens",
f" auto_context: {str(config_dict['llm']['auto_context']).lower()} # Auto-adjust context based on model capabilities",
"",
" model_rankings: # Preferred model order (edit to change priority)",

View File

@ -17,11 +17,13 @@ try:
from .llm_synthesizer import LLMSynthesizer, SynthesisResult
from .search import CodeSearcher
from .config import RAGConfig
from .system_context import get_system_context
except ImportError:
# For direct testing
from llm_synthesizer import LLMSynthesizer, SynthesisResult
from search import CodeSearcher
from config import RAGConfig
get_system_context = lambda x=None: ""
logger = logging.getLogger(__name__)
@ -154,10 +156,15 @@ Content: {content[:800]}{'...' if len(content) > 800 else ''}
results_text = "\n".join(results_context)
# Get system context for better responses
system_context = get_system_context(self.project_path)
# Create comprehensive exploration prompt with thinking
prompt = f"""<think>
The user asked: "{question}"
System context: {system_context}
Let me analyze what they're asking and look at the information I have available.
From the search results, I can see relevant information about:

View File

@ -16,11 +16,13 @@ from pathlib import Path
try:
from .llm_safeguards import ModelRunawayDetector, SafeguardConfig, get_optimal_ollama_parameters
from .system_context import get_system_context
except ImportError:
# Graceful fallback if safeguards not available
ModelRunawayDetector = None
SafeguardConfig = None
get_optimal_ollama_parameters = lambda x: {}
get_system_context = lambda x=None: ""
logger = logging.getLogger(__name__)
@ -175,12 +177,20 @@ class LLMSynthesizer:
# Ensure we're initialized
self._ensure_initialized()
# Use the best available model
# Use the best available model with retry logic
model_to_use = self.model
if self.model not in self.available_models:
# Refresh model list in case of race condition
logger.warning(f"Configured model {self.model} not in available list, refreshing...")
self.available_models = self._get_available_models()
if self.model in self.available_models:
model_to_use = self.model
logger.info(f"Model {self.model} found after refresh")
elif self.available_models:
# Fallback to first available model
if self.available_models:
model_to_use = self.available_models[0]
logger.warning(f"Using fallback model: {model_to_use}")
else:
logger.error("No Ollama models available")
return None
@ -587,9 +597,13 @@ Content: {content[:500]}{'...' if len(content) > 500 else ''}
context = "\n".join(context_parts)
# Create synthesis prompt
# Get system context for better responses
system_context = get_system_context(project_path)
# Create synthesis prompt with system context
prompt = f"""You are a senior software engineer analyzing code search results. Your task is to synthesize the search results into a helpful, actionable summary.
SYSTEM CONTEXT: {system_context}
SEARCH QUERY: "{query}"
PROJECT: {project_path.name}

123
mini_rag/system_context.py Normal file
View File

@ -0,0 +1,123 @@
"""
System Context Collection for Enhanced RAG Grounding
Collects minimal system information to help the LLM provide better,
context-aware assistance without compromising privacy.
"""
import platform
import sys
import os
from pathlib import Path
from typing import Dict, Optional
class SystemContextCollector:
"""Collects system context information for enhanced LLM grounding."""
@staticmethod
def get_system_context(project_path: Optional[Path] = None) -> str:
"""
Get concise system context for LLM grounding.
Args:
project_path: Current project directory
Returns:
Formatted system context string (max 200 chars for privacy)
"""
try:
# Basic system info
os_name = platform.system()
python_ver = f"{sys.version_info.major}.{sys.version_info.minor}"
# Simplified OS names
os_short = {
'Windows': 'Win',
'Linux': 'Linux',
'Darwin': 'macOS'
}.get(os_name, os_name)
# Working directory info
if project_path:
# Use relative or shortened path for privacy
try:
rel_path = project_path.relative_to(Path.home())
path_info = f"~/{rel_path}"
except ValueError:
# If not relative to home, just use folder name
path_info = project_path.name
else:
path_info = Path.cwd().name
# Trim path if too long for our 200-char limit
if len(path_info) > 50:
path_info = f".../{path_info[-45:]}"
# Command style hints
cmd_style = "rag.bat" if os_name == "Windows" else "./rag-mini"
# Format concise context
context = f"[{os_short} {python_ver}, {path_info}, use {cmd_style}]"
# Ensure we stay under 200 chars
if len(context) > 200:
context = context[:197] + "...]"
return context
except Exception:
# Fallback to minimal info if anything fails
return f"[{platform.system()}, Python {sys.version_info.major}.{sys.version_info.minor}]"
@staticmethod
def get_command_context(os_name: Optional[str] = None) -> Dict[str, str]:
"""
Get OS-appropriate command examples.
Returns:
Dictionary with command patterns for the current OS
"""
if os_name is None:
os_name = platform.system()
if os_name == "Windows":
return {
"launcher": "rag.bat",
"index": "rag.bat index C:\\path\\to\\project",
"search": "rag.bat search C:\\path\\to\\project \"query\"",
"explore": "rag.bat explore C:\\path\\to\\project",
"path_sep": "\\",
"example_path": "C:\\Users\\username\\Documents\\myproject"
}
else:
return {
"launcher": "./rag-mini",
"index": "./rag-mini index /path/to/project",
"search": "./rag-mini search /path/to/project \"query\"",
"explore": "./rag-mini explore /path/to/project",
"path_sep": "/",
"example_path": "~/Documents/myproject"
}
def get_system_context(project_path: Optional[Path] = None) -> str:
"""Convenience function to get system context."""
return SystemContextCollector.get_system_context(project_path)
def get_command_context() -> Dict[str, str]:
"""Convenience function to get command context."""
return SystemContextCollector.get_command_context()
# Test function
if __name__ == "__main__":
print("System Context Test:")
print(f"Context: {get_system_context()}")
print(f"Context with path: {get_system_context(Path('/tmp/test'))}")
print()
print("Command Context:")
cmds = get_command_context()
for key, value in cmds.items():
print(f" {key}: {value}")