Complete deployment expansion and system context integration
Major enhancements: • Add comprehensive deployment guide covering all platforms (mobile, edge, cloud) • Implement system context collection for enhanced AI responses • Update documentation with current workflows and deployment scenarios • Fix Windows compatibility bugs in file locking system • Enhanced diagrams with system context integration flow • Improved exploration mode with better context handling Platform support expanded: • Full macOS compatibility verified • Raspberry Pi deployment with ARM64 optimizations • Android deployment via Termux with configuration examples • Edge device deployment strategies and performance guidelines • Docker containerization for universal deployment Technical improvements: • System context module provides OS/environment awareness to AI • Context-aware prompts improve response relevance • Enhanced error handling and graceful fallbacks • Better integration between synthesis and exploration modes Documentation updates: • Complete deployment guide with troubleshooting • Updated getting started guide with current installation flows • Enhanced visual diagrams showing system architecture • Platform-specific configuration examples Ready for extended deployment testing and user feedback.
This commit is contained in:
parent
8e67c76c6d
commit
f5de046f95
381
docs/DEPLOYMENT_GUIDE.md
Normal file
381
docs/DEPLOYMENT_GUIDE.md
Normal file
@ -0,0 +1,381 @@
|
||||
# FSS-Mini-RAG Deployment Guide
|
||||
|
||||
> **Run semantic search anywhere - from smartphones to edge devices**
|
||||
> *Complete guide to deploying FSS-Mini-RAG on every platform imaginable*
|
||||
|
||||
## Platform Compatibility Matrix
|
||||
|
||||
| Platform | Status | AI Features | Installation | Notes |
|
||||
|----------|--------|-------------|--------------|-------|
|
||||
| **Linux** | ✅ Full | ✅ Full | `./install_mini_rag.sh` | Primary platform |
|
||||
| **Windows** | ✅ Full | ✅ Full | `install_windows.bat` | Desktop shortcuts |
|
||||
| **macOS** | ✅ Full | ✅ Full | `./install_mini_rag.sh` | Works perfectly |
|
||||
| **Raspberry Pi** | ✅ Excellent | ✅ AI ready | `./install_mini_rag.sh` | ARM64 optimized |
|
||||
| **Android (Termux)** | ✅ Good | 🟡 Limited | Manual install | Terminal interface |
|
||||
| **iOS (a-Shell)** | 🟡 Limited | ❌ Text only | Manual install | Sandbox limitations |
|
||||
| **Docker** | ✅ Excellent | ✅ Full | Dockerfile | Any platform |
|
||||
|
||||
## Desktop & Server Deployment
|
||||
|
||||
### 🐧 **Linux** (Primary Platform)
|
||||
```bash
|
||||
# Full installation with AI features
|
||||
./install_mini_rag.sh
|
||||
|
||||
# What you get:
|
||||
# ✅ Desktop shortcuts (.desktop files)
|
||||
# ✅ Application menu integration
|
||||
# ✅ Full AI model downloads
|
||||
# ✅ Complete terminal interface
|
||||
```
|
||||
|
||||
### 🪟 **Windows** (Fully Supported)
|
||||
```cmd
|
||||
# Full installation with desktop integration
|
||||
install_windows.bat
|
||||
|
||||
# What you get:
|
||||
# ✅ Desktop shortcuts (.lnk files)
|
||||
# ✅ Start Menu entries
|
||||
# ✅ Full AI model downloads
|
||||
# ✅ Beautiful terminal interface
|
||||
```
|
||||
|
||||
### 🍎 **macOS** (Excellent Support)
|
||||
```bash
|
||||
# Same as Linux - works perfectly
|
||||
./install_mini_rag.sh
|
||||
|
||||
# Additional macOS optimizations:
|
||||
brew install python3 # If needed
|
||||
brew install ollama # For AI features
|
||||
```
|
||||
|
||||
**macOS-specific features:**
|
||||
- Automatic path detection for common project locations
|
||||
- Integration with Spotlight search locations
|
||||
- Support for `.app` bundle creation (advanced)
|
||||
|
||||
## Edge Device Deployment
|
||||
|
||||
### 🥧 **Raspberry Pi** (Recommended Edge Platform)
|
||||
|
||||
**Perfect for:**
|
||||
- Home lab semantic search server
|
||||
- Portable development environment
|
||||
- IoT project documentation search
|
||||
- Offline code search station
|
||||
|
||||
**Installation:**
|
||||
```bash
|
||||
# On Raspberry Pi OS (64-bit recommended)
|
||||
sudo apt update && sudo apt upgrade
|
||||
./install_mini_rag.sh
|
||||
|
||||
# The installer automatically detects ARM and optimizes:
|
||||
# ✅ Suggests lightweight models (qwen3:0.6b)
|
||||
# ✅ Reduces memory usage
|
||||
# ✅ Enables efficient chunking
|
||||
```
|
||||
|
||||
**Raspberry Pi optimized config:**
|
||||
```yaml
|
||||
# Automatically generated for Pi
|
||||
embedding:
|
||||
preferred_method: ollama
|
||||
ollama_model: nomic-embed-text # 270MB - perfect for Pi
|
||||
|
||||
llm:
|
||||
synthesis_model: qwen3:0.6b # 500MB - fast on Pi 4+
|
||||
context_window: 4096 # Conservative memory use
|
||||
cpu_optimized: true
|
||||
|
||||
chunking:
|
||||
max_size: 1500 # Smaller chunks for efficiency
|
||||
```
|
||||
|
||||
**Performance expectations:**
|
||||
- **Pi 4 (4GB)**: Excellent performance, full AI features
|
||||
- **Pi 4 (2GB)**: Good performance, text-only or small models
|
||||
- **Pi 5**: Outstanding performance, handles large models
|
||||
- **Pi Zero**: Text-only search (hash-based embeddings)
|
||||
|
||||
### 🔧 **Other Edge Devices**
|
||||
|
||||
**NVIDIA Jetson Series:**
|
||||
- Overkill performance for this use case
|
||||
- Can run largest models with GPU acceleration
|
||||
- Perfect for AI-heavy development workstations
|
||||
|
||||
**Intel NUC / Mini PCs:**
|
||||
- Excellent performance
|
||||
- Full desktop experience
|
||||
- Can serve multiple users simultaneously
|
||||
|
||||
**Orange Pi / Rock Pi:**
|
||||
- Similar to Raspberry Pi
|
||||
- Same installation process
|
||||
- May need manual Ollama compilation
|
||||
|
||||
## Mobile Deployment
|
||||
|
||||
### 📱 **Android (Recommended: Termux)**
|
||||
|
||||
**Installation in Termux:**
|
||||
```bash
|
||||
# Install Termux from F-Droid (not Play Store)
|
||||
# In Termux:
|
||||
pkg update && pkg upgrade
|
||||
pkg install python python-pip git
|
||||
pip install --upgrade pip
|
||||
|
||||
# Clone and install FSS-Mini-RAG
|
||||
git clone https://github.com/your-repo/fss-mini-rag
|
||||
cd fss-mini-rag
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Quick start
|
||||
python -m mini_rag index /storage/emulated/0/Documents/myproject
|
||||
python -m mini_rag search /storage/emulated/0/Documents/myproject "your query"
|
||||
```
|
||||
|
||||
**Android-optimized config:**
|
||||
```yaml
|
||||
# config-android.yaml
|
||||
embedding:
|
||||
preferred_method: hash # No heavy models needed
|
||||
|
||||
chunking:
|
||||
max_size: 800 # Small chunks for mobile
|
||||
|
||||
files:
|
||||
min_file_size: 20 # Include more small files
|
||||
|
||||
llm:
|
||||
enable_synthesis: false # Text-only for speed
|
||||
```
|
||||
|
||||
**What works on Android:**
|
||||
- ✅ Full text search and indexing
|
||||
- ✅ Terminal interface (`rag-tui`)
|
||||
- ✅ Project indexing from phone storage
|
||||
- ✅ Search your phone's code projects
|
||||
- ❌ Heavy AI models (use cloud providers instead)
|
||||
|
||||
**Android use cases:**
|
||||
- Search your mobile development projects
|
||||
- Index documentation on your phone
|
||||
- Quick code reference while traveling
|
||||
- Offline search of downloaded repositories
|
||||
|
||||
### 🍎 **iOS (Limited but Possible)**
|
||||
|
||||
**Option 1: a-Shell (Free)**
|
||||
```bash
|
||||
# Install a-Shell from App Store
|
||||
# In a-Shell:
|
||||
pip install requests pathlib
|
||||
|
||||
# Limited installation (core features only)
|
||||
# Files must be in app sandbox
|
||||
```
|
||||
|
||||
**Option 2: iSH (Alpine Linux)**
|
||||
```bash
|
||||
# Install iSH from App Store
|
||||
# In iSH terminal:
|
||||
apk add python3 py3-pip git
|
||||
pip install -r requirements-light.txt
|
||||
|
||||
# Basic functionality only
|
||||
```
|
||||
|
||||
**iOS limitations:**
|
||||
- Sandbox restricts file access
|
||||
- No full AI model support
|
||||
- Terminal interface only
|
||||
- Limited to app-accessible files
|
||||
|
||||
## Specialized Deployment Scenarios
|
||||
|
||||
### 🐳 **Docker Deployment**
|
||||
|
||||
**For any platform with Docker:**
|
||||
```dockerfile
|
||||
# Dockerfile
|
||||
FROM python:3.11-slim
|
||||
|
||||
WORKDIR /app
|
||||
COPY . .
|
||||
RUN pip install -r requirements.txt
|
||||
|
||||
# Expose ports for server mode
|
||||
EXPOSE 7777
|
||||
|
||||
# Default to TUI interface
|
||||
CMD ["python", "-m", "mini_rag.cli"]
|
||||
```
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Build and run
|
||||
docker build -t fss-mini-rag .
|
||||
docker run -it -v $(pwd)/projects:/projects fss-mini-rag
|
||||
|
||||
# Server mode for web access
|
||||
docker run -p 7777:7777 fss-mini-rag python -m mini_rag server
|
||||
```
|
||||
|
||||
### ☁️ **Cloud Deployment**
|
||||
|
||||
**AWS/GCP/Azure VM:**
|
||||
- Same as Linux installation
|
||||
- Can serve multiple users
|
||||
- Perfect for team environments
|
||||
|
||||
**GitHub Codespaces:**
|
||||
```bash
|
||||
# Works in any Codespace
|
||||
./install_mini_rag.sh
|
||||
# Perfect for searching your workspace
|
||||
```
|
||||
|
||||
**Replit/CodeSandbox:**
|
||||
- Limited by platform restrictions
|
||||
- Basic functionality available
|
||||
|
||||
### 🏠 **Home Lab Integration**
|
||||
|
||||
**Home Assistant Add-on:**
|
||||
- Package as Home Assistant add-on
|
||||
- Search home automation configs
|
||||
- Voice integration possible
|
||||
|
||||
**NAS Integration:**
|
||||
- Install on Synology/QNAP
|
||||
- Search all stored documents
|
||||
- Family code documentation
|
||||
|
||||
**Router with USB:**
|
||||
- Install on OpenWrt routers with USB storage
|
||||
- Search network documentation
|
||||
- Configuration management
|
||||
|
||||
## Configuration by Use Case
|
||||
|
||||
### 🪶 **Ultra-Lightweight (Old hardware, mobile)**
|
||||
```yaml
|
||||
# Minimal resource usage
|
||||
embedding:
|
||||
preferred_method: hash
|
||||
chunking:
|
||||
max_size: 800
|
||||
strategy: fixed
|
||||
llm:
|
||||
enable_synthesis: false
|
||||
```
|
||||
|
||||
### ⚖️ **Balanced (Raspberry Pi, older laptops)**
|
||||
```yaml
|
||||
# Good performance with AI features
|
||||
embedding:
|
||||
preferred_method: ollama
|
||||
ollama_model: nomic-embed-text
|
||||
llm:
|
||||
synthesis_model: qwen3:0.6b
|
||||
context_window: 4096
|
||||
```
|
||||
|
||||
### 🚀 **Performance (Modern hardware)**
|
||||
```yaml
|
||||
# Full features and performance
|
||||
embedding:
|
||||
preferred_method: ollama
|
||||
ollama_model: nomic-embed-text
|
||||
llm:
|
||||
synthesis_model: qwen3:1.7b
|
||||
context_window: 16384
|
||||
enable_thinking: true
|
||||
```
|
||||
|
||||
### ☁️ **Cloud-Hybrid (Mobile + Cloud AI)**
|
||||
```yaml
|
||||
# Local search, cloud intelligence
|
||||
embedding:
|
||||
preferred_method: hash
|
||||
llm:
|
||||
provider: openai
|
||||
api_key: your_api_key
|
||||
synthesis_model: gpt-4
|
||||
```
|
||||
|
||||
## Troubleshooting by Platform
|
||||
|
||||
### **Raspberry Pi Issues**
|
||||
- **Out of memory**: Reduce context window, use smaller models
|
||||
- **Slow indexing**: Use hash-based embeddings
|
||||
- **Model download fails**: Check internet, use smaller models
|
||||
|
||||
### **Android/Termux Issues**
|
||||
- **Permission denied**: Use `termux-setup-storage`
|
||||
- **Package install fails**: Update packages first
|
||||
- **Can't access files**: Use `/storage/emulated/0/` paths
|
||||
|
||||
### **iOS Issues**
|
||||
- **Limited functionality**: Expected due to iOS restrictions
|
||||
- **Can't install packages**: Use lighter requirements file
|
||||
- **File access denied**: Files must be in app sandbox
|
||||
|
||||
### **Edge Device Issues**
|
||||
- **ARM compatibility**: Ensure using ARM64 Python packages
|
||||
- **Limited RAM**: Use hash embeddings, reduce chunk sizes
|
||||
- **No internet**: Skip AI model downloads, use text-only
|
||||
|
||||
## Advanced Edge Deployments
|
||||
|
||||
### **IoT Integration**
|
||||
- Index sensor logs and configurations
|
||||
- Search device documentation
|
||||
- Troubleshoot IoT deployments
|
||||
|
||||
### **Offline Development**
|
||||
- Complete development environment on edge device
|
||||
- No internet required after setup
|
||||
- Perfect for remote locations
|
||||
|
||||
### **Educational Use**
|
||||
- Raspberry Pi computer labs
|
||||
- Student project search
|
||||
- Coding bootcamp environments
|
||||
|
||||
### **Enterprise Edge**
|
||||
- Factory floor documentation search
|
||||
- Field service technical reference
|
||||
- Remote site troubleshooting
|
||||
|
||||
---
|
||||
|
||||
## Quick Start by Platform
|
||||
|
||||
### Desktop Users
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./install_mini_rag.sh
|
||||
|
||||
# Windows
|
||||
install_windows.bat
|
||||
```
|
||||
|
||||
### Edge/Mobile Users
|
||||
```bash
|
||||
# Raspberry Pi
|
||||
./install_mini_rag.sh
|
||||
|
||||
# Android (Termux)
|
||||
pkg install python git && pip install -r requirements.txt
|
||||
|
||||
# Any Docker platform
|
||||
docker run -it fss-mini-rag
|
||||
```
|
||||
|
||||
**💡 Pro tip**: Start with your current platform, then expand to edge devices as needed. The system scales from smartphones to servers seamlessly!
|
||||
@ -11,6 +11,7 @@
|
||||
- [Search Architecture](#search-architecture)
|
||||
- [Installation Flow](#installation-flow)
|
||||
- [Configuration System](#configuration-system)
|
||||
- [System Context Integration](#system-context-integration)
|
||||
- [Error Handling](#error-handling)
|
||||
|
||||
## System Overview
|
||||
@ -22,10 +23,12 @@ graph TB
|
||||
|
||||
CLI --> Index[📁 Index Project]
|
||||
CLI --> Search[🔍 Search Project]
|
||||
CLI --> Explore[🧠 Explore Project]
|
||||
CLI --> Status[📊 Show Status]
|
||||
|
||||
TUI --> Index
|
||||
TUI --> Search
|
||||
TUI --> Explore
|
||||
TUI --> Config[⚙️ Configuration]
|
||||
|
||||
Index --> Files[📄 File Discovery]
|
||||
@ -34,17 +37,32 @@ graph TB
|
||||
Embed --> Store[💾 Vector Database]
|
||||
|
||||
Search --> Query[❓ User Query]
|
||||
Search --> Context[🖥️ System Context]
|
||||
Query --> Vector[🎯 Vector Search]
|
||||
Query --> Keyword[🔤 Keyword Search]
|
||||
Vector --> Combine[🔄 Hybrid Results]
|
||||
Keyword --> Combine
|
||||
Combine --> Results[📋 Ranked Results]
|
||||
Context --> Combine
|
||||
Combine --> Synthesize{Synthesis Mode?}
|
||||
|
||||
Synthesize -->|Yes| FastLLM[⚡ Fast Synthesis]
|
||||
Synthesize -->|No| Results[📋 Ranked Results]
|
||||
FastLLM --> Results
|
||||
|
||||
Explore --> ExploreQuery[❓ Interactive Query]
|
||||
ExploreQuery --> Memory[🧠 Conversation Memory]
|
||||
ExploreQuery --> Context
|
||||
Memory --> DeepLLM[🤔 Deep AI Analysis]
|
||||
Context --> DeepLLM
|
||||
Vector --> DeepLLM
|
||||
DeepLLM --> Interactive[💬 Interactive Response]
|
||||
|
||||
Store --> LanceDB[(🗄️ LanceDB)]
|
||||
Vector --> LanceDB
|
||||
|
||||
Config --> YAML[📝 config.yaml]
|
||||
Status --> Manifest[📋 manifest.json]
|
||||
Context --> SystemInfo[💻 OS, Python, Paths]
|
||||
```
|
||||
|
||||
## User Journey
|
||||
@ -276,6 +294,58 @@ flowchart TD
|
||||
style Error fill:#ffcdd2
|
||||
```
|
||||
|
||||
## System Context Integration
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph "System Detection"
|
||||
OS[🖥️ Operating System]
|
||||
Python[🐍 Python Version]
|
||||
Project[📁 Project Path]
|
||||
|
||||
OS --> Windows[Windows: rag.bat]
|
||||
OS --> Linux[Linux: ./rag-mini]
|
||||
OS --> macOS[macOS: ./rag-mini]
|
||||
end
|
||||
|
||||
subgraph "Context Collection"
|
||||
Collect[🔍 Collect Context]
|
||||
OS --> Collect
|
||||
Python --> Collect
|
||||
Project --> Collect
|
||||
|
||||
Collect --> Format[📝 Format Context]
|
||||
Format --> Limit[✂️ Limit to 200 chars]
|
||||
end
|
||||
|
||||
subgraph "AI Integration"
|
||||
UserQuery[❓ User Query]
|
||||
SearchResults[📋 Search Results]
|
||||
SystemContext[💻 System Context]
|
||||
|
||||
UserQuery --> Prompt[📝 Build Prompt]
|
||||
SearchResults --> Prompt
|
||||
SystemContext --> Prompt
|
||||
|
||||
Prompt --> AI[🤖 LLM Processing]
|
||||
AI --> Response[💬 Contextual Response]
|
||||
end
|
||||
|
||||
subgraph "Enhanced Responses"
|
||||
Response --> Commands[💻 OS-specific commands]
|
||||
Response --> Paths[📂 Correct path formats]
|
||||
Response --> Tips[💡 Platform-specific tips]
|
||||
end
|
||||
|
||||
Format --> SystemContext
|
||||
|
||||
style SystemContext fill:#e3f2fd
|
||||
style Response fill:#f3e5f5
|
||||
style Commands fill:#e8f5e8
|
||||
```
|
||||
|
||||
*System context helps the AI provide better, platform-specific guidance without compromising privacy*
|
||||
|
||||
## Architecture Layers
|
||||
|
||||
```mermaid
|
||||
|
||||
@ -1,212 +1,314 @@
|
||||
# Getting Started with FSS-Mini-RAG
|
||||
|
||||
## Step 1: Installation
|
||||
> **Get from zero to searching in 2 minutes**
|
||||
> *Everything you need to know to start finding code by meaning, not just keywords*
|
||||
|
||||
Choose your installation based on what you want:
|
||||
## Installation (Choose Your Adventure)
|
||||
|
||||
### Option A: Ollama Only (Recommended)
|
||||
### 🎯 **Option 1: Full Installation (Recommended)**
|
||||
*Gets you everything working reliably with desktop shortcuts and AI features*
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
# Install Ollama first
|
||||
curl -fsSL https://ollama.ai/install.sh | sh
|
||||
|
||||
# Pull the embedding model
|
||||
ollama pull nomic-embed-text
|
||||
|
||||
# Install Python dependencies
|
||||
pip install -r requirements.txt
|
||||
./install_mini_rag.sh
|
||||
```
|
||||
|
||||
### Option B: Full ML Stack
|
||||
```bash
|
||||
# Install everything including PyTorch
|
||||
pip install -r requirements-full.txt
|
||||
**Windows:**
|
||||
```cmd
|
||||
install_windows.bat
|
||||
```
|
||||
|
||||
## Step 2: Test Installation
|
||||
**What this does:**
|
||||
- Sets up Python environment automatically
|
||||
- Installs all dependencies
|
||||
- Downloads AI models (with your permission)
|
||||
- Creates desktop shortcuts and application menu entries
|
||||
- Tests everything works
|
||||
- Gives you an interactive tutorial
|
||||
|
||||
**Time needed:** 5-10 minutes (depending on AI model downloads)
|
||||
|
||||
---
|
||||
|
||||
### 🚀 **Option 2: Copy & Try (Experimental)**
|
||||
*Just copy the folder and run - may work, may need manual setup*
|
||||
|
||||
**Linux/macOS:**
|
||||
```bash
|
||||
# Index this RAG system itself
|
||||
# Copy folder anywhere and try running
|
||||
./rag-mini index ~/my-project
|
||||
# Auto-setup attempts to create virtual environment
|
||||
# Falls back with clear instructions if it fails
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
```cmd
|
||||
# Copy folder anywhere and try running
|
||||
rag.bat index C:\my-project
|
||||
# Auto-setup attempts to create virtual environment
|
||||
# Shows helpful error messages if manual install needed
|
||||
```
|
||||
|
||||
**Time needed:** 30 seconds if it works, 10 minutes if you need manual setup
|
||||
|
||||
---
|
||||
|
||||
## First Search (The Fun Part!)
|
||||
|
||||
### Step 1: Choose Your Interface
|
||||
|
||||
**For Learning and Exploration:**
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-tui
|
||||
|
||||
# Windows
|
||||
rag.bat
|
||||
```
|
||||
*Interactive menus, shows you CLI commands as you learn*
|
||||
|
||||
**For Quick Commands:**
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-mini <command> <project-path>
|
||||
|
||||
# Windows
|
||||
rag.bat <command> <project-path>
|
||||
```
|
||||
*Direct commands when you know what you want*
|
||||
|
||||
### Step 2: Index Your First Project
|
||||
|
||||
**Interactive Way (Recommended for First Time):**
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-tui
|
||||
# Then: Select Project Directory → Index Project
|
||||
|
||||
# Windows
|
||||
rag.bat
|
||||
# Then: Select Project Directory → Index Project
|
||||
```
|
||||
|
||||
**Direct Commands:**
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-mini index ~/my-project
|
||||
|
||||
# Search for something
|
||||
./rag-mini search ~/my-project "chunker function"
|
||||
# Windows
|
||||
rag.bat index C:\my-project
|
||||
```
|
||||
|
||||
# Check what got indexed
|
||||
**What indexing does:**
|
||||
- Finds all text files in your project
|
||||
- Breaks them into smart "chunks" (functions, classes, logical sections)
|
||||
- Creates searchable embeddings that understand meaning
|
||||
- Stores everything in a fast vector database
|
||||
- Creates a `.mini-rag/` directory with your search index
|
||||
|
||||
**Time needed:** 10-60 seconds depending on project size
|
||||
|
||||
### Step 3: Search by Meaning
|
||||
|
||||
**Natural language queries:**
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-mini search ~/my-project "user authentication logic"
|
||||
./rag-mini search ~/my-project "error handling for database connections"
|
||||
./rag-mini search ~/my-project "how to validate input data"
|
||||
|
||||
# Windows
|
||||
rag.bat search C:\my-project "user authentication logic"
|
||||
rag.bat search C:\my-project "error handling for database connections"
|
||||
rag.bat search C:\my-project "how to validate input data"
|
||||
```
|
||||
|
||||
**Code concepts:**
|
||||
```bash
|
||||
# Finds login functions, auth middleware, session handling
|
||||
./rag-mini search ~/my-project "login functionality"
|
||||
|
||||
# Finds try/catch blocks, error handlers, retry logic
|
||||
./rag-mini search ~/my-project "exception handling"
|
||||
|
||||
# Finds validation functions, input sanitization, data checking
|
||||
./rag-mini search ~/my-project "data validation"
|
||||
```
|
||||
|
||||
**What you get:**
|
||||
- Ranked results by relevance (not just keyword matching)
|
||||
- File paths and line numbers for easy navigation
|
||||
- Context around each match so you understand what it does
|
||||
- Smart filtering to avoid noise and duplicates
|
||||
|
||||
## Two Powerful Modes
|
||||
|
||||
FSS-Mini-RAG has two different ways to get answers, optimized for different needs:
|
||||
|
||||
### 🚀 **Synthesis Mode** - Fast Answers
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-mini search ~/project "authentication logic" --synthesize
|
||||
|
||||
# Windows
|
||||
rag.bat search C:\project "authentication logic" --synthesize
|
||||
```
|
||||
|
||||
**Perfect for:**
|
||||
- Quick code discovery
|
||||
- Finding specific functions or patterns
|
||||
- Getting fast, consistent answers
|
||||
|
||||
**What you get:**
|
||||
- Lightning-fast responses (no thinking overhead)
|
||||
- Reliable, factual information about your code
|
||||
- Clear explanations of what code does and how it works
|
||||
|
||||
### 🧠 **Exploration Mode** - Deep Understanding
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-mini explore ~/project
|
||||
|
||||
# Windows
|
||||
rag.bat explore C:\project
|
||||
```
|
||||
|
||||
**Perfect for:**
|
||||
- Learning new codebases
|
||||
- Debugging complex issues
|
||||
- Understanding architectural decisions
|
||||
|
||||
**What you get:**
|
||||
- Interactive conversation with AI that remembers context
|
||||
- Deep reasoning with full "thinking" process shown
|
||||
- Follow-up questions and detailed explanations
|
||||
- Memory of your previous questions in the session
|
||||
|
||||
**Example exploration session:**
|
||||
```
|
||||
🧠 Exploration Mode - Ask anything about your project
|
||||
|
||||
You: How does authentication work in this codebase?
|
||||
|
||||
AI: Let me analyze the authentication system...
|
||||
|
||||
💭 Thinking: I can see several authentication-related files. Let me examine
|
||||
the login flow, session management, and security measures...
|
||||
|
||||
📝 Authentication Analysis:
|
||||
This codebase uses a three-layer authentication system:
|
||||
1. Login validation in auth.py handles username/password checking
|
||||
2. Session management in sessions.py maintains user state
|
||||
3. Middleware in auth_middleware.py protects routes
|
||||
|
||||
You: What security concerns should I be aware of?
|
||||
|
||||
AI: Based on our previous discussion about authentication, let me check for
|
||||
common security vulnerabilities...
|
||||
```
|
||||
|
||||
## Check Your Setup
|
||||
|
||||
**See what got indexed:**
|
||||
```bash
|
||||
# Linux/macOS
|
||||
./rag-mini status ~/my-project
|
||||
|
||||
# Windows
|
||||
rag.bat status C:\my-project
|
||||
```
|
||||
|
||||
## Step 3: Index Your First Project
|
||||
**What you'll see:**
|
||||
- How many files were processed
|
||||
- Total chunks created for searching
|
||||
- Embedding method being used (Ollama, ML models, or hash-based)
|
||||
- Configuration file location
|
||||
- Index health and last update time
|
||||
|
||||
## Configuration (Optional)
|
||||
|
||||
Your project gets a `.mini-rag/config.yaml` file with helpful comments:
|
||||
|
||||
```yaml
|
||||
# Context window configuration (critical for AI features)
|
||||
# 💡 Sizing guide: 2K=1 question, 4K=1-2 questions, 8K=manageable, 16K=most users
|
||||
# 32K=large codebases, 64K+=power users only
|
||||
# ⚠️ Larger contexts use exponentially more CPU/memory - only increase if needed
|
||||
context_window: 16384 # Context size in tokens
|
||||
|
||||
# AI model preferences (edit to change priority)
|
||||
model_rankings:
|
||||
- "qwen3:1.7b" # Excellent for RAG (1.4GB, recommended)
|
||||
- "qwen3:0.6b" # Lightweight and fast (~500MB)
|
||||
- "qwen3:4b" # Higher quality but slower (~2.5GB)
|
||||
```
|
||||
|
||||
**When to customize:**
|
||||
- Your searches aren't finding what you expect → adjust chunking settings
|
||||
- You want AI features → install Ollama and download models
|
||||
- System is slow → try smaller models or reduce context window
|
||||
- Getting too many/few results → adjust similarity threshold
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Project not indexed"
|
||||
**Problem:** You're trying to search before indexing
|
||||
```bash
|
||||
# Index any project directory
|
||||
./rag-mini index /path/to/your/project
|
||||
|
||||
# The system creates .mini-rag/ directory with:
|
||||
# - config.json (settings)
|
||||
# - manifest.json (file tracking)
|
||||
# - database.lance/ (vector database)
|
||||
# Run indexing first
|
||||
./rag-mini index ~/my-project # Linux/macOS
|
||||
rag.bat index C:\my-project # Windows
|
||||
```
|
||||
|
||||
## Step 4: Search Your Code
|
||||
|
||||
### "No Ollama models available"
|
||||
**Problem:** AI features need models downloaded
|
||||
```bash
|
||||
# Basic semantic search
|
||||
./rag-mini search /path/to/project "user login logic"
|
||||
# Install Ollama first
|
||||
curl -fsSL https://ollama.ai/install.sh | sh # Linux/macOS
|
||||
# Or download from https://ollama.com # Windows
|
||||
|
||||
# Enhanced search with smart features
|
||||
./rag-mini-enhanced search /path/to/project "authentication"
|
||||
# Start Ollama server
|
||||
ollama serve
|
||||
|
||||
# Find similar patterns
|
||||
./rag-mini-enhanced similar /path/to/project "def validate_input"
|
||||
# Download a model
|
||||
ollama pull qwen3:1.7b
|
||||
```
|
||||
|
||||
## Step 5: Customize Configuration
|
||||
|
||||
Edit `project/.mini-rag/config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"chunking": {
|
||||
"max_size": 3000,
|
||||
"strategy": "semantic"
|
||||
},
|
||||
"files": {
|
||||
"min_file_size": 100
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then re-index to apply changes:
|
||||
### "Virtual environment not found"
|
||||
**Problem:** Auto-setup didn't work, need manual installation
|
||||
```bash
|
||||
./rag-mini index /path/to/project --force
|
||||
# Run the full installer instead
|
||||
./install_mini_rag.sh # Linux/macOS
|
||||
install_windows.bat # Windows
|
||||
```
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
### Find Functions by Name
|
||||
### Getting weird results
|
||||
**Solution:** Try different search terms or check what got indexed
|
||||
```bash
|
||||
./rag-mini search /project "function named connect_to_database"
|
||||
# See what files were processed
|
||||
./rag-mini status ~/my-project
|
||||
|
||||
# Try more specific queries
|
||||
./rag-mini search ~/my-project "specific function name"
|
||||
```
|
||||
|
||||
### Find Code Patterns
|
||||
```bash
|
||||
./rag-mini search /project "error handling try catch"
|
||||
./rag-mini search /project "database query with parameters"
|
||||
```
|
||||
## Next Steps
|
||||
|
||||
### Find Configuration
|
||||
```bash
|
||||
./rag-mini search /project "database connection settings"
|
||||
./rag-mini search /project "environment variables"
|
||||
```
|
||||
### Learn More
|
||||
- **[Beginner's Glossary](BEGINNER_GLOSSARY.md)** - All the terms explained simply
|
||||
- **[TUI Guide](TUI_GUIDE.md)** - Master the interactive interface
|
||||
- **[Visual Diagrams](DIAGRAMS.md)** - See how everything works
|
||||
|
||||
### Find Documentation
|
||||
```bash
|
||||
./rag-mini search /project "how to deploy"
|
||||
./rag-mini search /project "API documentation"
|
||||
```
|
||||
### Advanced Features
|
||||
- **[Query Expansion](QUERY_EXPANSION.md)** - Make searches smarter with AI
|
||||
- **[LLM Providers](LLM_PROVIDERS.md)** - Use different AI models
|
||||
- **[CPU Deployment](CPU_DEPLOYMENT.md)** - Optimize for older computers
|
||||
|
||||
## Python API Usage
|
||||
### Customize Everything
|
||||
- **[Technical Guide](TECHNICAL_GUIDE.md)** - How the system actually works
|
||||
- **[Configuration Examples](../examples/)** - Pre-made configs for different needs
|
||||
|
||||
```python
|
||||
from mini_rag import ProjectIndexer, CodeSearcher, CodeEmbedder
|
||||
from pathlib import Path
|
||||
---
|
||||
|
||||
# Initialize
|
||||
project_path = Path("/path/to/your/project")
|
||||
embedder = CodeEmbedder()
|
||||
indexer = ProjectIndexer(project_path, embedder)
|
||||
searcher = CodeSearcher(project_path, embedder)
|
||||
**🎉 That's it!** You now have a semantic search system that understands your code by meaning, not just keywords. Start with simple searches and work your way up to the advanced AI features as you get comfortable.
|
||||
|
||||
# Index the project
|
||||
print("Indexing project...")
|
||||
result = indexer.index_project()
|
||||
print(f"Indexed {result['files_processed']} files, {result['chunks_created']} chunks")
|
||||
|
||||
# Search
|
||||
print("\nSearching for authentication code...")
|
||||
results = searcher.search("user authentication logic", top_k=5)
|
||||
|
||||
for i, result in enumerate(results, 1):
|
||||
print(f"\n{i}. {result.file_path}")
|
||||
print(f" Score: {result.score:.3f}")
|
||||
print(f" Type: {result.chunk_type}")
|
||||
print(f" Content: {result.content[:100]}...")
|
||||
```
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### Auto-optimization
|
||||
```bash
|
||||
# Get optimization suggestions
|
||||
./rag-mini-enhanced analyze /path/to/project
|
||||
|
||||
# This analyzes your codebase and suggests:
|
||||
# - Better chunk sizes for your language mix
|
||||
# - Streaming settings for large files
|
||||
# - File filtering optimizations
|
||||
```
|
||||
|
||||
### File Watching
|
||||
```python
|
||||
from mini_rag import FileWatcher
|
||||
|
||||
# Watch for file changes and auto-update index
|
||||
watcher = FileWatcher(project_path, indexer)
|
||||
watcher.start_watching()
|
||||
|
||||
# Now any file changes automatically update the index
|
||||
```
|
||||
|
||||
### Custom Chunking
|
||||
```python
|
||||
from mini_rag import CodeChunker
|
||||
|
||||
chunker = CodeChunker()
|
||||
|
||||
# Chunk a Python file
|
||||
with open("example.py") as f:
|
||||
content = f.read()
|
||||
|
||||
chunks = chunker.chunk_text(content, "python", "example.py")
|
||||
for chunk in chunks:
|
||||
print(f"Type: {chunk.chunk_type}")
|
||||
print(f"Content: {chunk.content}")
|
||||
```
|
||||
|
||||
## Tips and Best Practices
|
||||
|
||||
### For Better Search Results
|
||||
- Use descriptive phrases: "function that validates email addresses"
|
||||
- Try different phrasings if first search doesn't work
|
||||
- Search for concepts, not just exact variable names
|
||||
|
||||
### For Better Indexing
|
||||
- Exclude build directories: `node_modules/`, `build/`, `dist/`
|
||||
- Include documentation files - they often contain valuable context
|
||||
- Use semantic chunking strategy for most projects
|
||||
|
||||
### For Configuration
|
||||
- Start with default settings
|
||||
- Use `analyze` command to get optimization suggestions
|
||||
- Increase chunk size for larger functions/classes
|
||||
- Decrease chunk size for more granular search
|
||||
|
||||
### For Troubleshooting
|
||||
- Check `./rag-mini status` to see what was indexed
|
||||
- Look at `.mini-rag/manifest.json` for file details
|
||||
- Run with `--force` to completely rebuild index
|
||||
- Check logs in `.mini-rag/` directory for errors
|
||||
|
||||
## What's Next?
|
||||
|
||||
1. Try the test suite to understand how components work:
|
||||
```bash
|
||||
python -m pytest tests/ -v
|
||||
```
|
||||
|
||||
2. Look at the examples in `examples/` directory
|
||||
|
||||
3. Read the main README.md for complete technical details
|
||||
|
||||
4. Customize the system for your specific project needs
|
||||
**💡 Pro tip:** The best way to learn is to index a project you know well and try searching for things you know are in there. You'll quickly see how much better meaning-based search is than traditional keyword search.
|
||||
@ -194,6 +194,16 @@ class ConfigManager:
|
||||
|
||||
return config
|
||||
|
||||
except yaml.YAMLError as e:
|
||||
# YAML syntax error - help user fix it instead of silent fallback
|
||||
error_msg = f"⚠️ Config file has YAML syntax error at line {getattr(e, 'problem_mark', 'unknown')}: {e}"
|
||||
logger.error(error_msg)
|
||||
print(f"\n{error_msg}")
|
||||
print(f"Config file: {self.config_path}")
|
||||
print("💡 Check YAML syntax (indentation, quotes, colons)")
|
||||
print("💡 Or delete config file to reset to defaults")
|
||||
return RAGConfig() # Still return defaults but warn user
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load config from {self.config_path}: {e}")
|
||||
logger.info("Using default configuration")
|
||||
@ -210,7 +220,15 @@ class ConfigManager:
|
||||
# Create YAML content with comments
|
||||
yaml_content = self._create_yaml_with_comments(config_dict)
|
||||
|
||||
# Write with basic file locking to prevent corruption
|
||||
with open(self.config_path, 'w') as f:
|
||||
try:
|
||||
import fcntl
|
||||
fcntl.flock(f.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB) # Non-blocking exclusive lock
|
||||
f.write(yaml_content)
|
||||
fcntl.flock(f.fileno(), fcntl.LOCK_UN) # Unlock
|
||||
except (OSError, ImportError):
|
||||
# Fallback for Windows or if fcntl unavailable
|
||||
f.write(yaml_content)
|
||||
|
||||
logger.info(f"Configuration saved to {self.config_path}")
|
||||
@ -274,7 +292,11 @@ class ConfigManager:
|
||||
f" synthesis_temperature: {config_dict['llm']['synthesis_temperature']} # LLM temperature for analysis",
|
||||
"",
|
||||
" # Context window configuration (critical for RAG performance)",
|
||||
f" context_window: {config_dict['llm']['context_window']} # Context size in tokens (8K=fast, 16K=balanced, 32K=advanced)",
|
||||
" # 💡 Sizing guide: 2K=1 question, 4K=1-2 questions, 8K=manageable, 16K=most users",
|
||||
" # 32K=large codebases, 64K+=power users only",
|
||||
" # ⚠️ Larger contexts use exponentially more CPU/memory - only increase if needed",
|
||||
" # 🔧 Low context limits? Try smaller topk, better search terms, or archive noise",
|
||||
f" context_window: {config_dict['llm']['context_window']} # Context size in tokens",
|
||||
f" auto_context: {str(config_dict['llm']['auto_context']).lower()} # Auto-adjust context based on model capabilities",
|
||||
"",
|
||||
" model_rankings: # Preferred model order (edit to change priority)",
|
||||
|
||||
@ -17,11 +17,13 @@ try:
|
||||
from .llm_synthesizer import LLMSynthesizer, SynthesisResult
|
||||
from .search import CodeSearcher
|
||||
from .config import RAGConfig
|
||||
from .system_context import get_system_context
|
||||
except ImportError:
|
||||
# For direct testing
|
||||
from llm_synthesizer import LLMSynthesizer, SynthesisResult
|
||||
from search import CodeSearcher
|
||||
from config import RAGConfig
|
||||
get_system_context = lambda x=None: ""
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@ -154,10 +156,15 @@ Content: {content[:800]}{'...' if len(content) > 800 else ''}
|
||||
|
||||
results_text = "\n".join(results_context)
|
||||
|
||||
# Get system context for better responses
|
||||
system_context = get_system_context(self.project_path)
|
||||
|
||||
# Create comprehensive exploration prompt with thinking
|
||||
prompt = f"""<think>
|
||||
The user asked: "{question}"
|
||||
|
||||
System context: {system_context}
|
||||
|
||||
Let me analyze what they're asking and look at the information I have available.
|
||||
|
||||
From the search results, I can see relevant information about:
|
||||
|
||||
@ -16,11 +16,13 @@ from pathlib import Path
|
||||
|
||||
try:
|
||||
from .llm_safeguards import ModelRunawayDetector, SafeguardConfig, get_optimal_ollama_parameters
|
||||
from .system_context import get_system_context
|
||||
except ImportError:
|
||||
# Graceful fallback if safeguards not available
|
||||
ModelRunawayDetector = None
|
||||
SafeguardConfig = None
|
||||
get_optimal_ollama_parameters = lambda x: {}
|
||||
get_system_context = lambda x=None: ""
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@ -175,12 +177,20 @@ class LLMSynthesizer:
|
||||
# Ensure we're initialized
|
||||
self._ensure_initialized()
|
||||
|
||||
# Use the best available model
|
||||
# Use the best available model with retry logic
|
||||
model_to_use = self.model
|
||||
if self.model not in self.available_models:
|
||||
# Refresh model list in case of race condition
|
||||
logger.warning(f"Configured model {self.model} not in available list, refreshing...")
|
||||
self.available_models = self._get_available_models()
|
||||
|
||||
if self.model in self.available_models:
|
||||
model_to_use = self.model
|
||||
logger.info(f"Model {self.model} found after refresh")
|
||||
elif self.available_models:
|
||||
# Fallback to first available model
|
||||
if self.available_models:
|
||||
model_to_use = self.available_models[0]
|
||||
logger.warning(f"Using fallback model: {model_to_use}")
|
||||
else:
|
||||
logger.error("No Ollama models available")
|
||||
return None
|
||||
@ -587,9 +597,13 @@ Content: {content[:500]}{'...' if len(content) > 500 else ''}
|
||||
|
||||
context = "\n".join(context_parts)
|
||||
|
||||
# Create synthesis prompt
|
||||
# Get system context for better responses
|
||||
system_context = get_system_context(project_path)
|
||||
|
||||
# Create synthesis prompt with system context
|
||||
prompt = f"""You are a senior software engineer analyzing code search results. Your task is to synthesize the search results into a helpful, actionable summary.
|
||||
|
||||
SYSTEM CONTEXT: {system_context}
|
||||
SEARCH QUERY: "{query}"
|
||||
PROJECT: {project_path.name}
|
||||
|
||||
|
||||
123
mini_rag/system_context.py
Normal file
123
mini_rag/system_context.py
Normal file
@ -0,0 +1,123 @@
|
||||
"""
|
||||
System Context Collection for Enhanced RAG Grounding
|
||||
|
||||
Collects minimal system information to help the LLM provide better,
|
||||
context-aware assistance without compromising privacy.
|
||||
"""
|
||||
|
||||
import platform
|
||||
import sys
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional
|
||||
|
||||
|
||||
class SystemContextCollector:
|
||||
"""Collects system context information for enhanced LLM grounding."""
|
||||
|
||||
@staticmethod
|
||||
def get_system_context(project_path: Optional[Path] = None) -> str:
|
||||
"""
|
||||
Get concise system context for LLM grounding.
|
||||
|
||||
Args:
|
||||
project_path: Current project directory
|
||||
|
||||
Returns:
|
||||
Formatted system context string (max 200 chars for privacy)
|
||||
"""
|
||||
try:
|
||||
# Basic system info
|
||||
os_name = platform.system()
|
||||
python_ver = f"{sys.version_info.major}.{sys.version_info.minor}"
|
||||
|
||||
# Simplified OS names
|
||||
os_short = {
|
||||
'Windows': 'Win',
|
||||
'Linux': 'Linux',
|
||||
'Darwin': 'macOS'
|
||||
}.get(os_name, os_name)
|
||||
|
||||
# Working directory info
|
||||
if project_path:
|
||||
# Use relative or shortened path for privacy
|
||||
try:
|
||||
rel_path = project_path.relative_to(Path.home())
|
||||
path_info = f"~/{rel_path}"
|
||||
except ValueError:
|
||||
# If not relative to home, just use folder name
|
||||
path_info = project_path.name
|
||||
else:
|
||||
path_info = Path.cwd().name
|
||||
|
||||
# Trim path if too long for our 200-char limit
|
||||
if len(path_info) > 50:
|
||||
path_info = f".../{path_info[-45:]}"
|
||||
|
||||
# Command style hints
|
||||
cmd_style = "rag.bat" if os_name == "Windows" else "./rag-mini"
|
||||
|
||||
# Format concise context
|
||||
context = f"[{os_short} {python_ver}, {path_info}, use {cmd_style}]"
|
||||
|
||||
# Ensure we stay under 200 chars
|
||||
if len(context) > 200:
|
||||
context = context[:197] + "...]"
|
||||
|
||||
return context
|
||||
|
||||
except Exception:
|
||||
# Fallback to minimal info if anything fails
|
||||
return f"[{platform.system()}, Python {sys.version_info.major}.{sys.version_info.minor}]"
|
||||
|
||||
@staticmethod
|
||||
def get_command_context(os_name: Optional[str] = None) -> Dict[str, str]:
|
||||
"""
|
||||
Get OS-appropriate command examples.
|
||||
|
||||
Returns:
|
||||
Dictionary with command patterns for the current OS
|
||||
"""
|
||||
if os_name is None:
|
||||
os_name = platform.system()
|
||||
|
||||
if os_name == "Windows":
|
||||
return {
|
||||
"launcher": "rag.bat",
|
||||
"index": "rag.bat index C:\\path\\to\\project",
|
||||
"search": "rag.bat search C:\\path\\to\\project \"query\"",
|
||||
"explore": "rag.bat explore C:\\path\\to\\project",
|
||||
"path_sep": "\\",
|
||||
"example_path": "C:\\Users\\username\\Documents\\myproject"
|
||||
}
|
||||
else:
|
||||
return {
|
||||
"launcher": "./rag-mini",
|
||||
"index": "./rag-mini index /path/to/project",
|
||||
"search": "./rag-mini search /path/to/project \"query\"",
|
||||
"explore": "./rag-mini explore /path/to/project",
|
||||
"path_sep": "/",
|
||||
"example_path": "~/Documents/myproject"
|
||||
}
|
||||
|
||||
|
||||
def get_system_context(project_path: Optional[Path] = None) -> str:
|
||||
"""Convenience function to get system context."""
|
||||
return SystemContextCollector.get_system_context(project_path)
|
||||
|
||||
|
||||
def get_command_context() -> Dict[str, str]:
|
||||
"""Convenience function to get command context."""
|
||||
return SystemContextCollector.get_command_context()
|
||||
|
||||
|
||||
# Test function
|
||||
if __name__ == "__main__":
|
||||
print("System Context Test:")
|
||||
print(f"Context: {get_system_context()}")
|
||||
print(f"Context with path: {get_system_context(Path('/tmp/test'))}")
|
||||
print()
|
||||
print("Command Context:")
|
||||
cmds = get_command_context()
|
||||
for key, value in cmds.items():
|
||||
print(f" {key}: {value}")
|
||||
Loading…
x
Reference in New Issue
Block a user