Complete deployment expansion and system context integration
Major enhancements: • Add comprehensive deployment guide covering all platforms (mobile, edge, cloud) • Implement system context collection for enhanced AI responses • Update documentation with current workflows and deployment scenarios • Fix Windows compatibility bugs in file locking system • Enhanced diagrams with system context integration flow • Improved exploration mode with better context handling Platform support expanded: • Full macOS compatibility verified • Raspberry Pi deployment with ARM64 optimizations • Android deployment via Termux with configuration examples • Edge device deployment strategies and performance guidelines • Docker containerization for universal deployment Technical improvements: • System context module provides OS/environment awareness to AI • Context-aware prompts improve response relevance • Enhanced error handling and graceful fallbacks • Better integration between synthesis and exploration modes Documentation updates: • Complete deployment guide with troubleshooting • Updated getting started guide with current installation flows • Enhanced visual diagrams showing system architecture • Platform-specific configuration examples Ready for extended deployment testing and user feedback.
This commit is contained in:
parent
8e67c76c6d
commit
f5de046f95
381
docs/DEPLOYMENT_GUIDE.md
Normal file
381
docs/DEPLOYMENT_GUIDE.md
Normal file
@ -0,0 +1,381 @@
|
|||||||
|
# FSS-Mini-RAG Deployment Guide
|
||||||
|
|
||||||
|
> **Run semantic search anywhere - from smartphones to edge devices**
|
||||||
|
> *Complete guide to deploying FSS-Mini-RAG on every platform imaginable*
|
||||||
|
|
||||||
|
## Platform Compatibility Matrix
|
||||||
|
|
||||||
|
| Platform | Status | AI Features | Installation | Notes |
|
||||||
|
|----------|--------|-------------|--------------|-------|
|
||||||
|
| **Linux** | ✅ Full | ✅ Full | `./install_mini_rag.sh` | Primary platform |
|
||||||
|
| **Windows** | ✅ Full | ✅ Full | `install_windows.bat` | Desktop shortcuts |
|
||||||
|
| **macOS** | ✅ Full | ✅ Full | `./install_mini_rag.sh` | Works perfectly |
|
||||||
|
| **Raspberry Pi** | ✅ Excellent | ✅ AI ready | `./install_mini_rag.sh` | ARM64 optimized |
|
||||||
|
| **Android (Termux)** | ✅ Good | 🟡 Limited | Manual install | Terminal interface |
|
||||||
|
| **iOS (a-Shell)** | 🟡 Limited | ❌ Text only | Manual install | Sandbox limitations |
|
||||||
|
| **Docker** | ✅ Excellent | ✅ Full | Dockerfile | Any platform |
|
||||||
|
|
||||||
|
## Desktop & Server Deployment
|
||||||
|
|
||||||
|
### 🐧 **Linux** (Primary Platform)
|
||||||
|
```bash
|
||||||
|
# Full installation with AI features
|
||||||
|
./install_mini_rag.sh
|
||||||
|
|
||||||
|
# What you get:
|
||||||
|
# ✅ Desktop shortcuts (.desktop files)
|
||||||
|
# ✅ Application menu integration
|
||||||
|
# ✅ Full AI model downloads
|
||||||
|
# ✅ Complete terminal interface
|
||||||
|
```
|
||||||
|
|
||||||
|
### 🪟 **Windows** (Fully Supported)
|
||||||
|
```cmd
|
||||||
|
# Full installation with desktop integration
|
||||||
|
install_windows.bat
|
||||||
|
|
||||||
|
# What you get:
|
||||||
|
# ✅ Desktop shortcuts (.lnk files)
|
||||||
|
# ✅ Start Menu entries
|
||||||
|
# ✅ Full AI model downloads
|
||||||
|
# ✅ Beautiful terminal interface
|
||||||
|
```
|
||||||
|
|
||||||
|
### 🍎 **macOS** (Excellent Support)
|
||||||
|
```bash
|
||||||
|
# Same as Linux - works perfectly
|
||||||
|
./install_mini_rag.sh
|
||||||
|
|
||||||
|
# Additional macOS optimizations:
|
||||||
|
brew install python3 # If needed
|
||||||
|
brew install ollama # For AI features
|
||||||
|
```
|
||||||
|
|
||||||
|
**macOS-specific features:**
|
||||||
|
- Automatic path detection for common project locations
|
||||||
|
- Integration with Spotlight search locations
|
||||||
|
- Support for `.app` bundle creation (advanced)
|
||||||
|
|
||||||
|
## Edge Device Deployment
|
||||||
|
|
||||||
|
### 🥧 **Raspberry Pi** (Recommended Edge Platform)
|
||||||
|
|
||||||
|
**Perfect for:**
|
||||||
|
- Home lab semantic search server
|
||||||
|
- Portable development environment
|
||||||
|
- IoT project documentation search
|
||||||
|
- Offline code search station
|
||||||
|
|
||||||
|
**Installation:**
|
||||||
|
```bash
|
||||||
|
# On Raspberry Pi OS (64-bit recommended)
|
||||||
|
sudo apt update && sudo apt upgrade
|
||||||
|
./install_mini_rag.sh
|
||||||
|
|
||||||
|
# The installer automatically detects ARM and optimizes:
|
||||||
|
# ✅ Suggests lightweight models (qwen3:0.6b)
|
||||||
|
# ✅ Reduces memory usage
|
||||||
|
# ✅ Enables efficient chunking
|
||||||
|
```
|
||||||
|
|
||||||
|
**Raspberry Pi optimized config:**
|
||||||
|
```yaml
|
||||||
|
# Automatically generated for Pi
|
||||||
|
embedding:
|
||||||
|
preferred_method: ollama
|
||||||
|
ollama_model: nomic-embed-text # 270MB - perfect for Pi
|
||||||
|
|
||||||
|
llm:
|
||||||
|
synthesis_model: qwen3:0.6b # 500MB - fast on Pi 4+
|
||||||
|
context_window: 4096 # Conservative memory use
|
||||||
|
cpu_optimized: true
|
||||||
|
|
||||||
|
chunking:
|
||||||
|
max_size: 1500 # Smaller chunks for efficiency
|
||||||
|
```
|
||||||
|
|
||||||
|
**Performance expectations:**
|
||||||
|
- **Pi 4 (4GB)**: Excellent performance, full AI features
|
||||||
|
- **Pi 4 (2GB)**: Good performance, text-only or small models
|
||||||
|
- **Pi 5**: Outstanding performance, handles large models
|
||||||
|
- **Pi Zero**: Text-only search (hash-based embeddings)
|
||||||
|
|
||||||
|
### 🔧 **Other Edge Devices**
|
||||||
|
|
||||||
|
**NVIDIA Jetson Series:**
|
||||||
|
- Overkill performance for this use case
|
||||||
|
- Can run largest models with GPU acceleration
|
||||||
|
- Perfect for AI-heavy development workstations
|
||||||
|
|
||||||
|
**Intel NUC / Mini PCs:**
|
||||||
|
- Excellent performance
|
||||||
|
- Full desktop experience
|
||||||
|
- Can serve multiple users simultaneously
|
||||||
|
|
||||||
|
**Orange Pi / Rock Pi:**
|
||||||
|
- Similar to Raspberry Pi
|
||||||
|
- Same installation process
|
||||||
|
- May need manual Ollama compilation
|
||||||
|
|
||||||
|
## Mobile Deployment
|
||||||
|
|
||||||
|
### 📱 **Android (Recommended: Termux)**
|
||||||
|
|
||||||
|
**Installation in Termux:**
|
||||||
|
```bash
|
||||||
|
# Install Termux from F-Droid (not Play Store)
|
||||||
|
# In Termux:
|
||||||
|
pkg update && pkg upgrade
|
||||||
|
pkg install python python-pip git
|
||||||
|
pip install --upgrade pip
|
||||||
|
|
||||||
|
# Clone and install FSS-Mini-RAG
|
||||||
|
git clone https://github.com/your-repo/fss-mini-rag
|
||||||
|
cd fss-mini-rag
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Quick start
|
||||||
|
python -m mini_rag index /storage/emulated/0/Documents/myproject
|
||||||
|
python -m mini_rag search /storage/emulated/0/Documents/myproject "your query"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Android-optimized config:**
|
||||||
|
```yaml
|
||||||
|
# config-android.yaml
|
||||||
|
embedding:
|
||||||
|
preferred_method: hash # No heavy models needed
|
||||||
|
|
||||||
|
chunking:
|
||||||
|
max_size: 800 # Small chunks for mobile
|
||||||
|
|
||||||
|
files:
|
||||||
|
min_file_size: 20 # Include more small files
|
||||||
|
|
||||||
|
llm:
|
||||||
|
enable_synthesis: false # Text-only for speed
|
||||||
|
```
|
||||||
|
|
||||||
|
**What works on Android:**
|
||||||
|
- ✅ Full text search and indexing
|
||||||
|
- ✅ Terminal interface (`rag-tui`)
|
||||||
|
- ✅ Project indexing from phone storage
|
||||||
|
- ✅ Search your phone's code projects
|
||||||
|
- ❌ Heavy AI models (use cloud providers instead)
|
||||||
|
|
||||||
|
**Android use cases:**
|
||||||
|
- Search your mobile development projects
|
||||||
|
- Index documentation on your phone
|
||||||
|
- Quick code reference while traveling
|
||||||
|
- Offline search of downloaded repositories
|
||||||
|
|
||||||
|
### 🍎 **iOS (Limited but Possible)**
|
||||||
|
|
||||||
|
**Option 1: a-Shell (Free)**
|
||||||
|
```bash
|
||||||
|
# Install a-Shell from App Store
|
||||||
|
# In a-Shell:
|
||||||
|
pip install requests pathlib
|
||||||
|
|
||||||
|
# Limited installation (core features only)
|
||||||
|
# Files must be in app sandbox
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 2: iSH (Alpine Linux)**
|
||||||
|
```bash
|
||||||
|
# Install iSH from App Store
|
||||||
|
# In iSH terminal:
|
||||||
|
apk add python3 py3-pip git
|
||||||
|
pip install -r requirements-light.txt
|
||||||
|
|
||||||
|
# Basic functionality only
|
||||||
|
```
|
||||||
|
|
||||||
|
**iOS limitations:**
|
||||||
|
- Sandbox restricts file access
|
||||||
|
- No full AI model support
|
||||||
|
- Terminal interface only
|
||||||
|
- Limited to app-accessible files
|
||||||
|
|
||||||
|
## Specialized Deployment Scenarios
|
||||||
|
|
||||||
|
### 🐳 **Docker Deployment**
|
||||||
|
|
||||||
|
**For any platform with Docker:**
|
||||||
|
```dockerfile
|
||||||
|
# Dockerfile
|
||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
COPY . .
|
||||||
|
RUN pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Expose ports for server mode
|
||||||
|
EXPOSE 7777
|
||||||
|
|
||||||
|
# Default to TUI interface
|
||||||
|
CMD ["python", "-m", "mini_rag.cli"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Usage:**
|
||||||
|
```bash
|
||||||
|
# Build and run
|
||||||
|
docker build -t fss-mini-rag .
|
||||||
|
docker run -it -v $(pwd)/projects:/projects fss-mini-rag
|
||||||
|
|
||||||
|
# Server mode for web access
|
||||||
|
docker run -p 7777:7777 fss-mini-rag python -m mini_rag server
|
||||||
|
```
|
||||||
|
|
||||||
|
### ☁️ **Cloud Deployment**
|
||||||
|
|
||||||
|
**AWS/GCP/Azure VM:**
|
||||||
|
- Same as Linux installation
|
||||||
|
- Can serve multiple users
|
||||||
|
- Perfect for team environments
|
||||||
|
|
||||||
|
**GitHub Codespaces:**
|
||||||
|
```bash
|
||||||
|
# Works in any Codespace
|
||||||
|
./install_mini_rag.sh
|
||||||
|
# Perfect for searching your workspace
|
||||||
|
```
|
||||||
|
|
||||||
|
**Replit/CodeSandbox:**
|
||||||
|
- Limited by platform restrictions
|
||||||
|
- Basic functionality available
|
||||||
|
|
||||||
|
### 🏠 **Home Lab Integration**
|
||||||
|
|
||||||
|
**Home Assistant Add-on:**
|
||||||
|
- Package as Home Assistant add-on
|
||||||
|
- Search home automation configs
|
||||||
|
- Voice integration possible
|
||||||
|
|
||||||
|
**NAS Integration:**
|
||||||
|
- Install on Synology/QNAP
|
||||||
|
- Search all stored documents
|
||||||
|
- Family code documentation
|
||||||
|
|
||||||
|
**Router with USB:**
|
||||||
|
- Install on OpenWrt routers with USB storage
|
||||||
|
- Search network documentation
|
||||||
|
- Configuration management
|
||||||
|
|
||||||
|
## Configuration by Use Case
|
||||||
|
|
||||||
|
### 🪶 **Ultra-Lightweight (Old hardware, mobile)**
|
||||||
|
```yaml
|
||||||
|
# Minimal resource usage
|
||||||
|
embedding:
|
||||||
|
preferred_method: hash
|
||||||
|
chunking:
|
||||||
|
max_size: 800
|
||||||
|
strategy: fixed
|
||||||
|
llm:
|
||||||
|
enable_synthesis: false
|
||||||
|
```
|
||||||
|
|
||||||
|
### ⚖️ **Balanced (Raspberry Pi, older laptops)**
|
||||||
|
```yaml
|
||||||
|
# Good performance with AI features
|
||||||
|
embedding:
|
||||||
|
preferred_method: ollama
|
||||||
|
ollama_model: nomic-embed-text
|
||||||
|
llm:
|
||||||
|
synthesis_model: qwen3:0.6b
|
||||||
|
context_window: 4096
|
||||||
|
```
|
||||||
|
|
||||||
|
### 🚀 **Performance (Modern hardware)**
|
||||||
|
```yaml
|
||||||
|
# Full features and performance
|
||||||
|
embedding:
|
||||||
|
preferred_method: ollama
|
||||||
|
ollama_model: nomic-embed-text
|
||||||
|
llm:
|
||||||
|
synthesis_model: qwen3:1.7b
|
||||||
|
context_window: 16384
|
||||||
|
enable_thinking: true
|
||||||
|
```
|
||||||
|
|
||||||
|
### ☁️ **Cloud-Hybrid (Mobile + Cloud AI)**
|
||||||
|
```yaml
|
||||||
|
# Local search, cloud intelligence
|
||||||
|
embedding:
|
||||||
|
preferred_method: hash
|
||||||
|
llm:
|
||||||
|
provider: openai
|
||||||
|
api_key: your_api_key
|
||||||
|
synthesis_model: gpt-4
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting by Platform
|
||||||
|
|
||||||
|
### **Raspberry Pi Issues**
|
||||||
|
- **Out of memory**: Reduce context window, use smaller models
|
||||||
|
- **Slow indexing**: Use hash-based embeddings
|
||||||
|
- **Model download fails**: Check internet, use smaller models
|
||||||
|
|
||||||
|
### **Android/Termux Issues**
|
||||||
|
- **Permission denied**: Use `termux-setup-storage`
|
||||||
|
- **Package install fails**: Update packages first
|
||||||
|
- **Can't access files**: Use `/storage/emulated/0/` paths
|
||||||
|
|
||||||
|
### **iOS Issues**
|
||||||
|
- **Limited functionality**: Expected due to iOS restrictions
|
||||||
|
- **Can't install packages**: Use lighter requirements file
|
||||||
|
- **File access denied**: Files must be in app sandbox
|
||||||
|
|
||||||
|
### **Edge Device Issues**
|
||||||
|
- **ARM compatibility**: Ensure using ARM64 Python packages
|
||||||
|
- **Limited RAM**: Use hash embeddings, reduce chunk sizes
|
||||||
|
- **No internet**: Skip AI model downloads, use text-only
|
||||||
|
|
||||||
|
## Advanced Edge Deployments
|
||||||
|
|
||||||
|
### **IoT Integration**
|
||||||
|
- Index sensor logs and configurations
|
||||||
|
- Search device documentation
|
||||||
|
- Troubleshoot IoT deployments
|
||||||
|
|
||||||
|
### **Offline Development**
|
||||||
|
- Complete development environment on edge device
|
||||||
|
- No internet required after setup
|
||||||
|
- Perfect for remote locations
|
||||||
|
|
||||||
|
### **Educational Use**
|
||||||
|
- Raspberry Pi computer labs
|
||||||
|
- Student project search
|
||||||
|
- Coding bootcamp environments
|
||||||
|
|
||||||
|
### **Enterprise Edge**
|
||||||
|
- Factory floor documentation search
|
||||||
|
- Field service technical reference
|
||||||
|
- Remote site troubleshooting
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quick Start by Platform
|
||||||
|
|
||||||
|
### Desktop Users
|
||||||
|
```bash
|
||||||
|
# Linux/macOS
|
||||||
|
./install_mini_rag.sh
|
||||||
|
|
||||||
|
# Windows
|
||||||
|
install_windows.bat
|
||||||
|
```
|
||||||
|
|
||||||
|
### Edge/Mobile Users
|
||||||
|
```bash
|
||||||
|
# Raspberry Pi
|
||||||
|
./install_mini_rag.sh
|
||||||
|
|
||||||
|
# Android (Termux)
|
||||||
|
pkg install python git && pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Any Docker platform
|
||||||
|
docker run -it fss-mini-rag
|
||||||
|
```
|
||||||
|
|
||||||
|
**💡 Pro tip**: Start with your current platform, then expand to edge devices as needed. The system scales from smartphones to servers seamlessly!
|
||||||
@ -11,6 +11,7 @@
|
|||||||
- [Search Architecture](#search-architecture)
|
- [Search Architecture](#search-architecture)
|
||||||
- [Installation Flow](#installation-flow)
|
- [Installation Flow](#installation-flow)
|
||||||
- [Configuration System](#configuration-system)
|
- [Configuration System](#configuration-system)
|
||||||
|
- [System Context Integration](#system-context-integration)
|
||||||
- [Error Handling](#error-handling)
|
- [Error Handling](#error-handling)
|
||||||
|
|
||||||
## System Overview
|
## System Overview
|
||||||
@ -22,10 +23,12 @@ graph TB
|
|||||||
|
|
||||||
CLI --> Index[📁 Index Project]
|
CLI --> Index[📁 Index Project]
|
||||||
CLI --> Search[🔍 Search Project]
|
CLI --> Search[🔍 Search Project]
|
||||||
|
CLI --> Explore[🧠 Explore Project]
|
||||||
CLI --> Status[📊 Show Status]
|
CLI --> Status[📊 Show Status]
|
||||||
|
|
||||||
TUI --> Index
|
TUI --> Index
|
||||||
TUI --> Search
|
TUI --> Search
|
||||||
|
TUI --> Explore
|
||||||
TUI --> Config[⚙️ Configuration]
|
TUI --> Config[⚙️ Configuration]
|
||||||
|
|
||||||
Index --> Files[📄 File Discovery]
|
Index --> Files[📄 File Discovery]
|
||||||
@ -34,17 +37,32 @@ graph TB
|
|||||||
Embed --> Store[💾 Vector Database]
|
Embed --> Store[💾 Vector Database]
|
||||||
|
|
||||||
Search --> Query[❓ User Query]
|
Search --> Query[❓ User Query]
|
||||||
|
Search --> Context[🖥️ System Context]
|
||||||
Query --> Vector[🎯 Vector Search]
|
Query --> Vector[🎯 Vector Search]
|
||||||
Query --> Keyword[🔤 Keyword Search]
|
Query --> Keyword[🔤 Keyword Search]
|
||||||
Vector --> Combine[🔄 Hybrid Results]
|
Vector --> Combine[🔄 Hybrid Results]
|
||||||
Keyword --> Combine
|
Keyword --> Combine
|
||||||
Combine --> Results[📋 Ranked Results]
|
Context --> Combine
|
||||||
|
Combine --> Synthesize{Synthesis Mode?}
|
||||||
|
|
||||||
|
Synthesize -->|Yes| FastLLM[⚡ Fast Synthesis]
|
||||||
|
Synthesize -->|No| Results[📋 Ranked Results]
|
||||||
|
FastLLM --> Results
|
||||||
|
|
||||||
|
Explore --> ExploreQuery[❓ Interactive Query]
|
||||||
|
ExploreQuery --> Memory[🧠 Conversation Memory]
|
||||||
|
ExploreQuery --> Context
|
||||||
|
Memory --> DeepLLM[🤔 Deep AI Analysis]
|
||||||
|
Context --> DeepLLM
|
||||||
|
Vector --> DeepLLM
|
||||||
|
DeepLLM --> Interactive[💬 Interactive Response]
|
||||||
|
|
||||||
Store --> LanceDB[(🗄️ LanceDB)]
|
Store --> LanceDB[(🗄️ LanceDB)]
|
||||||
Vector --> LanceDB
|
Vector --> LanceDB
|
||||||
|
|
||||||
Config --> YAML[📝 config.yaml]
|
Config --> YAML[📝 config.yaml]
|
||||||
Status --> Manifest[📋 manifest.json]
|
Status --> Manifest[📋 manifest.json]
|
||||||
|
Context --> SystemInfo[💻 OS, Python, Paths]
|
||||||
```
|
```
|
||||||
|
|
||||||
## User Journey
|
## User Journey
|
||||||
@ -276,6 +294,58 @@ flowchart TD
|
|||||||
style Error fill:#ffcdd2
|
style Error fill:#ffcdd2
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## System Context Integration
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
subgraph "System Detection"
|
||||||
|
OS[🖥️ Operating System]
|
||||||
|
Python[🐍 Python Version]
|
||||||
|
Project[📁 Project Path]
|
||||||
|
|
||||||
|
OS --> Windows[Windows: rag.bat]
|
||||||
|
OS --> Linux[Linux: ./rag-mini]
|
||||||
|
OS --> macOS[macOS: ./rag-mini]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph "Context Collection"
|
||||||
|
Collect[🔍 Collect Context]
|
||||||
|
OS --> Collect
|
||||||
|
Python --> Collect
|
||||||
|
Project --> Collect
|
||||||
|
|
||||||
|
Collect --> Format[📝 Format Context]
|
||||||
|
Format --> Limit[✂️ Limit to 200 chars]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph "AI Integration"
|
||||||
|
UserQuery[❓ User Query]
|
||||||
|
SearchResults[📋 Search Results]
|
||||||
|
SystemContext[💻 System Context]
|
||||||
|
|
||||||
|
UserQuery --> Prompt[📝 Build Prompt]
|
||||||
|
SearchResults --> Prompt
|
||||||
|
SystemContext --> Prompt
|
||||||
|
|
||||||
|
Prompt --> AI[🤖 LLM Processing]
|
||||||
|
AI --> Response[💬 Contextual Response]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph "Enhanced Responses"
|
||||||
|
Response --> Commands[💻 OS-specific commands]
|
||||||
|
Response --> Paths[📂 Correct path formats]
|
||||||
|
Response --> Tips[💡 Platform-specific tips]
|
||||||
|
end
|
||||||
|
|
||||||
|
Format --> SystemContext
|
||||||
|
|
||||||
|
style SystemContext fill:#e3f2fd
|
||||||
|
style Response fill:#f3e5f5
|
||||||
|
style Commands fill:#e8f5e8
|
||||||
|
```
|
||||||
|
|
||||||
|
*System context helps the AI provide better, platform-specific guidance without compromising privacy*
|
||||||
|
|
||||||
## Architecture Layers
|
## Architecture Layers
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
|
|||||||
@ -1,212 +1,314 @@
|
|||||||
# Getting Started with FSS-Mini-RAG
|
# Getting Started with FSS-Mini-RAG
|
||||||
|
|
||||||
## Step 1: Installation
|
> **Get from zero to searching in 2 minutes**
|
||||||
|
> *Everything you need to know to start finding code by meaning, not just keywords*
|
||||||
|
|
||||||
Choose your installation based on what you want:
|
## Installation (Choose Your Adventure)
|
||||||
|
|
||||||
### Option A: Ollama Only (Recommended)
|
### 🎯 **Option 1: Full Installation (Recommended)**
|
||||||
|
*Gets you everything working reliably with desktop shortcuts and AI features*
|
||||||
|
|
||||||
|
**Linux/macOS:**
|
||||||
```bash
|
```bash
|
||||||
# Install Ollama first
|
./install_mini_rag.sh
|
||||||
curl -fsSL https://ollama.ai/install.sh | sh
|
|
||||||
|
|
||||||
# Pull the embedding model
|
|
||||||
ollama pull nomic-embed-text
|
|
||||||
|
|
||||||
# Install Python dependencies
|
|
||||||
pip install -r requirements.txt
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Option B: Full ML Stack
|
**Windows:**
|
||||||
```bash
|
```cmd
|
||||||
# Install everything including PyTorch
|
install_windows.bat
|
||||||
pip install -r requirements-full.txt
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 2: Test Installation
|
**What this does:**
|
||||||
|
- Sets up Python environment automatically
|
||||||
|
- Installs all dependencies
|
||||||
|
- Downloads AI models (with your permission)
|
||||||
|
- Creates desktop shortcuts and application menu entries
|
||||||
|
- Tests everything works
|
||||||
|
- Gives you an interactive tutorial
|
||||||
|
|
||||||
|
**Time needed:** 5-10 minutes (depending on AI model downloads)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 🚀 **Option 2: Copy & Try (Experimental)**
|
||||||
|
*Just copy the folder and run - may work, may need manual setup*
|
||||||
|
|
||||||
|
**Linux/macOS:**
|
||||||
```bash
|
```bash
|
||||||
# Index this RAG system itself
|
# Copy folder anywhere and try running
|
||||||
|
./rag-mini index ~/my-project
|
||||||
|
# Auto-setup attempts to create virtual environment
|
||||||
|
# Falls back with clear instructions if it fails
|
||||||
|
```
|
||||||
|
|
||||||
|
**Windows:**
|
||||||
|
```cmd
|
||||||
|
# Copy folder anywhere and try running
|
||||||
|
rag.bat index C:\my-project
|
||||||
|
# Auto-setup attempts to create virtual environment
|
||||||
|
# Shows helpful error messages if manual install needed
|
||||||
|
```
|
||||||
|
|
||||||
|
**Time needed:** 30 seconds if it works, 10 minutes if you need manual setup
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## First Search (The Fun Part!)
|
||||||
|
|
||||||
|
### Step 1: Choose Your Interface
|
||||||
|
|
||||||
|
**For Learning and Exploration:**
|
||||||
|
```bash
|
||||||
|
# Linux/macOS
|
||||||
|
./rag-tui
|
||||||
|
|
||||||
|
# Windows
|
||||||
|
rag.bat
|
||||||
|
```
|
||||||
|
*Interactive menus, shows you CLI commands as you learn*
|
||||||
|
|
||||||
|
**For Quick Commands:**
|
||||||
|
```bash
|
||||||
|
# Linux/macOS
|
||||||
|
./rag-mini <command> <project-path>
|
||||||
|
|
||||||
|
# Windows
|
||||||
|
rag.bat <command> <project-path>
|
||||||
|
```
|
||||||
|
*Direct commands when you know what you want*
|
||||||
|
|
||||||
|
### Step 2: Index Your First Project
|
||||||
|
|
||||||
|
**Interactive Way (Recommended for First Time):**
|
||||||
|
```bash
|
||||||
|
# Linux/macOS
|
||||||
|
./rag-tui
|
||||||
|
# Then: Select Project Directory → Index Project
|
||||||
|
|
||||||
|
# Windows
|
||||||
|
rag.bat
|
||||||
|
# Then: Select Project Directory → Index Project
|
||||||
|
```
|
||||||
|
|
||||||
|
**Direct Commands:**
|
||||||
|
```bash
|
||||||
|
# Linux/macOS
|
||||||
./rag-mini index ~/my-project
|
./rag-mini index ~/my-project
|
||||||
|
|
||||||
# Search for something
|
# Windows
|
||||||
./rag-mini search ~/my-project "chunker function"
|
rag.bat index C:\my-project
|
||||||
|
|
||||||
# Check what got indexed
|
|
||||||
./rag-mini status ~/my-project
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 3: Index Your First Project
|
**What indexing does:**
|
||||||
|
- Finds all text files in your project
|
||||||
|
- Breaks them into smart "chunks" (functions, classes, logical sections)
|
||||||
|
- Creates searchable embeddings that understand meaning
|
||||||
|
- Stores everything in a fast vector database
|
||||||
|
- Creates a `.mini-rag/` directory with your search index
|
||||||
|
|
||||||
|
**Time needed:** 10-60 seconds depending on project size
|
||||||
|
|
||||||
|
### Step 3: Search by Meaning
|
||||||
|
|
||||||
|
**Natural language queries:**
|
||||||
```bash
|
```bash
|
||||||
# Index any project directory
|
# Linux/macOS
|
||||||
./rag-mini index /path/to/your/project
|
./rag-mini search ~/my-project "user authentication logic"
|
||||||
|
./rag-mini search ~/my-project "error handling for database connections"
|
||||||
|
./rag-mini search ~/my-project "how to validate input data"
|
||||||
|
|
||||||
# The system creates .mini-rag/ directory with:
|
# Windows
|
||||||
# - config.json (settings)
|
rag.bat search C:\my-project "user authentication logic"
|
||||||
# - manifest.json (file tracking)
|
rag.bat search C:\my-project "error handling for database connections"
|
||||||
# - database.lance/ (vector database)
|
rag.bat search C:\my-project "how to validate input data"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 4: Search Your Code
|
**Code concepts:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Basic semantic search
|
# Finds login functions, auth middleware, session handling
|
||||||
./rag-mini search /path/to/project "user login logic"
|
./rag-mini search ~/my-project "login functionality"
|
||||||
|
|
||||||
# Enhanced search with smart features
|
# Finds try/catch blocks, error handlers, retry logic
|
||||||
./rag-mini-enhanced search /path/to/project "authentication"
|
./rag-mini search ~/my-project "exception handling"
|
||||||
|
|
||||||
# Find similar patterns
|
# Finds validation functions, input sanitization, data checking
|
||||||
./rag-mini-enhanced similar /path/to/project "def validate_input"
|
./rag-mini search ~/my-project "data validation"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step 5: Customize Configuration
|
**What you get:**
|
||||||
|
- Ranked results by relevance (not just keyword matching)
|
||||||
|
- File paths and line numbers for easy navigation
|
||||||
|
- Context around each match so you understand what it does
|
||||||
|
- Smart filtering to avoid noise and duplicates
|
||||||
|
|
||||||
Edit `project/.mini-rag/config.json`:
|
## Two Powerful Modes
|
||||||
|
|
||||||
```json
|
FSS-Mini-RAG has two different ways to get answers, optimized for different needs:
|
||||||
{
|
|
||||||
"chunking": {
|
|
||||||
"max_size": 3000,
|
|
||||||
"strategy": "semantic"
|
|
||||||
},
|
|
||||||
"files": {
|
|
||||||
"min_file_size": 100
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Then re-index to apply changes:
|
### 🚀 **Synthesis Mode** - Fast Answers
|
||||||
```bash
|
```bash
|
||||||
./rag-mini index /path/to/project --force
|
# Linux/macOS
|
||||||
|
./rag-mini search ~/project "authentication logic" --synthesize
|
||||||
|
|
||||||
|
# Windows
|
||||||
|
rag.bat search C:\project "authentication logic" --synthesize
|
||||||
```
|
```
|
||||||
|
|
||||||
## Common Use Cases
|
**Perfect for:**
|
||||||
|
- Quick code discovery
|
||||||
|
- Finding specific functions or patterns
|
||||||
|
- Getting fast, consistent answers
|
||||||
|
|
||||||
### Find Functions by Name
|
**What you get:**
|
||||||
```bash
|
- Lightning-fast responses (no thinking overhead)
|
||||||
./rag-mini search /project "function named connect_to_database"
|
- Reliable, factual information about your code
|
||||||
```
|
- Clear explanations of what code does and how it works
|
||||||
|
|
||||||
### Find Code Patterns
|
### 🧠 **Exploration Mode** - Deep Understanding
|
||||||
```bash
|
|
||||||
./rag-mini search /project "error handling try catch"
|
|
||||||
./rag-mini search /project "database query with parameters"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Find Configuration
|
|
||||||
```bash
|
```bash
|
||||||
./rag-mini search /project "database connection settings"
|
# Linux/macOS
|
||||||
./rag-mini search /project "environment variables"
|
./rag-mini explore ~/project
|
||||||
|
|
||||||
|
# Windows
|
||||||
|
rag.bat explore C:\project
|
||||||
```
|
```
|
||||||
|
|
||||||
### Find Documentation
|
**Perfect for:**
|
||||||
|
- Learning new codebases
|
||||||
|
- Debugging complex issues
|
||||||
|
- Understanding architectural decisions
|
||||||
|
|
||||||
|
**What you get:**
|
||||||
|
- Interactive conversation with AI that remembers context
|
||||||
|
- Deep reasoning with full "thinking" process shown
|
||||||
|
- Follow-up questions and detailed explanations
|
||||||
|
- Memory of your previous questions in the session
|
||||||
|
|
||||||
|
**Example exploration session:**
|
||||||
|
```
|
||||||
|
🧠 Exploration Mode - Ask anything about your project
|
||||||
|
|
||||||
|
You: How does authentication work in this codebase?
|
||||||
|
|
||||||
|
AI: Let me analyze the authentication system...
|
||||||
|
|
||||||
|
💭 Thinking: I can see several authentication-related files. Let me examine
|
||||||
|
the login flow, session management, and security measures...
|
||||||
|
|
||||||
|
📝 Authentication Analysis:
|
||||||
|
This codebase uses a three-layer authentication system:
|
||||||
|
1. Login validation in auth.py handles username/password checking
|
||||||
|
2. Session management in sessions.py maintains user state
|
||||||
|
3. Middleware in auth_middleware.py protects routes
|
||||||
|
|
||||||
|
You: What security concerns should I be aware of?
|
||||||
|
|
||||||
|
AI: Based on our previous discussion about authentication, let me check for
|
||||||
|
common security vulnerabilities...
|
||||||
|
```
|
||||||
|
|
||||||
|
## Check Your Setup
|
||||||
|
|
||||||
|
**See what got indexed:**
|
||||||
```bash
|
```bash
|
||||||
./rag-mini search /project "how to deploy"
|
# Linux/macOS
|
||||||
./rag-mini search /project "API documentation"
|
./rag-mini status ~/my-project
|
||||||
|
|
||||||
|
# Windows
|
||||||
|
rag.bat status C:\my-project
|
||||||
```
|
```
|
||||||
|
|
||||||
## Python API Usage
|
**What you'll see:**
|
||||||
|
- How many files were processed
|
||||||
|
- Total chunks created for searching
|
||||||
|
- Embedding method being used (Ollama, ML models, or hash-based)
|
||||||
|
- Configuration file location
|
||||||
|
- Index health and last update time
|
||||||
|
|
||||||
```python
|
## Configuration (Optional)
|
||||||
from mini_rag import ProjectIndexer, CodeSearcher, CodeEmbedder
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# Initialize
|
Your project gets a `.mini-rag/config.yaml` file with helpful comments:
|
||||||
project_path = Path("/path/to/your/project")
|
|
||||||
embedder = CodeEmbedder()
|
|
||||||
indexer = ProjectIndexer(project_path, embedder)
|
|
||||||
searcher = CodeSearcher(project_path, embedder)
|
|
||||||
|
|
||||||
# Index the project
|
```yaml
|
||||||
print("Indexing project...")
|
# Context window configuration (critical for AI features)
|
||||||
result = indexer.index_project()
|
# 💡 Sizing guide: 2K=1 question, 4K=1-2 questions, 8K=manageable, 16K=most users
|
||||||
print(f"Indexed {result['files_processed']} files, {result['chunks_created']} chunks")
|
# 32K=large codebases, 64K+=power users only
|
||||||
|
# ⚠️ Larger contexts use exponentially more CPU/memory - only increase if needed
|
||||||
|
context_window: 16384 # Context size in tokens
|
||||||
|
|
||||||
# Search
|
# AI model preferences (edit to change priority)
|
||||||
print("\nSearching for authentication code...")
|
model_rankings:
|
||||||
results = searcher.search("user authentication logic", top_k=5)
|
- "qwen3:1.7b" # Excellent for RAG (1.4GB, recommended)
|
||||||
|
- "qwen3:0.6b" # Lightweight and fast (~500MB)
|
||||||
for i, result in enumerate(results, 1):
|
- "qwen3:4b" # Higher quality but slower (~2.5GB)
|
||||||
print(f"\n{i}. {result.file_path}")
|
|
||||||
print(f" Score: {result.score:.3f}")
|
|
||||||
print(f" Type: {result.chunk_type}")
|
|
||||||
print(f" Content: {result.content[:100]}...")
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Advanced Features
|
**When to customize:**
|
||||||
|
- Your searches aren't finding what you expect → adjust chunking settings
|
||||||
|
- You want AI features → install Ollama and download models
|
||||||
|
- System is slow → try smaller models or reduce context window
|
||||||
|
- Getting too many/few results → adjust similarity threshold
|
||||||
|
|
||||||
### Auto-optimization
|
## Troubleshooting
|
||||||
|
|
||||||
|
### "Project not indexed"
|
||||||
|
**Problem:** You're trying to search before indexing
|
||||||
```bash
|
```bash
|
||||||
# Get optimization suggestions
|
# Run indexing first
|
||||||
./rag-mini-enhanced analyze /path/to/project
|
./rag-mini index ~/my-project # Linux/macOS
|
||||||
|
rag.bat index C:\my-project # Windows
|
||||||
# This analyzes your codebase and suggests:
|
|
||||||
# - Better chunk sizes for your language mix
|
|
||||||
# - Streaming settings for large files
|
|
||||||
# - File filtering optimizations
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### File Watching
|
### "No Ollama models available"
|
||||||
```python
|
**Problem:** AI features need models downloaded
|
||||||
from mini_rag import FileWatcher
|
```bash
|
||||||
|
# Install Ollama first
|
||||||
|
curl -fsSL https://ollama.ai/install.sh | sh # Linux/macOS
|
||||||
|
# Or download from https://ollama.com # Windows
|
||||||
|
|
||||||
# Watch for file changes and auto-update index
|
# Start Ollama server
|
||||||
watcher = FileWatcher(project_path, indexer)
|
ollama serve
|
||||||
watcher.start_watching()
|
|
||||||
|
|
||||||
# Now any file changes automatically update the index
|
# Download a model
|
||||||
|
ollama pull qwen3:1.7b
|
||||||
```
|
```
|
||||||
|
|
||||||
### Custom Chunking
|
### "Virtual environment not found"
|
||||||
```python
|
**Problem:** Auto-setup didn't work, need manual installation
|
||||||
from mini_rag import CodeChunker
|
```bash
|
||||||
|
# Run the full installer instead
|
||||||
chunker = CodeChunker()
|
./install_mini_rag.sh # Linux/macOS
|
||||||
|
install_windows.bat # Windows
|
||||||
# Chunk a Python file
|
|
||||||
with open("example.py") as f:
|
|
||||||
content = f.read()
|
|
||||||
|
|
||||||
chunks = chunker.chunk_text(content, "python", "example.py")
|
|
||||||
for chunk in chunks:
|
|
||||||
print(f"Type: {chunk.chunk_type}")
|
|
||||||
print(f"Content: {chunk.content}")
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Tips and Best Practices
|
### Getting weird results
|
||||||
|
**Solution:** Try different search terms or check what got indexed
|
||||||
|
```bash
|
||||||
|
# See what files were processed
|
||||||
|
./rag-mini status ~/my-project
|
||||||
|
|
||||||
### For Better Search Results
|
# Try more specific queries
|
||||||
- Use descriptive phrases: "function that validates email addresses"
|
./rag-mini search ~/my-project "specific function name"
|
||||||
- Try different phrasings if first search doesn't work
|
```
|
||||||
- Search for concepts, not just exact variable names
|
|
||||||
|
|
||||||
### For Better Indexing
|
## Next Steps
|
||||||
- Exclude build directories: `node_modules/`, `build/`, `dist/`
|
|
||||||
- Include documentation files - they often contain valuable context
|
|
||||||
- Use semantic chunking strategy for most projects
|
|
||||||
|
|
||||||
### For Configuration
|
### Learn More
|
||||||
- Start with default settings
|
- **[Beginner's Glossary](BEGINNER_GLOSSARY.md)** - All the terms explained simply
|
||||||
- Use `analyze` command to get optimization suggestions
|
- **[TUI Guide](TUI_GUIDE.md)** - Master the interactive interface
|
||||||
- Increase chunk size for larger functions/classes
|
- **[Visual Diagrams](DIAGRAMS.md)** - See how everything works
|
||||||
- Decrease chunk size for more granular search
|
|
||||||
|
|
||||||
### For Troubleshooting
|
### Advanced Features
|
||||||
- Check `./rag-mini status` to see what was indexed
|
- **[Query Expansion](QUERY_EXPANSION.md)** - Make searches smarter with AI
|
||||||
- Look at `.mini-rag/manifest.json` for file details
|
- **[LLM Providers](LLM_PROVIDERS.md)** - Use different AI models
|
||||||
- Run with `--force` to completely rebuild index
|
- **[CPU Deployment](CPU_DEPLOYMENT.md)** - Optimize for older computers
|
||||||
- Check logs in `.mini-rag/` directory for errors
|
|
||||||
|
|
||||||
## What's Next?
|
### Customize Everything
|
||||||
|
- **[Technical Guide](TECHNICAL_GUIDE.md)** - How the system actually works
|
||||||
|
- **[Configuration Examples](../examples/)** - Pre-made configs for different needs
|
||||||
|
|
||||||
1. Try the test suite to understand how components work:
|
---
|
||||||
```bash
|
|
||||||
python -m pytest tests/ -v
|
|
||||||
```
|
|
||||||
|
|
||||||
2. Look at the examples in `examples/` directory
|
**🎉 That's it!** You now have a semantic search system that understands your code by meaning, not just keywords. Start with simple searches and work your way up to the advanced AI features as you get comfortable.
|
||||||
|
|
||||||
3. Read the main README.md for complete technical details
|
**💡 Pro tip:** The best way to learn is to index a project you know well and try searching for things you know are in there. You'll quickly see how much better meaning-based search is than traditional keyword search.
|
||||||
|
|
||||||
4. Customize the system for your specific project needs
|
|
||||||
@ -194,6 +194,16 @@ class ConfigManager:
|
|||||||
|
|
||||||
return config
|
return config
|
||||||
|
|
||||||
|
except yaml.YAMLError as e:
|
||||||
|
# YAML syntax error - help user fix it instead of silent fallback
|
||||||
|
error_msg = f"⚠️ Config file has YAML syntax error at line {getattr(e, 'problem_mark', 'unknown')}: {e}"
|
||||||
|
logger.error(error_msg)
|
||||||
|
print(f"\n{error_msg}")
|
||||||
|
print(f"Config file: {self.config_path}")
|
||||||
|
print("💡 Check YAML syntax (indentation, quotes, colons)")
|
||||||
|
print("💡 Or delete config file to reset to defaults")
|
||||||
|
return RAGConfig() # Still return defaults but warn user
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"Failed to load config from {self.config_path}: {e}")
|
logger.error(f"Failed to load config from {self.config_path}: {e}")
|
||||||
logger.info("Using default configuration")
|
logger.info("Using default configuration")
|
||||||
@ -210,8 +220,16 @@ class ConfigManager:
|
|||||||
# Create YAML content with comments
|
# Create YAML content with comments
|
||||||
yaml_content = self._create_yaml_with_comments(config_dict)
|
yaml_content = self._create_yaml_with_comments(config_dict)
|
||||||
|
|
||||||
|
# Write with basic file locking to prevent corruption
|
||||||
with open(self.config_path, 'w') as f:
|
with open(self.config_path, 'w') as f:
|
||||||
f.write(yaml_content)
|
try:
|
||||||
|
import fcntl
|
||||||
|
fcntl.flock(f.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB) # Non-blocking exclusive lock
|
||||||
|
f.write(yaml_content)
|
||||||
|
fcntl.flock(f.fileno(), fcntl.LOCK_UN) # Unlock
|
||||||
|
except (OSError, ImportError):
|
||||||
|
# Fallback for Windows or if fcntl unavailable
|
||||||
|
f.write(yaml_content)
|
||||||
|
|
||||||
logger.info(f"Configuration saved to {self.config_path}")
|
logger.info(f"Configuration saved to {self.config_path}")
|
||||||
|
|
||||||
@ -274,7 +292,11 @@ class ConfigManager:
|
|||||||
f" synthesis_temperature: {config_dict['llm']['synthesis_temperature']} # LLM temperature for analysis",
|
f" synthesis_temperature: {config_dict['llm']['synthesis_temperature']} # LLM temperature for analysis",
|
||||||
"",
|
"",
|
||||||
" # Context window configuration (critical for RAG performance)",
|
" # Context window configuration (critical for RAG performance)",
|
||||||
f" context_window: {config_dict['llm']['context_window']} # Context size in tokens (8K=fast, 16K=balanced, 32K=advanced)",
|
" # 💡 Sizing guide: 2K=1 question, 4K=1-2 questions, 8K=manageable, 16K=most users",
|
||||||
|
" # 32K=large codebases, 64K+=power users only",
|
||||||
|
" # ⚠️ Larger contexts use exponentially more CPU/memory - only increase if needed",
|
||||||
|
" # 🔧 Low context limits? Try smaller topk, better search terms, or archive noise",
|
||||||
|
f" context_window: {config_dict['llm']['context_window']} # Context size in tokens",
|
||||||
f" auto_context: {str(config_dict['llm']['auto_context']).lower()} # Auto-adjust context based on model capabilities",
|
f" auto_context: {str(config_dict['llm']['auto_context']).lower()} # Auto-adjust context based on model capabilities",
|
||||||
"",
|
"",
|
||||||
" model_rankings: # Preferred model order (edit to change priority)",
|
" model_rankings: # Preferred model order (edit to change priority)",
|
||||||
|
|||||||
@ -17,11 +17,13 @@ try:
|
|||||||
from .llm_synthesizer import LLMSynthesizer, SynthesisResult
|
from .llm_synthesizer import LLMSynthesizer, SynthesisResult
|
||||||
from .search import CodeSearcher
|
from .search import CodeSearcher
|
||||||
from .config import RAGConfig
|
from .config import RAGConfig
|
||||||
|
from .system_context import get_system_context
|
||||||
except ImportError:
|
except ImportError:
|
||||||
# For direct testing
|
# For direct testing
|
||||||
from llm_synthesizer import LLMSynthesizer, SynthesisResult
|
from llm_synthesizer import LLMSynthesizer, SynthesisResult
|
||||||
from search import CodeSearcher
|
from search import CodeSearcher
|
||||||
from config import RAGConfig
|
from config import RAGConfig
|
||||||
|
get_system_context = lambda x=None: ""
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@ -154,10 +156,15 @@ Content: {content[:800]}{'...' if len(content) > 800 else ''}
|
|||||||
|
|
||||||
results_text = "\n".join(results_context)
|
results_text = "\n".join(results_context)
|
||||||
|
|
||||||
|
# Get system context for better responses
|
||||||
|
system_context = get_system_context(self.project_path)
|
||||||
|
|
||||||
# Create comprehensive exploration prompt with thinking
|
# Create comprehensive exploration prompt with thinking
|
||||||
prompt = f"""<think>
|
prompt = f"""<think>
|
||||||
The user asked: "{question}"
|
The user asked: "{question}"
|
||||||
|
|
||||||
|
System context: {system_context}
|
||||||
|
|
||||||
Let me analyze what they're asking and look at the information I have available.
|
Let me analyze what they're asking and look at the information I have available.
|
||||||
|
|
||||||
From the search results, I can see relevant information about:
|
From the search results, I can see relevant information about:
|
||||||
|
|||||||
@ -16,11 +16,13 @@ from pathlib import Path
|
|||||||
|
|
||||||
try:
|
try:
|
||||||
from .llm_safeguards import ModelRunawayDetector, SafeguardConfig, get_optimal_ollama_parameters
|
from .llm_safeguards import ModelRunawayDetector, SafeguardConfig, get_optimal_ollama_parameters
|
||||||
|
from .system_context import get_system_context
|
||||||
except ImportError:
|
except ImportError:
|
||||||
# Graceful fallback if safeguards not available
|
# Graceful fallback if safeguards not available
|
||||||
ModelRunawayDetector = None
|
ModelRunawayDetector = None
|
||||||
SafeguardConfig = None
|
SafeguardConfig = None
|
||||||
get_optimal_ollama_parameters = lambda x: {}
|
get_optimal_ollama_parameters = lambda x: {}
|
||||||
|
get_system_context = lambda x=None: ""
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@ -175,12 +177,20 @@ class LLMSynthesizer:
|
|||||||
# Ensure we're initialized
|
# Ensure we're initialized
|
||||||
self._ensure_initialized()
|
self._ensure_initialized()
|
||||||
|
|
||||||
# Use the best available model
|
# Use the best available model with retry logic
|
||||||
model_to_use = self.model
|
model_to_use = self.model
|
||||||
if self.model not in self.available_models:
|
if self.model not in self.available_models:
|
||||||
# Fallback to first available model
|
# Refresh model list in case of race condition
|
||||||
if self.available_models:
|
logger.warning(f"Configured model {self.model} not in available list, refreshing...")
|
||||||
|
self.available_models = self._get_available_models()
|
||||||
|
|
||||||
|
if self.model in self.available_models:
|
||||||
|
model_to_use = self.model
|
||||||
|
logger.info(f"Model {self.model} found after refresh")
|
||||||
|
elif self.available_models:
|
||||||
|
# Fallback to first available model
|
||||||
model_to_use = self.available_models[0]
|
model_to_use = self.available_models[0]
|
||||||
|
logger.warning(f"Using fallback model: {model_to_use}")
|
||||||
else:
|
else:
|
||||||
logger.error("No Ollama models available")
|
logger.error("No Ollama models available")
|
||||||
return None
|
return None
|
||||||
@ -587,9 +597,13 @@ Content: {content[:500]}{'...' if len(content) > 500 else ''}
|
|||||||
|
|
||||||
context = "\n".join(context_parts)
|
context = "\n".join(context_parts)
|
||||||
|
|
||||||
# Create synthesis prompt
|
# Get system context for better responses
|
||||||
|
system_context = get_system_context(project_path)
|
||||||
|
|
||||||
|
# Create synthesis prompt with system context
|
||||||
prompt = f"""You are a senior software engineer analyzing code search results. Your task is to synthesize the search results into a helpful, actionable summary.
|
prompt = f"""You are a senior software engineer analyzing code search results. Your task is to synthesize the search results into a helpful, actionable summary.
|
||||||
|
|
||||||
|
SYSTEM CONTEXT: {system_context}
|
||||||
SEARCH QUERY: "{query}"
|
SEARCH QUERY: "{query}"
|
||||||
PROJECT: {project_path.name}
|
PROJECT: {project_path.name}
|
||||||
|
|
||||||
|
|||||||
123
mini_rag/system_context.py
Normal file
123
mini_rag/system_context.py
Normal file
@ -0,0 +1,123 @@
|
|||||||
|
"""
|
||||||
|
System Context Collection for Enhanced RAG Grounding
|
||||||
|
|
||||||
|
Collects minimal system information to help the LLM provide better,
|
||||||
|
context-aware assistance without compromising privacy.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import platform
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, Optional
|
||||||
|
|
||||||
|
|
||||||
|
class SystemContextCollector:
|
||||||
|
"""Collects system context information for enhanced LLM grounding."""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def get_system_context(project_path: Optional[Path] = None) -> str:
|
||||||
|
"""
|
||||||
|
Get concise system context for LLM grounding.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
project_path: Current project directory
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Formatted system context string (max 200 chars for privacy)
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
# Basic system info
|
||||||
|
os_name = platform.system()
|
||||||
|
python_ver = f"{sys.version_info.major}.{sys.version_info.minor}"
|
||||||
|
|
||||||
|
# Simplified OS names
|
||||||
|
os_short = {
|
||||||
|
'Windows': 'Win',
|
||||||
|
'Linux': 'Linux',
|
||||||
|
'Darwin': 'macOS'
|
||||||
|
}.get(os_name, os_name)
|
||||||
|
|
||||||
|
# Working directory info
|
||||||
|
if project_path:
|
||||||
|
# Use relative or shortened path for privacy
|
||||||
|
try:
|
||||||
|
rel_path = project_path.relative_to(Path.home())
|
||||||
|
path_info = f"~/{rel_path}"
|
||||||
|
except ValueError:
|
||||||
|
# If not relative to home, just use folder name
|
||||||
|
path_info = project_path.name
|
||||||
|
else:
|
||||||
|
path_info = Path.cwd().name
|
||||||
|
|
||||||
|
# Trim path if too long for our 200-char limit
|
||||||
|
if len(path_info) > 50:
|
||||||
|
path_info = f".../{path_info[-45:]}"
|
||||||
|
|
||||||
|
# Command style hints
|
||||||
|
cmd_style = "rag.bat" if os_name == "Windows" else "./rag-mini"
|
||||||
|
|
||||||
|
# Format concise context
|
||||||
|
context = f"[{os_short} {python_ver}, {path_info}, use {cmd_style}]"
|
||||||
|
|
||||||
|
# Ensure we stay under 200 chars
|
||||||
|
if len(context) > 200:
|
||||||
|
context = context[:197] + "...]"
|
||||||
|
|
||||||
|
return context
|
||||||
|
|
||||||
|
except Exception:
|
||||||
|
# Fallback to minimal info if anything fails
|
||||||
|
return f"[{platform.system()}, Python {sys.version_info.major}.{sys.version_info.minor}]"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def get_command_context(os_name: Optional[str] = None) -> Dict[str, str]:
|
||||||
|
"""
|
||||||
|
Get OS-appropriate command examples.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary with command patterns for the current OS
|
||||||
|
"""
|
||||||
|
if os_name is None:
|
||||||
|
os_name = platform.system()
|
||||||
|
|
||||||
|
if os_name == "Windows":
|
||||||
|
return {
|
||||||
|
"launcher": "rag.bat",
|
||||||
|
"index": "rag.bat index C:\\path\\to\\project",
|
||||||
|
"search": "rag.bat search C:\\path\\to\\project \"query\"",
|
||||||
|
"explore": "rag.bat explore C:\\path\\to\\project",
|
||||||
|
"path_sep": "\\",
|
||||||
|
"example_path": "C:\\Users\\username\\Documents\\myproject"
|
||||||
|
}
|
||||||
|
else:
|
||||||
|
return {
|
||||||
|
"launcher": "./rag-mini",
|
||||||
|
"index": "./rag-mini index /path/to/project",
|
||||||
|
"search": "./rag-mini search /path/to/project \"query\"",
|
||||||
|
"explore": "./rag-mini explore /path/to/project",
|
||||||
|
"path_sep": "/",
|
||||||
|
"example_path": "~/Documents/myproject"
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def get_system_context(project_path: Optional[Path] = None) -> str:
|
||||||
|
"""Convenience function to get system context."""
|
||||||
|
return SystemContextCollector.get_system_context(project_path)
|
||||||
|
|
||||||
|
|
||||||
|
def get_command_context() -> Dict[str, str]:
|
||||||
|
"""Convenience function to get command context."""
|
||||||
|
return SystemContextCollector.get_command_context()
|
||||||
|
|
||||||
|
|
||||||
|
# Test function
|
||||||
|
if __name__ == "__main__":
|
||||||
|
print("System Context Test:")
|
||||||
|
print(f"Context: {get_system_context()}")
|
||||||
|
print(f"Context with path: {get_system_context(Path('/tmp/test'))}")
|
||||||
|
print()
|
||||||
|
print("Command Context:")
|
||||||
|
cmds = get_command_context()
|
||||||
|
for key, value in cmds.items():
|
||||||
|
print(f" {key}: {value}")
|
||||||
Loading…
x
Reference in New Issue
Block a user