Compare commits

..

No commits in common. "11639c82377e66d43fe741f2b122deaf22a8e646" and "9eb366f414ed18fdb87a549e7c394cbea1cf38a7" have entirely different histories.

11 changed files with 41 additions and 1040 deletions

21
LICENSE
View File

@ -1,21 +0,0 @@
MIT License
Copyright (c) 2025 Brett Fox
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@ -1,264 +0,0 @@
# 🤖 LLM Provider Setup Guide
This guide shows how to configure FSS-Mini-RAG with different LLM providers for synthesis and query expansion features.
## 🎯 Quick Provider Comparison
| Provider | Cost | Setup Difficulty | Quality | Privacy | Internet Required |
|----------|------|------------------|---------|---------|-------------------|
| **Ollama** | Free | Easy | Good | Excellent | No |
| **LM Studio** | Free | Easy | Good | Excellent | No |
| **OpenRouter** | Low ($0.10-0.50/M) | Medium | Excellent | Fair | Yes |
| **OpenAI** | Medium ($0.15-2.50/M) | Medium | Excellent | Fair | Yes |
| **Anthropic** | Medium-High | Medium | Excellent | Fair | Yes |
## 🏠 Local Providers (Recommended for Beginners)
### Ollama (Default)
**Best for:** Privacy, learning, no ongoing costs
```yaml
llm:
provider: ollama
ollama_host: localhost:11434
synthesis_model: llama3.2
expansion_model: llama3.2
enable_synthesis: false
synthesis_temperature: 0.3
cpu_optimized: true
enable_thinking: true
```
**Setup:**
1. Install Ollama: `curl -fsSL https://ollama.ai/install.sh | sh`
2. Start service: `ollama serve`
3. Download model: `ollama pull llama3.2`
4. Test: `./rag-mini search /path/to/project "test" --synthesize`
**Recommended Models:**
- `qwen3:0.6b` - Ultra-fast, good for CPU-only systems
- `llama3.2` - Balanced quality and speed
- `llama3.1:8b` - Higher quality, needs more RAM
### LM Studio
**Best for:** GUI users, model experimentation
```yaml
llm:
provider: openai
api_base: http://localhost:1234/v1
api_key: "not-needed"
synthesis_model: "any"
expansion_model: "any"
enable_synthesis: false
synthesis_temperature: 0.3
```
**Setup:**
1. Download [LM Studio](https://lmstudio.ai)
2. Install any model from the catalog
3. Start local server (default port 1234)
4. Use config above
## ☁️ Cloud Providers (For Advanced Users)
### OpenRouter (Best Value)
**Best for:** Access to many models, reasonable pricing
```yaml
llm:
provider: openai
api_base: https://openrouter.ai/api/v1
api_key: "your-api-key-here"
synthesis_model: "meta-llama/llama-3.1-8b-instruct:free"
expansion_model: "meta-llama/llama-3.1-8b-instruct:free"
enable_synthesis: false
synthesis_temperature: 0.3
timeout: 30
```
**Setup:**
1. Sign up at [openrouter.ai](https://openrouter.ai)
2. Create API key in dashboard
3. Add $5-10 credits (goes far with efficient models)
4. Replace `your-api-key-here` with actual key
**Budget Models:**
- `meta-llama/llama-3.1-8b-instruct:free` - Free tier
- `openai/gpt-4o-mini` - $0.15 per million tokens
- `anthropic/claude-3-haiku` - $0.25 per million tokens
### OpenAI (Premium Quality)
**Best for:** Reliability, advanced features
```yaml
llm:
provider: openai
api_key: "your-openai-api-key"
synthesis_model: "gpt-4o-mini"
expansion_model: "gpt-4o-mini"
enable_synthesis: false
synthesis_temperature: 0.3
timeout: 30
```
**Setup:**
1. Sign up at [platform.openai.com](https://platform.openai.com)
2. Add payment method
3. Create API key
4. Start with `gpt-4o-mini` for cost efficiency
### Anthropic Claude (Code Expert)
**Best for:** Code analysis, thoughtful responses
```yaml
llm:
provider: anthropic
api_key: "your-anthropic-api-key"
synthesis_model: "claude-3-haiku-20240307"
expansion_model: "claude-3-haiku-20240307"
enable_synthesis: false
synthesis_temperature: 0.3
timeout: 30
```
**Setup:**
1. Sign up at [console.anthropic.com](https://console.anthropic.com)
2. Add credits to account
3. Create API key
4. Start with Claude Haiku for budget-friendly option
## 🧪 Testing Your Setup
### 1. Basic Functionality Test
```bash
# Test without LLM (should always work)
./rag-mini search /path/to/project "authentication"
```
### 2. Synthesis Test
```bash
# Test LLM integration
./rag-mini search /path/to/project "authentication" --synthesize
```
### 3. Interactive Test
```bash
# Test exploration mode
./rag-mini explore /path/to/project
# Then ask: "How does authentication work in this codebase?"
```
### 4. Query Expansion Test
Enable `expand_queries: true` in config, then:
```bash
./rag-mini search /path/to/project "auth"
# Should automatically expand to "auth authentication login user session"
```
## 🛠️ Configuration Tips
### For Budget-Conscious Users
```yaml
llm:
synthesis_model: "gpt-4o-mini" # or claude-haiku
enable_synthesis: false # Manual control
synthesis_temperature: 0.1 # Factual responses
max_expansion_terms: 4 # Shorter expansions
```
### For Quality-Focused Users
```yaml
llm:
synthesis_model: "gpt-4o" # or claude-sonnet
enable_synthesis: true # Always on
synthesis_temperature: 0.3 # Balanced creativity
enable_thinking: true # Show reasoning
max_expansion_terms: 8 # Comprehensive expansion
```
### For Privacy-Focused Users
```yaml
# Use only local providers
embedding:
preferred_method: ollama # Local embeddings
llm:
provider: ollama # Local LLM
# Never use cloud providers
```
## 🔧 Troubleshooting
### Connection Issues
- **Local:** Ensure Ollama/LM Studio is running: `ps aux | grep ollama`
- **Cloud:** Check API key and internet: `curl -H "Authorization: Bearer $API_KEY" https://api.openai.com/v1/models`
### Model Not Found
- **Ollama:** `ollama pull model-name`
- **Cloud:** Check provider's model list documentation
### High Costs
- Use mini/haiku models instead of full versions
- Set `enable_synthesis: false` and use `--synthesize` selectively
- Reduce `max_expansion_terms` to 4-6
### Poor Quality
- Try higher-tier models (gpt-4o, claude-sonnet)
- Adjust `synthesis_temperature` (0.1 = factual, 0.5 = creative)
- Enable `expand_queries` for better search coverage
### Slow Responses
- **Local:** Try smaller models (qwen3:0.6b)
- **Cloud:** Increase `timeout` or switch providers
- **General:** Reduce `max_size` in chunking config
## 📋 Environment Variables (Alternative Setup)
Instead of putting API keys in config files, use environment variables:
```bash
# In your shell profile (.bashrc, .zshrc, etc.)
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export OPENROUTER_API_KEY="your-openrouter-key"
```
Then in config:
```yaml
llm:
api_key: "${OPENAI_API_KEY}" # Reads from environment
```
## 🚀 Advanced: Multi-Provider Setup
You can create different configs for different use cases:
```bash
# Fast local analysis
cp examples/config-beginner.yaml .mini-rag/config-local.yaml
# High-quality cloud analysis
cp examples/config-llm-providers.yaml .mini-rag/config-cloud.yaml
# Edit to use OpenAI/Claude
# Switch configs as needed
ln -sf config-local.yaml .mini-rag/config.yaml # Use local
ln -sf config-cloud.yaml .mini-rag/config.yaml # Use cloud
```
## 📚 Further Reading
- [Ollama Model Library](https://ollama.ai/library)
- [OpenRouter Pricing](https://openrouter.ai/docs#models)
- [OpenAI API Documentation](https://platform.openai.com/docs)
- [Anthropic Claude Documentation](https://docs.anthropic.com/claude)
- [LM Studio Getting Started](https://lmstudio.ai/docs)
---
💡 **Pro Tip:** Start with local Ollama for learning, then upgrade to cloud providers when you need production-quality analysis or are working with large codebases.

View File

@ -47,7 +47,6 @@ search:
expand_queries: false # Keep it simple for now
# 🤖 AI explanations (optional but helpful)
# 💡 WANT DIFFERENT LLM? See examples/config-llm-providers.yaml for OpenAI, Claude, etc.
llm:
synthesis_model: auto # Pick best available model
enable_synthesis: false # Turn on manually with --synthesize

View File

@ -1,233 +0,0 @@
# 🌐 LLM PROVIDER ALTERNATIVES - OpenRouter, LM Studio, OpenAI & More
# Educational guide showing how to configure different LLM providers
# Copy sections you need to your main config.yaml
#═════════════════════════════════════════════════════════════════════════════════
# 🎯 QUICK PROVIDER SELECTION GUIDE:
#
# 🏠 LOCAL (Best Privacy, No Internet Needed):
# - Ollama: Great quality, easy setup, free
# - LM Studio: User-friendly GUI, works with many models
#
# ☁️ CLOUD (Powerful Models, Requires API Keys):
# - OpenRouter: Access to many models with one API
# - OpenAI: High quality, reliable, but more expensive
# - Anthropic: Excellent for code analysis
#
# 💰 BUDGET FRIENDLY:
# - OpenRouter (Qwen, Llama models): $0.10-0.50 per million tokens
# - Local Ollama/LM Studio: Completely free
#
# 🚀 PERFORMANCE:
# - Local: Limited by your hardware
# - Cloud: Fast and powerful, costs per use
#═════════════════════════════════════════════════════════════════════════════════
# Standard FSS-Mini-RAG settings (copy these to any config)
chunking:
max_size: 2000
min_size: 150
strategy: semantic
streaming:
enabled: true
threshold_bytes: 1048576
files:
min_file_size: 50
exclude_patterns:
- "node_modules/**"
- ".git/**"
- "__pycache__/**"
- "*.pyc"
- ".venv/**"
- "build/**"
- "dist/**"
include_patterns:
- "**/*"
embedding:
preferred_method: ollama # Use Ollama for embeddings (works with all providers below)
ollama_model: nomic-embed-text
ollama_host: localhost:11434
batch_size: 32
search:
default_limit: 10
enable_bm25: true
similarity_threshold: 0.1
expand_queries: false
#═════════════════════════════════════════════════════════════════════════════════
# 🤖 LLM PROVIDER CONFIGURATIONS
#═════════════════════════════════════════════════════════════════════════════════
# 🏠 OPTION 1: OLLAMA (LOCAL) - Default and Recommended
# ✅ Pros: Free, private, no API keys, good quality
# ❌ Cons: Uses your computer's resources, limited by hardware
llm:
provider: ollama # Use local Ollama
ollama_host: localhost:11434 # Default Ollama location
synthesis_model: llama3.2 # Good all-around model
# alternatives: qwen3:0.6b (faster), llama3.2:3b (balanced), llama3.1:8b (quality)
expansion_model: llama3.2
enable_synthesis: false
synthesis_temperature: 0.3
cpu_optimized: true
enable_thinking: true
max_expansion_terms: 8
# 🖥️ OPTION 2: LM STUDIO (LOCAL) - User-Friendly Alternative
# ✅ Pros: Easy GUI, drag-drop model installation, compatible with Ollama
# ❌ Cons: Another app to manage, similar hardware limitations
#
# SETUP STEPS:
# 1. Download LM Studio from lmstudio.ai
# 2. Install a model (try "microsoft/DialoGPT-medium" or "TheBloke/Llama-2-7B-Chat-GGML")
# 3. Start local server in LM Studio (usually port 1234)
# 4. Use this config:
#
# llm:
# provider: openai # LM Studio uses OpenAI-compatible API
# api_base: http://localhost:1234/v1 # LM Studio default port
# api_key: "not-needed" # LM Studio doesn't require real API key
# synthesis_model: "any" # Use whatever model you loaded in LM Studio
# expansion_model: "any"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: true
# enable_thinking: true
# max_expansion_terms: 8
# ☁️ OPTION 3: OPENROUTER (CLOUD) - Many Models, One API
# ✅ Pros: Access to many models, good prices, no local setup
# ❌ Cons: Requires internet, costs money, less private
#
# SETUP STEPS:
# 1. Sign up at openrouter.ai
# 2. Get API key from dashboard
# 3. Add credits to account ($5-10 goes a long way)
# 4. Use this config:
#
# llm:
# provider: openai # OpenRouter uses OpenAI-compatible API
# api_base: https://openrouter.ai/api/v1
# api_key: "your-openrouter-api-key-here" # Replace with your actual key
# synthesis_model: "meta-llama/llama-3.1-8b-instruct:free" # Free tier model
# # alternatives: "openai/gpt-4o-mini" ($0.15/M), "anthropic/claude-3-haiku" ($0.25/M)
# expansion_model: "meta-llama/llama-3.1-8b-instruct:free"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: false # Cloud models don't need CPU optimization
# enable_thinking: true
# max_expansion_terms: 8
# timeout: 30 # Longer timeout for internet requests
# 🏢 OPTION 4: OPENAI (CLOUD) - Premium Quality
# ✅ Pros: Excellent quality, very reliable, fast
# ❌ Cons: More expensive, requires OpenAI account
#
# SETUP STEPS:
# 1. Sign up at platform.openai.com
# 2. Add payment method (pay-per-use)
# 3. Create API key in dashboard
# 4. Use this config:
#
# llm:
# provider: openai
# api_key: "your-openai-api-key-here" # Replace with your actual key
# synthesis_model: "gpt-4o-mini" # Affordable option (~$0.15/M tokens)
# # alternatives: "gpt-4o" (premium, ~$2.50/M), "gpt-3.5-turbo" (budget, ~$0.50/M)
# expansion_model: "gpt-4o-mini"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: false
# enable_thinking: true
# max_expansion_terms: 8
# timeout: 30
# 🧠 OPTION 5: ANTHROPIC CLAUDE (CLOUD) - Excellent for Code
# ✅ Pros: Great at code analysis, very thoughtful responses
# ❌ Cons: Premium pricing, separate API account needed
#
# SETUP STEPS:
# 1. Sign up at console.anthropic.com
# 2. Get API key and add credits
# 3. Use this config:
#
# llm:
# provider: anthropic
# api_key: "your-anthropic-api-key-here" # Replace with your actual key
# synthesis_model: "claude-3-haiku-20240307" # Most affordable option
# # alternatives: "claude-3-sonnet-20240229" (balanced), "claude-3-opus-20240229" (premium)
# expansion_model: "claude-3-haiku-20240307"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: false
# enable_thinking: true
# max_expansion_terms: 8
# timeout: 30
#═════════════════════════════════════════════════════════════════════════════════
# 🧪 TESTING YOUR CONFIGURATION
#═════════════════════════════════════════════════════════════════════════════════
#
# After setting up any provider, test with these commands:
#
# 1. Test basic search (no LLM needed):
# ./rag-mini search /path/to/project "test query"
#
# 2. Test LLM synthesis:
# ./rag-mini search /path/to/project "test query" --synthesize
#
# 3. Test query expansion:
# Enable expand_queries: true in search section and try:
# ./rag-mini search /path/to/project "auth"
#
# 4. Test thinking mode:
# ./rag-mini explore /path/to/project
# Then ask: "explain the authentication system"
#
#═════════════════════════════════════════════════════════════════════════════════
# 💡 TROUBLESHOOTING
#═════════════════════════════════════════════════════════════════════════════════
#
# ❌ "Connection refused" or "API error":
# - Local: Make sure Ollama/LM Studio is running
# - Cloud: Check API key and internet connection
#
# ❌ "Model not found":
# - Local: Install model with `ollama pull model-name`
# - Cloud: Check model name matches provider's API docs
#
# ❌ "Token limit exceeded" or expensive bills:
# - Use cheaper models like gpt-4o-mini or claude-haiku
# - Enable shorter contexts with max_size: 1500
#
# ❌ Slow responses:
# - Local: Try smaller models (qwen3:0.6b)
# - Cloud: Increase timeout or try different provider
#
# ❌ Poor quality results:
# - Try higher-quality models
# - Adjust synthesis_temperature (0.1 for factual, 0.5 for creative)
# - Enable expand_queries for better search coverage
#
#═════════════════════════════════════════════════════════════════════════════════
# 📚 LEARN MORE
#═════════════════════════════════════════════════════════════════════════════════
#
# Provider Documentation:
# - Ollama: https://ollama.ai/library (model catalog)
# - LM Studio: https://lmstudio.ai/docs (getting started)
# - OpenRouter: https://openrouter.ai/docs (API reference)
# - OpenAI: https://platform.openai.com/docs (API docs)
# - Anthropic: https://docs.anthropic.com/claude/reference (Claude API)
#
# Model Recommendations:
# - Code Analysis: claude-3-sonnet, gpt-4o, llama3.1:8b
# - Fast Responses: gpt-4o-mini, claude-haiku, qwen3:0.6b
# - Budget Friendly: OpenRouter free tier, local Ollama
# - Best Privacy: Local Ollama or LM Studio only
#
#═════════════════════════════════════════════════════════════════════════════════

View File

@ -162,72 +162,22 @@ check_ollama() {
print_warning "Ollama not found"
echo ""
echo -e "${CYAN}Ollama provides the best embedding quality and performance.${NC}"
echo -e "${YELLOW}To install Ollama:${NC}"
echo " 1. Visit: https://ollama.ai/download"
echo " 2. Download and install for your system"
echo " 3. Run: ollama serve"
echo " 4. Re-run this installer"
echo ""
echo -e "${BOLD}Options:${NC}"
echo -e "${GREEN}1) Install Ollama automatically${NC} (recommended)"
echo -e "${YELLOW}2) Manual installation${NC} - Visit https://ollama.com/download"
echo -e "${BLUE}3) Continue without Ollama${NC} (uses ML fallback)"
echo -e "${BLUE}Alternative: Use ML fallback (requires more disk space)${NC}"
echo ""
echo -n "Choose [1/2/3]: "
read -r ollama_choice
case "$ollama_choice" in
1|"")
print_info "Installing Ollama using official installer..."
echo -e "${CYAN}Running: curl -fsSL https://ollama.com/install.sh | sh${NC}"
if curl -fsSL https://ollama.com/install.sh | sh; then
print_success "Ollama installed successfully"
print_info "Starting Ollama server..."
ollama serve &
sleep 3
if curl -s http://localhost:11434/api/version >/dev/null 2>&1; then
print_success "Ollama server started"
echo ""
echo -e "${CYAN}💡 Pro tip: Download an LLM for AI-powered search synthesis!${NC}"
echo -e " Lightweight: ${GREEN}ollama pull qwen3:0.6b${NC} (~400MB, very fast)"
echo -e " Balanced: ${GREEN}ollama pull qwen3:1.7b${NC} (~1GB, good quality)"
echo -e " Excellent: ${GREEN}ollama pull qwen3:3b${NC} (~2GB, great for this project)"
echo -e " Premium: ${GREEN}ollama pull qwen3:8b${NC} (~5GB, amazing results)"
echo ""
echo -e "${BLUE}Creative possibilities: Try mistral for storytelling, or qwen3-coder for development!${NC}"
echo ""
return 0
else
print_warning "Ollama installed but failed to start automatically"
echo "Please start Ollama manually: ollama serve"
echo "Then re-run this installer"
exit 1
fi
else
print_error "Failed to install Ollama automatically"
echo "Please install manually from https://ollama.com/download"
exit 1
fi
;;
2)
echo ""
echo -e "${YELLOW}Manual Ollama installation:${NC}"
echo " 1. Visit: https://ollama.com/download"
echo " 2. Download and install for your system"
echo " 3. Run: ollama serve"
echo " 4. Re-run this installer"
print_info "Exiting for manual installation..."
exit 0
;;
3)
print_info "Continuing without Ollama (will use ML fallback)"
return 1
;;
*)
print_warning "Invalid choice, continuing without Ollama"
return 1
;;
esac
echo -n "Continue without Ollama? (y/N): "
read -r continue_without
if [[ $continue_without =~ ^[Yy]$ ]]; then
return 1
else
print_info "Install Ollama first, then re-run this script"
exit 0
fi
fi
}
@ -321,8 +271,8 @@ get_installation_preferences() {
echo ""
echo -e "${BOLD}Installation options:${NC}"
echo -e "${GREEN}L) Light${NC} - Ollama + basic deps (~50MB) ${CYAN}← Best performance + AI chat${NC}"
echo -e "${YELLOW}F) Full${NC} - Light + ML fallback (~2-3GB) ${CYAN}← RAG-only if no Ollama${NC}"
echo -e "${GREEN}L) Light${NC} - Ollama + basic deps (~50MB)"
echo -e "${YELLOW}F) Full${NC} - Light + ML fallback (~2-3GB)"
echo -e "${BLUE}C) Custom${NC} - Configure individual components"
echo ""
@ -599,125 +549,35 @@ show_completion() {
read -r run_test
if [[ ! $run_test =~ ^[Nn]$ ]]; then
run_quick_test
echo ""
show_beginner_guidance
else
show_beginner_guidance
fi
}
# Create sample project for testing
create_sample_project() {
local sample_dir="$SCRIPT_DIR/.sample_test"
rm -rf "$sample_dir"
mkdir -p "$sample_dir"
# Create a few small sample files
cat > "$sample_dir/README.md" << 'EOF'
# Sample Project
This is a sample project for testing FSS-Mini-RAG search capabilities.
## Features
- User authentication system
- Document processing
- Search functionality
- Email integration
EOF
cat > "$sample_dir/auth.py" << 'EOF'
# Authentication module
def login_user(username, password):
"""Handle user login with password validation"""
if validate_credentials(username, password):
create_session(username)
return True
return False
def validate_credentials(username, password):
"""Check username and password against database"""
# Database validation logic here
return check_password_hash(username, password)
EOF
cat > "$sample_dir/search.py" << 'EOF'
# Search functionality
def semantic_search(query, documents):
"""Perform semantic search across document collection"""
embeddings = generate_embeddings(query)
results = find_similar_documents(embeddings, documents)
return rank_results(results)
def generate_embeddings(text):
"""Generate vector embeddings for text"""
# Embedding generation logic
return process_with_model(text)
EOF
echo "$sample_dir"
}
# Run quick test with sample data
# Run quick test
run_quick_test() {
print_header "Quick Test"
print_info "Creating small sample project for testing..."
local sample_dir=$(create_sample_project)
echo "Sample project created with 3 files for fast testing."
print_info "Testing on this project directory..."
echo "This will index the FSS-Mini-RAG system itself as a test."
echo ""
# Index the sample project (much faster)
print_info "Indexing sample project (this should be fast)..."
if ./rag-mini index "$sample_dir" --quiet; then
print_success "Sample project indexed successfully"
# Index this project
if ./rag-mini index "$SCRIPT_DIR"; then
print_success "Indexing completed"
# Try a search
echo ""
print_info "Testing search with sample queries..."
echo -e "${BLUE}Running search: 'user authentication'${NC}"
./rag-mini search "$sample_dir" "user authentication" --limit 2
print_info "Testing search functionality..."
./rag-mini search "$SCRIPT_DIR" "embedding system" --limit 3
echo ""
print_success "Test completed successfully!"
echo -e "${CYAN}Ready to use FSS-Mini-RAG on your own projects!${NC}"
# Offer beginner guidance
echo ""
echo -e "${YELLOW}💡 Beginner Tip:${NC} Try the interactive mode with pre-made questions"
echo " Run: ./rag-tui for guided experience"
# Clean up sample
rm -rf "$sample_dir"
echo -e "${CYAN}You can now use FSS-Mini-RAG on your own projects.${NC}"
else
print_error "Sample test failed"
echo "This might indicate an issue with the installation."
rm -rf "$sample_dir"
print_error "Test failed"
echo "Check the error messages above for troubleshooting."
fi
}
# Show beginner-friendly first steps
show_beginner_guidance() {
print_header "Getting Started - Your First Search"
echo -e "${CYAN}FSS-Mini-RAG is ready! Here's how to start:${NC}"
echo ""
echo -e "${GREEN}🎯 For Beginners (Recommended):${NC}"
echo " ./rag-tui"
echo " ↳ Interactive interface with sample questions"
echo ""
echo -e "${BLUE}💻 For Developers:${NC}"
echo " ./rag-mini index /path/to/your/project"
echo " ./rag-mini search /path/to/your/project \"your question\""
echo ""
echo -e "${YELLOW}📚 What can you search for in FSS-Mini-RAG?${NC}"
echo " • Technical: \"chunking strategy\", \"ollama integration\", \"indexing performance\""
echo " • Usage: \"how to improve search results\", \"why does indexing take long\""
echo " • Your own projects: any code, docs, emails, notes, research"
echo ""
echo -e "${CYAN}💡 Pro tip:${NC} You can drag ANY text-based documents into a folder"
echo " and search through them - emails, notes, research, chat logs!"
}
# Main installation flow
main() {
echo -e "${CYAN}${BOLD}"

View File

@ -72,21 +72,13 @@ class SearchConfig:
@dataclass
class LLMConfig:
"""Configuration for LLM synthesis and query expansion."""
# Core settings
ollama_host: str = "localhost:11434"
synthesis_model: str = "auto" # "auto", "qwen3:1.7b", "qwen2.5:1.5b", etc.
expansion_model: str = "auto" # Usually same as synthesis_model
max_expansion_terms: int = 8 # Maximum additional terms to add
enable_synthesis: bool = False # Enable by default when --synthesize used
synthesis_temperature: float = 0.3
enable_thinking: bool = True # Enable thinking mode for Qwen3 models
cpu_optimized: bool = True # Prefer lightweight models
# Provider-specific settings (for different LLM providers)
provider: str = "ollama" # "ollama", "openai", "anthropic"
ollama_host: str = "localhost:11434" # Ollama connection
api_key: Optional[str] = None # API key for cloud providers
api_base: Optional[str] = None # Base URL for API (e.g., OpenRouter)
timeout: int = 20 # Request timeout in seconds
enable_thinking: bool = True # Enable thinking mode for Qwen3 models (production: True, testing: toggle)
@dataclass

View File

@ -81,36 +81,16 @@ class OllamaEmbedder:
def _verify_ollama_connection(self):
"""Verify Ollama server is running and model is available."""
try:
# Check server status
response = requests.get(f"{self.base_url}/api/tags", timeout=5)
response.raise_for_status()
except requests.exceptions.ConnectionError:
print("🔌 Ollama Service Unavailable")
print(" Ollama provides AI embeddings that make semantic search possible")
print(" Start Ollama: ollama serve")
print(" Install models: ollama pull nomic-embed-text")
print()
raise ConnectionError("Ollama service not running. Start with: ollama serve")
except requests.exceptions.Timeout:
print("⏱️ Ollama Service Timeout")
print(" Ollama is taking too long to respond")
print(" Check if Ollama is overloaded: ollama ps")
print(" Restart if needed: killall ollama && ollama serve")
print()
raise ConnectionError("Ollama service timeout")
# Check server status
response = requests.get(f"{self.base_url}/api/tags", timeout=5)
response.raise_for_status()
# Check if our model is available
models = response.json().get('models', [])
model_names = [model['name'] for model in models]
if self.model_name not in model_names:
print(f"📦 Model '{self.model_name}' Not Found")
print(" Embedding models convert text into searchable vectors")
print(f" Download model: ollama pull {self.model_name}")
if model_names:
print(f" Available models: {', '.join(model_names[:3])}")
print()
logger.warning(f"Model {self.model_name} not found. Available: {model_names}")
# Try to pull the model
self._pull_model()

View File

@ -117,21 +117,11 @@ class CodeSearcher:
"""Connect to the LanceDB database."""
try:
if not self.rag_dir.exists():
print("🗃️ No Search Index Found")
print(" An index is a database that makes your files searchable")
print(f" Create index: ./rag-mini index {self.project_path}")
print(" (This analyzes your files and creates semantic search vectors)")
print()
raise FileNotFoundError(f"No RAG index found at {self.rag_dir}")
self.db = lancedb.connect(self.rag_dir)
if "code_vectors" not in self.db.table_names():
print("🔧 Index Database Corrupted")
print(" The search index exists but is missing data tables")
print(f" Rebuild index: rm -rf {self.rag_dir} && ./rag-mini index {self.project_path}")
print(" (This will recreate the search database)")
print()
raise ValueError("No code_vectors table found. Run indexing first.")
self.table = self.db.open_table("code_vectors")

View File

@ -15,29 +15,11 @@ import logging
# Add the RAG system to the path
sys.path.insert(0, str(Path(__file__).parent))
try:
from mini_rag.indexer import ProjectIndexer
from mini_rag.search import CodeSearcher
from mini_rag.ollama_embeddings import OllamaEmbedder
from mini_rag.llm_synthesizer import LLMSynthesizer
from mini_rag.explorer import CodeExplorer
except ImportError as e:
print("❌ Error: Missing dependencies!")
print()
print("It looks like you haven't installed the required packages yet.")
print("This is a common mistake - here's how to fix it:")
print()
print("1. Make sure you're in the FSS-Mini-RAG directory")
print("2. Run the installer script:")
print(" ./install_mini_rag.sh")
print()
print("Or if you want to install manually:")
print(" python3 -m venv .venv")
print(" source .venv/bin/activate")
print(" pip install -r requirements.txt")
print()
print(f"Missing module: {e.name}")
sys.exit(1)
from mini_rag.indexer import ProjectIndexer
from mini_rag.search import CodeSearcher
from mini_rag.ollama_embeddings import OllamaEmbedder
from mini_rag.llm_synthesizer import LLMSynthesizer
from mini_rag.explorer import CodeExplorer
# Configure logging for user-friendly output
logging.basicConfig(
@ -86,25 +68,7 @@ def index_project(project_path: Path, force: bool = False):
if not (project_path / '.mini-rag' / 'last_search').exists():
print(f"\n💡 Try: rag-mini search {project_path} \"your search here\"")
except FileNotFoundError:
print(f"📁 Directory Not Found: {project_path}")
print(" Make sure the path exists and you're in the right location")
print(f" Current directory: {Path.cwd()}")
print(" Check path: ls -la /path/to/your/project")
print()
sys.exit(1)
except PermissionError:
print("🔒 Permission Denied")
print(" FSS-Mini-RAG needs to read files and create index database")
print(f" Check permissions: ls -la {project_path}")
print(" Try a different location with write access")
print()
sys.exit(1)
except Exception as e:
# Connection errors are handled in the embedding module
if "ollama" in str(e).lower() or "connection" in str(e).lower():
sys.exit(1) # Error already displayed
print(f"❌ Indexing failed: {e}")
print()
print("🔧 Common solutions:")

View File

@ -15,7 +15,6 @@ class SimpleTUI:
def __init__(self):
self.project_path: Optional[Path] = None
self.current_config: Dict[str, Any] = {}
self.search_count = 0 # Track searches for sample reminder
def clear_screen(self):
"""Clear the terminal screen."""
@ -279,37 +278,8 @@ class SimpleTUI:
print(f"Project: {self.project_path.name}")
print()
# Show sample questions for beginners - relevant to FSS-Mini-RAG
print("💡 Not sure what to search for? Try these questions about FSS-Mini-RAG:")
print()
sample_questions = [
"chunking strategy",
"ollama integration",
"indexing performance",
"why does indexing take long",
"how to improve search results",
"embedding generation"
]
for i, question in enumerate(sample_questions[:3], 1):
print(f" {i}. {question}")
print(" 4. Enter your own question")
print()
# Let user choose a sample or enter their own
choice_str = self.get_input("Choose a number (1-4) or press Enter for custom", "4")
try:
choice = int(choice_str)
if 1 <= choice <= 3:
query = sample_questions[choice - 1]
print(f"Selected: '{query}'")
print()
else:
query = self.get_input("Enter your search query", "").strip()
except ValueError:
query = self.get_input("Enter your search query", "").strip()
# Get search query
query = self.get_input("Enter search query", "").strip()
if not query:
return
@ -384,70 +354,6 @@ class SimpleTUI:
if len(results) > 1:
print("💡 To see more context or specific results:")
print(f" Run: ./rag-mini search {self.project_path} \"{query}\" --verbose")
# Suggest follow-up questions based on the search
print()
print("🔍 Suggested follow-up searches:")
follow_up_questions = self.generate_follow_up_questions(query, results)
for i, question in enumerate(follow_up_questions, 1):
print(f" {i}. {question}")
# Ask if they want to run a follow-up search
print()
choice = input("Run a follow-up search? Enter number (1-3) or press Enter to continue: ").strip()
if choice.isdigit() and 1 <= int(choice) <= len(follow_up_questions):
# Recursive search with the follow-up question
follow_up_query = follow_up_questions[int(choice) - 1]
print(f"\nSearching for: '{follow_up_query}'")
print("=" * 50)
# Run another search
follow_results = searcher.search(follow_up_query, top_k=5)
if follow_results:
print(f"✅ Found {len(follow_results)} follow-up results:")
print()
for i, result in enumerate(follow_results[:3], 1): # Show top 3
try:
rel_path = result.file_path.relative_to(self.project_path)
except:
rel_path = result.file_path
print(f"{i}. {rel_path} (Score: {result.score:.3f})")
print(f" {result.content.strip()[:100]}...")
print()
else:
print("❌ No follow-up results found")
# Track searches and show sample reminder
self.search_count += 1
# Show sample reminder after 2 searches
if self.search_count >= 2 and self.project_path.name == '.sample_test':
print()
print("⚠️ Sample Limitation Notice")
print("=" * 30)
print("You've been searching a small sample project.")
print("For full exploration of your codebase, you need to index the complete project.")
print()
# Show timing estimate if available
try:
with open('/tmp/fss-rag-sample-time.txt', 'r') as f:
sample_time = int(f.read().strip())
# Rough estimate: multiply by file count ratio
estimated_time = sample_time * 20 # Rough multiplier
print(f"🕒 Estimated full indexing time: ~{estimated_time} seconds")
except:
print("🕒 Estimated full indexing time: 1-3 minutes for typical projects")
print()
choice = input("Index the full project now? [y/N]: ").strip().lower()
if choice == 'y':
# Switch to full project and index
parent_dir = self.project_path.parent
self.project_path = parent_dir
print(f"\nSwitching to full project: {parent_dir}")
print("Starting full indexing...")
# Note: This would trigger full indexing in real implementation
print(f" Or: ./rag-mini-enhanced context {self.project_path} \"{query}\"")
print()
@ -458,48 +364,6 @@ class SimpleTUI:
print()
input("Press Enter to continue...")
def generate_follow_up_questions(self, original_query: str, results) -> List[str]:
"""Generate contextual follow-up questions based on search results."""
# Simple pattern-based follow-up generation
follow_ups = []
# Based on original query patterns
query_lower = original_query.lower()
# FSS-Mini-RAG specific follow-ups
if "chunk" in query_lower:
follow_ups.extend(["chunk size optimization", "smart chunking boundaries", "chunk overlap strategies"])
elif "ollama" in query_lower:
follow_ups.extend(["embedding model comparison", "ollama server setup", "nomic-embed-text performance"])
elif "index" in query_lower or "performance" in query_lower:
follow_ups.extend(["indexing speed optimization", "memory usage during indexing", "file processing pipeline"])
elif "search" in query_lower or "result" in query_lower:
follow_ups.extend(["search result ranking", "semantic vs keyword search", "query expansion techniques"])
elif "embed" in query_lower:
follow_ups.extend(["vector embedding storage", "embedding model fallbacks", "similarity scoring"])
else:
# Generic RAG-related follow-ups
follow_ups.extend(["vector database internals", "search quality tuning", "embedding optimization"])
# Based on file types found in results (FSS-Mini-RAG specific)
if results:
file_extensions = set()
for result in results[:3]: # Check first 3 results
ext = result.file_path.suffix.lower()
file_extensions.add(ext)
if '.py' in file_extensions:
follow_ups.append("Python module dependencies")
if '.md' in file_extensions:
follow_ups.append("documentation implementation")
if 'chunker' in str(results[0].file_path).lower():
follow_ups.append("chunking algorithm details")
if 'search' in str(results[0].file_path).lower():
follow_ups.append("search algorithm implementation")
# Return top 3 unique follow-ups
return list(dict.fromkeys(follow_ups))[:3]
def explore_interactive(self):
"""Interactive exploration interface with thinking mode."""
if not self.project_path:
@ -818,12 +682,6 @@ class SimpleTUI:
status = "✅ Indexed" if rag_dir.exists() else "❌ Not indexed"
print(f"📁 Current project: {self.project_path.name} ({status})")
print()
else:
# Show beginner tips when no project selected
print("🎯 Welcome to FSS-Mini-RAG!")
print(" Search through code, documents, emails, notes - anything text-based!")
print(" Start by selecting a project directory below.")
print()
options = [
"Select project directory",

View File

@ -1,124 +0,0 @@
#!/usr/bin/env python3
"""
Test script to validate all config examples are syntactically correct
and contain required fields for FSS-Mini-RAG.
"""
import yaml
import sys
from pathlib import Path
from typing import Dict, Any, List
def validate_config_structure(config: Dict[str, Any], config_name: str) -> List[str]:
"""Validate that config has required structure."""
errors = []
# Required sections
required_sections = ['chunking', 'streaming', 'files', 'embedding', 'search']
for section in required_sections:
if section not in config:
errors.append(f"{config_name}: Missing required section '{section}'")
# Validate chunking section
if 'chunking' in config:
chunking = config['chunking']
required_chunking = ['max_size', 'min_size', 'strategy']
for field in required_chunking:
if field not in chunking:
errors.append(f"{config_name}: Missing chunking.{field}")
# Validate types and ranges
if 'max_size' in chunking and not isinstance(chunking['max_size'], int):
errors.append(f"{config_name}: chunking.max_size must be integer")
if 'min_size' in chunking and not isinstance(chunking['min_size'], int):
errors.append(f"{config_name}: chunking.min_size must be integer")
if 'strategy' in chunking and chunking['strategy'] not in ['semantic', 'fixed']:
errors.append(f"{config_name}: chunking.strategy must be 'semantic' or 'fixed'")
# Validate embedding section
if 'embedding' in config:
embedding = config['embedding']
if 'preferred_method' in embedding:
valid_methods = ['ollama', 'ml', 'hash', 'auto']
if embedding['preferred_method'] not in valid_methods:
errors.append(f"{config_name}: embedding.preferred_method must be one of {valid_methods}")
# Validate LLM section (if present)
if 'llm' in config:
llm = config['llm']
if 'synthesis_temperature' in llm:
temp = llm['synthesis_temperature']
if not isinstance(temp, (int, float)) or temp < 0 or temp > 1:
errors.append(f"{config_name}: llm.synthesis_temperature must be number between 0-1")
return errors
def test_config_file(config_path: Path) -> bool:
"""Test a single config file."""
print(f"Testing {config_path.name}...")
try:
# Test YAML parsing
with open(config_path, 'r') as f:
config = yaml.safe_load(f)
if not config:
print(f"{config_path.name}: Empty or invalid YAML")
return False
# Test structure
errors = validate_config_structure(config, config_path.name)
if errors:
print(f"{config_path.name}: Structure errors:")
for error in errors:
print(f"{error}")
return False
print(f"{config_path.name}: Valid")
return True
except yaml.YAMLError as e:
print(f"{config_path.name}: YAML parsing error: {e}")
return False
except Exception as e:
print(f"{config_path.name}: Unexpected error: {e}")
return False
def main():
"""Test all config examples."""
script_dir = Path(__file__).parent
project_root = script_dir.parent
examples_dir = project_root / 'examples'
if not examples_dir.exists():
print(f"❌ Examples directory not found: {examples_dir}")
sys.exit(1)
# Find all config files
config_files = list(examples_dir.glob('config*.yaml'))
if not config_files:
print(f"❌ No config files found in {examples_dir}")
sys.exit(1)
print(f"🧪 Testing {len(config_files)} config files...\n")
all_passed = True
for config_file in sorted(config_files):
passed = test_config_file(config_file)
if not passed:
all_passed = False
print(f"\n{'='*50}")
if all_passed:
print("✅ All config files are valid!")
print("\n💡 To use any config:")
print(" cp examples/config-NAME.yaml /path/to/project/.mini-rag/config.yaml")
sys.exit(0)
else:
print("❌ Some config files have issues - please fix before release")
sys.exit(1)
if __name__ == '__main__':
main()