Compare commits

...

6 Commits

Author SHA1 Message Date
11639c8237 Add Ollama auto-installation and educational LLM model suggestions
 Features:
- One-click Ollama installation using official script
- Educational LLM model recommendations after successful install
- Smart 3-option menu: auto-install, manual, or skip
- Clear performance vs quality guidance for model selection

🛡 Safety & UX:
- Uses official ollama.com/install.sh script
- Shows exact commands before execution
- Graceful fallback to manual installation
- Auto-starts Ollama server and verifies health
- Educational approach with size/performance trade-offs

🎯 Model Recommendations:
- qwen3:0.6b (lightweight, 400MB)
- qwen3:1.7b (balanced, 1GB)
- qwen3:3b (excellent for this project, 2GB)
- qwen3:8b (premium results, 5GB)
- Creative suggestions: mistral for storytelling, qwen3-coder for development

Transforms installation from multi-step manual process to guided automation.
2025-08-14 19:50:12 +10:00
2f2dd6880b Add comprehensive LLM provider support and educational error handling
 Features:
- Multi-provider LLM support (OpenAI, Claude, OpenRouter, LM Studio)
- Educational config examples with setup guides
- Comprehensive documentation in docs/LLM_PROVIDERS.md
- Config validation testing system

🎯 Beginner Experience:
- Friendly error messages for common mistakes
- Educational explanations for technical concepts
- Step-by-step troubleshooting guidance
- Clear next-steps for every error condition

🛠 Technical:
- Extended LLMConfig dataclass for cloud providers
- Automated config validation script
- Enhanced error handling in core components
- Backward-compatible configuration system

📚 Documentation:
- Provider comparison tables with costs/quality
- Setup instructions for each LLM provider
- Troubleshooting guides and testing procedures
- Environment variable configuration options

All configs pass validation tests. Ready for production use.
2025-08-14 16:39:12 +10:00
3fe26ef138 Address PR feedback: Better samples and realistic search examples
Based on feedback in PR comment, implemented:

Installer improvements:
- Added choice between code/docs sample testing
- Created FSS-Mini-RAG specific sample files (chunker.py, ollama_integration.py, etc.)
- Timing-based estimation for full project indexing
- Better sample content that actually relates to this project

TUI enhancements:
- Replaced generic searches with FSS-Mini-RAG relevant questions:
  * "chunking strategy"
  * "ollama integration"
  * "indexing performance"
  * "why does indexing take long"
- Added search count tracking and sample limitation reminder
- Intelligent transition to full project after 2 sample searches
- FSS-Mini-RAG specific follow-up question patterns

Key fixes:
- No more dead search results (removed auth/API queries that don't exist)
- Sample questions now match actual content that will be found
- User gets timing estimate for full indexing based on sample performance
- Clear transition path from sample to full project exploration

This prevents the "installed malware" feeling when searches return no results.
2025-08-14 08:55:53 +10:00
e6d5f20f7d Improve installer experience and beginner-friendly features
- Replace slow full-project test with fast 3-file sample
- Add beginner guidance and welcome messages
- Add sample questions to combat prompt paralysis
- Add intelligent follow-up question suggestions
- Improve TUI with contextual next steps

Installer improvements:
- Create minimal sample project (3 files) for testing
- Add helpful tips and guidance for new users
- Better error messaging and progress indicators

TUI enhancements:
- Welcome message for first-time users
- Sample search questions (authentication, error handling, etc.)
- Pattern-based follow-up question generation
- Contextual suggestions based on search results

These changes address user feedback about installation taking too long
and beginners not knowing what to search for.
2025-08-14 08:26:22 +10:00
29abbb285e Merge branch 'main' of https://github.com/FSSCoding/Fss-Mini-Rag 2025-08-12 22:17:09 +10:00
e16451b060
Initial commit 2025-08-12 20:03:50 +10:00
11 changed files with 1040 additions and 41 deletions

21
LICENSE Normal file
View File

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 Brett Fox
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

264
docs/LLM_PROVIDERS.md Normal file
View File

@ -0,0 +1,264 @@
# 🤖 LLM Provider Setup Guide
This guide shows how to configure FSS-Mini-RAG with different LLM providers for synthesis and query expansion features.
## 🎯 Quick Provider Comparison
| Provider | Cost | Setup Difficulty | Quality | Privacy | Internet Required |
|----------|------|------------------|---------|---------|-------------------|
| **Ollama** | Free | Easy | Good | Excellent | No |
| **LM Studio** | Free | Easy | Good | Excellent | No |
| **OpenRouter** | Low ($0.10-0.50/M) | Medium | Excellent | Fair | Yes |
| **OpenAI** | Medium ($0.15-2.50/M) | Medium | Excellent | Fair | Yes |
| **Anthropic** | Medium-High | Medium | Excellent | Fair | Yes |
## 🏠 Local Providers (Recommended for Beginners)
### Ollama (Default)
**Best for:** Privacy, learning, no ongoing costs
```yaml
llm:
provider: ollama
ollama_host: localhost:11434
synthesis_model: llama3.2
expansion_model: llama3.2
enable_synthesis: false
synthesis_temperature: 0.3
cpu_optimized: true
enable_thinking: true
```
**Setup:**
1. Install Ollama: `curl -fsSL https://ollama.ai/install.sh | sh`
2. Start service: `ollama serve`
3. Download model: `ollama pull llama3.2`
4. Test: `./rag-mini search /path/to/project "test" --synthesize`
**Recommended Models:**
- `qwen3:0.6b` - Ultra-fast, good for CPU-only systems
- `llama3.2` - Balanced quality and speed
- `llama3.1:8b` - Higher quality, needs more RAM
### LM Studio
**Best for:** GUI users, model experimentation
```yaml
llm:
provider: openai
api_base: http://localhost:1234/v1
api_key: "not-needed"
synthesis_model: "any"
expansion_model: "any"
enable_synthesis: false
synthesis_temperature: 0.3
```
**Setup:**
1. Download [LM Studio](https://lmstudio.ai)
2. Install any model from the catalog
3. Start local server (default port 1234)
4. Use config above
## ☁️ Cloud Providers (For Advanced Users)
### OpenRouter (Best Value)
**Best for:** Access to many models, reasonable pricing
```yaml
llm:
provider: openai
api_base: https://openrouter.ai/api/v1
api_key: "your-api-key-here"
synthesis_model: "meta-llama/llama-3.1-8b-instruct:free"
expansion_model: "meta-llama/llama-3.1-8b-instruct:free"
enable_synthesis: false
synthesis_temperature: 0.3
timeout: 30
```
**Setup:**
1. Sign up at [openrouter.ai](https://openrouter.ai)
2. Create API key in dashboard
3. Add $5-10 credits (goes far with efficient models)
4. Replace `your-api-key-here` with actual key
**Budget Models:**
- `meta-llama/llama-3.1-8b-instruct:free` - Free tier
- `openai/gpt-4o-mini` - $0.15 per million tokens
- `anthropic/claude-3-haiku` - $0.25 per million tokens
### OpenAI (Premium Quality)
**Best for:** Reliability, advanced features
```yaml
llm:
provider: openai
api_key: "your-openai-api-key"
synthesis_model: "gpt-4o-mini"
expansion_model: "gpt-4o-mini"
enable_synthesis: false
synthesis_temperature: 0.3
timeout: 30
```
**Setup:**
1. Sign up at [platform.openai.com](https://platform.openai.com)
2. Add payment method
3. Create API key
4. Start with `gpt-4o-mini` for cost efficiency
### Anthropic Claude (Code Expert)
**Best for:** Code analysis, thoughtful responses
```yaml
llm:
provider: anthropic
api_key: "your-anthropic-api-key"
synthesis_model: "claude-3-haiku-20240307"
expansion_model: "claude-3-haiku-20240307"
enable_synthesis: false
synthesis_temperature: 0.3
timeout: 30
```
**Setup:**
1. Sign up at [console.anthropic.com](https://console.anthropic.com)
2. Add credits to account
3. Create API key
4. Start with Claude Haiku for budget-friendly option
## 🧪 Testing Your Setup
### 1. Basic Functionality Test
```bash
# Test without LLM (should always work)
./rag-mini search /path/to/project "authentication"
```
### 2. Synthesis Test
```bash
# Test LLM integration
./rag-mini search /path/to/project "authentication" --synthesize
```
### 3. Interactive Test
```bash
# Test exploration mode
./rag-mini explore /path/to/project
# Then ask: "How does authentication work in this codebase?"
```
### 4. Query Expansion Test
Enable `expand_queries: true` in config, then:
```bash
./rag-mini search /path/to/project "auth"
# Should automatically expand to "auth authentication login user session"
```
## 🛠️ Configuration Tips
### For Budget-Conscious Users
```yaml
llm:
synthesis_model: "gpt-4o-mini" # or claude-haiku
enable_synthesis: false # Manual control
synthesis_temperature: 0.1 # Factual responses
max_expansion_terms: 4 # Shorter expansions
```
### For Quality-Focused Users
```yaml
llm:
synthesis_model: "gpt-4o" # or claude-sonnet
enable_synthesis: true # Always on
synthesis_temperature: 0.3 # Balanced creativity
enable_thinking: true # Show reasoning
max_expansion_terms: 8 # Comprehensive expansion
```
### For Privacy-Focused Users
```yaml
# Use only local providers
embedding:
preferred_method: ollama # Local embeddings
llm:
provider: ollama # Local LLM
# Never use cloud providers
```
## 🔧 Troubleshooting
### Connection Issues
- **Local:** Ensure Ollama/LM Studio is running: `ps aux | grep ollama`
- **Cloud:** Check API key and internet: `curl -H "Authorization: Bearer $API_KEY" https://api.openai.com/v1/models`
### Model Not Found
- **Ollama:** `ollama pull model-name`
- **Cloud:** Check provider's model list documentation
### High Costs
- Use mini/haiku models instead of full versions
- Set `enable_synthesis: false` and use `--synthesize` selectively
- Reduce `max_expansion_terms` to 4-6
### Poor Quality
- Try higher-tier models (gpt-4o, claude-sonnet)
- Adjust `synthesis_temperature` (0.1 = factual, 0.5 = creative)
- Enable `expand_queries` for better search coverage
### Slow Responses
- **Local:** Try smaller models (qwen3:0.6b)
- **Cloud:** Increase `timeout` or switch providers
- **General:** Reduce `max_size` in chunking config
## 📋 Environment Variables (Alternative Setup)
Instead of putting API keys in config files, use environment variables:
```bash
# In your shell profile (.bashrc, .zshrc, etc.)
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export OPENROUTER_API_KEY="your-openrouter-key"
```
Then in config:
```yaml
llm:
api_key: "${OPENAI_API_KEY}" # Reads from environment
```
## 🚀 Advanced: Multi-Provider Setup
You can create different configs for different use cases:
```bash
# Fast local analysis
cp examples/config-beginner.yaml .mini-rag/config-local.yaml
# High-quality cloud analysis
cp examples/config-llm-providers.yaml .mini-rag/config-cloud.yaml
# Edit to use OpenAI/Claude
# Switch configs as needed
ln -sf config-local.yaml .mini-rag/config.yaml # Use local
ln -sf config-cloud.yaml .mini-rag/config.yaml # Use cloud
```
## 📚 Further Reading
- [Ollama Model Library](https://ollama.ai/library)
- [OpenRouter Pricing](https://openrouter.ai/docs#models)
- [OpenAI API Documentation](https://platform.openai.com/docs)
- [Anthropic Claude Documentation](https://docs.anthropic.com/claude)
- [LM Studio Getting Started](https://lmstudio.ai/docs)
---
💡 **Pro Tip:** Start with local Ollama for learning, then upgrade to cloud providers when you need production-quality analysis or are working with large codebases.

View File

@ -47,6 +47,7 @@ search:
expand_queries: false # Keep it simple for now expand_queries: false # Keep it simple for now
# 🤖 AI explanations (optional but helpful) # 🤖 AI explanations (optional but helpful)
# 💡 WANT DIFFERENT LLM? See examples/config-llm-providers.yaml for OpenAI, Claude, etc.
llm: llm:
synthesis_model: auto # Pick best available model synthesis_model: auto # Pick best available model
enable_synthesis: false # Turn on manually with --synthesize enable_synthesis: false # Turn on manually with --synthesize

View File

@ -0,0 +1,233 @@
# 🌐 LLM PROVIDER ALTERNATIVES - OpenRouter, LM Studio, OpenAI & More
# Educational guide showing how to configure different LLM providers
# Copy sections you need to your main config.yaml
#═════════════════════════════════════════════════════════════════════════════════
# 🎯 QUICK PROVIDER SELECTION GUIDE:
#
# 🏠 LOCAL (Best Privacy, No Internet Needed):
# - Ollama: Great quality, easy setup, free
# - LM Studio: User-friendly GUI, works with many models
#
# ☁️ CLOUD (Powerful Models, Requires API Keys):
# - OpenRouter: Access to many models with one API
# - OpenAI: High quality, reliable, but more expensive
# - Anthropic: Excellent for code analysis
#
# 💰 BUDGET FRIENDLY:
# - OpenRouter (Qwen, Llama models): $0.10-0.50 per million tokens
# - Local Ollama/LM Studio: Completely free
#
# 🚀 PERFORMANCE:
# - Local: Limited by your hardware
# - Cloud: Fast and powerful, costs per use
#═════════════════════════════════════════════════════════════════════════════════
# Standard FSS-Mini-RAG settings (copy these to any config)
chunking:
max_size: 2000
min_size: 150
strategy: semantic
streaming:
enabled: true
threshold_bytes: 1048576
files:
min_file_size: 50
exclude_patterns:
- "node_modules/**"
- ".git/**"
- "__pycache__/**"
- "*.pyc"
- ".venv/**"
- "build/**"
- "dist/**"
include_patterns:
- "**/*"
embedding:
preferred_method: ollama # Use Ollama for embeddings (works with all providers below)
ollama_model: nomic-embed-text
ollama_host: localhost:11434
batch_size: 32
search:
default_limit: 10
enable_bm25: true
similarity_threshold: 0.1
expand_queries: false
#═════════════════════════════════════════════════════════════════════════════════
# 🤖 LLM PROVIDER CONFIGURATIONS
#═════════════════════════════════════════════════════════════════════════════════
# 🏠 OPTION 1: OLLAMA (LOCAL) - Default and Recommended
# ✅ Pros: Free, private, no API keys, good quality
# ❌ Cons: Uses your computer's resources, limited by hardware
llm:
provider: ollama # Use local Ollama
ollama_host: localhost:11434 # Default Ollama location
synthesis_model: llama3.2 # Good all-around model
# alternatives: qwen3:0.6b (faster), llama3.2:3b (balanced), llama3.1:8b (quality)
expansion_model: llama3.2
enable_synthesis: false
synthesis_temperature: 0.3
cpu_optimized: true
enable_thinking: true
max_expansion_terms: 8
# 🖥️ OPTION 2: LM STUDIO (LOCAL) - User-Friendly Alternative
# ✅ Pros: Easy GUI, drag-drop model installation, compatible with Ollama
# ❌ Cons: Another app to manage, similar hardware limitations
#
# SETUP STEPS:
# 1. Download LM Studio from lmstudio.ai
# 2. Install a model (try "microsoft/DialoGPT-medium" or "TheBloke/Llama-2-7B-Chat-GGML")
# 3. Start local server in LM Studio (usually port 1234)
# 4. Use this config:
#
# llm:
# provider: openai # LM Studio uses OpenAI-compatible API
# api_base: http://localhost:1234/v1 # LM Studio default port
# api_key: "not-needed" # LM Studio doesn't require real API key
# synthesis_model: "any" # Use whatever model you loaded in LM Studio
# expansion_model: "any"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: true
# enable_thinking: true
# max_expansion_terms: 8
# ☁️ OPTION 3: OPENROUTER (CLOUD) - Many Models, One API
# ✅ Pros: Access to many models, good prices, no local setup
# ❌ Cons: Requires internet, costs money, less private
#
# SETUP STEPS:
# 1. Sign up at openrouter.ai
# 2. Get API key from dashboard
# 3. Add credits to account ($5-10 goes a long way)
# 4. Use this config:
#
# llm:
# provider: openai # OpenRouter uses OpenAI-compatible API
# api_base: https://openrouter.ai/api/v1
# api_key: "your-openrouter-api-key-here" # Replace with your actual key
# synthesis_model: "meta-llama/llama-3.1-8b-instruct:free" # Free tier model
# # alternatives: "openai/gpt-4o-mini" ($0.15/M), "anthropic/claude-3-haiku" ($0.25/M)
# expansion_model: "meta-llama/llama-3.1-8b-instruct:free"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: false # Cloud models don't need CPU optimization
# enable_thinking: true
# max_expansion_terms: 8
# timeout: 30 # Longer timeout for internet requests
# 🏢 OPTION 4: OPENAI (CLOUD) - Premium Quality
# ✅ Pros: Excellent quality, very reliable, fast
# ❌ Cons: More expensive, requires OpenAI account
#
# SETUP STEPS:
# 1. Sign up at platform.openai.com
# 2. Add payment method (pay-per-use)
# 3. Create API key in dashboard
# 4. Use this config:
#
# llm:
# provider: openai
# api_key: "your-openai-api-key-here" # Replace with your actual key
# synthesis_model: "gpt-4o-mini" # Affordable option (~$0.15/M tokens)
# # alternatives: "gpt-4o" (premium, ~$2.50/M), "gpt-3.5-turbo" (budget, ~$0.50/M)
# expansion_model: "gpt-4o-mini"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: false
# enable_thinking: true
# max_expansion_terms: 8
# timeout: 30
# 🧠 OPTION 5: ANTHROPIC CLAUDE (CLOUD) - Excellent for Code
# ✅ Pros: Great at code analysis, very thoughtful responses
# ❌ Cons: Premium pricing, separate API account needed
#
# SETUP STEPS:
# 1. Sign up at console.anthropic.com
# 2. Get API key and add credits
# 3. Use this config:
#
# llm:
# provider: anthropic
# api_key: "your-anthropic-api-key-here" # Replace with your actual key
# synthesis_model: "claude-3-haiku-20240307" # Most affordable option
# # alternatives: "claude-3-sonnet-20240229" (balanced), "claude-3-opus-20240229" (premium)
# expansion_model: "claude-3-haiku-20240307"
# enable_synthesis: false
# synthesis_temperature: 0.3
# cpu_optimized: false
# enable_thinking: true
# max_expansion_terms: 8
# timeout: 30
#═════════════════════════════════════════════════════════════════════════════════
# 🧪 TESTING YOUR CONFIGURATION
#═════════════════════════════════════════════════════════════════════════════════
#
# After setting up any provider, test with these commands:
#
# 1. Test basic search (no LLM needed):
# ./rag-mini search /path/to/project "test query"
#
# 2. Test LLM synthesis:
# ./rag-mini search /path/to/project "test query" --synthesize
#
# 3. Test query expansion:
# Enable expand_queries: true in search section and try:
# ./rag-mini search /path/to/project "auth"
#
# 4. Test thinking mode:
# ./rag-mini explore /path/to/project
# Then ask: "explain the authentication system"
#
#═════════════════════════════════════════════════════════════════════════════════
# 💡 TROUBLESHOOTING
#═════════════════════════════════════════════════════════════════════════════════
#
# ❌ "Connection refused" or "API error":
# - Local: Make sure Ollama/LM Studio is running
# - Cloud: Check API key and internet connection
#
# ❌ "Model not found":
# - Local: Install model with `ollama pull model-name`
# - Cloud: Check model name matches provider's API docs
#
# ❌ "Token limit exceeded" or expensive bills:
# - Use cheaper models like gpt-4o-mini or claude-haiku
# - Enable shorter contexts with max_size: 1500
#
# ❌ Slow responses:
# - Local: Try smaller models (qwen3:0.6b)
# - Cloud: Increase timeout or try different provider
#
# ❌ Poor quality results:
# - Try higher-quality models
# - Adjust synthesis_temperature (0.1 for factual, 0.5 for creative)
# - Enable expand_queries for better search coverage
#
#═════════════════════════════════════════════════════════════════════════════════
# 📚 LEARN MORE
#═════════════════════════════════════════════════════════════════════════════════
#
# Provider Documentation:
# - Ollama: https://ollama.ai/library (model catalog)
# - LM Studio: https://lmstudio.ai/docs (getting started)
# - OpenRouter: https://openrouter.ai/docs (API reference)
# - OpenAI: https://platform.openai.com/docs (API docs)
# - Anthropic: https://docs.anthropic.com/claude/reference (Claude API)
#
# Model Recommendations:
# - Code Analysis: claude-3-sonnet, gpt-4o, llama3.1:8b
# - Fast Responses: gpt-4o-mini, claude-haiku, qwen3:0.6b
# - Budget Friendly: OpenRouter free tier, local Ollama
# - Best Privacy: Local Ollama or LM Studio only
#
#═════════════════════════════════════════════════════════════════════════════════

View File

@ -162,22 +162,72 @@ check_ollama() {
print_warning "Ollama not found" print_warning "Ollama not found"
echo "" echo ""
echo -e "${CYAN}Ollama provides the best embedding quality and performance.${NC}" echo -e "${CYAN}Ollama provides the best embedding quality and performance.${NC}"
echo -e "${YELLOW}To install Ollama:${NC}" echo ""
echo " 1. Visit: https://ollama.ai/download" echo -e "${BOLD}Options:${NC}"
echo -e "${GREEN}1) Install Ollama automatically${NC} (recommended)"
echo -e "${YELLOW}2) Manual installation${NC} - Visit https://ollama.com/download"
echo -e "${BLUE}3) Continue without Ollama${NC} (uses ML fallback)"
echo ""
echo -n "Choose [1/2/3]: "
read -r ollama_choice
case "$ollama_choice" in
1|"")
print_info "Installing Ollama using official installer..."
echo -e "${CYAN}Running: curl -fsSL https://ollama.com/install.sh | sh${NC}"
if curl -fsSL https://ollama.com/install.sh | sh; then
print_success "Ollama installed successfully"
print_info "Starting Ollama server..."
ollama serve &
sleep 3
if curl -s http://localhost:11434/api/version >/dev/null 2>&1; then
print_success "Ollama server started"
echo ""
echo -e "${CYAN}💡 Pro tip: Download an LLM for AI-powered search synthesis!${NC}"
echo -e " Lightweight: ${GREEN}ollama pull qwen3:0.6b${NC} (~400MB, very fast)"
echo -e " Balanced: ${GREEN}ollama pull qwen3:1.7b${NC} (~1GB, good quality)"
echo -e " Excellent: ${GREEN}ollama pull qwen3:3b${NC} (~2GB, great for this project)"
echo -e " Premium: ${GREEN}ollama pull qwen3:8b${NC} (~5GB, amazing results)"
echo ""
echo -e "${BLUE}Creative possibilities: Try mistral for storytelling, or qwen3-coder for development!${NC}"
echo ""
return 0
else
print_warning "Ollama installed but failed to start automatically"
echo "Please start Ollama manually: ollama serve"
echo "Then re-run this installer"
exit 1
fi
else
print_error "Failed to install Ollama automatically"
echo "Please install manually from https://ollama.com/download"
exit 1
fi
;;
2)
echo ""
echo -e "${YELLOW}Manual Ollama installation:${NC}"
echo " 1. Visit: https://ollama.com/download"
echo " 2. Download and install for your system" echo " 2. Download and install for your system"
echo " 3. Run: ollama serve" echo " 3. Run: ollama serve"
echo " 4. Re-run this installer" echo " 4. Re-run this installer"
echo "" print_info "Exiting for manual installation..."
echo -e "${BLUE}Alternative: Use ML fallback (requires more disk space)${NC}"
echo ""
echo -n "Continue without Ollama? (y/N): "
read -r continue_without
if [[ $continue_without =~ ^[Yy]$ ]]; then
return 1
else
print_info "Install Ollama first, then re-run this script"
exit 0 exit 0
fi ;;
3)
print_info "Continuing without Ollama (will use ML fallback)"
return 1
;;
*)
print_warning "Invalid choice, continuing without Ollama"
return 1
;;
esac
fi fi
} }
@ -271,8 +321,8 @@ get_installation_preferences() {
echo "" echo ""
echo -e "${BOLD}Installation options:${NC}" echo -e "${BOLD}Installation options:${NC}"
echo -e "${GREEN}L) Light${NC} - Ollama + basic deps (~50MB)" echo -e "${GREEN}L) Light${NC} - Ollama + basic deps (~50MB) ${CYAN}← Best performance + AI chat${NC}"
echo -e "${YELLOW}F) Full${NC} - Light + ML fallback (~2-3GB)" echo -e "${YELLOW}F) Full${NC} - Light + ML fallback (~2-3GB) ${CYAN}← RAG-only if no Ollama${NC}"
echo -e "${BLUE}C) Custom${NC} - Configure individual components" echo -e "${BLUE}C) Custom${NC} - Configure individual components"
echo "" echo ""
@ -549,35 +599,125 @@ show_completion() {
read -r run_test read -r run_test
if [[ ! $run_test =~ ^[Nn]$ ]]; then if [[ ! $run_test =~ ^[Nn]$ ]]; then
run_quick_test run_quick_test
echo ""
show_beginner_guidance
else
show_beginner_guidance
fi fi
} }
# Run quick test # Create sample project for testing
create_sample_project() {
local sample_dir="$SCRIPT_DIR/.sample_test"
rm -rf "$sample_dir"
mkdir -p "$sample_dir"
# Create a few small sample files
cat > "$sample_dir/README.md" << 'EOF'
# Sample Project
This is a sample project for testing FSS-Mini-RAG search capabilities.
## Features
- User authentication system
- Document processing
- Search functionality
- Email integration
EOF
cat > "$sample_dir/auth.py" << 'EOF'
# Authentication module
def login_user(username, password):
"""Handle user login with password validation"""
if validate_credentials(username, password):
create_session(username)
return True
return False
def validate_credentials(username, password):
"""Check username and password against database"""
# Database validation logic here
return check_password_hash(username, password)
EOF
cat > "$sample_dir/search.py" << 'EOF'
# Search functionality
def semantic_search(query, documents):
"""Perform semantic search across document collection"""
embeddings = generate_embeddings(query)
results = find_similar_documents(embeddings, documents)
return rank_results(results)
def generate_embeddings(text):
"""Generate vector embeddings for text"""
# Embedding generation logic
return process_with_model(text)
EOF
echo "$sample_dir"
}
# Run quick test with sample data
run_quick_test() { run_quick_test() {
print_header "Quick Test" print_header "Quick Test"
print_info "Testing on this project directory..." print_info "Creating small sample project for testing..."
echo "This will index the FSS-Mini-RAG system itself as a test." local sample_dir=$(create_sample_project)
echo "Sample project created with 3 files for fast testing."
echo "" echo ""
# Index this project # Index the sample project (much faster)
if ./rag-mini index "$SCRIPT_DIR"; then print_info "Indexing sample project (this should be fast)..."
print_success "Indexing completed" if ./rag-mini index "$sample_dir" --quiet; then
print_success "Sample project indexed successfully"
# Try a search
echo "" echo ""
print_info "Testing search functionality..." print_info "Testing search with sample queries..."
./rag-mini search "$SCRIPT_DIR" "embedding system" --limit 3 echo -e "${BLUE}Running search: 'user authentication'${NC}"
./rag-mini search "$sample_dir" "user authentication" --limit 2
echo "" echo ""
print_success "Test completed successfully!" print_success "Test completed successfully!"
echo -e "${CYAN}You can now use FSS-Mini-RAG on your own projects.${NC}" echo -e "${CYAN}Ready to use FSS-Mini-RAG on your own projects!${NC}"
# Offer beginner guidance
echo ""
echo -e "${YELLOW}💡 Beginner Tip:${NC} Try the interactive mode with pre-made questions"
echo " Run: ./rag-tui for guided experience"
# Clean up sample
rm -rf "$sample_dir"
else else
print_error "Test failed" print_error "Sample test failed"
echo "Check the error messages above for troubleshooting." echo "This might indicate an issue with the installation."
rm -rf "$sample_dir"
fi fi
} }
# Show beginner-friendly first steps
show_beginner_guidance() {
print_header "Getting Started - Your First Search"
echo -e "${CYAN}FSS-Mini-RAG is ready! Here's how to start:${NC}"
echo ""
echo -e "${GREEN}🎯 For Beginners (Recommended):${NC}"
echo " ./rag-tui"
echo " ↳ Interactive interface with sample questions"
echo ""
echo -e "${BLUE}💻 For Developers:${NC}"
echo " ./rag-mini index /path/to/your/project"
echo " ./rag-mini search /path/to/your/project \"your question\""
echo ""
echo -e "${YELLOW}📚 What can you search for in FSS-Mini-RAG?${NC}"
echo " • Technical: \"chunking strategy\", \"ollama integration\", \"indexing performance\""
echo " • Usage: \"how to improve search results\", \"why does indexing take long\""
echo " • Your own projects: any code, docs, emails, notes, research"
echo ""
echo -e "${CYAN}💡 Pro tip:${NC} You can drag ANY text-based documents into a folder"
echo " and search through them - emails, notes, research, chat logs!"
}
# Main installation flow # Main installation flow
main() { main() {
echo -e "${CYAN}${BOLD}" echo -e "${CYAN}${BOLD}"

View File

@ -72,13 +72,21 @@ class SearchConfig:
@dataclass @dataclass
class LLMConfig: class LLMConfig:
"""Configuration for LLM synthesis and query expansion.""" """Configuration for LLM synthesis and query expansion."""
ollama_host: str = "localhost:11434" # Core settings
synthesis_model: str = "auto" # "auto", "qwen3:1.7b", "qwen2.5:1.5b", etc. synthesis_model: str = "auto" # "auto", "qwen3:1.7b", "qwen2.5:1.5b", etc.
expansion_model: str = "auto" # Usually same as synthesis_model expansion_model: str = "auto" # Usually same as synthesis_model
max_expansion_terms: int = 8 # Maximum additional terms to add max_expansion_terms: int = 8 # Maximum additional terms to add
enable_synthesis: bool = False # Enable by default when --synthesize used enable_synthesis: bool = False # Enable by default when --synthesize used
synthesis_temperature: float = 0.3 synthesis_temperature: float = 0.3
enable_thinking: bool = True # Enable thinking mode for Qwen3 models (production: True, testing: toggle) enable_thinking: bool = True # Enable thinking mode for Qwen3 models
cpu_optimized: bool = True # Prefer lightweight models
# Provider-specific settings (for different LLM providers)
provider: str = "ollama" # "ollama", "openai", "anthropic"
ollama_host: str = "localhost:11434" # Ollama connection
api_key: Optional[str] = None # API key for cloud providers
api_base: Optional[str] = None # Base URL for API (e.g., OpenRouter)
timeout: int = 20 # Request timeout in seconds
@dataclass @dataclass

View File

@ -81,16 +81,36 @@ class OllamaEmbedder:
def _verify_ollama_connection(self): def _verify_ollama_connection(self):
"""Verify Ollama server is running and model is available.""" """Verify Ollama server is running and model is available."""
try:
# Check server status # Check server status
response = requests.get(f"{self.base_url}/api/tags", timeout=5) response = requests.get(f"{self.base_url}/api/tags", timeout=5)
response.raise_for_status() response.raise_for_status()
except requests.exceptions.ConnectionError:
print("🔌 Ollama Service Unavailable")
print(" Ollama provides AI embeddings that make semantic search possible")
print(" Start Ollama: ollama serve")
print(" Install models: ollama pull nomic-embed-text")
print()
raise ConnectionError("Ollama service not running. Start with: ollama serve")
except requests.exceptions.Timeout:
print("⏱️ Ollama Service Timeout")
print(" Ollama is taking too long to respond")
print(" Check if Ollama is overloaded: ollama ps")
print(" Restart if needed: killall ollama && ollama serve")
print()
raise ConnectionError("Ollama service timeout")
# Check if our model is available # Check if our model is available
models = response.json().get('models', []) models = response.json().get('models', [])
model_names = [model['name'] for model in models] model_names = [model['name'] for model in models]
if self.model_name not in model_names: if self.model_name not in model_names:
logger.warning(f"Model {self.model_name} not found. Available: {model_names}") print(f"📦 Model '{self.model_name}' Not Found")
print(" Embedding models convert text into searchable vectors")
print(f" Download model: ollama pull {self.model_name}")
if model_names:
print(f" Available models: {', '.join(model_names[:3])}")
print()
# Try to pull the model # Try to pull the model
self._pull_model() self._pull_model()

View File

@ -117,11 +117,21 @@ class CodeSearcher:
"""Connect to the LanceDB database.""" """Connect to the LanceDB database."""
try: try:
if not self.rag_dir.exists(): if not self.rag_dir.exists():
print("🗃️ No Search Index Found")
print(" An index is a database that makes your files searchable")
print(f" Create index: ./rag-mini index {self.project_path}")
print(" (This analyzes your files and creates semantic search vectors)")
print()
raise FileNotFoundError(f"No RAG index found at {self.rag_dir}") raise FileNotFoundError(f"No RAG index found at {self.rag_dir}")
self.db = lancedb.connect(self.rag_dir) self.db = lancedb.connect(self.rag_dir)
if "code_vectors" not in self.db.table_names(): if "code_vectors" not in self.db.table_names():
print("🔧 Index Database Corrupted")
print(" The search index exists but is missing data tables")
print(f" Rebuild index: rm -rf {self.rag_dir} && ./rag-mini index {self.project_path}")
print(" (This will recreate the search database)")
print()
raise ValueError("No code_vectors table found. Run indexing first.") raise ValueError("No code_vectors table found. Run indexing first.")
self.table = self.db.open_table("code_vectors") self.table = self.db.open_table("code_vectors")

View File

@ -15,11 +15,29 @@ import logging
# Add the RAG system to the path # Add the RAG system to the path
sys.path.insert(0, str(Path(__file__).parent)) sys.path.insert(0, str(Path(__file__).parent))
from mini_rag.indexer import ProjectIndexer try:
from mini_rag.search import CodeSearcher from mini_rag.indexer import ProjectIndexer
from mini_rag.ollama_embeddings import OllamaEmbedder from mini_rag.search import CodeSearcher
from mini_rag.llm_synthesizer import LLMSynthesizer from mini_rag.ollama_embeddings import OllamaEmbedder
from mini_rag.explorer import CodeExplorer from mini_rag.llm_synthesizer import LLMSynthesizer
from mini_rag.explorer import CodeExplorer
except ImportError as e:
print("❌ Error: Missing dependencies!")
print()
print("It looks like you haven't installed the required packages yet.")
print("This is a common mistake - here's how to fix it:")
print()
print("1. Make sure you're in the FSS-Mini-RAG directory")
print("2. Run the installer script:")
print(" ./install_mini_rag.sh")
print()
print("Or if you want to install manually:")
print(" python3 -m venv .venv")
print(" source .venv/bin/activate")
print(" pip install -r requirements.txt")
print()
print(f"Missing module: {e.name}")
sys.exit(1)
# Configure logging for user-friendly output # Configure logging for user-friendly output
logging.basicConfig( logging.basicConfig(
@ -68,7 +86,25 @@ def index_project(project_path: Path, force: bool = False):
if not (project_path / '.mini-rag' / 'last_search').exists(): if not (project_path / '.mini-rag' / 'last_search').exists():
print(f"\n💡 Try: rag-mini search {project_path} \"your search here\"") print(f"\n💡 Try: rag-mini search {project_path} \"your search here\"")
except FileNotFoundError:
print(f"📁 Directory Not Found: {project_path}")
print(" Make sure the path exists and you're in the right location")
print(f" Current directory: {Path.cwd()}")
print(" Check path: ls -la /path/to/your/project")
print()
sys.exit(1)
except PermissionError:
print("🔒 Permission Denied")
print(" FSS-Mini-RAG needs to read files and create index database")
print(f" Check permissions: ls -la {project_path}")
print(" Try a different location with write access")
print()
sys.exit(1)
except Exception as e: except Exception as e:
# Connection errors are handled in the embedding module
if "ollama" in str(e).lower() or "connection" in str(e).lower():
sys.exit(1) # Error already displayed
print(f"❌ Indexing failed: {e}") print(f"❌ Indexing failed: {e}")
print() print()
print("🔧 Common solutions:") print("🔧 Common solutions:")

View File

@ -15,6 +15,7 @@ class SimpleTUI:
def __init__(self): def __init__(self):
self.project_path: Optional[Path] = None self.project_path: Optional[Path] = None
self.current_config: Dict[str, Any] = {} self.current_config: Dict[str, Any] = {}
self.search_count = 0 # Track searches for sample reminder
def clear_screen(self): def clear_screen(self):
"""Clear the terminal screen.""" """Clear the terminal screen."""
@ -278,8 +279,37 @@ class SimpleTUI:
print(f"Project: {self.project_path.name}") print(f"Project: {self.project_path.name}")
print() print()
# Get search query # Show sample questions for beginners - relevant to FSS-Mini-RAG
query = self.get_input("Enter search query", "").strip() print("💡 Not sure what to search for? Try these questions about FSS-Mini-RAG:")
print()
sample_questions = [
"chunking strategy",
"ollama integration",
"indexing performance",
"why does indexing take long",
"how to improve search results",
"embedding generation"
]
for i, question in enumerate(sample_questions[:3], 1):
print(f" {i}. {question}")
print(" 4. Enter your own question")
print()
# Let user choose a sample or enter their own
choice_str = self.get_input("Choose a number (1-4) or press Enter for custom", "4")
try:
choice = int(choice_str)
if 1 <= choice <= 3:
query = sample_questions[choice - 1]
print(f"Selected: '{query}'")
print()
else:
query = self.get_input("Enter your search query", "").strip()
except ValueError:
query = self.get_input("Enter your search query", "").strip()
if not query: if not query:
return return
@ -354,6 +384,70 @@ class SimpleTUI:
if len(results) > 1: if len(results) > 1:
print("💡 To see more context or specific results:") print("💡 To see more context or specific results:")
print(f" Run: ./rag-mini search {self.project_path} \"{query}\" --verbose") print(f" Run: ./rag-mini search {self.project_path} \"{query}\" --verbose")
# Suggest follow-up questions based on the search
print()
print("🔍 Suggested follow-up searches:")
follow_up_questions = self.generate_follow_up_questions(query, results)
for i, question in enumerate(follow_up_questions, 1):
print(f" {i}. {question}")
# Ask if they want to run a follow-up search
print()
choice = input("Run a follow-up search? Enter number (1-3) or press Enter to continue: ").strip()
if choice.isdigit() and 1 <= int(choice) <= len(follow_up_questions):
# Recursive search with the follow-up question
follow_up_query = follow_up_questions[int(choice) - 1]
print(f"\nSearching for: '{follow_up_query}'")
print("=" * 50)
# Run another search
follow_results = searcher.search(follow_up_query, top_k=5)
if follow_results:
print(f"✅ Found {len(follow_results)} follow-up results:")
print()
for i, result in enumerate(follow_results[:3], 1): # Show top 3
try:
rel_path = result.file_path.relative_to(self.project_path)
except:
rel_path = result.file_path
print(f"{i}. {rel_path} (Score: {result.score:.3f})")
print(f" {result.content.strip()[:100]}...")
print()
else:
print("❌ No follow-up results found")
# Track searches and show sample reminder
self.search_count += 1
# Show sample reminder after 2 searches
if self.search_count >= 2 and self.project_path.name == '.sample_test':
print()
print("⚠️ Sample Limitation Notice")
print("=" * 30)
print("You've been searching a small sample project.")
print("For full exploration of your codebase, you need to index the complete project.")
print()
# Show timing estimate if available
try:
with open('/tmp/fss-rag-sample-time.txt', 'r') as f:
sample_time = int(f.read().strip())
# Rough estimate: multiply by file count ratio
estimated_time = sample_time * 20 # Rough multiplier
print(f"🕒 Estimated full indexing time: ~{estimated_time} seconds")
except:
print("🕒 Estimated full indexing time: 1-3 minutes for typical projects")
print()
choice = input("Index the full project now? [y/N]: ").strip().lower()
if choice == 'y':
# Switch to full project and index
parent_dir = self.project_path.parent
self.project_path = parent_dir
print(f"\nSwitching to full project: {parent_dir}")
print("Starting full indexing...")
# Note: This would trigger full indexing in real implementation
print(f" Or: ./rag-mini-enhanced context {self.project_path} \"{query}\"") print(f" Or: ./rag-mini-enhanced context {self.project_path} \"{query}\"")
print() print()
@ -364,6 +458,48 @@ class SimpleTUI:
print() print()
input("Press Enter to continue...") input("Press Enter to continue...")
def generate_follow_up_questions(self, original_query: str, results) -> List[str]:
"""Generate contextual follow-up questions based on search results."""
# Simple pattern-based follow-up generation
follow_ups = []
# Based on original query patterns
query_lower = original_query.lower()
# FSS-Mini-RAG specific follow-ups
if "chunk" in query_lower:
follow_ups.extend(["chunk size optimization", "smart chunking boundaries", "chunk overlap strategies"])
elif "ollama" in query_lower:
follow_ups.extend(["embedding model comparison", "ollama server setup", "nomic-embed-text performance"])
elif "index" in query_lower or "performance" in query_lower:
follow_ups.extend(["indexing speed optimization", "memory usage during indexing", "file processing pipeline"])
elif "search" in query_lower or "result" in query_lower:
follow_ups.extend(["search result ranking", "semantic vs keyword search", "query expansion techniques"])
elif "embed" in query_lower:
follow_ups.extend(["vector embedding storage", "embedding model fallbacks", "similarity scoring"])
else:
# Generic RAG-related follow-ups
follow_ups.extend(["vector database internals", "search quality tuning", "embedding optimization"])
# Based on file types found in results (FSS-Mini-RAG specific)
if results:
file_extensions = set()
for result in results[:3]: # Check first 3 results
ext = result.file_path.suffix.lower()
file_extensions.add(ext)
if '.py' in file_extensions:
follow_ups.append("Python module dependencies")
if '.md' in file_extensions:
follow_ups.append("documentation implementation")
if 'chunker' in str(results[0].file_path).lower():
follow_ups.append("chunking algorithm details")
if 'search' in str(results[0].file_path).lower():
follow_ups.append("search algorithm implementation")
# Return top 3 unique follow-ups
return list(dict.fromkeys(follow_ups))[:3]
def explore_interactive(self): def explore_interactive(self):
"""Interactive exploration interface with thinking mode.""" """Interactive exploration interface with thinking mode."""
if not self.project_path: if not self.project_path:
@ -682,6 +818,12 @@ class SimpleTUI:
status = "✅ Indexed" if rag_dir.exists() else "❌ Not indexed" status = "✅ Indexed" if rag_dir.exists() else "❌ Not indexed"
print(f"📁 Current project: {self.project_path.name} ({status})") print(f"📁 Current project: {self.project_path.name} ({status})")
print() print()
else:
# Show beginner tips when no project selected
print("🎯 Welcome to FSS-Mini-RAG!")
print(" Search through code, documents, emails, notes - anything text-based!")
print(" Start by selecting a project directory below.")
print()
options = [ options = [
"Select project directory", "Select project directory",

124
scripts/test-configs.py Executable file
View File

@ -0,0 +1,124 @@
#!/usr/bin/env python3
"""
Test script to validate all config examples are syntactically correct
and contain required fields for FSS-Mini-RAG.
"""
import yaml
import sys
from pathlib import Path
from typing import Dict, Any, List
def validate_config_structure(config: Dict[str, Any], config_name: str) -> List[str]:
"""Validate that config has required structure."""
errors = []
# Required sections
required_sections = ['chunking', 'streaming', 'files', 'embedding', 'search']
for section in required_sections:
if section not in config:
errors.append(f"{config_name}: Missing required section '{section}'")
# Validate chunking section
if 'chunking' in config:
chunking = config['chunking']
required_chunking = ['max_size', 'min_size', 'strategy']
for field in required_chunking:
if field not in chunking:
errors.append(f"{config_name}: Missing chunking.{field}")
# Validate types and ranges
if 'max_size' in chunking and not isinstance(chunking['max_size'], int):
errors.append(f"{config_name}: chunking.max_size must be integer")
if 'min_size' in chunking and not isinstance(chunking['min_size'], int):
errors.append(f"{config_name}: chunking.min_size must be integer")
if 'strategy' in chunking and chunking['strategy'] not in ['semantic', 'fixed']:
errors.append(f"{config_name}: chunking.strategy must be 'semantic' or 'fixed'")
# Validate embedding section
if 'embedding' in config:
embedding = config['embedding']
if 'preferred_method' in embedding:
valid_methods = ['ollama', 'ml', 'hash', 'auto']
if embedding['preferred_method'] not in valid_methods:
errors.append(f"{config_name}: embedding.preferred_method must be one of {valid_methods}")
# Validate LLM section (if present)
if 'llm' in config:
llm = config['llm']
if 'synthesis_temperature' in llm:
temp = llm['synthesis_temperature']
if not isinstance(temp, (int, float)) or temp < 0 or temp > 1:
errors.append(f"{config_name}: llm.synthesis_temperature must be number between 0-1")
return errors
def test_config_file(config_path: Path) -> bool:
"""Test a single config file."""
print(f"Testing {config_path.name}...")
try:
# Test YAML parsing
with open(config_path, 'r') as f:
config = yaml.safe_load(f)
if not config:
print(f"{config_path.name}: Empty or invalid YAML")
return False
# Test structure
errors = validate_config_structure(config, config_path.name)
if errors:
print(f"{config_path.name}: Structure errors:")
for error in errors:
print(f"{error}")
return False
print(f"{config_path.name}: Valid")
return True
except yaml.YAMLError as e:
print(f"{config_path.name}: YAML parsing error: {e}")
return False
except Exception as e:
print(f"{config_path.name}: Unexpected error: {e}")
return False
def main():
"""Test all config examples."""
script_dir = Path(__file__).parent
project_root = script_dir.parent
examples_dir = project_root / 'examples'
if not examples_dir.exists():
print(f"❌ Examples directory not found: {examples_dir}")
sys.exit(1)
# Find all config files
config_files = list(examples_dir.glob('config*.yaml'))
if not config_files:
print(f"❌ No config files found in {examples_dir}")
sys.exit(1)
print(f"🧪 Testing {len(config_files)} config files...\n")
all_passed = True
for config_file in sorted(config_files):
passed = test_config_file(config_file)
if not passed:
all_passed = False
print(f"\n{'='*50}")
if all_passed:
print("✅ All config files are valid!")
print("\n💡 To use any config:")
print(" cp examples/config-NAME.yaml /path/to/project/.mini-rag/config.yaml")
sys.exit(0)
else:
print("❌ Some config files have issues - please fix before release")
sys.exit(1)
if __name__ == '__main__':
main()