Compare commits
6 Commits
9eb366f414
...
11639c8237
| Author | SHA1 | Date | |
|---|---|---|---|
| 11639c8237 | |||
| 2f2dd6880b | |||
| 3fe26ef138 | |||
| e6d5f20f7d | |||
| 29abbb285e | |||
| e16451b060 |
21
LICENSE
Normal file
21
LICENSE
Normal file
@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2025 Brett Fox
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
264
docs/LLM_PROVIDERS.md
Normal file
264
docs/LLM_PROVIDERS.md
Normal file
@ -0,0 +1,264 @@
|
||||
# 🤖 LLM Provider Setup Guide
|
||||
|
||||
This guide shows how to configure FSS-Mini-RAG with different LLM providers for synthesis and query expansion features.
|
||||
|
||||
## 🎯 Quick Provider Comparison
|
||||
|
||||
| Provider | Cost | Setup Difficulty | Quality | Privacy | Internet Required |
|
||||
|----------|------|------------------|---------|---------|-------------------|
|
||||
| **Ollama** | Free | Easy | Good | Excellent | No |
|
||||
| **LM Studio** | Free | Easy | Good | Excellent | No |
|
||||
| **OpenRouter** | Low ($0.10-0.50/M) | Medium | Excellent | Fair | Yes |
|
||||
| **OpenAI** | Medium ($0.15-2.50/M) | Medium | Excellent | Fair | Yes |
|
||||
| **Anthropic** | Medium-High | Medium | Excellent | Fair | Yes |
|
||||
|
||||
## 🏠 Local Providers (Recommended for Beginners)
|
||||
|
||||
### Ollama (Default)
|
||||
|
||||
**Best for:** Privacy, learning, no ongoing costs
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider: ollama
|
||||
ollama_host: localhost:11434
|
||||
synthesis_model: llama3.2
|
||||
expansion_model: llama3.2
|
||||
enable_synthesis: false
|
||||
synthesis_temperature: 0.3
|
||||
cpu_optimized: true
|
||||
enable_thinking: true
|
||||
```
|
||||
|
||||
**Setup:**
|
||||
1. Install Ollama: `curl -fsSL https://ollama.ai/install.sh | sh`
|
||||
2. Start service: `ollama serve`
|
||||
3. Download model: `ollama pull llama3.2`
|
||||
4. Test: `./rag-mini search /path/to/project "test" --synthesize`
|
||||
|
||||
**Recommended Models:**
|
||||
- `qwen3:0.6b` - Ultra-fast, good for CPU-only systems
|
||||
- `llama3.2` - Balanced quality and speed
|
||||
- `llama3.1:8b` - Higher quality, needs more RAM
|
||||
|
||||
### LM Studio
|
||||
|
||||
**Best for:** GUI users, model experimentation
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider: openai
|
||||
api_base: http://localhost:1234/v1
|
||||
api_key: "not-needed"
|
||||
synthesis_model: "any"
|
||||
expansion_model: "any"
|
||||
enable_synthesis: false
|
||||
synthesis_temperature: 0.3
|
||||
```
|
||||
|
||||
**Setup:**
|
||||
1. Download [LM Studio](https://lmstudio.ai)
|
||||
2. Install any model from the catalog
|
||||
3. Start local server (default port 1234)
|
||||
4. Use config above
|
||||
|
||||
## ☁️ Cloud Providers (For Advanced Users)
|
||||
|
||||
### OpenRouter (Best Value)
|
||||
|
||||
**Best for:** Access to many models, reasonable pricing
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider: openai
|
||||
api_base: https://openrouter.ai/api/v1
|
||||
api_key: "your-api-key-here"
|
||||
synthesis_model: "meta-llama/llama-3.1-8b-instruct:free"
|
||||
expansion_model: "meta-llama/llama-3.1-8b-instruct:free"
|
||||
enable_synthesis: false
|
||||
synthesis_temperature: 0.3
|
||||
timeout: 30
|
||||
```
|
||||
|
||||
**Setup:**
|
||||
1. Sign up at [openrouter.ai](https://openrouter.ai)
|
||||
2. Create API key in dashboard
|
||||
3. Add $5-10 credits (goes far with efficient models)
|
||||
4. Replace `your-api-key-here` with actual key
|
||||
|
||||
**Budget Models:**
|
||||
- `meta-llama/llama-3.1-8b-instruct:free` - Free tier
|
||||
- `openai/gpt-4o-mini` - $0.15 per million tokens
|
||||
- `anthropic/claude-3-haiku` - $0.25 per million tokens
|
||||
|
||||
### OpenAI (Premium Quality)
|
||||
|
||||
**Best for:** Reliability, advanced features
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider: openai
|
||||
api_key: "your-openai-api-key"
|
||||
synthesis_model: "gpt-4o-mini"
|
||||
expansion_model: "gpt-4o-mini"
|
||||
enable_synthesis: false
|
||||
synthesis_temperature: 0.3
|
||||
timeout: 30
|
||||
```
|
||||
|
||||
**Setup:**
|
||||
1. Sign up at [platform.openai.com](https://platform.openai.com)
|
||||
2. Add payment method
|
||||
3. Create API key
|
||||
4. Start with `gpt-4o-mini` for cost efficiency
|
||||
|
||||
### Anthropic Claude (Code Expert)
|
||||
|
||||
**Best for:** Code analysis, thoughtful responses
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider: anthropic
|
||||
api_key: "your-anthropic-api-key"
|
||||
synthesis_model: "claude-3-haiku-20240307"
|
||||
expansion_model: "claude-3-haiku-20240307"
|
||||
enable_synthesis: false
|
||||
synthesis_temperature: 0.3
|
||||
timeout: 30
|
||||
```
|
||||
|
||||
**Setup:**
|
||||
1. Sign up at [console.anthropic.com](https://console.anthropic.com)
|
||||
2. Add credits to account
|
||||
3. Create API key
|
||||
4. Start with Claude Haiku for budget-friendly option
|
||||
|
||||
## 🧪 Testing Your Setup
|
||||
|
||||
### 1. Basic Functionality Test
|
||||
```bash
|
||||
# Test without LLM (should always work)
|
||||
./rag-mini search /path/to/project "authentication"
|
||||
```
|
||||
|
||||
### 2. Synthesis Test
|
||||
```bash
|
||||
# Test LLM integration
|
||||
./rag-mini search /path/to/project "authentication" --synthesize
|
||||
```
|
||||
|
||||
### 3. Interactive Test
|
||||
```bash
|
||||
# Test exploration mode
|
||||
./rag-mini explore /path/to/project
|
||||
# Then ask: "How does authentication work in this codebase?"
|
||||
```
|
||||
|
||||
### 4. Query Expansion Test
|
||||
Enable `expand_queries: true` in config, then:
|
||||
```bash
|
||||
./rag-mini search /path/to/project "auth"
|
||||
# Should automatically expand to "auth authentication login user session"
|
||||
```
|
||||
|
||||
## 🛠️ Configuration Tips
|
||||
|
||||
### For Budget-Conscious Users
|
||||
```yaml
|
||||
llm:
|
||||
synthesis_model: "gpt-4o-mini" # or claude-haiku
|
||||
enable_synthesis: false # Manual control
|
||||
synthesis_temperature: 0.1 # Factual responses
|
||||
max_expansion_terms: 4 # Shorter expansions
|
||||
```
|
||||
|
||||
### For Quality-Focused Users
|
||||
```yaml
|
||||
llm:
|
||||
synthesis_model: "gpt-4o" # or claude-sonnet
|
||||
enable_synthesis: true # Always on
|
||||
synthesis_temperature: 0.3 # Balanced creativity
|
||||
enable_thinking: true # Show reasoning
|
||||
max_expansion_terms: 8 # Comprehensive expansion
|
||||
```
|
||||
|
||||
### For Privacy-Focused Users
|
||||
```yaml
|
||||
# Use only local providers
|
||||
embedding:
|
||||
preferred_method: ollama # Local embeddings
|
||||
llm:
|
||||
provider: ollama # Local LLM
|
||||
# Never use cloud providers
|
||||
```
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
### Connection Issues
|
||||
- **Local:** Ensure Ollama/LM Studio is running: `ps aux | grep ollama`
|
||||
- **Cloud:** Check API key and internet: `curl -H "Authorization: Bearer $API_KEY" https://api.openai.com/v1/models`
|
||||
|
||||
### Model Not Found
|
||||
- **Ollama:** `ollama pull model-name`
|
||||
- **Cloud:** Check provider's model list documentation
|
||||
|
||||
### High Costs
|
||||
- Use mini/haiku models instead of full versions
|
||||
- Set `enable_synthesis: false` and use `--synthesize` selectively
|
||||
- Reduce `max_expansion_terms` to 4-6
|
||||
|
||||
### Poor Quality
|
||||
- Try higher-tier models (gpt-4o, claude-sonnet)
|
||||
- Adjust `synthesis_temperature` (0.1 = factual, 0.5 = creative)
|
||||
- Enable `expand_queries` for better search coverage
|
||||
|
||||
### Slow Responses
|
||||
- **Local:** Try smaller models (qwen3:0.6b)
|
||||
- **Cloud:** Increase `timeout` or switch providers
|
||||
- **General:** Reduce `max_size` in chunking config
|
||||
|
||||
## 📋 Environment Variables (Alternative Setup)
|
||||
|
||||
Instead of putting API keys in config files, use environment variables:
|
||||
|
||||
```bash
|
||||
# In your shell profile (.bashrc, .zshrc, etc.)
|
||||
export OPENAI_API_KEY="your-openai-key"
|
||||
export ANTHROPIC_API_KEY="your-anthropic-key"
|
||||
export OPENROUTER_API_KEY="your-openrouter-key"
|
||||
```
|
||||
|
||||
Then in config:
|
||||
```yaml
|
||||
llm:
|
||||
api_key: "${OPENAI_API_KEY}" # Reads from environment
|
||||
```
|
||||
|
||||
## 🚀 Advanced: Multi-Provider Setup
|
||||
|
||||
You can create different configs for different use cases:
|
||||
|
||||
```bash
|
||||
# Fast local analysis
|
||||
cp examples/config-beginner.yaml .mini-rag/config-local.yaml
|
||||
|
||||
# High-quality cloud analysis
|
||||
cp examples/config-llm-providers.yaml .mini-rag/config-cloud.yaml
|
||||
# Edit to use OpenAI/Claude
|
||||
|
||||
# Switch configs as needed
|
||||
ln -sf config-local.yaml .mini-rag/config.yaml # Use local
|
||||
ln -sf config-cloud.yaml .mini-rag/config.yaml # Use cloud
|
||||
```
|
||||
|
||||
## 📚 Further Reading
|
||||
|
||||
- [Ollama Model Library](https://ollama.ai/library)
|
||||
- [OpenRouter Pricing](https://openrouter.ai/docs#models)
|
||||
- [OpenAI API Documentation](https://platform.openai.com/docs)
|
||||
- [Anthropic Claude Documentation](https://docs.anthropic.com/claude)
|
||||
- [LM Studio Getting Started](https://lmstudio.ai/docs)
|
||||
|
||||
---
|
||||
|
||||
💡 **Pro Tip:** Start with local Ollama for learning, then upgrade to cloud providers when you need production-quality analysis or are working with large codebases.
|
||||
@ -47,6 +47,7 @@ search:
|
||||
expand_queries: false # Keep it simple for now
|
||||
|
||||
# 🤖 AI explanations (optional but helpful)
|
||||
# 💡 WANT DIFFERENT LLM? See examples/config-llm-providers.yaml for OpenAI, Claude, etc.
|
||||
llm:
|
||||
synthesis_model: auto # Pick best available model
|
||||
enable_synthesis: false # Turn on manually with --synthesize
|
||||
|
||||
233
examples/config-llm-providers.yaml
Normal file
233
examples/config-llm-providers.yaml
Normal file
@ -0,0 +1,233 @@
|
||||
# 🌐 LLM PROVIDER ALTERNATIVES - OpenRouter, LM Studio, OpenAI & More
|
||||
# Educational guide showing how to configure different LLM providers
|
||||
# Copy sections you need to your main config.yaml
|
||||
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
# 🎯 QUICK PROVIDER SELECTION GUIDE:
|
||||
#
|
||||
# 🏠 LOCAL (Best Privacy, No Internet Needed):
|
||||
# - Ollama: Great quality, easy setup, free
|
||||
# - LM Studio: User-friendly GUI, works with many models
|
||||
#
|
||||
# ☁️ CLOUD (Powerful Models, Requires API Keys):
|
||||
# - OpenRouter: Access to many models with one API
|
||||
# - OpenAI: High quality, reliable, but more expensive
|
||||
# - Anthropic: Excellent for code analysis
|
||||
#
|
||||
# 💰 BUDGET FRIENDLY:
|
||||
# - OpenRouter (Qwen, Llama models): $0.10-0.50 per million tokens
|
||||
# - Local Ollama/LM Studio: Completely free
|
||||
#
|
||||
# 🚀 PERFORMANCE:
|
||||
# - Local: Limited by your hardware
|
||||
# - Cloud: Fast and powerful, costs per use
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
# Standard FSS-Mini-RAG settings (copy these to any config)
|
||||
chunking:
|
||||
max_size: 2000
|
||||
min_size: 150
|
||||
strategy: semantic
|
||||
|
||||
streaming:
|
||||
enabled: true
|
||||
threshold_bytes: 1048576
|
||||
|
||||
files:
|
||||
min_file_size: 50
|
||||
exclude_patterns:
|
||||
- "node_modules/**"
|
||||
- ".git/**"
|
||||
- "__pycache__/**"
|
||||
- "*.pyc"
|
||||
- ".venv/**"
|
||||
- "build/**"
|
||||
- "dist/**"
|
||||
include_patterns:
|
||||
- "**/*"
|
||||
|
||||
embedding:
|
||||
preferred_method: ollama # Use Ollama for embeddings (works with all providers below)
|
||||
ollama_model: nomic-embed-text
|
||||
ollama_host: localhost:11434
|
||||
batch_size: 32
|
||||
|
||||
search:
|
||||
default_limit: 10
|
||||
enable_bm25: true
|
||||
similarity_threshold: 0.1
|
||||
expand_queries: false
|
||||
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
# 🤖 LLM PROVIDER CONFIGURATIONS
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
|
||||
# 🏠 OPTION 1: OLLAMA (LOCAL) - Default and Recommended
|
||||
# ✅ Pros: Free, private, no API keys, good quality
|
||||
# ❌ Cons: Uses your computer's resources, limited by hardware
|
||||
llm:
|
||||
provider: ollama # Use local Ollama
|
||||
ollama_host: localhost:11434 # Default Ollama location
|
||||
synthesis_model: llama3.2 # Good all-around model
|
||||
# alternatives: qwen3:0.6b (faster), llama3.2:3b (balanced), llama3.1:8b (quality)
|
||||
expansion_model: llama3.2
|
||||
enable_synthesis: false
|
||||
synthesis_temperature: 0.3
|
||||
cpu_optimized: true
|
||||
enable_thinking: true
|
||||
max_expansion_terms: 8
|
||||
|
||||
# 🖥️ OPTION 2: LM STUDIO (LOCAL) - User-Friendly Alternative
|
||||
# ✅ Pros: Easy GUI, drag-drop model installation, compatible with Ollama
|
||||
# ❌ Cons: Another app to manage, similar hardware limitations
|
||||
#
|
||||
# SETUP STEPS:
|
||||
# 1. Download LM Studio from lmstudio.ai
|
||||
# 2. Install a model (try "microsoft/DialoGPT-medium" or "TheBloke/Llama-2-7B-Chat-GGML")
|
||||
# 3. Start local server in LM Studio (usually port 1234)
|
||||
# 4. Use this config:
|
||||
#
|
||||
# llm:
|
||||
# provider: openai # LM Studio uses OpenAI-compatible API
|
||||
# api_base: http://localhost:1234/v1 # LM Studio default port
|
||||
# api_key: "not-needed" # LM Studio doesn't require real API key
|
||||
# synthesis_model: "any" # Use whatever model you loaded in LM Studio
|
||||
# expansion_model: "any"
|
||||
# enable_synthesis: false
|
||||
# synthesis_temperature: 0.3
|
||||
# cpu_optimized: true
|
||||
# enable_thinking: true
|
||||
# max_expansion_terms: 8
|
||||
|
||||
# ☁️ OPTION 3: OPENROUTER (CLOUD) - Many Models, One API
|
||||
# ✅ Pros: Access to many models, good prices, no local setup
|
||||
# ❌ Cons: Requires internet, costs money, less private
|
||||
#
|
||||
# SETUP STEPS:
|
||||
# 1. Sign up at openrouter.ai
|
||||
# 2. Get API key from dashboard
|
||||
# 3. Add credits to account ($5-10 goes a long way)
|
||||
# 4. Use this config:
|
||||
#
|
||||
# llm:
|
||||
# provider: openai # OpenRouter uses OpenAI-compatible API
|
||||
# api_base: https://openrouter.ai/api/v1
|
||||
# api_key: "your-openrouter-api-key-here" # Replace with your actual key
|
||||
# synthesis_model: "meta-llama/llama-3.1-8b-instruct:free" # Free tier model
|
||||
# # alternatives: "openai/gpt-4o-mini" ($0.15/M), "anthropic/claude-3-haiku" ($0.25/M)
|
||||
# expansion_model: "meta-llama/llama-3.1-8b-instruct:free"
|
||||
# enable_synthesis: false
|
||||
# synthesis_temperature: 0.3
|
||||
# cpu_optimized: false # Cloud models don't need CPU optimization
|
||||
# enable_thinking: true
|
||||
# max_expansion_terms: 8
|
||||
# timeout: 30 # Longer timeout for internet requests
|
||||
|
||||
# 🏢 OPTION 4: OPENAI (CLOUD) - Premium Quality
|
||||
# ✅ Pros: Excellent quality, very reliable, fast
|
||||
# ❌ Cons: More expensive, requires OpenAI account
|
||||
#
|
||||
# SETUP STEPS:
|
||||
# 1. Sign up at platform.openai.com
|
||||
# 2. Add payment method (pay-per-use)
|
||||
# 3. Create API key in dashboard
|
||||
# 4. Use this config:
|
||||
#
|
||||
# llm:
|
||||
# provider: openai
|
||||
# api_key: "your-openai-api-key-here" # Replace with your actual key
|
||||
# synthesis_model: "gpt-4o-mini" # Affordable option (~$0.15/M tokens)
|
||||
# # alternatives: "gpt-4o" (premium, ~$2.50/M), "gpt-3.5-turbo" (budget, ~$0.50/M)
|
||||
# expansion_model: "gpt-4o-mini"
|
||||
# enable_synthesis: false
|
||||
# synthesis_temperature: 0.3
|
||||
# cpu_optimized: false
|
||||
# enable_thinking: true
|
||||
# max_expansion_terms: 8
|
||||
# timeout: 30
|
||||
|
||||
# 🧠 OPTION 5: ANTHROPIC CLAUDE (CLOUD) - Excellent for Code
|
||||
# ✅ Pros: Great at code analysis, very thoughtful responses
|
||||
# ❌ Cons: Premium pricing, separate API account needed
|
||||
#
|
||||
# SETUP STEPS:
|
||||
# 1. Sign up at console.anthropic.com
|
||||
# 2. Get API key and add credits
|
||||
# 3. Use this config:
|
||||
#
|
||||
# llm:
|
||||
# provider: anthropic
|
||||
# api_key: "your-anthropic-api-key-here" # Replace with your actual key
|
||||
# synthesis_model: "claude-3-haiku-20240307" # Most affordable option
|
||||
# # alternatives: "claude-3-sonnet-20240229" (balanced), "claude-3-opus-20240229" (premium)
|
||||
# expansion_model: "claude-3-haiku-20240307"
|
||||
# enable_synthesis: false
|
||||
# synthesis_temperature: 0.3
|
||||
# cpu_optimized: false
|
||||
# enable_thinking: true
|
||||
# max_expansion_terms: 8
|
||||
# timeout: 30
|
||||
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
# 🧪 TESTING YOUR CONFIGURATION
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
#
|
||||
# After setting up any provider, test with these commands:
|
||||
#
|
||||
# 1. Test basic search (no LLM needed):
|
||||
# ./rag-mini search /path/to/project "test query"
|
||||
#
|
||||
# 2. Test LLM synthesis:
|
||||
# ./rag-mini search /path/to/project "test query" --synthesize
|
||||
#
|
||||
# 3. Test query expansion:
|
||||
# Enable expand_queries: true in search section and try:
|
||||
# ./rag-mini search /path/to/project "auth"
|
||||
#
|
||||
# 4. Test thinking mode:
|
||||
# ./rag-mini explore /path/to/project
|
||||
# Then ask: "explain the authentication system"
|
||||
#
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
# 💡 TROUBLESHOOTING
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
#
|
||||
# ❌ "Connection refused" or "API error":
|
||||
# - Local: Make sure Ollama/LM Studio is running
|
||||
# - Cloud: Check API key and internet connection
|
||||
#
|
||||
# ❌ "Model not found":
|
||||
# - Local: Install model with `ollama pull model-name`
|
||||
# - Cloud: Check model name matches provider's API docs
|
||||
#
|
||||
# ❌ "Token limit exceeded" or expensive bills:
|
||||
# - Use cheaper models like gpt-4o-mini or claude-haiku
|
||||
# - Enable shorter contexts with max_size: 1500
|
||||
#
|
||||
# ❌ Slow responses:
|
||||
# - Local: Try smaller models (qwen3:0.6b)
|
||||
# - Cloud: Increase timeout or try different provider
|
||||
#
|
||||
# ❌ Poor quality results:
|
||||
# - Try higher-quality models
|
||||
# - Adjust synthesis_temperature (0.1 for factual, 0.5 for creative)
|
||||
# - Enable expand_queries for better search coverage
|
||||
#
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
# 📚 LEARN MORE
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
#
|
||||
# Provider Documentation:
|
||||
# - Ollama: https://ollama.ai/library (model catalog)
|
||||
# - LM Studio: https://lmstudio.ai/docs (getting started)
|
||||
# - OpenRouter: https://openrouter.ai/docs (API reference)
|
||||
# - OpenAI: https://platform.openai.com/docs (API docs)
|
||||
# - Anthropic: https://docs.anthropic.com/claude/reference (Claude API)
|
||||
#
|
||||
# Model Recommendations:
|
||||
# - Code Analysis: claude-3-sonnet, gpt-4o, llama3.1:8b
|
||||
# - Fast Responses: gpt-4o-mini, claude-haiku, qwen3:0.6b
|
||||
# - Budget Friendly: OpenRouter free tier, local Ollama
|
||||
# - Best Privacy: Local Ollama or LM Studio only
|
||||
#
|
||||
#═════════════════════════════════════════════════════════════════════════════════
|
||||
@ -162,22 +162,72 @@ check_ollama() {
|
||||
print_warning "Ollama not found"
|
||||
echo ""
|
||||
echo -e "${CYAN}Ollama provides the best embedding quality and performance.${NC}"
|
||||
echo -e "${YELLOW}To install Ollama:${NC}"
|
||||
echo " 1. Visit: https://ollama.ai/download"
|
||||
echo ""
|
||||
echo -e "${BOLD}Options:${NC}"
|
||||
echo -e "${GREEN}1) Install Ollama automatically${NC} (recommended)"
|
||||
echo -e "${YELLOW}2) Manual installation${NC} - Visit https://ollama.com/download"
|
||||
echo -e "${BLUE}3) Continue without Ollama${NC} (uses ML fallback)"
|
||||
echo ""
|
||||
echo -n "Choose [1/2/3]: "
|
||||
read -r ollama_choice
|
||||
|
||||
case "$ollama_choice" in
|
||||
1|"")
|
||||
print_info "Installing Ollama using official installer..."
|
||||
echo -e "${CYAN}Running: curl -fsSL https://ollama.com/install.sh | sh${NC}"
|
||||
|
||||
if curl -fsSL https://ollama.com/install.sh | sh; then
|
||||
print_success "Ollama installed successfully"
|
||||
|
||||
print_info "Starting Ollama server..."
|
||||
ollama serve &
|
||||
sleep 3
|
||||
|
||||
if curl -s http://localhost:11434/api/version >/dev/null 2>&1; then
|
||||
print_success "Ollama server started"
|
||||
|
||||
echo ""
|
||||
echo -e "${CYAN}💡 Pro tip: Download an LLM for AI-powered search synthesis!${NC}"
|
||||
echo -e " Lightweight: ${GREEN}ollama pull qwen3:0.6b${NC} (~400MB, very fast)"
|
||||
echo -e " Balanced: ${GREEN}ollama pull qwen3:1.7b${NC} (~1GB, good quality)"
|
||||
echo -e " Excellent: ${GREEN}ollama pull qwen3:3b${NC} (~2GB, great for this project)"
|
||||
echo -e " Premium: ${GREEN}ollama pull qwen3:8b${NC} (~5GB, amazing results)"
|
||||
echo ""
|
||||
echo -e "${BLUE}Creative possibilities: Try mistral for storytelling, or qwen3-coder for development!${NC}"
|
||||
echo ""
|
||||
|
||||
return 0
|
||||
else
|
||||
print_warning "Ollama installed but failed to start automatically"
|
||||
echo "Please start Ollama manually: ollama serve"
|
||||
echo "Then re-run this installer"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
print_error "Failed to install Ollama automatically"
|
||||
echo "Please install manually from https://ollama.com/download"
|
||||
exit 1
|
||||
fi
|
||||
;;
|
||||
2)
|
||||
echo ""
|
||||
echo -e "${YELLOW}Manual Ollama installation:${NC}"
|
||||
echo " 1. Visit: https://ollama.com/download"
|
||||
echo " 2. Download and install for your system"
|
||||
echo " 3. Run: ollama serve"
|
||||
echo " 4. Re-run this installer"
|
||||
echo ""
|
||||
echo -e "${BLUE}Alternative: Use ML fallback (requires more disk space)${NC}"
|
||||
echo ""
|
||||
echo -n "Continue without Ollama? (y/N): "
|
||||
read -r continue_without
|
||||
if [[ $continue_without =~ ^[Yy]$ ]]; then
|
||||
return 1
|
||||
else
|
||||
print_info "Install Ollama first, then re-run this script"
|
||||
print_info "Exiting for manual installation..."
|
||||
exit 0
|
||||
fi
|
||||
;;
|
||||
3)
|
||||
print_info "Continuing without Ollama (will use ML fallback)"
|
||||
return 1
|
||||
;;
|
||||
*)
|
||||
print_warning "Invalid choice, continuing without Ollama"
|
||||
return 1
|
||||
;;
|
||||
esac
|
||||
fi
|
||||
}
|
||||
|
||||
@ -271,8 +321,8 @@ get_installation_preferences() {
|
||||
|
||||
echo ""
|
||||
echo -e "${BOLD}Installation options:${NC}"
|
||||
echo -e "${GREEN}L) Light${NC} - Ollama + basic deps (~50MB)"
|
||||
echo -e "${YELLOW}F) Full${NC} - Light + ML fallback (~2-3GB)"
|
||||
echo -e "${GREEN}L) Light${NC} - Ollama + basic deps (~50MB) ${CYAN}← Best performance + AI chat${NC}"
|
||||
echo -e "${YELLOW}F) Full${NC} - Light + ML fallback (~2-3GB) ${CYAN}← RAG-only if no Ollama${NC}"
|
||||
echo -e "${BLUE}C) Custom${NC} - Configure individual components"
|
||||
echo ""
|
||||
|
||||
@ -549,35 +599,125 @@ show_completion() {
|
||||
read -r run_test
|
||||
if [[ ! $run_test =~ ^[Nn]$ ]]; then
|
||||
run_quick_test
|
||||
echo ""
|
||||
show_beginner_guidance
|
||||
else
|
||||
show_beginner_guidance
|
||||
fi
|
||||
}
|
||||
|
||||
# Run quick test
|
||||
# Create sample project for testing
|
||||
create_sample_project() {
|
||||
local sample_dir="$SCRIPT_DIR/.sample_test"
|
||||
rm -rf "$sample_dir"
|
||||
mkdir -p "$sample_dir"
|
||||
|
||||
# Create a few small sample files
|
||||
cat > "$sample_dir/README.md" << 'EOF'
|
||||
# Sample Project
|
||||
|
||||
This is a sample project for testing FSS-Mini-RAG search capabilities.
|
||||
|
||||
## Features
|
||||
|
||||
- User authentication system
|
||||
- Document processing
|
||||
- Search functionality
|
||||
- Email integration
|
||||
EOF
|
||||
|
||||
cat > "$sample_dir/auth.py" << 'EOF'
|
||||
# Authentication module
|
||||
def login_user(username, password):
|
||||
"""Handle user login with password validation"""
|
||||
if validate_credentials(username, password):
|
||||
create_session(username)
|
||||
return True
|
||||
return False
|
||||
|
||||
def validate_credentials(username, password):
|
||||
"""Check username and password against database"""
|
||||
# Database validation logic here
|
||||
return check_password_hash(username, password)
|
||||
EOF
|
||||
|
||||
cat > "$sample_dir/search.py" << 'EOF'
|
||||
# Search functionality
|
||||
def semantic_search(query, documents):
|
||||
"""Perform semantic search across document collection"""
|
||||
embeddings = generate_embeddings(query)
|
||||
results = find_similar_documents(embeddings, documents)
|
||||
return rank_results(results)
|
||||
|
||||
def generate_embeddings(text):
|
||||
"""Generate vector embeddings for text"""
|
||||
# Embedding generation logic
|
||||
return process_with_model(text)
|
||||
EOF
|
||||
|
||||
echo "$sample_dir"
|
||||
}
|
||||
|
||||
# Run quick test with sample data
|
||||
run_quick_test() {
|
||||
print_header "Quick Test"
|
||||
|
||||
print_info "Testing on this project directory..."
|
||||
echo "This will index the FSS-Mini-RAG system itself as a test."
|
||||
print_info "Creating small sample project for testing..."
|
||||
local sample_dir=$(create_sample_project)
|
||||
echo "Sample project created with 3 files for fast testing."
|
||||
echo ""
|
||||
|
||||
# Index this project
|
||||
if ./rag-mini index "$SCRIPT_DIR"; then
|
||||
print_success "Indexing completed"
|
||||
# Index the sample project (much faster)
|
||||
print_info "Indexing sample project (this should be fast)..."
|
||||
if ./rag-mini index "$sample_dir" --quiet; then
|
||||
print_success "Sample project indexed successfully"
|
||||
|
||||
# Try a search
|
||||
echo ""
|
||||
print_info "Testing search functionality..."
|
||||
./rag-mini search "$SCRIPT_DIR" "embedding system" --limit 3
|
||||
print_info "Testing search with sample queries..."
|
||||
echo -e "${BLUE}Running search: 'user authentication'${NC}"
|
||||
./rag-mini search "$sample_dir" "user authentication" --limit 2
|
||||
|
||||
echo ""
|
||||
print_success "Test completed successfully!"
|
||||
echo -e "${CYAN}You can now use FSS-Mini-RAG on your own projects.${NC}"
|
||||
echo -e "${CYAN}Ready to use FSS-Mini-RAG on your own projects!${NC}"
|
||||
|
||||
# Offer beginner guidance
|
||||
echo ""
|
||||
echo -e "${YELLOW}💡 Beginner Tip:${NC} Try the interactive mode with pre-made questions"
|
||||
echo " Run: ./rag-tui for guided experience"
|
||||
|
||||
# Clean up sample
|
||||
rm -rf "$sample_dir"
|
||||
else
|
||||
print_error "Test failed"
|
||||
echo "Check the error messages above for troubleshooting."
|
||||
print_error "Sample test failed"
|
||||
echo "This might indicate an issue with the installation."
|
||||
rm -rf "$sample_dir"
|
||||
fi
|
||||
}
|
||||
|
||||
# Show beginner-friendly first steps
|
||||
show_beginner_guidance() {
|
||||
print_header "Getting Started - Your First Search"
|
||||
|
||||
echo -e "${CYAN}FSS-Mini-RAG is ready! Here's how to start:${NC}"
|
||||
echo ""
|
||||
echo -e "${GREEN}🎯 For Beginners (Recommended):${NC}"
|
||||
echo " ./rag-tui"
|
||||
echo " ↳ Interactive interface with sample questions"
|
||||
echo ""
|
||||
echo -e "${BLUE}💻 For Developers:${NC}"
|
||||
echo " ./rag-mini index /path/to/your/project"
|
||||
echo " ./rag-mini search /path/to/your/project \"your question\""
|
||||
echo ""
|
||||
echo -e "${YELLOW}📚 What can you search for in FSS-Mini-RAG?${NC}"
|
||||
echo " • Technical: \"chunking strategy\", \"ollama integration\", \"indexing performance\""
|
||||
echo " • Usage: \"how to improve search results\", \"why does indexing take long\""
|
||||
echo " • Your own projects: any code, docs, emails, notes, research"
|
||||
echo ""
|
||||
echo -e "${CYAN}💡 Pro tip:${NC} You can drag ANY text-based documents into a folder"
|
||||
echo " and search through them - emails, notes, research, chat logs!"
|
||||
}
|
||||
|
||||
# Main installation flow
|
||||
main() {
|
||||
echo -e "${CYAN}${BOLD}"
|
||||
|
||||
@ -72,13 +72,21 @@ class SearchConfig:
|
||||
@dataclass
|
||||
class LLMConfig:
|
||||
"""Configuration for LLM synthesis and query expansion."""
|
||||
ollama_host: str = "localhost:11434"
|
||||
# Core settings
|
||||
synthesis_model: str = "auto" # "auto", "qwen3:1.7b", "qwen2.5:1.5b", etc.
|
||||
expansion_model: str = "auto" # Usually same as synthesis_model
|
||||
max_expansion_terms: int = 8 # Maximum additional terms to add
|
||||
enable_synthesis: bool = False # Enable by default when --synthesize used
|
||||
synthesis_temperature: float = 0.3
|
||||
enable_thinking: bool = True # Enable thinking mode for Qwen3 models (production: True, testing: toggle)
|
||||
enable_thinking: bool = True # Enable thinking mode for Qwen3 models
|
||||
cpu_optimized: bool = True # Prefer lightweight models
|
||||
|
||||
# Provider-specific settings (for different LLM providers)
|
||||
provider: str = "ollama" # "ollama", "openai", "anthropic"
|
||||
ollama_host: str = "localhost:11434" # Ollama connection
|
||||
api_key: Optional[str] = None # API key for cloud providers
|
||||
api_base: Optional[str] = None # Base URL for API (e.g., OpenRouter)
|
||||
timeout: int = 20 # Request timeout in seconds
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@ -81,16 +81,36 @@ class OllamaEmbedder:
|
||||
|
||||
def _verify_ollama_connection(self):
|
||||
"""Verify Ollama server is running and model is available."""
|
||||
try:
|
||||
# Check server status
|
||||
response = requests.get(f"{self.base_url}/api/tags", timeout=5)
|
||||
response.raise_for_status()
|
||||
except requests.exceptions.ConnectionError:
|
||||
print("🔌 Ollama Service Unavailable")
|
||||
print(" Ollama provides AI embeddings that make semantic search possible")
|
||||
print(" Start Ollama: ollama serve")
|
||||
print(" Install models: ollama pull nomic-embed-text")
|
||||
print()
|
||||
raise ConnectionError("Ollama service not running. Start with: ollama serve")
|
||||
except requests.exceptions.Timeout:
|
||||
print("⏱️ Ollama Service Timeout")
|
||||
print(" Ollama is taking too long to respond")
|
||||
print(" Check if Ollama is overloaded: ollama ps")
|
||||
print(" Restart if needed: killall ollama && ollama serve")
|
||||
print()
|
||||
raise ConnectionError("Ollama service timeout")
|
||||
|
||||
# Check if our model is available
|
||||
models = response.json().get('models', [])
|
||||
model_names = [model['name'] for model in models]
|
||||
|
||||
if self.model_name not in model_names:
|
||||
logger.warning(f"Model {self.model_name} not found. Available: {model_names}")
|
||||
print(f"📦 Model '{self.model_name}' Not Found")
|
||||
print(" Embedding models convert text into searchable vectors")
|
||||
print(f" Download model: ollama pull {self.model_name}")
|
||||
if model_names:
|
||||
print(f" Available models: {', '.join(model_names[:3])}")
|
||||
print()
|
||||
# Try to pull the model
|
||||
self._pull_model()
|
||||
|
||||
|
||||
@ -117,11 +117,21 @@ class CodeSearcher:
|
||||
"""Connect to the LanceDB database."""
|
||||
try:
|
||||
if not self.rag_dir.exists():
|
||||
print("🗃️ No Search Index Found")
|
||||
print(" An index is a database that makes your files searchable")
|
||||
print(f" Create index: ./rag-mini index {self.project_path}")
|
||||
print(" (This analyzes your files and creates semantic search vectors)")
|
||||
print()
|
||||
raise FileNotFoundError(f"No RAG index found at {self.rag_dir}")
|
||||
|
||||
self.db = lancedb.connect(self.rag_dir)
|
||||
|
||||
if "code_vectors" not in self.db.table_names():
|
||||
print("🔧 Index Database Corrupted")
|
||||
print(" The search index exists but is missing data tables")
|
||||
print(f" Rebuild index: rm -rf {self.rag_dir} && ./rag-mini index {self.project_path}")
|
||||
print(" (This will recreate the search database)")
|
||||
print()
|
||||
raise ValueError("No code_vectors table found. Run indexing first.")
|
||||
|
||||
self.table = self.db.open_table("code_vectors")
|
||||
|
||||
46
rag-mini.py
46
rag-mini.py
@ -15,11 +15,29 @@ import logging
|
||||
# Add the RAG system to the path
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
from mini_rag.indexer import ProjectIndexer
|
||||
from mini_rag.search import CodeSearcher
|
||||
from mini_rag.ollama_embeddings import OllamaEmbedder
|
||||
from mini_rag.llm_synthesizer import LLMSynthesizer
|
||||
from mini_rag.explorer import CodeExplorer
|
||||
try:
|
||||
from mini_rag.indexer import ProjectIndexer
|
||||
from mini_rag.search import CodeSearcher
|
||||
from mini_rag.ollama_embeddings import OllamaEmbedder
|
||||
from mini_rag.llm_synthesizer import LLMSynthesizer
|
||||
from mini_rag.explorer import CodeExplorer
|
||||
except ImportError as e:
|
||||
print("❌ Error: Missing dependencies!")
|
||||
print()
|
||||
print("It looks like you haven't installed the required packages yet.")
|
||||
print("This is a common mistake - here's how to fix it:")
|
||||
print()
|
||||
print("1. Make sure you're in the FSS-Mini-RAG directory")
|
||||
print("2. Run the installer script:")
|
||||
print(" ./install_mini_rag.sh")
|
||||
print()
|
||||
print("Or if you want to install manually:")
|
||||
print(" python3 -m venv .venv")
|
||||
print(" source .venv/bin/activate")
|
||||
print(" pip install -r requirements.txt")
|
||||
print()
|
||||
print(f"Missing module: {e.name}")
|
||||
sys.exit(1)
|
||||
|
||||
# Configure logging for user-friendly output
|
||||
logging.basicConfig(
|
||||
@ -68,7 +86,25 @@ def index_project(project_path: Path, force: bool = False):
|
||||
if not (project_path / '.mini-rag' / 'last_search').exists():
|
||||
print(f"\n💡 Try: rag-mini search {project_path} \"your search here\"")
|
||||
|
||||
except FileNotFoundError:
|
||||
print(f"📁 Directory Not Found: {project_path}")
|
||||
print(" Make sure the path exists and you're in the right location")
|
||||
print(f" Current directory: {Path.cwd()}")
|
||||
print(" Check path: ls -la /path/to/your/project")
|
||||
print()
|
||||
sys.exit(1)
|
||||
except PermissionError:
|
||||
print("🔒 Permission Denied")
|
||||
print(" FSS-Mini-RAG needs to read files and create index database")
|
||||
print(f" Check permissions: ls -la {project_path}")
|
||||
print(" Try a different location with write access")
|
||||
print()
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
# Connection errors are handled in the embedding module
|
||||
if "ollama" in str(e).lower() or "connection" in str(e).lower():
|
||||
sys.exit(1) # Error already displayed
|
||||
|
||||
print(f"❌ Indexing failed: {e}")
|
||||
print()
|
||||
print("🔧 Common solutions:")
|
||||
|
||||
146
rag-tui.py
146
rag-tui.py
@ -15,6 +15,7 @@ class SimpleTUI:
|
||||
def __init__(self):
|
||||
self.project_path: Optional[Path] = None
|
||||
self.current_config: Dict[str, Any] = {}
|
||||
self.search_count = 0 # Track searches for sample reminder
|
||||
|
||||
def clear_screen(self):
|
||||
"""Clear the terminal screen."""
|
||||
@ -278,8 +279,37 @@ class SimpleTUI:
|
||||
print(f"Project: {self.project_path.name}")
|
||||
print()
|
||||
|
||||
# Get search query
|
||||
query = self.get_input("Enter search query", "").strip()
|
||||
# Show sample questions for beginners - relevant to FSS-Mini-RAG
|
||||
print("💡 Not sure what to search for? Try these questions about FSS-Mini-RAG:")
|
||||
print()
|
||||
sample_questions = [
|
||||
"chunking strategy",
|
||||
"ollama integration",
|
||||
"indexing performance",
|
||||
"why does indexing take long",
|
||||
"how to improve search results",
|
||||
"embedding generation"
|
||||
]
|
||||
|
||||
for i, question in enumerate(sample_questions[:3], 1):
|
||||
print(f" {i}. {question}")
|
||||
print(" 4. Enter your own question")
|
||||
print()
|
||||
|
||||
# Let user choose a sample or enter their own
|
||||
choice_str = self.get_input("Choose a number (1-4) or press Enter for custom", "4")
|
||||
|
||||
try:
|
||||
choice = int(choice_str)
|
||||
if 1 <= choice <= 3:
|
||||
query = sample_questions[choice - 1]
|
||||
print(f"Selected: '{query}'")
|
||||
print()
|
||||
else:
|
||||
query = self.get_input("Enter your search query", "").strip()
|
||||
except ValueError:
|
||||
query = self.get_input("Enter your search query", "").strip()
|
||||
|
||||
if not query:
|
||||
return
|
||||
|
||||
@ -354,6 +384,70 @@ class SimpleTUI:
|
||||
if len(results) > 1:
|
||||
print("💡 To see more context or specific results:")
|
||||
print(f" Run: ./rag-mini search {self.project_path} \"{query}\" --verbose")
|
||||
|
||||
# Suggest follow-up questions based on the search
|
||||
print()
|
||||
print("🔍 Suggested follow-up searches:")
|
||||
follow_up_questions = self.generate_follow_up_questions(query, results)
|
||||
for i, question in enumerate(follow_up_questions, 1):
|
||||
print(f" {i}. {question}")
|
||||
|
||||
# Ask if they want to run a follow-up search
|
||||
print()
|
||||
choice = input("Run a follow-up search? Enter number (1-3) or press Enter to continue: ").strip()
|
||||
if choice.isdigit() and 1 <= int(choice) <= len(follow_up_questions):
|
||||
# Recursive search with the follow-up question
|
||||
follow_up_query = follow_up_questions[int(choice) - 1]
|
||||
print(f"\nSearching for: '{follow_up_query}'")
|
||||
print("=" * 50)
|
||||
# Run another search
|
||||
follow_results = searcher.search(follow_up_query, top_k=5)
|
||||
|
||||
if follow_results:
|
||||
print(f"✅ Found {len(follow_results)} follow-up results:")
|
||||
print()
|
||||
for i, result in enumerate(follow_results[:3], 1): # Show top 3
|
||||
try:
|
||||
rel_path = result.file_path.relative_to(self.project_path)
|
||||
except:
|
||||
rel_path = result.file_path
|
||||
print(f"{i}. {rel_path} (Score: {result.score:.3f})")
|
||||
print(f" {result.content.strip()[:100]}...")
|
||||
print()
|
||||
else:
|
||||
print("❌ No follow-up results found")
|
||||
|
||||
# Track searches and show sample reminder
|
||||
self.search_count += 1
|
||||
|
||||
# Show sample reminder after 2 searches
|
||||
if self.search_count >= 2 and self.project_path.name == '.sample_test':
|
||||
print()
|
||||
print("⚠️ Sample Limitation Notice")
|
||||
print("=" * 30)
|
||||
print("You've been searching a small sample project.")
|
||||
print("For full exploration of your codebase, you need to index the complete project.")
|
||||
print()
|
||||
|
||||
# Show timing estimate if available
|
||||
try:
|
||||
with open('/tmp/fss-rag-sample-time.txt', 'r') as f:
|
||||
sample_time = int(f.read().strip())
|
||||
# Rough estimate: multiply by file count ratio
|
||||
estimated_time = sample_time * 20 # Rough multiplier
|
||||
print(f"🕒 Estimated full indexing time: ~{estimated_time} seconds")
|
||||
except:
|
||||
print("🕒 Estimated full indexing time: 1-3 minutes for typical projects")
|
||||
|
||||
print()
|
||||
choice = input("Index the full project now? [y/N]: ").strip().lower()
|
||||
if choice == 'y':
|
||||
# Switch to full project and index
|
||||
parent_dir = self.project_path.parent
|
||||
self.project_path = parent_dir
|
||||
print(f"\nSwitching to full project: {parent_dir}")
|
||||
print("Starting full indexing...")
|
||||
# Note: This would trigger full indexing in real implementation
|
||||
print(f" Or: ./rag-mini-enhanced context {self.project_path} \"{query}\"")
|
||||
print()
|
||||
|
||||
@ -364,6 +458,48 @@ class SimpleTUI:
|
||||
print()
|
||||
input("Press Enter to continue...")
|
||||
|
||||
def generate_follow_up_questions(self, original_query: str, results) -> List[str]:
|
||||
"""Generate contextual follow-up questions based on search results."""
|
||||
# Simple pattern-based follow-up generation
|
||||
follow_ups = []
|
||||
|
||||
# Based on original query patterns
|
||||
query_lower = original_query.lower()
|
||||
|
||||
# FSS-Mini-RAG specific follow-ups
|
||||
if "chunk" in query_lower:
|
||||
follow_ups.extend(["chunk size optimization", "smart chunking boundaries", "chunk overlap strategies"])
|
||||
elif "ollama" in query_lower:
|
||||
follow_ups.extend(["embedding model comparison", "ollama server setup", "nomic-embed-text performance"])
|
||||
elif "index" in query_lower or "performance" in query_lower:
|
||||
follow_ups.extend(["indexing speed optimization", "memory usage during indexing", "file processing pipeline"])
|
||||
elif "search" in query_lower or "result" in query_lower:
|
||||
follow_ups.extend(["search result ranking", "semantic vs keyword search", "query expansion techniques"])
|
||||
elif "embed" in query_lower:
|
||||
follow_ups.extend(["vector embedding storage", "embedding model fallbacks", "similarity scoring"])
|
||||
else:
|
||||
# Generic RAG-related follow-ups
|
||||
follow_ups.extend(["vector database internals", "search quality tuning", "embedding optimization"])
|
||||
|
||||
# Based on file types found in results (FSS-Mini-RAG specific)
|
||||
if results:
|
||||
file_extensions = set()
|
||||
for result in results[:3]: # Check first 3 results
|
||||
ext = result.file_path.suffix.lower()
|
||||
file_extensions.add(ext)
|
||||
|
||||
if '.py' in file_extensions:
|
||||
follow_ups.append("Python module dependencies")
|
||||
if '.md' in file_extensions:
|
||||
follow_ups.append("documentation implementation")
|
||||
if 'chunker' in str(results[0].file_path).lower():
|
||||
follow_ups.append("chunking algorithm details")
|
||||
if 'search' in str(results[0].file_path).lower():
|
||||
follow_ups.append("search algorithm implementation")
|
||||
|
||||
# Return top 3 unique follow-ups
|
||||
return list(dict.fromkeys(follow_ups))[:3]
|
||||
|
||||
def explore_interactive(self):
|
||||
"""Interactive exploration interface with thinking mode."""
|
||||
if not self.project_path:
|
||||
@ -682,6 +818,12 @@ class SimpleTUI:
|
||||
status = "✅ Indexed" if rag_dir.exists() else "❌ Not indexed"
|
||||
print(f"📁 Current project: {self.project_path.name} ({status})")
|
||||
print()
|
||||
else:
|
||||
# Show beginner tips when no project selected
|
||||
print("🎯 Welcome to FSS-Mini-RAG!")
|
||||
print(" Search through code, documents, emails, notes - anything text-based!")
|
||||
print(" Start by selecting a project directory below.")
|
||||
print()
|
||||
|
||||
options = [
|
||||
"Select project directory",
|
||||
|
||||
124
scripts/test-configs.py
Executable file
124
scripts/test-configs.py
Executable file
@ -0,0 +1,124 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Test script to validate all config examples are syntactically correct
|
||||
and contain required fields for FSS-Mini-RAG.
|
||||
"""
|
||||
|
||||
import yaml
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List
|
||||
|
||||
def validate_config_structure(config: Dict[str, Any], config_name: str) -> List[str]:
|
||||
"""Validate that config has required structure."""
|
||||
errors = []
|
||||
|
||||
# Required sections
|
||||
required_sections = ['chunking', 'streaming', 'files', 'embedding', 'search']
|
||||
for section in required_sections:
|
||||
if section not in config:
|
||||
errors.append(f"{config_name}: Missing required section '{section}'")
|
||||
|
||||
# Validate chunking section
|
||||
if 'chunking' in config:
|
||||
chunking = config['chunking']
|
||||
required_chunking = ['max_size', 'min_size', 'strategy']
|
||||
for field in required_chunking:
|
||||
if field not in chunking:
|
||||
errors.append(f"{config_name}: Missing chunking.{field}")
|
||||
|
||||
# Validate types and ranges
|
||||
if 'max_size' in chunking and not isinstance(chunking['max_size'], int):
|
||||
errors.append(f"{config_name}: chunking.max_size must be integer")
|
||||
if 'min_size' in chunking and not isinstance(chunking['min_size'], int):
|
||||
errors.append(f"{config_name}: chunking.min_size must be integer")
|
||||
if 'strategy' in chunking and chunking['strategy'] not in ['semantic', 'fixed']:
|
||||
errors.append(f"{config_name}: chunking.strategy must be 'semantic' or 'fixed'")
|
||||
|
||||
# Validate embedding section
|
||||
if 'embedding' in config:
|
||||
embedding = config['embedding']
|
||||
if 'preferred_method' in embedding:
|
||||
valid_methods = ['ollama', 'ml', 'hash', 'auto']
|
||||
if embedding['preferred_method'] not in valid_methods:
|
||||
errors.append(f"{config_name}: embedding.preferred_method must be one of {valid_methods}")
|
||||
|
||||
# Validate LLM section (if present)
|
||||
if 'llm' in config:
|
||||
llm = config['llm']
|
||||
if 'synthesis_temperature' in llm:
|
||||
temp = llm['synthesis_temperature']
|
||||
if not isinstance(temp, (int, float)) or temp < 0 or temp > 1:
|
||||
errors.append(f"{config_name}: llm.synthesis_temperature must be number between 0-1")
|
||||
|
||||
return errors
|
||||
|
||||
def test_config_file(config_path: Path) -> bool:
|
||||
"""Test a single config file."""
|
||||
print(f"Testing {config_path.name}...")
|
||||
|
||||
try:
|
||||
# Test YAML parsing
|
||||
with open(config_path, 'r') as f:
|
||||
config = yaml.safe_load(f)
|
||||
|
||||
if not config:
|
||||
print(f" ❌ {config_path.name}: Empty or invalid YAML")
|
||||
return False
|
||||
|
||||
# Test structure
|
||||
errors = validate_config_structure(config, config_path.name)
|
||||
|
||||
if errors:
|
||||
print(f" ❌ {config_path.name}: Structure errors:")
|
||||
for error in errors:
|
||||
print(f" • {error}")
|
||||
return False
|
||||
|
||||
print(f" ✅ {config_path.name}: Valid")
|
||||
return True
|
||||
|
||||
except yaml.YAMLError as e:
|
||||
print(f" ❌ {config_path.name}: YAML parsing error: {e}")
|
||||
return False
|
||||
except Exception as e:
|
||||
print(f" ❌ {config_path.name}: Unexpected error: {e}")
|
||||
return False
|
||||
|
||||
def main():
|
||||
"""Test all config examples."""
|
||||
script_dir = Path(__file__).parent
|
||||
project_root = script_dir.parent
|
||||
examples_dir = project_root / 'examples'
|
||||
|
||||
if not examples_dir.exists():
|
||||
print(f"❌ Examples directory not found: {examples_dir}")
|
||||
sys.exit(1)
|
||||
|
||||
# Find all config files
|
||||
config_files = list(examples_dir.glob('config*.yaml'))
|
||||
|
||||
if not config_files:
|
||||
print(f"❌ No config files found in {examples_dir}")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"🧪 Testing {len(config_files)} config files...\n")
|
||||
|
||||
all_passed = True
|
||||
for config_file in sorted(config_files):
|
||||
passed = test_config_file(config_file)
|
||||
if not passed:
|
||||
all_passed = False
|
||||
|
||||
print(f"\n{'='*50}")
|
||||
if all_passed:
|
||||
print("✅ All config files are valid!")
|
||||
print("\n💡 To use any config:")
|
||||
print(" cp examples/config-NAME.yaml /path/to/project/.mini-rag/config.yaml")
|
||||
sys.exit(0)
|
||||
else:
|
||||
print("❌ Some config files have issues - please fix before release")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
Loading…
x
Reference in New Issue
Block a user