Fss-Rag-Mini/examples/config.yaml
BobAi 4925f6d4e4 Add comprehensive testing suite and documentation for new features
📚 DOCUMENTATION
- docs/QUERY_EXPANSION.md: Complete beginner guide with examples and troubleshooting
- Updated config.yaml with proper LLM settings and comments
- Clear explanations of when features are enabled/disabled

🧪 NEW TESTING INFRASTRUCTURE
- test_ollama_integration.py: 6 comprehensive tests with helpful error messages
- test_smart_ranking.py: 6 tests verifying ranking quality improvements
- troubleshoot.py: Interactive tool for diagnosing setup issues
- Enhanced system validation with new features coverage

⚙️ SMART DEFAULTS
- Query expansion disabled by default (CLI speed)
- TUI enables expansion automatically (exploration mode)
- Clear user feedback about which features are active
- Graceful degradation when Ollama unavailable

🎯 BEGINNER-FRIENDLY APPROACH
- Tests explain what they're checking and why
- Clear solutions provided for common problems
- Educational output showing system status
- Offline testing with gentle mocking

Run 'python3 tests/troubleshoot.py' to verify your setup\!
2025-08-12 17:36:32 +10:00

53 lines
1.8 KiB
YAML

# FSS-Mini-RAG Configuration
# Edit this file to customize indexing and search behavior
# See docs/GETTING_STARTED.md for detailed explanations
# Text chunking settings
chunking:
max_size: 2000 # Maximum characters per chunk
min_size: 150 # Minimum characters per chunk
strategy: semantic # 'semantic' (language-aware) or 'fixed'
# Large file streaming settings
streaming:
enabled: true
threshold_bytes: 1048576 # Files larger than this use streaming (1MB)
# File processing settings
files:
min_file_size: 50 # Skip files smaller than this
exclude_patterns:
- "node_modules/**"
- ".git/**"
- "__pycache__/**"
- "*.pyc"
- ".venv/**"
- "venv/**"
- "build/**"
- "dist/**"
include_patterns:
- "**/*" # Include all files by default
# Embedding generation settings
embedding:
preferred_method: ollama # 'ollama', 'ml', 'hash', or 'auto'
ollama_model: nomic-embed-text
ollama_host: localhost:11434
ml_model: sentence-transformers/all-MiniLM-L6-v2
batch_size: 32 # Embeddings processed per batch
# Search behavior settings
search:
default_limit: 10 # Default number of results
enable_bm25: true # Enable keyword matching boost
similarity_threshold: 0.1 # Minimum similarity score
expand_queries: false # Enable automatic query expansion (TUI auto-enables)
# LLM synthesis and query expansion settings
llm:
ollama_host: localhost:11434
synthesis_model: auto # 'auto', 'qwen3:1.7b', etc.
expansion_model: auto # Usually same as synthesis_model
max_expansion_terms: 8 # Maximum terms to add to queries
enable_synthesis: false # Enable synthesis by default
synthesis_temperature: 0.3 # LLM temperature for analysis