Fss-Rag-Mini/docs/QUERY_EXPANSION.md
BobAi a84ff94fba Improve UX with streaming tokens, fix model references, and add icon integration
This comprehensive update enhances user experience with several key improvements:

## Enhanced Streaming & Thinking Display
- Implement real-time streaming with gray thinking tokens that collapse after completion
- Fix thinking token redisplay bug with proper content filtering
- Add clear "AI Response:" headers to separate thinking from responses
- Enable streaming by default for better user engagement
- Keep thinking visible for exploration, collapse only for suggested questions

## Natural Conversation Responses
- Convert clunky JSON exploration responses to natural, conversational format
- Improve exploration prompts for friendly, colleague-style interactions
- Update summary generation with better context handling
- Eliminate double response display issues

## Model Reference Updates
- Remove all llama3.2 references in favor of qwen3 models
- Fix non-existent qwen3:3b references, replace with proper model names
- Update model rankings to prioritize working qwen models across all components
- Ensure consistent model recommendations in docs and examples

## Cross-Platform Icon Integration
- Add desktop icon setup to Linux installer with .desktop entry
- Add Windows shortcuts for desktop and Start Menu integration
- Improve installer user experience with visual branding

## Configuration & Navigation Fixes
- Fix "0" option in configuration menu to properly go back
- Improve configuration menu user-friendliness
- Update troubleshooting guides with correct model suggestions

These changes significantly improve the beginner experience while maintaining
technical accuracy and system reliability.
2025-08-15 12:20:06 +10:00

114 lines
3.1 KiB
Markdown

# Query Expansion Guide
## What Is Query Expansion?
Query expansion automatically adds related terms to your search to find more relevant results.
**Example:**
- You search: `"authentication"`
- System expands to: `"authentication login user verification credentials security"`
- Result: 2-3x more relevant matches!
## How It Works
```mermaid
graph LR
A[User Query] --> B[LLM Expands]
B --> C[Enhanced Search]
C --> D[Better Results]
style A fill:#e1f5fe
style D fill:#e8f5e8
```
1. **Your query** goes to a small, fast LLM (like qwen3:1.7b)
2. **LLM adds related terms** that people might use when writing about the topic
3. **Both semantic and keyword search** use the expanded query
4. **You get much better results** without changing anything
## When Is It Enabled?
-**CLI commands**: Disabled by default (for speed)
-**TUI interface**: Auto-enabled (when you have time to explore)
- ⚙️ **Configurable**: Can be enabled/disabled in config.yaml
## Configuration
### Easy Configuration (TUI)
Use the interactive Configuration Manager in the TUI:
1. **Start TUI**: `./rag-tui` or `rag.bat` (Windows)
2. **Select Option 6**: Configuration Manager
3. **Choose Option 2**: Toggle query expansion
4. **Follow prompts**: Get explanation and easy on/off toggle
The TUI will:
- Explain benefits and requirements clearly
- Check if Ollama is available
- Show current status (enabled/disabled)
- Save changes automatically
### Manual Configuration (Advanced)
Edit `config.yaml` directly:
```yaml
# Search behavior settings
search:
expand_queries: false # Enable automatic query expansion
# LLM expansion settings
llm:
max_expansion_terms: 8 # How many terms to add
expansion_model: auto # Which model to use
ollama_host: localhost:11434 # Ollama server
```
## Performance
- **Speed**: ~100ms on most systems (depends on your hardware)
- **Caching**: Repeated queries are instant
- **Model Selection**: Automatically uses fastest available model
## Examples
**Code Search:**
```
"error handling" → "error handling exception try catch fault tolerance recovery"
```
**Documentation Search:**
```
"installation" → "installation setup install deploy configuration getting started"
```
**Any Content:**
```
"budget planning" → "budget planning financial forecast cost analysis spending plan"
```
## Troubleshooting
**Query expansion not working?**
1. Check if Ollama is running: `curl http://localhost:11434/api/tags`
2. Verify you have a model installed: `ollama list`
3. Check logs with `--verbose` flag
**Too slow?**
1. Disable in config.yaml: `expand_queries: false`
2. Or use faster model: `expansion_model: "qwen3:0.6b"`
**Poor expansions?**
1. Try different model: `expansion_model: "qwen3:1.7b"`
2. Reduce terms: `max_expansion_terms: 5`
## Technical Details
The QueryExpander class:
- Uses temperature 0.1 for consistent results
- Limits expansions to prevent very long queries
- Handles model selection automatically
- Includes smart caching to avoid repeated calls
Perfect for beginners because it "just works" - enable it when you want better results, disable when you want maximum speed.