Fss-Rag-Mini/docs/QUERY_EXPANSION.md
BobAi a84ff94fba Improve UX with streaming tokens, fix model references, and add icon integration
This comprehensive update enhances user experience with several key improvements:

## Enhanced Streaming & Thinking Display
- Implement real-time streaming with gray thinking tokens that collapse after completion
- Fix thinking token redisplay bug with proper content filtering
- Add clear "AI Response:" headers to separate thinking from responses
- Enable streaming by default for better user engagement
- Keep thinking visible for exploration, collapse only for suggested questions

## Natural Conversation Responses
- Convert clunky JSON exploration responses to natural, conversational format
- Improve exploration prompts for friendly, colleague-style interactions
- Update summary generation with better context handling
- Eliminate double response display issues

## Model Reference Updates
- Remove all llama3.2 references in favor of qwen3 models
- Fix non-existent qwen3:3b references, replace with proper model names
- Update model rankings to prioritize working qwen models across all components
- Ensure consistent model recommendations in docs and examples

## Cross-Platform Icon Integration
- Add desktop icon setup to Linux installer with .desktop entry
- Add Windows shortcuts for desktop and Start Menu integration
- Improve installer user experience with visual branding

## Configuration & Navigation Fixes
- Fix "0" option in configuration menu to properly go back
- Improve configuration menu user-friendliness
- Update troubleshooting guides with correct model suggestions

These changes significantly improve the beginner experience while maintaining
technical accuracy and system reliability.
2025-08-15 12:20:06 +10:00

3.1 KiB

Query Expansion Guide

What Is Query Expansion?

Query expansion automatically adds related terms to your search to find more relevant results.

Example:

  • You search: "authentication"
  • System expands to: "authentication login user verification credentials security"
  • Result: 2-3x more relevant matches!

How It Works

graph LR
    A[User Query] --> B[LLM Expands]
    B --> C[Enhanced Search]
    C --> D[Better Results]
    
    style A fill:#e1f5fe
    style D fill:#e8f5e8
  1. Your query goes to a small, fast LLM (like qwen3:1.7b)
  2. LLM adds related terms that people might use when writing about the topic
  3. Both semantic and keyword search use the expanded query
  4. You get much better results without changing anything

When Is It Enabled?

  • CLI commands: Disabled by default (for speed)
  • TUI interface: Auto-enabled (when you have time to explore)
  • ⚙️ Configurable: Can be enabled/disabled in config.yaml

Configuration

Easy Configuration (TUI)

Use the interactive Configuration Manager in the TUI:

  1. Start TUI: ./rag-tui or rag.bat (Windows)
  2. Select Option 6: Configuration Manager
  3. Choose Option 2: Toggle query expansion
  4. Follow prompts: Get explanation and easy on/off toggle

The TUI will:

  • Explain benefits and requirements clearly
  • Check if Ollama is available
  • Show current status (enabled/disabled)
  • Save changes automatically

Manual Configuration (Advanced)

Edit config.yaml directly:

# Search behavior settings
search:
  expand_queries: false         # Enable automatic query expansion

# LLM expansion settings  
llm:
  max_expansion_terms: 8        # How many terms to add
  expansion_model: auto         # Which model to use
  ollama_host: localhost:11434  # Ollama server

Performance

  • Speed: ~100ms on most systems (depends on your hardware)
  • Caching: Repeated queries are instant
  • Model Selection: Automatically uses fastest available model

Examples

Code Search:

"error handling" → "error handling exception try catch fault tolerance recovery"

Documentation Search:

"installation" → "installation setup install deploy configuration getting started"

Any Content:

"budget planning" → "budget planning financial forecast cost analysis spending plan"

Troubleshooting

Query expansion not working?

  1. Check if Ollama is running: curl http://localhost:11434/api/tags
  2. Verify you have a model installed: ollama list
  3. Check logs with --verbose flag

Too slow?

  1. Disable in config.yaml: expand_queries: false
  2. Or use faster model: expansion_model: "qwen3:0.6b"

Poor expansions?

  1. Try different model: expansion_model: "qwen3:1.7b"
  2. Reduce terms: max_expansion_terms: 5

Technical Details

The QueryExpander class:

  • Uses temperature 0.1 for consistent results
  • Limits expansions to prevent very long queries
  • Handles model selection automatically
  • Includes smart caching to avoid repeated calls

Perfect for beginners because it "just works" - enable it when you want better results, disable when you want maximum speed.