BobAi a84ff94fba Improve UX with streaming tokens, fix model references, and add icon integration

This comprehensive update enhances user experience with several key improvements:

## Enhanced Streaming & Thinking Display
- Implement real-time streaming with gray thinking tokens that collapse after completion
- Fix thinking token redisplay bug with proper content filtering
- Add clear "AI Response:" headers to separate thinking from responses
- Enable streaming by default for better user engagement
- Keep thinking visible for exploration, collapse only for suggested questions

## Natural Conversation Responses
- Convert clunky JSON exploration responses to natural, conversational format
- Improve exploration prompts for friendly, colleague-style interactions
- Update summary generation with better context handling
- Eliminate double response display issues

## Model Reference Updates
- Remove all llama3.2 references in favor of qwen3 models
- Fix non-existent qwen3:3b references, replace with proper model names
- Update model rankings to prioritize working qwen models across all components
- Ensure consistent model recommendations in docs and examples

## Cross-Platform Icon Integration
- Add desktop icon setup to Linux installer with .desktop entry
- Add Windows shortcuts for desktop and Start Menu integration
- Improve installer user experience with visual branding

## Configuration & Navigation Fixes
- Fix "0" option in configuration menu to properly go back
- Improve configuration menu user-friendliness
- Update troubleshooting guides with correct model suggestions

These changes significantly improve the beginner experience while maintaining
technical accuracy and system reliability.

2025-08-15 12:20:06 +10:00

3.1 KiB

Raw Permalink Blame History

Query Expansion Guide

What Is Query Expansion?

Query expansion automatically adds related terms to your search to find more relevant results.

Example:

You search: "authentication"
System expands to: "authentication login user verification credentials security"
Result: 2-3x more relevant matches!

How It Works

graph LR
    A[User Query] --> B[LLM Expands]
    B --> C[Enhanced Search]
    C --> D[Better Results]
    
    style A fill:#e1f5fe
    style D fill:#e8f5e8

Your query goes to a small, fast LLM (like qwen3:1.7b)
LLM adds related terms that people might use when writing about the topic
Both semantic and keyword search use the expanded query
You get much better results without changing anything

When Is It Enabled?

❌ CLI commands: Disabled by default (for speed)
✅ TUI interface: Auto-enabled (when you have time to explore)
⚙️ Configurable: Can be enabled/disabled in config.yaml

Configuration

Easy Configuration (TUI)

Use the interactive Configuration Manager in the TUI:

Start TUI: ./rag-tui or rag.bat (Windows)
Select Option 6: Configuration Manager
Choose Option 2: Toggle query expansion
Follow prompts: Get explanation and easy on/off toggle

The TUI will:

Explain benefits and requirements clearly
Check if Ollama is available
Show current status (enabled/disabled)
Save changes automatically

Manual Configuration (Advanced)

Edit config.yaml directly:

# Search behavior settings
search:
  expand_queries: false         # Enable automatic query expansion

# LLM expansion settings  
llm:
  max_expansion_terms: 8        # How many terms to add
  expansion_model: auto         # Which model to use
  ollama_host: localhost:11434  # Ollama server

Performance

Speed: ~100ms on most systems (depends on your hardware)
Caching: Repeated queries are instant
Model Selection: Automatically uses fastest available model

Examples

Code Search:

"error handling" → "error handling exception try catch fault tolerance recovery"

Documentation Search:

"installation" → "installation setup install deploy configuration getting started"

Any Content:

"budget planning" → "budget planning financial forecast cost analysis spending plan"

Troubleshooting

Query expansion not working?

Check if Ollama is running: curl http://localhost:11434/api/tags
Verify you have a model installed: ollama list
Check logs with --verbose flag

Too slow?

Disable in config.yaml: expand_queries: false
Or use faster model: expansion_model: "qwen3:0.6b"

Poor expansions?

Try different model: expansion_model: "qwen3:1.7b"
Reduce terms: max_expansion_terms: 5

Technical Details

The QueryExpander class:

Uses temperature 0.1 for consistent results
Limits expansions to prevent very long queries
Handles model selection automatically
Includes smart caching to avoid repeated calls

Perfect for beginners because it "just works" - enable it when you want better results, disable when you want maximum speed.

3.1 KiB Raw Permalink Blame History