Fss-Rag-Mini

Author	SHA1	Message	Date
BobAi	a84ff94fba	Improve UX with streaming tokens, fix model references, and add icon integration This comprehensive update enhances user experience with several key improvements: ## Enhanced Streaming & Thinking Display - Implement real-time streaming with gray thinking tokens that collapse after completion - Fix thinking token redisplay bug with proper content filtering - Add clear "AI Response:" headers to separate thinking from responses - Enable streaming by default for better user engagement - Keep thinking visible for exploration, collapse only for suggested questions ## Natural Conversation Responses - Convert clunky JSON exploration responses to natural, conversational format - Improve exploration prompts for friendly, colleague-style interactions - Update summary generation with better context handling - Eliminate double response display issues ## Model Reference Updates - Remove all llama3.2 references in favor of qwen3 models - Fix non-existent qwen3:3b references, replace with proper model names - Update model rankings to prioritize working qwen models across all components - Ensure consistent model recommendations in docs and examples ## Cross-Platform Icon Integration - Add desktop icon setup to Linux installer with .desktop entry - Add Windows shortcuts for desktop and Start Menu integration - Improve installer user experience with visual branding ## Configuration & Navigation Fixes - Fix "0" option in configuration menu to properly go back - Improve configuration menu user-friendliness - Update troubleshooting guides with correct model suggestions These changes significantly improve the beginner experience while maintaining technical accuracy and system reliability.	2025-08-15 12:20:06 +10:00
BobAi	c201b3badd	Fix critical deployment issues and improve system reliability Major fixes: - Fix model selection to prioritize qwen3:1.7b instead of qwen3:4b for testing - Correct context length from 80,000 to 32,000 tokens (proper Qwen3 limit) - Implement content-preserving safeguards instead of dropping responses - Fix all test imports from claude_rag to mini_rag module naming - Add virtual environment warnings to all test entry points - Fix TUI EOF crash handling with proper error handling - Remove warmup delays that were causing startup lag and unwanted model calls - Fix command mappings between bash wrapper and Python script - Update documentation to reflect qwen3:1.7b as primary recommendation - Improve TUI box alignment and formatting - Make language generic for any documents, not just codebases - Add proper folder names in user feedback instead of generic terms Technical improvements: - Unified model rankings across all components - Better error handling for missing dependencies - Comprehensive testing and validation of all fixes - All tests now pass and system is deployment-ready All major crashes and deployment issues resolved.	2025-08-15 09:47:15 +10:00
BobAi	3363171820	🎓 Complete beginner-friendly polish with production reliability ✨ BEGINNER-FRIENDLY ENHANCEMENTS: - Add comprehensive glossary explaining RAG, embeddings, chunks in plain English - Create detailed troubleshooting guide covering installation, search issues, performance - Provide preset configs (beginner/fast/quality) with extensive helpful comments - Enhanced error messages with specific solutions and next steps 🔧 PRODUCTION RELIABILITY: - Add thread-safe caching with automatic cleanup in QueryExpander - Implement chunked processing for large batches to prevent memory issues - Enhanced concurrent embedding with intelligent batch size management - Memory leak prevention with LRU cache approximation 🏗️ ARCHITECTURE COMPLETENESS: - Maintain two-mode system (synthesis fast, exploration thinking + memory) - Preserve educational value while removing intimidation barriers - Complete testing coverage for mode separation and context memory - Full documentation reflecting clean two-mode architecture Perfect balance: genuinely beginner-friendly without compromising technical sophistication	2025-08-12 18:59:24 +10:00
BobAi	a7e3e6f474	Add interactive exploration mode with thinking and context memory - Create separate explore mode with thinking enabled for debugging/learning - Add lazy loading with LLM warmup using 'testing, just say "hi" <no_think>' - Implement context-aware conversation memory across questions - Add interactive CLI with help, summary, and session management - Enable Qwen3 thinking mode toggle for experimentation - Support multi-turn conversations for better debugging workflow - Clean separation between fast synthesis and deep exploration modes	2025-08-12 18:06:08 +10:00
BobAi	16199375fc	Add CPU-only deployment support with qwen3:0.6b model - Update model rankings to prioritize ultra-efficient CPU models (qwen3:0.6b first) - Add comprehensive CPU deployment documentation with performance benchmarks - Configure CPU-optimized settings in default config - Enable 796MB total model footprint for standard systems - Support Raspberry Pi, older laptops, and CPU-only environments - Maintain excellent quality with 522MB qwen3:0.6b model	2025-08-12 17:49:02 +10:00
BobAi	4925f6d4e4	Add comprehensive testing suite and documentation for new features 📚 DOCUMENTATION - docs/QUERY_EXPANSION.md: Complete beginner guide with examples and troubleshooting - Updated config.yaml with proper LLM settings and comments - Clear explanations of when features are enabled/disabled 🧪 NEW TESTING INFRASTRUCTURE - test_ollama_integration.py: 6 comprehensive tests with helpful error messages - test_smart_ranking.py: 6 tests verifying ranking quality improvements - troubleshoot.py: Interactive tool for diagnosing setup issues - Enhanced system validation with new features coverage ⚙️ SMART DEFAULTS - Query expansion disabled by default (CLI speed) - TUI enables expansion automatically (exploration mode) - Clear user feedback about which features are active - Graceful degradation when Ollama unavailable 🎯 BEGINNER-FRIENDLY APPROACH - Tests explain what they're checking and why - Clear solutions provided for common problems - Educational output showing system status - Offline testing with gentle mocking Run 'python3 tests/troubleshoot.py' to verify your setup\!	2025-08-12 17:36:32 +10:00
BobAi	4166d0a362	Initial release: FSS-Mini-RAG - Lightweight semantic code search system 🎯 Complete transformation from 5.9GB bloated system to 70MB optimized solution ✨ Key Features: - Hybrid embedding system (Ollama + ML fallback + hash backup) - Intelligent chunking with language-aware parsing - Semantic + BM25 hybrid search with rich context - Zero-config portable design with graceful degradation - Beautiful TUI for beginners + powerful CLI for experts - Comprehensive documentation with 8+ Mermaid diagrams - Professional animated demo (183KB optimized GIF) 🏗️ Architecture Highlights: - LanceDB vector storage with streaming indexing - Smart file tracking (size/mtime) to avoid expensive rehashing - Progressive chunking: Markdown headers → Python functions → fixed-size - Quality filtering: 200+ chars, 20+ words, 30% alphanumeric content - Concurrent batch processing with error recovery 📦 Package Contents: - Core engine: claude_rag/ (11 modules, 2,847 lines) - Entry points: rag-mini (unified), rag-tui (beginner interface) - Documentation: README + 6 guides with visual diagrams - Assets: 3D icon, optimized demo GIF, recording tools - Tests: 8 comprehensive integration and validation tests - Examples: Usage patterns, config templates, dependency analysis 🎥 Demo System: - Scripted demonstration showing 12 files → 58 chunks indexing - Semantic search with multi-line result previews - Complete workflow from TUI startup to CLI mastery - Professional recording pipeline with asciinema + GIF conversion 🛡️ Security & Quality: - Complete .gitignore with personal data protection - Dependency optimization (removed python-dotenv) - Code quality validation and educational test suite - Agent-reviewed architecture and documentation Ready for production use - copy folder, run ./rag-mini, start searching\!	2025-08-12 16:38:28 +10:00

7 Commits