- Create global wrapper script in /usr/local/bin/rag-mini
- Automatically handles virtual environment activation
- Suppress virtual environment warnings when using global wrapper
- Update installation scripts to install global wrapper automatically
- Add comprehensive timing documentation (2-3 min fast, 5-10 min slow internet)
- Add agent warnings for background process execution
- Update all documentation with realistic timing expectations
- Fix README commands to use correct syntax (rag-mini init -p .)
Major improvements:
- Users can now run 'rag-mini' from anywhere without activation
- Installation creates transparent global command automatically
- No more virtual environment complexity for end users
- Comprehensive agent/CI/CD guidance with timeout warnings
- Complete documentation consistency across all files
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Applied Black formatter and isort across entire codebase for professional consistency
- Moved implementation scripts (rag-mini.py, rag-tui.py) to bin/ directory for cleaner root
- Updated shell scripts to reference new bin/ locations maintaining user compatibility
- Added comprehensive linting configuration (.flake8, pyproject.toml) with dedicated .venv-linting
- Removed development artifacts (commit_message.txt, GET_STARTED.md duplicate) from root
- Consolidated documentation and fixed script references across all guides
- Relocated test_fixes.py to proper tests/ directory
- Enhanced project structure following Python packaging standards
All user commands work identically while improving code organization and beginner accessibility.
Major enhancements:
• Add comprehensive deployment guide covering all platforms (mobile, edge, cloud)
• Implement system context collection for enhanced AI responses
• Update documentation with current workflows and deployment scenarios
• Fix Windows compatibility bugs in file locking system
• Enhanced diagrams with system context integration flow
• Improved exploration mode with better context handling
Platform support expanded:
• Full macOS compatibility verified
• Raspberry Pi deployment with ARM64 optimizations
• Android deployment via Termux with configuration examples
• Edge device deployment strategies and performance guidelines
• Docker containerization for universal deployment
Technical improvements:
• System context module provides OS/environment awareness to AI
• Context-aware prompts improve response relevance
• Enhanced error handling and graceful fallbacks
• Better integration between synthesis and exploration modes
Documentation updates:
• Complete deployment guide with troubleshooting
• Updated getting started guide with current installation flows
• Enhanced visual diagrams showing system architecture
• Platform-specific configuration examples
Ready for extended deployment testing and user feedback.
This comprehensive update enhances user experience with several key improvements:
## Enhanced Streaming & Thinking Display
- Implement real-time streaming with gray thinking tokens that collapse after completion
- Fix thinking token redisplay bug with proper content filtering
- Add clear "AI Response:" headers to separate thinking from responses
- Enable streaming by default for better user engagement
- Keep thinking visible for exploration, collapse only for suggested questions
## Natural Conversation Responses
- Convert clunky JSON exploration responses to natural, conversational format
- Improve exploration prompts for friendly, colleague-style interactions
- Update summary generation with better context handling
- Eliminate double response display issues
## Model Reference Updates
- Remove all llama3.2 references in favor of qwen3 models
- Fix non-existent qwen3:3b references, replace with proper model names
- Update model rankings to prioritize working qwen models across all components
- Ensure consistent model recommendations in docs and examples
## Cross-Platform Icon Integration
- Add desktop icon setup to Linux installer with .desktop entry
- Add Windows shortcuts for desktop and Start Menu integration
- Improve installer user experience with visual branding
## Configuration & Navigation Fixes
- Fix "0" option in configuration menu to properly go back
- Improve configuration menu user-friendliness
- Update troubleshooting guides with correct model suggestions
These changes significantly improve the beginner experience while maintaining
technical accuracy and system reliability.
Major fixes:
- Fix model selection to prioritize qwen3:1.7b instead of qwen3:4b for testing
- Correct context length from 80,000 to 32,000 tokens (proper Qwen3 limit)
- Implement content-preserving safeguards instead of dropping responses
- Fix all test imports from claude_rag to mini_rag module naming
- Add virtual environment warnings to all test entry points
- Fix TUI EOF crash handling with proper error handling
- Remove warmup delays that were causing startup lag and unwanted model calls
- Fix command mappings between bash wrapper and Python script
- Update documentation to reflect qwen3:1.7b as primary recommendation
- Improve TUI box alignment and formatting
- Make language generic for any documents, not just codebases
- Add proper folder names in user feedback instead of generic terms
Technical improvements:
- Unified model rankings across all components
- Better error handling for missing dependencies
- Comprehensive testing and validation of all fixes
- All tests now pass and system is deployment-ready
All major crashes and deployment issues resolved.
- Changed primary model recommendation from qwen3:1.7b to qwen3:4b
- Added Q8 quantization info in technical docs for production users
- Fixed method name error: get_embedding_info() -> get_status()
- Updated all error messages and test files with new recommendations
- Maintained beginner-friendly options (1.7b still very good, 0.6b surprisingly good)
- Added explanation of why small models work well with RAG context
- Comprehensive testing completed - system ready for clean release
Complete rebrand to eliminate any Claude/Anthropic references:
Directory Changes:
- claude_rag/ → mini_rag/ (preserving git history)
Content Changes:
- Replaced 930+ Claude references across 40+ files
- Updated all imports: from claude_rag → from mini_rag
- Updated all file paths: .claude-rag → .mini-rag
- Updated documentation and comments
- Updated configuration files and examples
Testing Changes:
- All tests updated to use mini_rag imports
- Integration tests verify new module structure
This ensures complete independence from Claude/Anthropic
branding while maintaining all functionality and git history.
- Update model rankings to prioritize ultra-efficient CPU models (qwen3:0.6b first)
- Add comprehensive CPU deployment documentation with performance benchmarks
- Configure CPU-optimized settings in default config
- Enable 796MB total model footprint for standard systems
- Support Raspberry Pi, older laptops, and CPU-only environments
- Maintain excellent quality with 522MB qwen3:0.6b model
📚 DOCUMENTATION
- docs/QUERY_EXPANSION.md: Complete beginner guide with examples and troubleshooting
- Updated config.yaml with proper LLM settings and comments
- Clear explanations of when features are enabled/disabled
🧪 NEW TESTING INFRASTRUCTURE
- test_ollama_integration.py: 6 comprehensive tests with helpful error messages
- test_smart_ranking.py: 6 tests verifying ranking quality improvements
- troubleshoot.py: Interactive tool for diagnosing setup issues
- Enhanced system validation with new features coverage
⚙️ SMART DEFAULTS
- Query expansion disabled by default (CLI speed)
- TUI enables expansion automatically (exploration mode)
- Clear user feedback about which features are active
- Graceful degradation when Ollama unavailable
🎯 BEGINNER-FRIENDLY APPROACH
- Tests explain what they're checking and why
- Clear solutions provided for common problems
- Educational output showing system status
- Offline testing with gentle mocking
Run 'python3 tests/troubleshoot.py' to verify your setup\!