3 Commits

Author SHA1 Message Date
01ecd74983 Complete GitHub issue implementation and security hardening
Major improvements from comprehensive technical and security reviews:

🎯 GitHub Issue Fixes (All 3 Priority Items):
• Add headless installation flag (--headless) for agents/CI automation
• Implement automatic model name resolution (qwen3:1.7b → qwen3:1.7b-q8_0)
• Prominent copy-paste instructions for fresh Ubuntu/Windows/Mac systems

🔧 CI/CD Pipeline Fixes:
• Fix virtual environment activation in GitHub workflows
• Add comprehensive test execution with proper dependency context
• Resolve test pattern matching for safeguard preservation methods
• Eliminate CI failure emails with robust error handling

🔒 Security Hardening:
• Replace unsafe curl|sh patterns with secure download-verify-execute
• Add SSL certificate validation with retry logic and exponential backoff
• Implement model name sanitization to prevent injection attacks
• Add network timeout handling and connection resilience

 Enhanced Features:
• Robust model resolution with fuzzy matching for quantization variants
• Cross-platform headless installation for automation workflows
• Comprehensive error handling with graceful fallbacks
• Analysis directory gitignore protection for scan results

🧪 Testing & Quality:
• All test suites passing (4/4 tests successful)
• Security validation preventing injection attempts
• Model resolution tested with real Ollama instances
• CI workflows validated across Python 3.10/3.11/3.12

📚 Documentation:
• Security-hardened installation maintains beginner-friendly approach
• Copy-paste instructions work on completely fresh systems
• Progressive complexity preserved (TUI → CLI → advanced)
• Step-by-step explanations for all installation commands
2025-09-02 17:15:21 +10:00
683ba9d51f Update .gitignore to exclude user-specific folders
- Add .mini-rag/ to gitignore (user-specific index data, 1.6MB)
- Add .claude/ to gitignore (personal Claude Code settings)
- Keep repo lightweight and focused on source code
- Users can quickly create their own index with: ./rag-mini index .
2025-08-15 10:13:01 +10:00
4166d0a362 Initial release: FSS-Mini-RAG - Lightweight semantic code search system
🎯 Complete transformation from 5.9GB bloated system to 70MB optimized solution

 Key Features:
- Hybrid embedding system (Ollama + ML fallback + hash backup)
- Intelligent chunking with language-aware parsing
- Semantic + BM25 hybrid search with rich context
- Zero-config portable design with graceful degradation
- Beautiful TUI for beginners + powerful CLI for experts
- Comprehensive documentation with 8+ Mermaid diagrams
- Professional animated demo (183KB optimized GIF)

🏗️ Architecture Highlights:
- LanceDB vector storage with streaming indexing
- Smart file tracking (size/mtime) to avoid expensive rehashing
- Progressive chunking: Markdown headers → Python functions → fixed-size
- Quality filtering: 200+ chars, 20+ words, 30% alphanumeric content
- Concurrent batch processing with error recovery

📦 Package Contents:
- Core engine: claude_rag/ (11 modules, 2,847 lines)
- Entry points: rag-mini (unified), rag-tui (beginner interface)
- Documentation: README + 6 guides with visual diagrams
- Assets: 3D icon, optimized demo GIF, recording tools
- Tests: 8 comprehensive integration and validation tests
- Examples: Usage patterns, config templates, dependency analysis

🎥 Demo System:
- Scripted demonstration showing 12 files → 58 chunks indexing
- Semantic search with multi-line result previews
- Complete workflow from TUI startup to CLI mastery
- Professional recording pipeline with asciinema + GIF conversion

🛡️ Security & Quality:
- Complete .gitignore with personal data protection
- Dependency optimization (removed python-dotenv)
- Code quality validation and educational test suite
- Agent-reviewed architecture and documentation

Ready for production use - copy folder, run ./rag-mini, start searching\!
2025-08-12 16:38:28 +10:00