🎯 Complete transformation from 5.9GB bloated system to 70MB optimized solution ✨ Key Features: - Hybrid embedding system (Ollama + ML fallback + hash backup) - Intelligent chunking with language-aware parsing - Semantic + BM25 hybrid search with rich context - Zero-config portable design with graceful degradation - Beautiful TUI for beginners + powerful CLI for experts - Comprehensive documentation with 8+ Mermaid diagrams - Professional animated demo (183KB optimized GIF) 🏗️ Architecture Highlights: - LanceDB vector storage with streaming indexing - Smart file tracking (size/mtime) to avoid expensive rehashing - Progressive chunking: Markdown headers → Python functions → fixed-size - Quality filtering: 200+ chars, 20+ words, 30% alphanumeric content - Concurrent batch processing with error recovery 📦 Package Contents: - Core engine: claude_rag/ (11 modules, 2,847 lines) - Entry points: rag-mini (unified), rag-tui (beginner interface) - Documentation: README + 6 guides with visual diagrams - Assets: 3D icon, optimized demo GIF, recording tools - Tests: 8 comprehensive integration and validation tests - Examples: Usage patterns, config templates, dependency analysis 🎥 Demo System: - Scripted demonstration showing 12 files → 58 chunks indexing - Semantic search with multi-line result previews - Complete workflow from TUI startup to CLI mastery - Professional recording pipeline with asciinema + GIF conversion 🛡️ Security & Quality: - Complete .gitignore with personal data protection - Dependency optimization (removed python-dotenv) - Code quality validation and educational test suite - Agent-reviewed architecture and documentation Ready for production use - copy folder, run ./rag-mini, start searching\!
63 lines
1.7 KiB
Python
63 lines
1.7 KiB
Python
"""
|
|
Windows Console Unicode/Emoji Fix
|
|
This fucking works in 2025. No more emoji bullshit.
|
|
"""
|
|
|
|
import sys
|
|
import os
|
|
import io
|
|
|
|
|
|
def fix_windows_console():
|
|
"""
|
|
Fix Windows console to properly handle UTF-8 and emojis.
|
|
Call this at the start of any script that needs to output Unicode/emojis.
|
|
"""
|
|
# Set environment variable for UTF-8 mode
|
|
os.environ['PYTHONUTF8'] = '1'
|
|
|
|
# For Python 3.7+
|
|
if hasattr(sys.stdout, 'reconfigure'):
|
|
sys.stdout.reconfigure(encoding='utf-8')
|
|
sys.stderr.reconfigure(encoding='utf-8')
|
|
if hasattr(sys.stdin, 'reconfigure'):
|
|
sys.stdin.reconfigure(encoding='utf-8')
|
|
else:
|
|
# For older Python versions
|
|
if sys.platform == 'win32':
|
|
# Replace streams with UTF-8 versions
|
|
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', line_buffering=True)
|
|
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', line_buffering=True)
|
|
|
|
# Also set the console code page to UTF-8 on Windows
|
|
if sys.platform == 'win32':
|
|
import subprocess
|
|
try:
|
|
# Set console to UTF-8 code page
|
|
subprocess.run(['chcp', '65001'], shell=True, capture_output=True)
|
|
except:
|
|
pass
|
|
|
|
|
|
# Auto-fix on import
|
|
fix_windows_console()
|
|
|
|
|
|
# Test function to verify it works
|
|
def test_emojis():
|
|
"""Test that emojis work properly."""
|
|
print("Testing emoji output:")
|
|
print(" Check mark")
|
|
print(" Cross mark")
|
|
print(" Rocket")
|
|
print(" Fire")
|
|
print(" Computer")
|
|
print(" Python")
|
|
print(" Folder")
|
|
print(" Search")
|
|
print(" Lightning")
|
|
print(" Sparkles")
|
|
|
|
|
|
if __name__ == "__main__":
|
|
test_emojis() |