✨ BEGINNER-FRIENDLY ENHANCEMENTS: - Add comprehensive glossary explaining RAG, embeddings, chunks in plain English - Create detailed troubleshooting guide covering installation, search issues, performance - Provide preset configs (beginner/fast/quality) with extensive helpful comments - Enhanced error messages with specific solutions and next steps 🔧 PRODUCTION RELIABILITY: - Add thread-safe caching with automatic cleanup in QueryExpander - Implement chunked processing for large batches to prevent memory issues - Enhanced concurrent embedding with intelligent batch size management - Memory leak prevention with LRU cache approximation 🏗️ ARCHITECTURE COMPLETENESS: - Maintain two-mode system (synthesis fast, exploration thinking + memory) - Preserve educational value while removing intimidation barriers - Complete testing coverage for mode separation and context memory - Full documentation reflecting clean two-mode architecture Perfect balance: genuinely beginner-friendly without compromising technical sophistication
7.8 KiB
📚 Beginner's Glossary - RAG Terms Made Simple
Confused by all the technical terms? Don't worry! This guide explains everything in plain English.
🤖 RAG - Retrieval Augmented Generation
What it is: A fancy way of saying "search your code and get AI explanations"
Simple explanation: Instead of just searching for keywords (like Google), RAG finds code that's similar in meaning to what you're looking for, then has an AI explain it to you.
Real example:
- You search for "user authentication"
- RAG finds code about login systems, password validation, and user sessions
- AI explains: "This code handles user logins using email/password, stores sessions in cookies, and validates users on each request"
🧩 Chunks - Bite-sized pieces of your code
What it is: Your code files broken into smaller, searchable pieces
Simple explanation: RAG can't search entire huge files efficiently, so it breaks them into "chunks" - like cutting a pizza into slices. Each chunk is usually one function, one class, or a few related lines.
Why it matters:
- Too small chunks = missing context ("this variable" but what variable?)
- Too big chunks = too much unrelated stuff in search results
- Just right = perfect context for understanding what code does
Real example:
# This would be one chunk:
def login_user(email, password):
"""Authenticate user with email and password."""
user = find_user_by_email(email)
if user and check_password(user, password):
create_session(user)
return True
return False
🧠 Embeddings - Code "fingerprints"
What it is: A way to convert your code into numbers that computers can compare
Simple explanation: Think of embeddings like DNA fingerprints for your code. Similar code gets similar fingerprints. The computer can then find code with similar "fingerprints" to what you're searching for.
The magic: Code that does similar things gets similar embeddings, even if the exact words are different:
login_user()andauthenticate()would have similar embeddingscalculate_tax()andlogin_user()would have very different embeddings
You don't need to understand the technical details - just know that embeddings help find semantically similar code, not just exact word matches.
🔍 Vector Search vs Keyword Search
Keyword search (like Google): Finds exact word matches
- Search "login" → finds code with the word "login"
- Misses: authentication, signin, user_auth
Vector search (the RAG way): Finds similar meaning
- Search "login" → finds login, authentication, signin, user validation
- Uses those embedding "fingerprints" to find similar concepts
FSS-Mini-RAG uses both for the best results!
📊 Similarity Score - How relevant is this result?
What it is: A number from 0.0 to 1.0 showing how closely your search matches the result
Simple explanation:
- 1.0 = Perfect match (very rare)
- 0.8+ = Excellent match
- 0.5+ = Good match
- 0.3+ = Somewhat relevant
- 0.1+ = Might be useful
- Below 0.1 = Probably not what you want
In practice: Most useful results are between 0.2-0.8
🎯 BM25 - The keyword search boost
What it is: A fancy algorithm that finds exact word matches (like Google search)
Simple explanation: While embeddings find similar meaning, BM25 finds exact words. Using both together gives you the best of both worlds.
Example:
- You search for "password validation"
- Embeddings find: authentication functions, login methods, user security
- BM25 finds: code with the exact words "password" and "validation"
- Combined = comprehensive results
Keep it enabled unless you're getting too many irrelevant results.
🔄 Query Expansion - Making your search smarter
What it is: Automatically adding related terms to your search
Simple explanation: When you search for "auth", the system automatically expands it to "auth authentication login signin user validate".
Pros: Much better, more comprehensive results
Cons: Slower search, sometimes too broad
When to use:
- Turn ON for: Complex searches, learning new codebases
- Turn OFF for: Quick lookups, very specific searches
🤖 LLM - Large Language Model (The AI Brain)
What it is: The AI that reads your search results and explains them in plain English
Simple explanation: After finding relevant code chunks, the LLM reads them like a human would and gives you a summary like: "This code handles user registration by validating email format, checking for existing users, hashing passwords, and saving to database."
Models you might see:
- qwen3:0.6b - Ultra-fast, good for most questions
- llama3.2 - Slower but more detailed
- auto - Picks the best available model
🧮 Synthesis vs Exploration - Two ways to get answers
🚀 Synthesis Mode (Fast & Consistent)
What it does: Quick, factual answers about your code Best for: "What does this function do?" "Where is authentication handled?" "How does the database connection work?" Speed: Very fast (no "thinking" overhead)
🧠 Exploration Mode (Deep & Interactive)
What it does: Detailed analysis with reasoning, remembers conversation Best for: "Why is this function slow?" "What are the security issues here?" "How would I add a new feature?" Features: Shows its reasoning process, you can ask follow-up questions
⚡ Streaming - Handling huge files without crashing
What it is: Processing large files in smaller batches instead of all at once
Simple explanation: Imagine trying to eat an entire cake at once vs. eating it slice by slice. Streaming is like eating slice by slice - your computer won't choke on huge files.
When it kicks in: Files larger than 1MB (that's about 25,000 lines of code)
🏷️ Semantic vs Fixed Chunking
Semantic chunking (RECOMMENDED): Smart splitting that respects code structure
- Keeps functions together
- Keeps classes together
- Respects natural code boundaries
Fixed chunking: Simple splitting that just cuts at size limits
- Faster processing
- Might cut functions in half
- Less intelligent but more predictable
For beginners: Always use semantic chunking unless you have a specific reason not to.
❓ Common Questions
Q: Do I need to understand embeddings to use this?
A: Nope! Just know they help find similar code. The system handles all the technical details.
Q: What's a good similarity threshold for beginners?
A: Start with 0.1. If you get too many results, try 0.2. If you get too few, try 0.05.
Q: Should I enable query expansion?
A: For learning new codebases: YES. For quick specific searches: NO. The TUI enables it automatically when helpful.
Q: Which embedding method should I choose?
A: Use "auto" - it tries the best option and falls back gracefully if needed.
Q: What if I don't have Ollama installed?
A: No problem! The system will automatically fall back to other methods that work without any additional software.
🚀 Quick Start Recommendations
For absolute beginners:
- Keep all default settings
- Use the TUI interface to start
- Try simple searches like "user login" or "database connection"
- Gradually try the CLI commands as you get comfortable
For faster results:
- Set
similarity_threshold: 0.2 - Set
expand_queries: false - Use synthesis mode instead of exploration
For learning new codebases:
- Set
expand_queries: true - Use exploration mode
- Ask "why" and "how" questions
Remember: This is a learning tool! Don't be afraid to experiment with settings and see what works best for your projects. The beauty of FSS-Mini-RAG is that it's designed to be beginner-friendly while still being powerful.