fss-mini-rag-github/docs/GETTING_STARTED.md

# Getting Started with FSS-Mini-RAG

## Step 1: Installation

Choose your installation based on what you want:

### Option A: Ollama Only (Recommended)
```bash
# Install Ollama first
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the embedding model
ollama pull nomic-embed-text

# Install Python dependencies
pip install -r requirements.txt
```

### Option B: Full ML Stack
```bash
# Install everything including PyTorch
pip install -r requirements-full.txt
```

## Step 2: Test Installation

```bash
# Index this RAG system itself
./rag-mini index ~/my-project

# Search for something
./rag-mini search ~/my-project "chunker function"

# Check what got indexed
./rag-mini status ~/my-project
```

## Step 3: Index Your First Project

```bash
# Index any project directory
./rag-mini index /path/to/your/project

# The system creates .claude-rag/ directory with:
# - config.json (settings)
# - manifest.json (file tracking)
# - database.lance/ (vector database)
```

## Step 4: Search Your Code

```bash
# Basic semantic search
./rag-mini search /path/to/project "user login logic"

# Enhanced search with smart features
./rag-mini-enhanced search /path/to/project "authentication"

# Find similar patterns
./rag-mini-enhanced similar /path/to/project "def validate_input"
```

## Step 5: Customize Configuration

Edit `project/.claude-rag/config.json`:

```json
{
  "chunking": {
    "max_size": 3000,
    "strategy": "semantic"
  },
  "files": {
    "min_file_size": 100
  }
}
```

Then re-index to apply changes:
```bash
./rag-mini index /path/to/project --force
```

## Common Use Cases

### Find Functions by Name
```bash
./rag-mini search /project "function named connect_to_database"
```

### Find Code Patterns
```bash
./rag-mini search /project "error handling try catch"
./rag-mini search /project "database query with parameters"
```

### Find Configuration
```bash
./rag-mini search /project "database connection settings"
./rag-mini search /project "environment variables"
```

### Find Documentation
```bash
./rag-mini search /project "how to deploy"
./rag-mini search /project "API documentation"
```

## Python API Usage

```python
from claude_rag import ProjectIndexer, CodeSearcher, CodeEmbedder
from pathlib import Path

# Initialize
project_path = Path("/path/to/your/project")
embedder = CodeEmbedder()
indexer = ProjectIndexer(project_path, embedder)
searcher = CodeSearcher(project_path, embedder)

# Index the project
print("Indexing project...")
result = indexer.index_project()
print(f"Indexed {result['files_processed']} files, {result['chunks_created']} chunks")

# Search
print("\nSearching for authentication code...")
results = searcher.search("user authentication logic", limit=5)

for i, result in enumerate(results, 1):
    print(f"\n{i}. {result.file_path}")
    print(f"   Score: {result.score:.3f}")
    print(f"   Type: {result.chunk_type}")
    print(f"   Content: {result.content[:100]}...")
```

## Advanced Features

### Auto-optimization
```bash
# Get optimization suggestions
./rag-mini-enhanced analyze /path/to/project

# This analyzes your codebase and suggests:
# - Better chunk sizes for your language mix
# - Streaming settings for large files
# - File filtering optimizations
```

### File Watching
```python
from claude_rag import FileWatcher

# Watch for file changes and auto-update index
watcher = FileWatcher(project_path, indexer)
watcher.start_watching()

# Now any file changes automatically update the index
```

### Custom Chunking
```python
from claude_rag import CodeChunker

chunker = CodeChunker()

# Chunk a Python file
with open("example.py") as f:
    content = f.read()

chunks = chunker.chunk_text(content, "python", "example.py")
for chunk in chunks:
    print(f"Type: {chunk.chunk_type}")
    print(f"Content: {chunk.content}")
```

## Tips and Best Practices

### For Better Search Results
- Use descriptive phrases: "function that validates email addresses"
- Try different phrasings if first search doesn't work
- Search for concepts, not just exact variable names

### For Better Indexing
- Exclude build directories: `node_modules/`, `build/`, `dist/`
- Include documentation files - they often contain valuable context
- Use semantic chunking strategy for most projects

### For Configuration
- Start with default settings
- Use `analyze` command to get optimization suggestions
- Increase chunk size for larger functions/classes
- Decrease chunk size for more granular search

### For Troubleshooting
- Check `./rag-mini status` to see what was indexed
- Look at `.claude-rag/manifest.json` for file details
- Run with `--force` to completely rebuild index
- Check logs in `.claude-rag/` directory for errors

## What's Next?

1. Try the test suite to understand how components work:
   ```bash
   python -m pytest tests/ -v
   ```

2. Look at the examples in `examples/` directory

3. Read the main README.md for complete technical details

4. Customize the system for your specific project needs