# Batch LLM Classifier - Quick Start ## Prerequisite Check ```bash python tools/batch_llm_classifier.py check ``` Expected: `✓ vLLM server is running and ready` If not running: Start vLLM server at rtx3090.bobai.com.au first --- ## Basic Usage ```bash python tools/batch_llm_classifier.py ask \ --source enron \ --limit 50 \ --question "YOUR QUESTION HERE" \ --output results.txt ``` --- ## Example Questions ### Find Urgent Emails ```bash --question "Is this email urgent or time-sensitive? Answer yes/no and explain." ``` ### Extract Financial Data ```bash --question "List any dollar amounts, budgets, or financial numbers in this email." ``` ### Meeting Detection ```bash --question "Does this email mention a meeting? If yes, extract date/time/location." ``` ### Sentiment Analysis ```bash --question "What is the tone? Professional/Casual/Urgent/Frustrated? Explain." ``` ### Custom Classification ```bash --question "Should this email be archived or kept active? Why?" ``` --- ## Performance - **Throughput**: 4.65 requests/sec - **Batch size**: 4 (proper batch pooling) - **Reliability**: 100% success rate - **Example**: 500 requests in 108 seconds --- ## When To Use ✅ **Use Batch LLM for:** - Custom questions on 50-500 emails - One-off exploratory analysis - Flexible classification criteria - Data extraction tasks ❌ **Use RAG instead for:** - Searching 10k+ email corpus - Semantic topic search - Multi-document reasoning ❌ **Use Main ML Pipeline for:** - Regular ongoing classification - High-volume processing (10k+ emails) - Consistent categories - Maximum speed --- ## Quick Test ```bash # Check server python tools/batch_llm_classifier.py check # Process 10 emails python tools/batch_llm_classifier.py ask \ --source enron \ --limit 10 \ --question "Summarize this email in one sentence." \ --output test.txt # Check results cat test.txt ``` --- ## Files Created - `tools/batch_llm_classifier.py` - Main tool (executable) - `tools/README.md` - Full documentation - `test_llm_concurrent.py` - Performance testing script (root) **No files in `src/` were modified - existing ML pipeline untouched** --- ## Configuration Edit `VLLM_CONFIG` in `batch_llm_classifier.py`: ```python VLLM_CONFIG = { 'base_url': 'https://rtx3090.bobai.com.au/v1', 'api_key': 'rtx3090_foxadmin_10_8034ecb47841f45ba1d5f3f5d875c092', 'model': 'qwen3-coder-30b', 'batch_size': 4, # Don't increase - causes 503 errors } ``` --- ## Troubleshooting **Server not available:** ```bash curl https://rtx3090.bobai.com.au/v1/models -H "Authorization: Bearer rtx3090_..." ``` **503 errors:** Lower `batch_size` to 2 in config (currently optimal is 4) **Slow processing:** Check vLLM server load - may be handling other requests --- **Done!** Ready to ask custom questions across email batches.