2 Commits

Author SHA1 Message Date
50ddaa4b39 Fix calibration workflow - LLM now generates categories/labels correctly
Root cause: Pre-trained model was loading successfully, causing CLI to skip
calibration entirely. System went straight to classification with 35% model.

Changes:
- config: Set calibration_model to qwen3:8b-q4_K_M (larger model for better instruction following)
- cli: Create separate calibration_llm provider with 8b model
- llm_analyzer: Improved prompt to force exact email ID copying
- workflow: Merge discovered categories with predefined ones
- workflow: Add detailed error logging for label mismatches
- ml_classifier: Fixed model path checking (was checking None parameter)
- ml_classifier: Add dual API support (sklearn predict_proba vs LightGBM predict)
- ollama: Fixed model list parsing (use m.model not m.get('name'))
- feature_extractor: Switch to Ollama embeddings (instant vs 90s load time)

Result: Calibration now runs and generates 16 categories + 50 labels correctly.
Next: Investigate calibration sampling to reduce overfitting on small samples.
2025-10-23 13:51:09 +11:00
0a301da0ff Add comprehensive next steps and action plan
- Created NEXT_STEPS.md with three clear deployment paths
- Path A: Framework validation (5 minutes)
- Path B: Real model integration (30-60 minutes)
- Path C: Full production deployment (2-3 hours)
- Decision tree for users
- Common commands reference
- Troubleshooting guide
- Success criteria checklist
- Timeline estimates

Enables users to:
1. Quickly validate framework with mock model
2. Choose their model integration approach
3. Understand full deployment path
4. Have clear next steps documentation

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 12:13:35 +11:00