email-sorter

BobAi/email-sorter

Fork 0

Commit Graph

Author	SHA1	Message	Date
FSSCoding	50ddaa4b39	Fix calibration workflow - LLM now generates categories/labels correctly Root cause: Pre-trained model was loading successfully, causing CLI to skip calibration entirely. System went straight to classification with 35% model. Changes: - config: Set calibration_model to qwen3:8b-q4_K_M (larger model for better instruction following) - cli: Create separate calibration_llm provider with 8b model - llm_analyzer: Improved prompt to force exact email ID copying - workflow: Merge discovered categories with predefined ones - workflow: Add detailed error logging for label mismatches - ml_classifier: Fixed model path checking (was checking None parameter) - ml_classifier: Add dual API support (sklearn predict_proba vs LightGBM predict) - ollama: Fixed model list parsing (use m.model not m.get('name')) - feature_extractor: Switch to Ollama embeddings (instant vs 90s load time) Result: Calibration now runs and generates 16 categories + 50 labels correctly. Next: Investigate calibration sampling to reduce overfitting on small samples.	2025-10-23 13:51:09 +11:00
Brett Fox	22fe08a1a6	Add model integration tools and comprehensive completion assessment Features: - Created download_pretrained_model.py for downloading models from URLs - Created setup_real_model.py for integrating pre-trained LightGBM models - Generated MODEL_INFO.md with model usage documentation - Created COMPLETION_ASSESSMENT.md with comprehensive project evaluation - Framework complete: all 16 phases implemented, 27/30 tests passing - Model integration ready: tools to download/setup real LightGBM models - Clear path to production: real model, Gmail OAuth, and deployment ready This enables: 1. Immediate real model integration without code changes 2. Clear path from mock framework testing to production 3. Support for both downloaded and self-trained models 4. Documented deployment process for 80k+ email processing Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-21 12:12:52 +11:00

Author

SHA1

Message

Date

FSSCoding

50ddaa4b39

Fix calibration workflow - LLM now generates categories/labels correctly

Root cause: Pre-trained model was loading successfully, causing CLI to skip
calibration entirely. System went straight to classification with 35% model.

Changes:
- config: Set calibration_model to qwen3:8b-q4_K_M (larger model for better instruction following)
- cli: Create separate calibration_llm provider with 8b model
- llm_analyzer: Improved prompt to force exact email ID copying
- workflow: Merge discovered categories with predefined ones
- workflow: Add detailed error logging for label mismatches
- ml_classifier: Fixed model path checking (was checking None parameter)
- ml_classifier: Add dual API support (sklearn predict_proba vs LightGBM predict)
- ollama: Fixed model list parsing (use m.model not m.get('name'))
- feature_extractor: Switch to Ollama embeddings (instant vs 90s load time)

Result: Calibration now runs and generates 16 categories + 50 labels correctly.
Next: Investigate calibration sampling to reduce overfitting on small samples.

2025-10-23 13:51:09 +11:00

Brett Fox

22fe08a1a6

Add model integration tools and comprehensive completion assessment

Features:
- Created download_pretrained_model.py for downloading models from URLs
- Created setup_real_model.py for integrating pre-trained LightGBM models
- Generated MODEL_INFO.md with model usage documentation
- Created COMPLETION_ASSESSMENT.md with comprehensive project evaluation
- Framework complete: all 16 phases implemented, 27/30 tests passing
- Model integration ready: tools to download/setup real LightGBM models
- Clear path to production: real model, Gmail OAuth, and deployment ready

This enables:
1. Immediate real model integration without code changes
2. Clear path from mock framework testing to production
3. Support for both downloaded and self-trained models
4. Documented deployment process for 80k+ email processing

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-21 12:12:52 +11:00

2 Commits