- Rewrote CLAUDE.md with comprehensive development guide - Archived 20 old docs to docs/archive/ - Added PROJECT_ROADMAP_2025.md with research learnings - Added CLASSIFICATION_METHODS_COMPARISON.md - Added SESSION_HANDOVER_20251128.md - Added tools for analysis (brett_gmail/microsoft analyzers) - Updated .gitignore for archive folders - Config changes for local vLLM endpoint
129 lines
3.8 KiB
Markdown
129 lines
3.8 KiB
Markdown
# Session Handover Report - Email Sorter
|
|
**Date:** 2025-11-28
|
|
**Session ID:** eb549838-a153-48d1-ae5d-891e0e83108f
|
|
|
|
---
|
|
|
|
## What Was Done This Session
|
|
|
|
### 1. Classified 801 emails from brett-gmail using three methods:
|
|
|
|
| Method | Accuracy | Time | Output Location |
|
|
|--------|----------|------|-----------------|
|
|
| ML-Only | 54.9% | ~5 sec | `/home/bob/Documents/Email Manager/emails/brett-gm-md/` |
|
|
| ML+LLM | 93.3% | ~3.5 min | `/home/bob/Documents/Email Manager/emails/brett-gm-llm/` |
|
|
| Manual Agent | 99.8% | ~25 min | Same as ML-only + analysis files |
|
|
|
|
### 2. Created/Modified Files
|
|
|
|
**New Files:**
|
|
- `tools/generate_html_report.py` - HTML report generator
|
|
- `tools/brett_gmail_analyzer.py` - Custom dataset analyzer
|
|
- `data/brett_gmail_analysis.json` - Analysis output
|
|
- `docs/REPORT_FORMAT.md` - Report system documentation
|
|
- `docs/CLASSIFICATION_METHODS_COMPARISON.md` - Method comparison
|
|
- `docs/PROJECT_ROADMAP_2025.md` - Full roadmap and learnings
|
|
- `/home/bob/Documents/Email Manager/emails/brett-gm-md/BRETT_GMAIL_ANALYSIS_REPORT.md` - Analysis report
|
|
- `/home/bob/Documents/Email Manager/emails/brett-gm-md/report.html` - HTML report (ML-only)
|
|
- `/home/bob/Documents/Email Manager/emails/brett-gm-llm/report.html` - HTML report (ML+LLM)
|
|
|
|
**Modified Files:**
|
|
- `src/cli.py` - Added `--force-ml` flag, enriched results.json with email metadata
|
|
- `src/llm/openai_compat.py` - Removed API key requirement for local vLLM
|
|
- `config/default_config.yaml` - Changed LLM to openai provider on localhost:11433
|
|
|
|
### 3. Key Configuration Changes
|
|
|
|
```yaml
|
|
# config/default_config.yaml - LLM now uses vLLM endpoint
|
|
llm:
|
|
provider: "openai"
|
|
openai:
|
|
base_url: "http://localhost:11433/v1"
|
|
api_key: "not-needed"
|
|
classification_model: "qwen3-coder-30b"
|
|
```
|
|
|
|
---
|
|
|
|
## Key Findings
|
|
|
|
1. **ML pipeline overkill for <5000 emails** - Agent analysis gives better accuracy in similar time
|
|
2. **Sender domain is strongest signal** - Top 5 senders = 47.5% of emails
|
|
3. **Categories should serve downstream routing** - Not human labels, but processing decisions
|
|
4. **Risk-based accuracy** - Personal emails need high accuracy, junk can tolerate errors
|
|
5. **This tool = triage** - Sorts into buckets for other specialized tools
|
|
|
|
---
|
|
|
|
## Project Scope (Agreed with User)
|
|
|
|
**Email Sorter IS:**
|
|
- Bulk classification/triage tool
|
|
- Router to downstream specialized tools
|
|
- Part of larger email processing ecosystem
|
|
|
|
**Email Sorter IS NOT:**
|
|
- Complete email management solution
|
|
- Spam filter (trust Gmail/Outlook)
|
|
- Final destination for emails
|
|
|
|
---
|
|
|
|
## Recommended Dataset Size Routing
|
|
|
|
| Size | Method |
|
|
|------|--------|
|
|
| <500 | Agent-only |
|
|
| 500-5000 | Agent pre-scan + ML |
|
|
| >5000 | ML pipeline |
|
|
|
|
---
|
|
|
|
## Background Processes
|
|
|
|
There are stale background bash processes (f8678e, 0a3549, 0d150e) from classification runs. These completed successfully and can be ignored.
|
|
|
|
---
|
|
|
|
## What Needs Doing Next
|
|
|
|
1. **Review docs/** - All learnings are in PROJECT_ROADMAP_2025.md
|
|
2. **Phase 1 development** - Dataset size routing, sender-first classification
|
|
3. **Agent pre-scan module** - 10-15 min discovery phase before ML
|
|
|
|
---
|
|
|
|
## User Preferences (from CLAUDE.md)
|
|
|
|
- NO emojis in commits
|
|
- NO "Generated with Claude" attribution
|
|
- Use tools (Read/Edit/Grep) not bash commands for file ops
|
|
- Virtual environment required for Python
|
|
- TTS available via `fss-speak` (single line messages only, no newlines)
|
|
|
|
---
|
|
|
|
## Quick Start for Next Agent
|
|
|
|
```bash
|
|
cd /MASTERFOLDER/Tools/email-sorter
|
|
source venv/bin/activate
|
|
|
|
# Read the roadmap
|
|
cat docs/PROJECT_ROADMAP_2025.md
|
|
|
|
# Run classification
|
|
python -m src.cli run --source local \
|
|
--directory "/path/to/emails" \
|
|
--output "/path/to/output" \
|
|
--force-ml --llm-provider openai
|
|
|
|
# Generate HTML report
|
|
python tools/generate_html_report.py --input /path/to/results.json
|
|
```
|
|
|
|
---
|
|
|
|
*Session ended: 2025-11-28 ~03:30 AEDT*
|