From 183b12c9b4332a15a4c31211916eb471516928bb Mon Sep 17 00:00:00 2001 From: FSSCoding Date: Thu, 23 Oct 2025 14:15:17 +1100 Subject: [PATCH] Improve LLM prompts with proper context and purpose Both discovery and consolidation prompts now explain: - What the system does (train ML classifier for auto-sorting) - What makes good categories (broad, timeless, learnable) - Why this matters (user needs, ML training requirements) - How to think about the task (user-focused, functional) Discovery prompt changes: - Explains goal of identifying natural categories for ML training - Lists guidelines for good categories (broad, user-focused, learnable) - Provides concrete examples of functional categories - Emphasizes PURPOSE over topic Consolidation prompt changes: - Explains full system context (LightGBM, auto-labeling, user search) - Defines what makes categories effective for ML and users - Provides user-centric thinking framework - Emphasizes reusability and timelessness Prompts now give the brilliant 8b model proper context to deliver excellent category decisions instead of lazy generic categorization. --- src/calibration/llm_analyzer.py | 65 +++++++++++++++++++++++++++------ 1 file changed, 54 insertions(+), 11 deletions(-) diff --git a/src/calibration/llm_analyzer.py b/src/calibration/llm_analyzer.py index e76fc2f..685bc48 100644 --- a/src/calibration/llm_analyzer.py +++ b/src/calibration/llm_analyzer.py @@ -105,16 +105,36 @@ class CalibrationAnalyzer: # Use first email ID as example example_id = batch[0].id if batch else "maildir_example__sent_1" - prompt = f"""Categorize these emails. You MUST copy the exact ID string for each email. + prompt = f"""You are analyzing emails to discover natural categories for an automatic classification system. -EMAILS: +GOAL: Identify broad, reusable categories that will help train a machine learning model to sort thousands of emails automatically. + +GUIDELINES FOR GOOD CATEGORIES: +- BROAD & TIMELESS: "Financial" not "Q3 Budget Review" +- USER-FOCUSED: Think "what would help someone find this email later?" +- LEARNABLE: ML model needs consistent patterns (sender domains, keywords, structure) +- FUNCTIONAL: Each category serves a distinct purpose +- 3-10 categories ideal: Too many = noise, too few = useless + +EMAILS TO ANALYZE: {email_summary} -CRITICAL: Copy the EXACT ID from each email above. For example, if email #1 has ID "{example_id}", you must write exactly "{example_id}" in the labels array, not "email1" or anything else. +TASK: +1. Identify natural groupings based on PURPOSE, not just topic +2. Create SHORT (1-3 word) category names +3. Assign each email to exactly one category +4. CRITICAL: Copy EXACT email IDs - if email #1 shows ID "{example_id}", use exactly "{example_id}" in labels + +EXAMPLES OF GOOD CATEGORIES: +- "Work Communication" (daily business emails) +- "Financial" (invoices, budgets, reports) +- "Urgent" (time-sensitive requests) +- "Technical" (system alerts, dev discussions) +- "Administrative" (HR, policies, announcements) Return JSON: {{ - "categories": {{"category_name": "description", ...}}, + "categories": {{"category_name": "what user need this serves", ...}}, "labels": [["{example_id}", "category"], ...] }} @@ -257,28 +277,51 @@ JSON: rules_text = "\n".join(rules) # Build prompt - prompt = f"""Consolidate email categories by merging duplicates and overlaps. + prompt = f"""You are helping build an email classification system that will automatically sort thousands of emails. + +TASK: Consolidate the discovered categories below into a lean, effective set for training a machine learning classifier. + +WHY THIS MATTERS: +These categories will be used to: +1. Train a LightGBM classifier on email features (embeddings, patterns, structure) +2. Automatically label thousands of emails without human intervention +3. Help users quickly find emails by category (like Gmail labels) + +WHAT MAKES GOOD CATEGORIES: +- BROAD & REUSABLE: "Meetings" not "Q3 Planning Meeting" - applies to many emails +- FUNCTIONALLY DISTINCT: Each category serves a different user need +- BALANCED: Avoid 1 huge category + many tiny ones +- LEARNABLE: ML model needs clear patterns to distinguish categories +- TIMELESS: "Financial Reports" not "2023 Budget Review" +- ACTION-ORIENTED: Users ask "show me all X" - what is X? DISCOVERED CATEGORIES (sorted by email count): {category_list} -{context_section}CONSOLIDATION RULES: +{context_section}CONSOLIDATION STRATEGY: {rules_text} +THINK LIKE A USER: If you had to sort 10,000 emails, what categories would help you find things fast? +- "Work Communication" catches daily business emails +- "Urgent" flags time-sensitive items +- "Financial" groups all money-related emails +- "Technical" vs "Administrative" serves different workflows + OUTPUT FORMAT - Return JSON with consolidated categories and mapping: {{ "consolidated": {{ - "FinalCategoryName": "Clear, generic description of what emails fit here" + "FinalCategoryName": "Clear description of what user need this serves" }}, "mappings": {{ "OldCategoryName": "FinalCategoryName" }} }} -IMPORTANT: -- consolidated dict should have {target_categories} or fewer entries -- mappings dict must map EVERY old category name to a final category -- Final category names should be present in both consolidated and mappings +CRITICAL REQUIREMENTS: +- Maximum {target_categories} final categories (strict limit) +- Map EVERY old category to exactly one final category +- Final category names must be SHORT (1-3 words), GENERIC, and REUSABLE +- Think: "Would this category still make sense in 5 years?" JSON: """