Logging & Feedback

Run Records

Every pipeline execution logs a RunRecord to logs/analytics/runs.jsonl:

{
  "run_id": "uuid-string",
  "timestamp": "2025-01-15T10:30:00Z",
  "user_input": "Classify support tickets by priority",
  "mode": "generate",
  "pattern": "classification",
  "pattern_confidence": 0.85,
  "provider": "openai",
  "iteration_count": 1,
  "final_verdict": "PASS",
  "quality_scores": {
    "correctness": 5,
    "completeness": 4,
    "format_validity": 5,
    "efficiency": 4,
    "reusability": 4
  },
  "estimated_human_edit_minutes": 5,
  "confidence": 0.9,
  "node_count": 5,
  "edge_count": 6,
  "node_types": ["llm", "decision_tree", "variable_transform"],
  "validation_errors": [],
  "critique_issues": []
}

Feedback Records

Human feedback is linked to runs via run_id and stored in logs/analytics/feedback.jsonl:

{
  "run_id": "uuid-string",
  "timestamp": "2025-01-15T11:00:00Z",
  "edit_minutes": 8,
  "rating": 4,
  "notes": "Good structure, needed minor prompt tweaks"
}

The rating field (1-5) is compared against the evaluator's correctness score during calibration checks.

Reading Logs

The AnalyticsReader provides filtered access to log data:

from autoflow.analytics.reader import AnalyticsReader

reader = AnalyticsReader()

# All runs
runs = list(reader.iter_runs())

# Filter by pattern
classification_runs = list(reader.iter_runs(pattern="classification"))

# Filter by verdict
passed = list(reader.iter_runs(verdict="PASS"))

# Filter by mode
generated = list(reader.iter_runs(mode="generate"))

# Get feedback for a run
feedback = reader.get_feedback(run_id="uuid-string")

Analysis Reports

The PatternAnalyzer computes aggregate statistics:

from autoflow.analytics.analyzer import PatternAnalyzer

analyzer = PatternAnalyzer()
report = analyzer.generate_report()

# Per-pattern stats
for pattern, stats in report.pattern_stats.items():
    print(f"{pattern}: {stats.pass_rate:.0%} pass rate, "
          f"{stats.avg_correctness:.1f} avg correctness")

# Failure modes
for mode in report.failure_modes:
    print(f"Issue: {mode.description} ({mode.frequency} occurrences)")

Run the report from the CLI:

python -m autoflow --analyze

Pattern Updates

The PatternUpdater analyzes run history and suggests improvements:

from autoflow.analytics.pattern_updater import PatternUpdater

updater = PatternUpdater()
suggestions = updater.suggest_updates()

# Confidence adjustments
for adj in suggestions.confidence_adjustments:
    print(f"{adj.pattern}: {adj.current} → {adj.suggested}")

# Keyword additions
for kw in suggestions.keyword_additions:
    print(f"{kw.pattern}: add '{kw.keyword}'")

# Deprecations
for dep in suggestions.deprecations:
    print(f"Consider deprecating: {dep.pattern}")