Architecture¶
Four Layers¶
Breadcrumb is built on four well-separated layers, each with a clear responsibility:
┌─────────────────────────────────────────────────────────────────┐
│ Test Code (pytest / raw Playwright) │
├─────────────────────────────────────────────────────────────────┤
│ Layer 4 — Reporting & Intelligence │
│ TestTracker · TestAnalyzer · QuarantineManager │
│ ReportConsole · ReportHTML · ReportJSON │
│ PageCrawler · ElementClassifier · TestCodeGenerator │
│ MCP Server (7 tools) │
├─────────────────────────────────────────────────────────────────┤
│ Layer 3 — Playwright Wrapper │
│ HealablePage · HealableLocator · crumb() · heal_page fixture │
├─────────────────────────────────────────────────────────────────┤
│ Layer 2 — Similarity Engine │
│ Healer · SimilarityScorer │
│ Jaccard · Levenshtein · LCS · Euclidean │
├─────────────────────────────────────────────────────────────────┤
│ Layer 1 — Fingerprint + Storage │
│ ElementFingerprint · BoundingBox · FingerprintStore (SQLite) │
└─────────────────────────────────────────────────────────────────┘
Package Structure¶
breadcrumb/
├── __init__.py # Public API: crumb, HealablePage, HealableLocator
├── core/
│ ├── fingerprint.py # ElementFingerprint, BoundingBox (frozen dataclasses)
│ ├── similarity.py # Six scoring algorithms (pure Python)
│ ├── healer.py # Scoring pipeline + candidate selection
│ └── storage.py # FingerprintStore (SQLite, WAL mode)
├── playwright/
│ ├── extractor.py # JS-based DOM extraction
│ └── page_wrapper.py # HealablePage, HealableLocator
├── plugins/
│ └── pytest_plugin.py # heal_page fixture + --breadcrumb flags
├── flaky/
│ ├── tracker.py # TestTracker, schema migration v1→v2
│ ├── analyzer.py # TestAnalyzer: flip-rate, EWMA, classification
│ └── quarantine.py # QuarantineManager: auto-quarantine/release
├── report/
│ ├── console.py # ReportConsole
│ ├── html.py # ReportHTML (interactive dashboard)
│ └── json.py # ReportJSON
├── generate/
│ ├── crawler.py # PageCrawler (static + Playwright)
│ ├── classifier.py # ElementClassifier (heuristic roles)
│ └── codegen.py # TestCodeGenerator (POM + pytest)
├── mcp/
│ ├── __init__.py
│ └── server.py # MCP server with 7 tools
└── cli/
└── main.py # Click CLI: report/doctor/generate/init/mcp
Data Schema¶
Schema v1 (core)¶
CREATE TABLE schema_meta (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
CREATE TABLE fingerprints (
test_id TEXT NOT NULL,
locator TEXT NOT NULL,
fingerprint_json TEXT NOT NULL, -- JSON-serialised ElementFingerprint
updated_at REAL NOT NULL, -- Unix timestamp
PRIMARY KEY (test_id, locator)
);
CREATE TABLE healing_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
test_id TEXT NOT NULL,
locator TEXT NOT NULL,
confidence REAL NOT NULL,
original_json TEXT NOT NULL, -- fingerprint before heal
healed_json TEXT NOT NULL, -- fingerprint after heal
timestamp REAL NOT NULL
);
Schema v2 additions (flaky tracking)¶
CREATE TABLE test_runs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
test_id TEXT NOT NULL,
status TEXT NOT NULL, -- passed/failed/error/skipped
duration_ms REAL,
healing_occurred INTEGER NOT NULL DEFAULT 0,
error_type TEXT,
environment TEXT,
timestamp REAL NOT NULL
);
CREATE TABLE quarantine (
test_id TEXT PRIMARY KEY,
reason TEXT NOT NULL,
quarantined_at REAL NOT NULL,
auto_unquarantine INTEGER NOT NULL DEFAULT 1
);
Migration from v1 → v2 is idempotent and handled automatically by migrate_schema().
Similarity Scoring Detail¶
Final score = weighted_mean(
tag_match × 0.25,
id_match × 0.25,
text_similarity × 0.20, # Levenshtein
class_similarity × 0.10, # Jaccard
attr_similarity × 0.05, # Jaccard
path_similarity × 0.10, # LCS
sibling_similarity × 0.03, # LCS
position_score × 0.02, # 1 − norm_euclidean_distance
)
All algorithms are implemented in breadcrumb/core/similarity.py with zero
external dependencies — the only imports are from the Python standard library.
Performance Benchmarks¶
Measured on Windows 11, Python 3.12, i7 processor.
| Operation | Result |
|---|---|
| Single-pair similarity score | ~0.009 ms |
| Heal over 100 candidates | ~2 ms |
| Heal over 1,000 candidates | ~14 ms |
| Fingerprint INSERT (SQLite WAL) | ~0.29 ms/op |
| Fingerprint SELECT by key | ~0.006 ms/op |
Healing a typical page (30–100 elements) adds under 15 ms per broken locator.
Design Principles¶
- Type safety is non-negotiable — pyright strict + mypy strict on every commit
- No external runtime dependencies — similarity algorithms are pure Python
- Never silently heal — confidence threshold must be exceeded or the test fails normally
- Local-only by design — no cloud calls, no API keys required for core functionality
- Append-only heal log — every heal event is preserved for auditing
- Idempotent migrations — schema changes are safe to run multiple times