Flaky Tests API¶
The flaky tracking layer records test outcomes, computes flip-rates, and manages quarantine.
TestTracker¶
breadcrumb.flaky.tracker.TestTracker
¶
Records test execution runs into the SQLite database.
Usage::
tracker = TestTracker(store)
tracker.record_run("test_login", "passed", duration_ms=120.5)
tracker.record_run("test_login", "failed", error_type="AssertionError")
runs = tracker.get_runs("test_login")
Source code in breadcrumb/flaky/tracker.py
record_run
¶
record_run(test_id: str, status: str, duration_ms: float | None = None, healing_occurred: bool = False, error_type: str | None = None, environment: str | None = None) -> None
Record a test execution result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_id
|
str
|
Unique test identifier (e.g. pytest node ID). |
required |
status
|
str
|
One of 'passed', 'failed', 'error', 'skipped'. |
required |
duration_ms
|
float | None
|
Test duration in milliseconds. |
None
|
healing_occurred
|
bool
|
Whether self-healing was triggered. |
False
|
error_type
|
str | None
|
Exception class name if the test errored. |
None
|
environment
|
str | None
|
Optional environment tag (e.g. 'ci', 'local'). |
None
|
Source code in breadcrumb/flaky/tracker.py
get_runs
¶
Return recent runs for a test, ordered by timestamp descending.
Source code in breadcrumb/flaky/tracker.py
get_all_test_ids
¶
Return distinct test_ids that have at least one recorded run.
Source code in breadcrumb/flaky/tracker.py
migrate_schema¶
breadcrumb.flaky.tracker.migrate_schema
¶
Migrate a v1 DB to v2 by adding test_runs and quarantine tables.
Safe to call multiple times — CREATE IF NOT EXISTS guards are used.
Source code in breadcrumb/flaky/tracker.py
TestAnalyzer¶
breadcrumb.flaky.analyzer.TestAnalyzer
¶
Analyses test run history to detect and rank flaky tests.
Classifications
Stable — fliprate == 0.0 Intermittent — 0.0 < fliprate <= 0.2 Flaky — 0.2 < fliprate <= 0.5 Chronic — fliprate > 0.5
Usage::
analyzer = TestAnalyzer(tracker)
fliprate = analyzer.compute_fliprate("test_login")
classification = analyzer.classify("test_login")
Source code in breadcrumb/flaky/analyzer.py
compute_fliprate
¶
Standard flip-rate: fraction of consecutive outcome changes.
For n runs, there are n-1 consecutive pairs. Each pair where the outcome changes counts as a flip. Returns flips / (n-1).
Returns 0.0 if fewer than 2 runs are available.
Source code in breadcrumb/flaky/analyzer.py
compute_ewma_fliprate
¶
EWMA-weighted flip-rate (more recent flips weighted more heavily).
Implements the exponentially weighted moving average approach from Apple's "Modeling and ranking flaky tests" paper. Alpha controls how much weight is given to recent vs older flips (higher = more recent-biased).
Returns 0.0 if fewer than 2 runs are available.
Source code in breadcrumb/flaky/analyzer.py
classify
¶
Classify a test by its flip-rate into a stability tier.
Returns one of: 'Stable', 'Intermittent', 'Flaky', 'Chronic'.
Source code in breadcrumb/flaky/analyzer.py
get_all_classifications
¶
Return {test_id: classification} for all known tests.
QuarantineManager¶
breadcrumb.flaky.quarantine.QuarantineManager
¶
Manages the quarantine list for flaky tests.
Quarantined tests
- Are still executed so data keeps accumulating.
- Their failures should not block CI (callers are responsible for enforcing this; the manager only tracks quarantine state).
- Are automatically released when their classification improves to Stable or Intermittent.
Usage::
manager = QuarantineManager(store, analyzer)
manager.quarantine("test_checkout", "auto: Chronic fliprate 0.7")
if manager.is_quarantined("test_checkout"):
...
report = manager.auto_update()
Source code in breadcrumb/flaky/quarantine.py
is_quarantined
¶
Return True if the test is currently quarantined.
Source code in breadcrumb/flaky/quarantine.py
quarantine
¶
Add a test to the quarantine list.
Source code in breadcrumb/flaky/quarantine.py
unquarantine
¶
Remove a test from the quarantine list.
auto_update
¶
Auto-quarantine Flaky/Chronic tests; release Stable/Intermittent ones.
Only tests with auto_unquarantine=1 are candidates for automatic release.
Returns:
| Type | Description |
|---|---|
dict[str, list[str]]
|
{"quarantined": [list of newly quarantined test_ids], "unquarantined": [list of released test_ids]} |
Source code in breadcrumb/flaky/quarantine.py
get_all_quarantined
¶
Return the list of all currently quarantined test_ids.