Holistic Progress Tracking.
Before BETA, proof generation, project documentation, and metrics lived in silos. BETA unifies them into a single, reliable pane of glass where real metrics drive decisions. It ingests report.json files from proof runs, scans codebases, manages SQLite histories, and acts as the nervous system for the OMEGA suite.
Authentic Evidence
BETA strictly separates real field/bench evidence from mock/demo data. If a report is flagged as synthetic, it is quarantined and prevented from artificially inflating progress metrics.
Stdlib HTTP Server
A dependency-free Python server (built on http.server / ThreadingTCPServer) serves JSON data and generated HTML from local SQLite workspaces. No web framework, no external runtime dependencies.
Ollama Integration
Optionally uses local, privacy-preserving Ollama models (e.g. Llama 3.2, Gemma, Qwen) as an advisory analyst and project manager. The deterministic tracking engine remains the source of truth; AI output is marked advisory and gated by real evidence.
Evolving the Dashboard.
The CLI Limits
Initially conceived as just the "OMEGA proof generator", the tool was a simple command-line script. When the ecosystem grew to encompass hardware and team coordination, the CLI became an untameable beast.
Generated Web UI
Moving to a local web dashboard was required to visualize the sprawling data. BETA generates static HTML plus a JSON data feed served by a small stdlib server, which let the tool track itself (using BETA on BETA) alongside unrelated hardware projects.
Unified Ingestion Pipeline
Data Ingestion Analytics Engine Dashboard UI ────────────── ──────────────── ──────────── report.json files ──┐ Quarantine & Sanitize Serve Local Dashboard Project Scans │ Calculate Deltas (stdlib http.server) GitHub issues + CI ─┼─► SQLite DB ──► Daily Snapshot + Trend ─► JSON ► Action Tracker AI Sessions ──┘ Generate Reports Trend Lines
One Command Interface.
A single PowerShell helper (dev.ps1) wraps the Python CLI behind a validated set of verbs, so the day-to-day loop does not require remembering raw module flags. It bootstraps and configures workspaces, ingests external report.json artifacts and project info, records measured evidence and work sessions, runs the test suite, serves the dashboard, and invokes the local AI analyst.
.\dev.ps1 doctor # environment + workspace checks .\dev.ps1 init-project # bootstrap a tracked workspace .\dev.ps1 evidence-template / record-evidence # capture measured runs .\dev.ps1 record-work # log a work session to the ledger .\dev.ps1 test / run-tests # BETA suite, or allowlisted per-project test commands .\dev.ps1 github-import / import-tests # pull issues + milestones, CI / test results .\dev.ps1 snapshot-project # persist a dated snapshot for trend lines .\dev.ps1 refresh / serve # rebuild + serve the local dashboard .\dev.ps1 ai / ask / manage-ai -Model gemma4:latest # advisory Ollama
Raw python -m beta <command> calls still work; dev.ps1 is a convenience layer over the same engine. The Python package itself ships with zero third-party dependencies.
Metrics, Not Vibes.
The deterministic engine derives a fixed set of tracked metrics from ingested evidence, then consolidates them into a single prioritized backlog. Nothing here is a vanity number — each metric maps back to a signal in the data.
Evidence Quality
Correctness, effectiveness, efficiency, evidence coverage, backlog health, and regression load, plus per-claim coverage and data-quality checks (baseline availability, repeatability, scenario diversity, sample size).
Run Throughput
Ingest throughput, write p95, query p95, station coverage, bathymetry, mesh routes, portal payload, database bytes per observation, and wire savings — compared run-over-run against the same scenario.
Project Velocity
Evidence velocity, repeat depth, connected-source coverage, active-measurement coverage, and AI availability, alongside PM metrics for runs/scenarios tracked, improved vs regressed metrics, and open P0/P1 work.
Metrics → Prioritized Backlog
The Plan, Manager, and AI screens consolidate the operating plan, deterministic work guidance, measurement plan, manager risks, missing data sources, metric backlog, and advisory Ollama suggestions into statused action records. Each action carries a source, priority, metric, next step, success signal, and evidence requirement. Actions can be active, blocked by missing real data, gated by manager risks, or flagged as advisory AI items — AI can feed the tracker, but status stays gated by real project evidence.
The Blank-Chart Trade-off.
BETA's whole point is to not lie about progress. That choice has a deliberate, visible cost: until real field or bench evidence is provided, the charts can be entirely blank.
Blank Until Proven
If no accepted real report exists, project pages show Needs data and keep all proof charts empty. An empty chart is treated as the honest answer, not a bug to paper over with placeholder data. Likewise, a software metric with no measurement is shown as not measured — never a misleading 0.0.
Marker Set
Reports carrying demo, synthetic, simulated, fixture, mock, dummy, fake, example, placeholder, template, or draft markers — plus the OMEGA-specific local-proof and coastal-demo tags — are quarantined before metrics run. (smoke is treated as a soft marker.) They stay visible as audit inputs but never drive charts, scores, or AI analysis.
Data & Model Lineage
Every dashboard JSON and exported report records the BETA version, analysis schema, data-version ID, project config versions, real vs quarantined report counts, and the Ollama model used for advisory review — so you can tell which model and which data shaped a result without treating model output as proof.
Tracking Its Own Drift
When dashboards or reports are generated, BETA appends compact workspace/project records to version-history.json and compares the latest scoped record against the previous one — calling out material changes such as a different AI model, new real evidence, fewer quarantined reports, or a changed config version. BETA also persists a dated project snapshot each day, so every tracked metric carries a trend line over time rather than just a latest value. Each tracked project (for example omega and beta itself) gets its own scoped page and data feed; the workspace dashboard is only a project selector and does not combine metrics across projects.