Phase 0 · Foundation Assessment — BEA-27 Beats GM AI Engine
The permanent base layer (FAD) for all future work. Everything downstream — the 30 cross-domain patterns, the three architectures (A · Scoped Digest Router, B · Fusion & Per-Owner Compilation, C · Pre-Launch Radar + ACH), the Monte Carlo (A 68, B 61, C 31), the pre-mortem kill criteria, and the SPLIT-but-converging validation — is anchored here. The single load-bearing fault line carried through every phase is the ground-truth oracle: a reliable, owned way to later confirm whether a surfaced call was right ("did the launch actually happen on date X? was the sentiment inflection real?"). It is defined here in plain language so it is not first introduced as a metaphor in Phase 3 (Mills, Tier-3 validator, flagged exactly this).
1. Success criteria (measurable, time-bound)
- [ ] Adoption precondition (the real go/no-go). ≥ 5 named Beats product owners (GMs) pre-commit, in writing, a specific still-open decision (a launch-timing, pricing, or counter-positioning call not yet frozen) they will route through the engine — captured before any build begins, and ≥ 5 of them remain active (one decision routed) at the 6-month mark. Metric: count of pre-committed owners with a named open decision; target ≥ 5 by end of Phase 0 / decision-gate, ≥ 5 still active by month 6. If this is not met, every architecture is premature (Lindqvist + Real converged on this from opposite directions).
- [ ] Decision-attribution, not engagement. At least 3 distinct GM decisions are demonstrably influenced by an engine-surfaced item within 6 months of a shipped architecture, evidenced by a monthly 3-question attribution survey ("did this change anything you did?") plus a logged decision-trace — not open-rate or alert-volume. Metric: ≥ 3 attributed decisions by month 6; zero attributed decisions = innovation theater = kill that architecture.
- [ ] Oracle backfill feasibility (gates B; pre-conditions C). Within the first 90 days of any B/C kickoff, demonstrate that ≥ 2 years of resolved historical events can be backfilled to seed the source-credibility ledger (Venkataraman's re-frame: accumulating resolved events is the slow constraint, not standing up a source — the consumer domain prints only ~6–20 resolvable events per competitor per year). Metric: count of backfillable resolved events per priority competitor; ≥ 2 years assembled, or B is re-classified as a multi-year program rather than an 8-month build.
- [ ] Legal/CIL clearance as a hard gate (all architectures). A Critical Information List (enumerating exactly what is collected and why) and per-source ToS-for-monitoring clearance are signed off by Apple/Beats brand & competitive-conduct counsel before any collection or repackaging runs — including for A, whose reuse of licensed social/news feeds into a standing "competitor monitoring" surface may breach use terms (Ferreira's veto applies to repackaging, not only new collection). Metric: CIL approved + ToS-for-monitoring clearance on 100% of in-scope sources before go-live; any unresolved veto on a core source = no-go for that architecture.
- [ ] Calibration before confidence (gates any surfaced confidence band). No numeric confidence digit is surfaced to a GM until it is backed by a Brier-style calibration score against resolved outcomes; until then, only source-count is shown. Metric: zero uncalibrated numeric bands shipped to GMs; calibration curve published before the first confidence number goes live.
2. Decision-maker profile (Three Ledgers)
- Public ledger (what we pitch): A "Per-product-owner AI intelligence layer" that gives every Beats GM a scoped, trustworthy, decision-shaped view of competitor launches, sentiment, partnerships, and market movement — so no GM is surprised by a competitor move they could have seen coming. Framed as modernization and competitive readiness.
- Shadow ledger (what the decision-maker actually optimizes for): The owner/sponsor (Buku) is optimizing for a visible, defensible "we have an AI competitive-intelligence capability" win that survives exec scrutiny and a finance-skeptic (Marcus Real archetype) review — without owning a multi-year science project, a legal/brand incident, or a tool that demos well and changes zero decisions. The shadow incentive favors shipping something credible and cheap that can be honestly described, and quietly fears the "AI Engine" branding inviting scrutiny it can't survive (Phase 3.5 row 6).
- True ledger (what will actually get built): On the evidence, the realistic build is Architecture A (Scoped Digest Router) deployed deliberately as a cheap, resilient adoption-and-oracle probe — instrumented from day one for decision-attribution and for backfilling resolved events — with B built on top only once the oracle is proven assemblable, and C confined to a separately-budgeted, kill-fast corpus-feasibility + certification-filings spike, never a committed program. The honest near-term artifact is closer to "an internal Feedly-equivalent we control, instrumented to learn whether anyone's decision changes," not a predictive engine.
- Owner / sponsor: Buku (per manifest
owner).
3. Constraint inventory
| Constraint | Type (hard / soft / assumed) | Notes |
|---|---|---|
| Apple/Beats brand & competitive-conduct legal review (CIL + per-source ToS-for-monitoring) is a gating artifact, not advisory — for ALL architectures including A | hard | Ferreira (Tier-2 counsel) is a veto-holder. Repackaging licensed social/news feeds into a "monitoring" surface can breach use terms regardless of pipe ownership (Phase 3.5 row 12). C presumptively starts in the no-go column. |
| Ground-truth oracle: outcomes resolve only as fast as reality emits them (~6–20 events per competitor per year, often ambiguous) | hard | The deepest systemic single point of failure (Phases 2.5→4.5). Cannot be manufactured faster than reality supplies it; denies B/C calibration on the promised timeline. |
| Use existing Apple-internal model gateway (Claude via internal gateway) and already-licensed feeds; no new external data contracts on A's critical path | hard | Phase 3 A reuses governed contracts to keep procurement/legal off the critical path. New sources (B/C) re-open procurement + legal. |
| Headcount is contended Apple/Beats engineering talent; ML/data specialists are the scarcest, most-contended resource | hard | A ≈ 1 eng + 0.4 PM; B ≈ 2.5 eng (1 ML) + 1 PM; C ≈ 3.5 eng + 1 PM + recurring legal. Lower-ceiling tools lose the staffing-priority fight (Phase 3.5 row 7). |
| Single-tenant internal tool — no external revenue, no cross-customer network effect | hard | Value is internal cost-avoidance + decision-quality only (Phase 4). "Data network effect" exists only within Beats; must not be overclaimed (Phase 2.5). |
| GM launch/pricing/counter-positioning windows are frozen 2–3 quarters out, limiting the actionability of lead time | assumed (untested, high-stakes) | Lindqvist (the user archetype) flagged this as the row she'd circle in red (Phase 3.5 row 35). If lead time is not actionable, B/C's core value never lands. Must be validated with owners up front. |
| Engine success must be measured on calibration / decision-attribution, not alert volume | soft (organizational, contestable) | The calibration-vs-volume incentive landmine (Phase 2.5/3.5 row 17, 29). A correctly-quiet engine reads as "broken" to execs expecting a busy dashboard; the KPI must be contracted with the sponsor up front. |
| Competitor OPSEC: Apple-tier audio peers (Sony, Bose, Samsung) run real product OPSEC; most C indicators are suppressible | hard (for C) | Okafor (Tier-1): only legally-mandated certification filings (FCC, Bluetooth SIG, CE) are non-suppressible. Outside that subset, C is structurally a confident-garbage generator. |
4. Scope boundary
- In scope: A per-product-owner intelligence layer over four signal domains — launches, sentiment, partnerships, and market — for the ~10–30 Beats GM cohort. The PIR/coverage-universe scoping gate, cadence-tiered dissemination (interrupt / digest / brief), perishability stamping, and (for B) credibility-weighted corroboration with confidence as a first-class field and per-owner recompilation. Decision-attribution instrumentation and oracle-backfill feasibility are in scope from day one. A separately-gated, kill-fast corpus-feasibility + certification-filings spike is the only in-scope form of C.
- Out of scope: Any external/commercial productization or multi-tenant SaaS; live anticipatory competitor predictions outside the certification-filing subset and outside a long shadow-mode period; any numeric confidence band not backed by a Brier calibration score; collection on any source not cleared by the Critical Information List + ToS-for-monitoring review; the "negative-space cancelled-launch" read as anything but advisory-only; treating C's nominal +$2.1M future value as bookable rather than ~8%-probability-gated.
- Recursion budget acknowledged: max 3 per phase. (Manifest
recursionCount= 0; Phase 5 explicitly recommended NO full-phase recursion — objections were structural tightenings of existing kill criteria, not paradigm breaks.)
5. Assumption log
| # | Assumption | Confidence (0–1) | Validated? |
|---|---|---|---|
| 1 | The real GM pain is signal-gap / actionable-lead-time, NOT aggregation overload — and ≥ 5 owners will pre-commit a named open decision | 0.35 | No — Phase 1 never validated which pain is real; the explicit go/no-go gate. Lindqvist + Real both doubt it. |
| 2 | A ground-truth oracle can be assembled by backfilling ≥ 2 years of resolved events, not just standing up a source going forward | 0.40 | No — Venkataraman re-framed this as the binding constraint; the original 90-day "stand up a source" criterion tested the easy half. |
| 3 | Existing licensed feeds cover ≥ 70% of signals owners care about (partnership intel especially is often single-source or absent) | 0.45 | No — Phase 3 flagged untested; Phase 3.5 rows 3 & 16 expect partnership coverage gaps. |
| 4 | Beats/Apple legal will clear a Critical Information List + per-source ToS-for-monitoring use for A and B; C clears only the certification-filing subset | 0.40 | No — Ferreira presumptively vetoes C broadly and warns A's repackaging is not automatically clean. |
| 5 | Lead time on a competitor move is actionable given GM windows frozen 2–3 quarters out | 0.30 | No — Lindqvist's red-circle objection (Phase 3.5 row 35); unvalidated and load-bearing for B/C value. |
| 6 | The ≥ 3 independent source types per signal class needed for corroboration genuinely exist (esp. for partnerships) | 0.45 | No — Phase 3 B assumption; Phase 3.5 row 16 expects some classes stall in staging forever. |
| 7 | A genuinely independent ACH/red-cell pass can be built (separate model lineage + evidence framing), not one LLM grading itself | 0.50 | No — Okafor warns the naive single-model implementation produces a narrow, fake confidence band. |
| 8 | The sponsor will contract on a calibration / decision-attribution KPI and resist demanding alert-volume | 0.45 | No — the calibration-vs-volume landmine (Phase 2.5/3.5); an organizational-physics risk, not technical. |
Gate: success criteria defined (5 measurable, time-bound), constraints mapped (8 rows), decision-maker identified (Buku, three ledgers), assumptions logged with confidence (8 rows) → gate met for 0-foundation.