Phase 0 · Foundation Assessment — BEA-01 Pill Character Creative Engine
Establish the foundation before any analysis. The permanent base layer (FAD) for all future work.
Retroactive note: This project was already built and is in-progress at
6-execute(Mikey LoRA on 150 reference images across 3 colorways; Modal H100 + ComfyUI) when this dossier was written. Phase 0 is being reconstructed after the fact to establish the analytical backing that was skipped. Where this assessment surfaces a gap between what should have been validated and what was actually shipped, that gap is flagged explicitly rather than smoothed over — that is the point of the retroactive exercise.
1. Success criteria (measurable, time-bound)
- [ ] Brand-consistent output rate: ≥80% of generated Mikey images pass internal Beats brand/creative review without manual retouching, measured across a 50-image audit set per colorway (Black, Red, one TBD third colorway), within 30 days of inference pipeline going live to the creative team.
- [ ] Turnaround speed vs. baseline: Median time from creative brief to first usable Mikey asset drops from the current illustrator-dependent baseline (assume ~2–5 business days per bespoke character render) to under 30 minutes of operator time per asset, measured over the first 20 production briefs.
- [ ] Cost per asset: Fully-loaded cost per generated asset (Modal H100 GPU-seconds + operator time, excluding the one-time LoRA training cost) lands under $5/asset at a steady-state volume of ≥50 assets/month, tracked monthly for the first quarter of operation.
- [ ] Colorway fidelity: Each of the 3 trained colorways reproduces its target Pill hardware color within an agreed perceptual tolerance (ΔE ≤ 5 against the canonical brand hex/Pantone) in ≥90% of outputs, verified on a per-colorway test prompt set before any asset ships externally.
- [ ] Adoption: At least 3 distinct Beats creative/marketing workflows (e.g. social, retail, deck assets) have used the engine to ship a real deliverable within 60 days of launch — proving it is a tool people reach for, not a demo.
2. Decision-maker profile (Three Ledgers)
- Public ledger (what we pitch): A purpose-built generative engine that produces on-brand "Mikey" Pill character art on demand across all official colorways, cutting illustration turnaround from days to minutes and freeing the creative team from repetitive bespoke renders.
- Shadow ledger (what the decision-maker actually optimizes for): Demonstrating that the Beats creative org can stand up a credible, in-house AI generation capability — both to keep proprietary character IP inside Apple/Beats walls (never sent to a third-party model API) and to show internal momentum on AI initiatives. Speed-to-a-working-demo and "we built our own LoRA" narrative value likely outweighed rigorous cost/quality validation, which is why the project reached
6-executewith no Phase 0–5 backing. - True ledger (what will actually get built / has been built): A ComfyUI-based inference pipeline running a custom Mikey LoRA on Modal-hosted H100s, trained on 150 reference images spanning 3 colorways. Proprietary, Beats-only, not reusable beyond Beats. Operated by a small creative/technical team rather than self-serve by all designers. The real artifact is a working but unaudited generation tool, not yet a measured, quality-gated production system.
- Owner / sponsor: Alex Bowring (manifest
owner). Accountable for both the creative-quality bar (is this "Mikey" enough to ship under the Beats brand?) and the operational call on whether the Modal/ComfyUI stack is worth maintaining.
3. Constraint inventory
| Constraint | Type (hard / soft / assumed) | Notes |
|---|---|---|
| Mikey character IP and the 150 reference images must remain inside Apple/Beats infrastructure; no proprietary assets sent to external/public model APIs | hard | Drives the self-hosted Modal + ComfyUI architecture instead of a hosted SaaS image model. Non-negotiable for Beats-proprietary brand assets. |
| Training corpus is exactly 150 reference images across 3 colorways (~50 per colorway) | hard (as-built) / soft (could expand) | Small for a character LoRA; bounds achievable fidelity and pose/scene diversity. Expanding the corpus is possible but costs new art commissions and retraining. |
| Inference runs on Modal-hosted H100 GPUs | soft | Chosen for on-demand H100 access without owning hardware; cost scales with usage. Could move to other GPU hosts or local hardware if economics shift. |
| Only 3 colorways are trained | soft (assumed to cover near-term need) | Beats ships more than 3 Pill colorways over time; each new colorway likely needs new reference art + retraining or a conditioning scheme. |
| Output must clear Beats brand / creative review before external use | hard | A generative tool cannot self-certify brand compliance; human sign-off is a gate, which caps the "minutes per asset" speed claim with review overhead. |
| Apple/Beats internal AI governance and legal review for generative imagery | assumed | Assumed to be satisfied or in progress; not documented in any prior phase. A real constraint that was never validated — a live risk given the project already ships. |
| Operated by a small specialist team, not self-serve to all designers | soft | ComfyUI workflows require operator skill; broad self-serve would need a wrapped UI (not yet built). |
| One-time LoRA training cost (compute + 150-image curation/commission) already sunk | hard (sunk) | Should be excluded from go-forward per-asset economics but inflates total project cost and biases toward "keep using it." |
4. Scope boundary
- In scope: A Mikey-character image generation pipeline (LoRA + ComfyUI + Modal H100 inference) producing on-brand renders in the 3 trained colorways; operator workflow to brief, generate, and select assets; a brand-review handoff for outputs before external use; per-asset cost and quality measurement.
- Out of scope: Generating non-Mikey Beats characters or products (Pill hardware photography, Solo/Studio headphones, etc.); fully self-serve generation for all designers with a polished UI; video/motion generation; colorways beyond the 3 trained; any reuse outside Beats (project is explicitly Beats-proprietary, not a shared SkaFld asset); replacing human creative direction or final brand sign-off.
- Recursion budget acknowledged: max 3 per phase.
5. Assumption log
| # | Assumption | Confidence (0–1) | Validated? |
|---|---|---|---|
| 1 | 150 reference images is enough to train a LoRA that reliably renders a consistent, on-brand Mikey across poses and scenes | 0.45 | No — never tested against a quality bar; this is the single biggest unvalidated risk and should have been a Phase 4/5 gate |
| 2 | Modal H100 on-demand inference is cheaper at expected volume than continuing illustrator-rendered bespoke assets | 0.5 | No — no cost model exists; per-asset economics (criterion 3) are asserted, not measured |
| 3 | Self-hosting (vs. a hosted image API) is required because the character IP and reference art cannot leave Apple/Beats infrastructure | 0.85 | Partially — strongly implied by "proprietary" flag and Apple norms, but no documented governance/legal confirmation |
| 4 | The 3 trained colorways cover the near-term creative demand | 0.55 | No — assumed; Beats' actual colorway roadmap not cross-checked against the trained set |
| 5 | Generated outputs will pass Beats brand/creative review at a high enough rate to actually save time (net of review + retouch overhead) | 0.4 | No — no audit set, no measured pass rate; the time-savings thesis rests entirely on this |
| 6 | A small specialist team operating ComfyUI is an acceptable delivery model (vs. needing self-serve tooling to get adoption) | 0.5 | No — adoption (criterion 5) is unproven; operator bottleneck could cap usage |
| 7 | Apple/Beats AI governance permits production use of generative character imagery for brand assets | 0.6 | No — assumed in scope; a hard blocker if wrong, and the project is already executing |
Foundation assessment conclusion: The build is real and technically plausible, but it advanced to 6-execute on the strength of the shadow ledger (in-house AI capability, IP containment, demo momentum) without ever validating the two assumptions the public ledger depends on — LoRA fidelity from a 150-image corpus (#1) and favorable per-asset economics (#2, #5). This is the signature of a prematurely shipped project: the constraints that make it feasible (IP containment) are well-honored, while the criteria that make it worth it (quality pass-rate, cost, adoption) are unmeasured. Subsequent phases should treat criteria 1, 3, and 5 as the load-bearing tests.
Gate: success criteria defined, constraints mapped, decision-maker identified → set phaseGates.0-foundation = passed in manifest.json.