Phase 0 · Foundation Assessment — BEA-09 Amazon Bulk Image Upload
Establish the foundation before any analysis. The permanent base layer (FAD) for all future work. Written after Phases 1–5 to anchor the decision-gate record; every criterion, constraint, and assumption below is cross-referenced to the downstream dossiers it later governed.
One-liner (manifest): Batch rename/validate product images to ASIN naming convention; feeds the NIS package. · Category: Production Automation · Build type: backend-automation · Owner: Jaclyn Kreidstein.
1. Success criteria (measurable, time-bound)
- [ ] Baseline instrumented before any build. Capture errors-per-batch and operator-minutes-per-batch on the current manual file-first flow across ≥3–4 production batches within the first 2 weeks of execution, producing a single documented "displaceable pain" figure (target validation of the ~$31k/yr synthetic estimate from Phase 4). This is the master gate: per the Phase 4 innovation-theater check and Phase 5 consensus, no value claim is falsifiable until this number exists. Metric: a written baseline (mean reject/mislabel rate %, mean operator-minutes/batch) signed off by the owner.
- [ ] NIS package contract pinned and contract-tested before go-live. Within the first 2 weeks, pin the NIS package-format version and stand up an automated contract test against a stored NIS schema fixture that passes on every emitted package. Metric: contract test exists in CI and is green; NIS format version recorded in the manifest header (ref KILL-ALL-NIS, Phase 3.5 #6/#27; Iglesias S2, Phase 5).
- [ ] Architecture A in production with measured reject-rate reduction. Ship the manifest-anchored local rename engine (deterministic rename + manifest-vs-disk pre-flight diff + quarantine fan-out + day-one sidecar state flags + one truth-gate step) and demonstrate, within 90 days of go-live, a downstream NIS/Amazon reject/mislabel rate at or below half the measured baseline, on ≥6 production batches. Metric: post-launch reject rate ≤ 50% of baseline, sustained over 6 batches.
- [ ] Operability go-live gate satisfied. Before production cutover, the tool lives in a shared repo with a README, unit + smoke tests, and ≥2 trained operators (bus-factor ≥ 2). Metric: repo + CI + named second operator confirmed (ref DOWNGRADE-ON-BUS-FACTOR, Phase 3.5 #7/#18; Aluko, Phase 5). Timeframe: a precondition of go-live, not a follow-up.
2. Decision-maker profile (Three Ledgers)
- Public ledger (what we pitch): An automation that removes the slow, error-prone manual rename step — product images are batch-renamed to the ASIN convention, validated against a manifest before anything moves, broken files quarantined instead of jamming the batch, and a clean package handed to NIS. Pitched as risk reduction (fewer mislabeled listings) plus recovered operator time.
- Shadow ledger (what the decision-maker actually optimizes for): Speed-to-NIS and not being the cause of a visible listing defect. Per the Phase 2.5 incentive note and Phase 3.5 #3/#11, Jaclyn's team is rewarded for delivery throughput, not for gate rigor — so any conformance gate that is not the path of least resistance will be routed around under deadline. The decision-maker also optimizes to avoid funding a tool whose ROI cannot be demonstrated to a sponsor (the quiet-defunding fear, #11/#32).
- True ledger (what will actually get built): Architecture A only — a local/CLI deterministic rename engine with a manifest pre-flight diff, quarantine, day-one sidecar state flags (S1), and one cheap truth-gate (preferably Amazon ProcessingReport ingest, S3). C contributes a reusable checksum-sealing module and a narrow post-ASIN export-preset, not its transformative name-at-origin north-star. B's perceptual spine is not built absent two validation spikes (SP-API access + recycled-image frequency). This is the validated Phase 4/4.5/5 posture, not the full three-architecture ambition.
- Owner / sponsor: Jaclyn Kreidstein (manifest owner). Note (Phase 5, Borowski): the studio-operations team whose behavior Architecture C would depend on is a separate, unconsulted, unincentivized party — a key reason C is reclassified down rather than treated as a sponsor-backed initiative.
3. Constraint inventory
| Constraint | Type (hard / soft / assumed) | Notes |
|---|---|---|
| ASIN filename naming convention is externally defined by Amazon and can change without notice | hard | The regex/grammar must match Amazon's spec; a drift makes the tool reject valid or pass invalid names (Phase 3.5 #9). Must be externalized to config, not hard-coded. |
| NIS package format is an unowned external consumer contract | hard | BEA-09 controls neither the format nor its change cadence; pinning + contract test is a phase-zero blocker for all architectures (Phase 3.5 #6/#27; Phase 5 S2 / KILL-ALL-NIS). |
| SP-API / Amazon Catalog access requires external security/legal review and a selling-partner relationship | hard (for B) / assumed | Phase 4 modeled P≈0.55; Phase 5 (Venkataraman) revised to ~0.35, and likely limited to the Catalog Items read API — gates Architecture B's entire premise (#12). |
| ASIN is not always known at image-capture time | hard (for C) | Studio estimate (Borowski, Phase 5): 30–40% of shoots are pre-ASIN — worse than the Phase 4 P≈0.50 coin-flip. Structurally invalidates C as a primary path (#23). |
| Apple/Beats data-handling & asset-governance policy on catalog snapshots and any image-hash registry | hard | A perceptual-hash provenance ledger crossing into cross-team data governance can be frozen (#20); the local catalog snapshot is the low-friction, debuggable choice (Iglesias, Phase 5) and keeps data residency trivial. |
| Headcount / ownership: defensive internal tools chronically get bus-factor 1 and no funded second owner | hard (organizational) | Every architecture orphans differently — A=script, B=service, C=process (#7/#18/#28). Drives the ≥2-operator go-live gate. |
| Engineering effort budget ~2–10 engineer-weeks at ~$3k/eng-wk | soft | A≈3 wk / ~$13k; B≈8 wk / ~$33k; C≈4 wk code + ~$18k org-change / ~$33k (Phase 4). Budget is negotiable but A is the only option that clears a positive year-0 net. |
| Batch volume assumed hundreds–low-thousands of images, ~40 batches/yr | assumed | Phase 4 baseline; a 10x expansion blows A's single-machine ceiling and forces sharding/queued-service (#5, Phase 4.5). Unvalidated until baseline is measured. |
| Operators edit files out-of-band between runs | assumed (validated as likely by Phase 5) | Breaks naive pure-function idempotency; Reinhardt (S1) requires sidecar state flags in A's day-one scope, not as future work (#10). |
4. Scope boundary
- In scope: Architecture A end-to-end — human/spreadsheet-authored manifest, manifest-vs-disk pre-flight diff as the operator's first screen, deterministic idempotent rename to a staging area with a hard regex naming gate, quarantine fan-out for unresolvable/fatal/off-spec files, decoupled NIS package assembly from clean staging, day-one sidecar state flags (S1), and one cheap truth-gate step (S3) — ProcessingReport ingest preferred over a human thumbnail-vs-catalog visual confirm. Also in scope as additive: C's checksum-sealing module (the one piece that survives every C failure mode, Phase 4.5) and a narrow post-ASIN export-preset that auto-stamps an already-known ASIN at export with zero typing (the only piece of C the studio would adopt, Borowski). Pre-build: the baseline-measurement and NIS-contract-test gates.
- Out of scope: Architecture B's perceptual-content integrity spine (pHash registry, SP-API-authored manifest, job-queue service, version-pin hard-fail) — held until SP-API access is secured and recycled-image incident frequency is proven non-trivial. Architecture C's transformative name-at-origin north-star and any mandate to change photographer/vendor capture workflows (reclassified as theater per Borowski's pre-ASIN reality). Any live Amazon API in A's critical path. Cross-team asset-provenance ledger. Auto-applied pixel corrections (resize/recolor) that ship without human approval (#16).
- Recursion budget acknowledged: max 3 per phase. Current
recursionCount = 0; Phase 5 reached strong consensus with no full-phase recursion — objections absorbed as scope amendments carried into the Decision Gate.
5. Assumption log
| # | Assumption | Confidence (0–1) | Validated? |
|---|---|---|---|
| 1 | The current manual flow's displaceable pain is meaningful (~$31k/yr synthetic midpoint), not trivial | 0.45 | No — this is the master unvalidated assumption; baseline measurement is criterion #1 (Garrett/Marcus, Phase 5). |
| 2 | The team can produce a reasonably accurate source→ASIN manifest spreadsheet | 0.70 | Partially — assumed in Phase 3 Architecture A; the format gate validates form, not truth, so a typo still ships well-formed (#1). |
| 3 | A locally-maintained catalog snapshot is fresh enough that stale/re-issued ASINs are rare | 0.55 | No — staleness is a silent-error risk (#2); needs an export-date stamp + age assertion. |
| 4 | The NIS package format is stable and documentable enough to pin and contract-test | 0.60 | No — explicitly the unowned-dependency landmine (Iglesias, S2); must be confirmed in the first 2 weeks. |
| 5 | Batch volume stays in the hundreds–low-thousands; single-machine local run is sufficient | 0.65 | No — depends on baseline measurement; 10x expansion forces re-architecture (#5). |
| 6 | Operators will edit files out-of-band between runs, breaking naive idempotency | 0.75 | Yes (qualitatively) — Reinhardt confirms operators always edit out-of-band; drives day-one sidecar flags (S1). |
| 7 | SP-API/Catalog access for an internal Beats entity will be granted with adequate scope | 0.35 | No — revised down from Phase 4's 0.55 by Venkataraman; gates Architecture B (#12). |
| 8 | A material fraction (30–40%) of shoots occur before an ASIN is assigned | 0.70 | Partially — studio-ops estimate (Borowski); invalidates C-as-primary, needs a formal shoot survey to confirm (#23). |
| 9 | The recycled/byte-duplicate image error class occurs often enough to justify a perceptual spine | 0.30 | No — Venkataraman: the dominant real defect is wrong-variant/colorway mapping, which pHash handles worse; must validate frequency before building B (#15). |
| 10 | An organization that funds the build will also fund a second owner / docs / tests for month-13 maintenance | 0.40 | No — defensive internal tools historically orphan (Aluko); forces the ≥2-operator go-live gate (#7/#18). |
Gate: success criteria defined, constraints mapped, decision-maker identified → set phaseGates.0-foundation = passed in manifest.json.