01Weeks 1–3

Value stream map of S&OP

Mapped twelve queue points in the demand planning cycle. Forecast accuracy baseline measured per SKU category.

02Weeks 4–10

Forecasting model pilot

Built and instrumented against real SKU history. MAPE measured weekly per category. Planners review every Monday.

03Weeks 11–14

Tier huddle around S&OP

Tier 2 cadence on the weekly planning board. Demand signal and supply response linked in the same room.

04Beyond 14

Sustain

Planners run the forecast every Monday morning. MAPE cut by 11 points; $40M of working capital freed. Quarterly advisory only.

The challenge

Warehouse aisle. The downstream effect of every forecast miss.

A top-10 US CPG manufacturer was losing $40M+ in working capital to forecast error its SAP IBP statistical model could not fix. The forecast was wrong by 24% on average for every SKU-store-week pair, every week, and the error followed structural patterns the existing model never captured.

The manufacturer ran demand planning on SAP IBP across about 1,200 SKUs distributed to 50,000 retail outlets. Forecasts were generated at SKU-store-week. MAPE at that granularity sat at 24 points. On average, the forecast was wrong by 24% for every SKU-store-week pair, every week. The error was not random; it had structural patterns that the existing statistical forecast inside IBP was not equipped to capture.

The cost was visible in the balance sheet. Safety stock was elevated across the network to cover the error. Working capital tied up in inventory ran $180M above the network-optimised level. Service-level penalties from key retailers averaged $14M a year. Promotional execution suffered because the forecast couldn't reliably separate baseline demand from promotional lift; the planners were running promo math in spreadsheets on the side. The audit committee had noticed the working-capital line in the prior annual report.

A prior consulting engagement had produced an "AI demand model" delivered as a Jupyter notebook. It ran on a workstation. It was not connected to IBP, not deployed to production, and had no monitoring. The data scientist who built it had taken another job. The notebook had been quietly abandoned by the time we walked in.

This CPG demand planning AI case study describes how we productionised a forecasting service in 16 weeks that cut MAPE 11 points at SKU-store-week granularity, integrated with SAP IBP through the standard interface, and delivered $40M+ in annualised working-capital impact independently measured by the client's finance team.

The constraints

The build sat inside an enterprise planning environment with real audit and integration obligations.

  • SAP IBP as the system of record. Planners worked in IBP. Supply, procurement, and finance read forecasts from IBP. The AI model could augment IBP but could not replace it. Integration ran through the standard forecast-import interface, no custom middleware.
  • SOX control environment. Demand forecasts feed inventory valuation and revenue planning, both SOX-relevant. Model changes, version promotion, and forecast generation runs had to be auditable. The internal controls team had to sign off on the change-governance design.
  • 24 months of clean training data, more if dirty. Promotional flags were inconsistent across business units. Store-cluster definitions changed twice in the window. Multi-cultural holiday calendars were missing for two key store clusters. Feature engineering had to repair the data, not just consume it.
  • Planner trust deficit from the prior notebook. The data science team had burned credibility with the demand planning organisation. Any new model had to earn adoption through workflow fit, not through accuracy claims in a slide deck.
  • Data engineering team takes ownership at week 16. The client's data engineering team would operate the service after handoff. Architecture decisions had to fit their existing platform (Databricks plus Snowflake), their existing on-call rotation, and their existing deployment pipeline.
  • No new capital spend. The budget was operating expense. We had to use the client's existing platforms, existing licences, and existing data lake.

Our approach

Production-grade from week one. We declined to build a notebook. The model would run as a production service, integrate with IBP via the standard forecast-import interface, run on a defined schedule, and have monitoring. We wrote that architecture decision on a whiteboard in the kickoff and didn't move off it. No modeling work began until the runtime, deployment, and IBP interface were settled. The same architecture-first discipline we apply in regulated-industry builds, like the bank fraud investigation copilot case study where MRM was the constraint, applied here with SOX as the constraint.

Granularity that matched the unit of decision. The client's prior model forecast at SKU-region-week. Inventory is not held at SKU-region-week. It is held at SKU-store-week. We rebuilt at the unit of decision. The accuracy improvement was substantial just from that change, before any feature engineering.

Feature engineering over architecture sophistication. Forty-plus engineered features: hierarchical (brand, category, store cluster, region, season), promotional flags with the correct lead and lag windows, weather signals at store-region level pulled from NOAA, holiday calendars at multi-cultural granularity (the client had been missing Lunar New Year and Diwali lift in two of its store clusters for years), and competitor promotional intelligence where the client had it. The model architecture (LightGBM) is well-understood. The features are where the win came from. Two features alone (the multi-cultural holiday calendar and the NOAA weather pull) accounted for about 4 of the 11 MAPE points.

Planner workflow integration as a paired workstream. A demand model that improves accuracy but doesn't change planner behaviour captures none of the value. We worked with the demand planning organisation to redesign the weekly cadence: what planners review, what they override, what they trust the model on. The cadence redesign was a separate workstream paired with the technical build, not an afterthought. This is where the Lean Consulting discipline shows up in an AI engagement: the management system around the model matters as much as the model.

SOX governance built into the deployment pipeline. Controlled promotion between dev, staging, and production. Signed approvals on every model version. Full audit log of forecast generation runs. The internal controls team reviewed the design in week eight and accepted it in week ten. By the time finance signed off on the working-capital methodology in week fourteen, SOX had already approved the model-change governance.

What we delivered

A production demand forecasting service that:

  • Generates SKU-store-week forecasts on a weekly schedule
  • Integrates with SAP IBP via the standard forecast-import interface
  • Provides confidence intervals at the forecast level
  • Surfaces feature attributions so planners understand why a forecast moved
  • Logs every forecast generation with model version, feature data, and resulting predictions for audit and back-test purposes
  • Monitors forecast accuracy continuously and flags drift to the planning organisation

The planner workflow piece, including the redesigned weekly cadence, the override protocol, and the management dashboards, was a separate workstream paired with the technical build.

Plus the SOX change-governance pattern: documented promotion gates, signed approvals on every model version, audit logs reviewed by the internal controls team quarterly. The pattern was accepted by the external auditor in the next annual cycle without findings.

Plus the operational runbook for the data engineering team taking ownership. Named on-call rotation, escalation tree for forecast generation failures, retraining cadence (quarterly), drift-alert thresholds with named owners for response.

The result

Metric Baseline After 6 months production Change
MAPE at SKU-store-week 24 pts 13 pts −11 pts
Safety stock requirement (network) $180M above optimum $140M above optimum −$40M
Service-level penalties (annualized) $14M $9.1M −$4.9M
Planner override rate n/a (no model) 12% (declining) adoption healthy
Total annualized working-capital impact n/a n/a $40M+

The financial impact (working capital plus service-level penalty plus obsolescence reduction) was measured by the client's finance team independently of Rockmere. The CFO presented the number to the board in the quarter following the 6-month mark. The audit committee, which had noticed the working-capital line a year earlier, noted the reduction in the next annual review.

Engagement timeline

Week Workstream
Weeks 1–2 Architecture decision, data audit, baseline MAPE measurement against the existing IBP forecast. Found the granularity gap on day three. Internal controls team briefed on SOX implications of model change governance.
Weeks 3–6 Feature engineering and model training. Two engineered features (multi-cultural holiday calendar and NOAA weather) accounted for about 4 of the 11 MAPE points on their own.
Weeks 7–9 IBP integration, production deployment pipeline, monitoring. The IBP forecast-import interface was older than expected; a planner who had run IBP for nine years walked us through the format quirks in an afternoon.
Weeks 10–13 Planner workflow redesign with the demand planning organisation. Watched four planners run their weekly cadence. Found that two of them were already overriding the statistical forecast in a spreadsheet they kept hidden from the planning lead. We made that workflow legitimate. SOX change-governance design accepted by internal controls team.
Weeks 14–16 Hardening, planner training, operational handoff to the client's data engineering team. Finance signed off on the working-capital measurement methodology.

What survived past our engagement

Five artefacts now belong to the client.

  1. The production model service. Operated by the client's data engineering team on the existing Databricks/Snowflake platform. Retrained quarterly. Drift alerts wired to on-call.
  2. The feature engineering library. Now the client's internal demand-modelling standard. Reused on the trade promotion optimisation engagement nine months later.
  3. The planner workflow patterns. Weekly cadence, override protocol, management dashboards. Documented and used to onboard new planners.
  4. The SOX change-governance pattern. Documented promotion gates, signed approvals, audit logs. Reused on subsequent AI initiatives across the supply chain organisation.
  5. A named owner with budget. The VP of Supply Chain Planning owns the service. The data engineering team has a named tech lead. Retraining spend sits in the supply chain operating budget.

A second engagement followed 9 months later for a similar build on the client's trade promotion optimisation workflow, drawing on the same feature engineering library and the same SOX change-governance pattern. The credential authority that lets us deliver to this depth (named senior practitioners, IBP and Databricks production experience) is detailed on our credentials page.

Where this fits

This engagement is canonical for our Manufacturing practice. The model and integration work was AI Transformation; the planner workflow redesign drew on Lean Consulting for the daily and weekly cadence patterns.

The same management-system-around-the-model pattern shows up in our Tier-1 automotive supplier OEE case study, where Lean huddles were how the AI predictions actually changed behaviour. The same regulator-first build discipline shows up in our State Medicaid eligibility AI case study, where NIST AI RMF was the gating regime instead of SOX.

If you're running demand planning on SAP IBP, Anaplan, Blue Yonder, or o9 and your forecast accuracy is limiting working capital efficiency, get in touch. We can usually estimate the achievable MAPE improvement from a 30-minute data overview call.