Single-model fragility
One model fits a steady state and breaks the moment the world stops being steady. Tariffs, geopolitics, supply shocks, demand spikes — they’re not noise to be smoothed; they’re the test the model has to pass.
Most enterprise forecasting still runs on spreadsheets or single-model pipelines that miss every regime change. We build tiered forecasting systems — classical baselines, time-series foundation models for the hard tail, and LLM-driven narration on top — so your operators have forecasts they can defend and replan around.
Forecast error in 2026 usually isn’t a model problem — it’s a regime-change problem, an exogenous-variable problem, and a missing-layer-between-forecasters-and-operators problem. AI-powered forecasting reduces error 20–50% when it’s built right. Built wrong, it adds a confident hallucination on top of an already-broken plan.
One model fits a steady state and breaks the moment the world stops being steady. Tariffs, geopolitics, supply shocks, demand spikes — they’re not noise to be smoothed; they’re the test the model has to pass.
Weather, macro indicators, promotions, news, search and social signals — these are first-class inputs in 2026 and they’re missing from most production stacks. The forecast that doesn’t see them is forecasting last year.
Still a weakness, still being papered over. Frontier LLMs narrate forecasts beautifully and predict numeric values poorly. Putting them under the forecast instead of on top of it is a 2024 mistake people are still making.
A number on a dashboard isn’t a forecast that earns its keep. Without integration to S&OP, capacity, replenishment, and finance systems, the lift never lands in the operating result.
A tiered ensemble that uses the right tool for each series — and puts the LLM where it actually adds value, not where it doesn’t.
Three production forecasting patterns — each engineered for a different shape of planning work.
S&OP, replenishment, allocation, assortment planning. The classical use case, modernized — tiered ensemble across the SKU base, exogenous variables treated as first-class, and the integration into ERP and planning systems that turns the lift into shipped inventory.
Cash flow, revenue, expense, headcount, capacity. Built with regime-change resilience — the model holds when the macro environment moves, and the operator can see why the forecast changed.
Workforce, energy load, fraud, credit loss, claims volume. Domain-specific architectures where the cost of a miss is denominated in headcount, capital, or customer impact — not in a dashboard variance.
The model is one piece. These are the layers that make a forecasting system trustworthy in production.
Continuous model comparison per series. Automatic switchover when a challenger demonstrably wins. No once-a-year model bake-off; the right model for each series is a running decision, not a project.
First-class feature engineering for external signals — weather, macro, promotions, calendar effects, search and news embeddings. Lineage and versioning so a forecast can be replayed against the inputs it actually saw.
LLM-driven explanation, anomaly detection, what-if reasoning, analyst Q&A. The forecast becomes legible to the operator without anyone pretending the LLM produced it.
The strongest 2026 use cases share a shape: high-stakes planning over volatile, exogenous-driven series — where a miss is denominated in inventory, capital, or capacity, not in a dashboard variance.
SKU-level forecasting across the long tail. Up to 65% stockout reduction reported with modern tiered approaches; the lift compounds when exogenous signals are wired in properly.
Continuous re-forecasting replacing the monthly cycle. The plan moves at the speed of the market, not at the speed of the planning cadence.
Revenue, cash flow, expense, headcount. Regime-aware models that hold under macro shifts, with narration the CFO can take into the board meeting.
TSFMs now measurably outperforming on imbalance prediction. Exogenous signals (weather, calendar, generation mix) treated as first-class inputs, not afterthoughts.
Adding exogenous variables (temperature, illness seasonality, mobility) measured at +34% accuracy in published 2026 studies. The forecast finally meets the operating reality.
Insurance reserves, credit losses, fraud volume, claims frequency. Domain-tuned ensembles where the cost of being wrong is denominated in capital.
No, and this is the most important question on the page. Frontier LLMs are still poor numeric predictors — they narrate beautifully and forecast badly. The 2026 consensus, including from the teams that build the frontier models, is that LLMs belong on top of a forecast for explanation, scenario reasoning, and analyst Q&A. They do not belong under it as the numeric engine.
Rarely. TSFMs (Chronos-2, TimeGPT, TimesFM, MOIRAI-2) earn their place on cold-start, zero-shot, and the long tail of sparse SKUs — but tuned classical and gradient-boosted models still win on data-rich, well-behaved series. The 2026 consensus is complementary, not substitutive. We pick per series and run champion-challenger continuously.
Through ensemble diversity, exogenous variables, and an evaluation loop that watches for regime shifts. A single model assumes the world stays the same; a tiered ensemble with first-class exogenous inputs catches the shift earlier and re-bases faster. Tariffs, supply shocks, demand jumps — these are the test, not the noise.
As a first-class engineering deliverable, not an export. We integrate directly with SAP IBP, Oracle, o9, Blue Yonder, Kinaxis Maestro, Anaplan PlanIQ, or your custom planning stack — with the forecast, the inputs it used, and the explanation surfaced in the operator’s existing workflow. A forecast that doesn’t reach the plan doesn’t earn its keep.
Yes. The code, the features, the model registry, the evaluation harness, the champion-challenger logic, the integrations — all yours, in your repos. We design every engagement so your team can operate, retrain, and extend the system independently. Enablement is built into the project, not sold back to you on a retainer.
That is the job of the narration layer, and it is engineered, not an afterthought. The LLM on top of the forecast produces a defensible explanation — what changed, which drivers moved, which exogenous signals contributed, and what the counterfactual looks like under different assumptions. Operators get a forecast they can take into a planning meeting, not a number with no story.
Demand on a long tail of SKUs, financial forecasting under volatility, energy load with new exogenous signals, claims volume under regime shift. We benchmark your current approach against a modern tiered architecture, prove out the right model class on your data, and deliver a sequenced build plan with expected error reduction.
What you get: a forecasting-readiness assessment scored against twelve criteria; a series-by-series classification and target model class; a target tiered architecture with champion-challenger logic, exogenous data pipeline, and narration layer; a staged delivery plan with timelines, effort estimates, and expected error reduction per tier; and one workshop with your planning, finance, and engineering leads. Led by a senior consultant — fixed scope, fixed fee.
Book a Forecasting Review →A 30-minute conversation with a senior consultant. Bring a forecasting problem you can’t crack — long-tail demand, regime-prone revenue, energy load, claims volume. We’ll tell you whether a tiered architecture is the right answer, where a TSFM earns its seat, what exogenous variables you’re missing, and what a Forecasting Review would surface.
Book a Forecasting Review →