Phase-by-Phase Breakdown
True Cost by Implementation Phase
Vendors quote model development — roughly 10–15% of what you'll actually spend. Here's where the rest goes:
| Phase | % of 3-Year Cost | What Insiders Budget | What SMBs Typically Budget |
|---|---|---|---|
| Data preparation & cleaning | 25–35% | $150K–$500K | "We have data already" |
| Model development / fine-tuning | 10–15% | $100K–$300K | Think this is the whole project |
| Integration & MLOps infrastructure | 15–20% | $200K–$400K | $0 (assumed plug-and-play) |
| Ongoing monitoring & retraining | 20–30% | $8K–$25K/month | $0 (assumed one-and-done) |
| Compute / inference costs | 10–15% | Tracked monthly | "We'll use the cloud" |
| Change management & training | 5–10% | $50K–$150K | $0 |
Key insight: The sticker price of an AI project is 15–25% of what you'll actually spend. Post-deployment costs exceed development costs within 12–18 months.
Inference & Compute
What You'll Actually Pay for Inference
| Deployment Model | Monthly Cost | Best For | Trade-off |
|---|---|---|---|
| Frontier API — GPT-5.4, Claude Opus 4.6 (10K queries/day) | $750–$4,500/mo | Best quality, fast start | Vendor lock-in, data leaves your network |
| Mid-tier API — GPT-5.4-mini, Claude Sonnet 4.6, Gemini Flash (10K queries/day) | $60–$300/mo | Good enough for most tasks | Still usage-based, still external |
| Self-hosted open model (Llama 3.x 70B) | $2,500–$4,500/mo | High volume, data privacy | You own the ops burden |
| Fine-tuned small model (7B–13B) | $400–$800/mo + $30K–$80K upfront | Domain-specific tasks | Expensive to retrain, narrow scope |
| RAG pipeline add-on | +$500–$2,000/mo | Making AI "know" your docs | Cheapest path to domain knowledge |
| Edge/on-prem (NVIDIA Jetson/RTX) | ~$0/mo + $5K–$25K hardware | Manufacturing, air-gapped | You own maintenance forever |
Hidden Costs
The Hidden Multipliers
| Cost Item | Range | Notes |
|---|---|---|
| Data labeling (50K examples) | $2,500–$300,000 | Simple classification: $0.05/label. Medical/legal: $2–$6/label |
| Model drift retraining (per cycle) | $5K–$50K | Plan for quarterly minimum |
| Compliance & legal review | $25K–$100K | SOC2, GDPR, EU AI Act readiness |
| Prompt engineering | $12K–$60K | 2–6 weeks at $150–$250/hr. Not a weekend project |
| Evaluation infrastructure | 20% of project cost | A/B testing, human review, automated eval pipelines |
| Organizational resistance | 10–15% of project | #1 killer of AI projects. If operators don't trust it, they'll ignore it |
Timeline
Timeline Reality Check
| Milestone | Vendor Promise | Actual (Median) |
|---|---|---|
| Proof of concept | 2–4 weeks | 4–8 weeks |
| POC → Production | 2–3 months | 6–14 months |
| Full organizational adoption | "Immediate" | 12–24 months |
| Positive ROI | "Within 6 months" | 18–36 months (if ever) |
The POC-to-production cliff: A POC that works at 85% accuracy in a demo costs $20K. Getting that same model to 95% accuracy in production with edge cases, monitoring, and failover costs $200K–$500K. Vendors demo the POC and quote accordingly.