AI Receptionist Cost in 2026: Full Breakdown by Platform

Q: How much does an AI receptionist cost per month for a small business?

For an SMB doing 300 to 500 calls a month at typical 2 to 4 minute average duration, raw platform infrastructure runs $80 to $200 per month across the major platforms. Add agency build cost ($1,500 to $5,000 one-time) and run cost ($400 to $1,200 per month) if you don't want to manage it yourself.

Q: Is it cheaper to use Bland or Vapi?

For SMB volumes (under 2,000 minutes per month) Bland is usually cheaper because the all-in $0.11 to $0.14 per minute beats a typical Vapi stack of platform plus LLM plus STT plus TTS plus telephony, especially if you'd otherwise pay Vapi's $1,000 per month HIPAA surcharge. For 5,000+ minutes per month with no HIPAA need, a tuned Vapi stack can come in lower.

Q: What's hidden in Vapi's $0.05 per minute price?

The $0.05 per minute is platform orchestration only. STT, LLM, TTS, and telephony are all billed separately. HIPAA compliance adds a flat $1,000 per month. Realistic stacked cost for a working agent runs $0.07 to $0.33 per minute.

Q: Does ElevenLabs Conversational AI charge for the LLM separately?

Yes. The per-minute rate ($0.08 / $0.10 / $0.12 by tier) covers orchestration and TTS. LLM tokens are pass-through and billed separately on top.

Q: How does Synthflow's pricing compare to Vapi or Retell pay-per-minute?

Synthflow is pay-as-you-go at $0.15 to $0.24 per minute depending on the LLM and telephony choices you make. Retell at about $0.11 per minute on a typical stack and Vapi at $0.07 to $0.25 per minute are also pay-per-minute. All three charge only for minutes used, with no unused-capacity penalty. Synthflow's higher PAYG floor reflects its bundled voice engine and LLM handling; Retell's lower floor reflects a more a-la-carte component model.

Q: What happens to the per-minute rate when I'm not on a call?

You're not billed for idle time on any of the five platforms. Concurrency reservations are billed monthly ($8 to $20 per month per extra reserved line on Retell or Synthflow), but per-minute usage only accrues during active call time. Hold time and transfer time are billed on Bland (transfer at $0.03 to $0.05 per minute by tier).

Q: How much does it cost to have an agency build and run an AI receptionist?

Operator-published ranges put agency setup at $3K to $15K and ongoing retainer at $200 to $1,200 per month for typical SMB deployments, scaling to $3,500 per month for full-deployment management on more complex stacks. Build cost varies with how many integrations (CRM, scheduling, dispatch, payment) you need wired in.

TL;DR

Stack the platform fee + LLM + STT + TTS + telephony and a working AI receptionist runs $0.09–$0.36/min. For a typical SMB doing 300–500 calls/month, that's $90–$360 in raw infrastructure — before any agency build or run cost. See how we deploy AI voice agents end to end at our AI voice agents service page.

$0.09–$0.36per minute all-in

$90–$360SMB monthly infra cost

$1,000+HIPAA add-on (Vapi)

5 platformsbenchmarked in 2026

$0.05/min

Vapi entry-only fee — before LLM, STT, TTS & telephony

Source: Vapi.ai/pricing

$1,000/mo

Vapi HIPAA compliance flat surcharge, separate from usage

Source: Vapi billing docs

~50%

ElevenLabs price cut Feb 2025 — Creator & Pro plans dropped to $0.10/min

Source: ElevenLabs blog

1.4–1.7s

Hamming.ai independent p50 end-to-end latency across 4M+ production calls

Source: Hamming.ai, 2026

1m 32s

Median duration of a successful AI voice call (Canonical Chat dataset)

Source: Canonical Chat

42%

Share of AI voice calls that meet their stated objective

Source: Canonical Chat, 2025

The "$0.05 per minute" line on the Vapi pricing page is technically true. It's also useless, because no working AI receptionist actually costs $0.05 a minute. It costs that, plus an LLM, plus speech-to-text, plus text-to-speech, plus a phone number that can receive calls. Stack those and the real number for an SMB looks more like $0.09 to $0.36 a minute, depending on the platform and the voice you pick.

This article tabulates the real 2026 cost of an AI receptionist across five platforms (Vapi, Retell, Synthflow, Bland, and ElevenLabs Conversational AI), every component that goes into a working stack, and three SMB worked examples (dental, law, HVAC) so you can see the monthly bill before you commit. Every number is sourced. Nothing is rounded to make a vendor look better.

What does “AI voice agent cost per minute” actually include?

Answer

“Cost per minute” for an AI voice agent is the sum of five separately billed components: the orchestration platform fee, speech-to-text (STT), the large language model (LLM), text-to-speech (TTS), and telephony termination. Most vendor pricing pages quote only the orchestration fee. The real all-in cost in 2026 is two to six times that headline number once the other four components are added.

A working AI receptionist isn't one product. It's a pipeline. Audio comes in over a phone line, gets transcribed to text, gets reasoned over by an LLM, gets rendered back to audio, and goes out the same phone line. Each of those steps is a paid API call.

Five components, billed on five different clocks:

Orchestration / platform fee: the per-minute rate the voice-agent platform charges to glue the pipeline together (Vapi $0.05, Retell $0.055, Synthflow $0.09 voice engine).
Speech-to-text (STT): what the caller says, transcribed in real time. Deepgram, AssemblyAI, OpenAI, and Google STT are the common choices.
LLM tokens: input tokens (the prompt + caller transcript) and output tokens (the AI's response), billed per million.
Text-to-speech (TTS): the voice you hear back. ElevenLabs, Cartesia, Deepgram Aura, OpenAI TTS, Hume, and PlayHT compete on quality and latency.
Telephony termination (PSTN): Twilio, Telnyx, or SignalWire to connect the call to a real phone number. Inbound and outbound are billed separately, plus a monthly DID rental.

Bland and ElevenLabs Conversational AI are the two exceptions. Both quote a single all-in per-minute rate that bundles the components above. Everything else stacks.

What are the 5 components of an AI voice agent stack (and what does each cost in 2026)?

Across the platforms covered here, expect roughly these per-minute ranges in 2026:

AI voice agent component cost ranges, 2026
Component	Typical 2026 range	Notes
Orchestration platform fee	$0.05 to $0.09/min	Vapi, Retell, Synthflow voice engine
Speech-to-text (STT)	$0.0025 to $0.016/min	AssemblyAI Universal-2 cheapest; Google STT v2 Chirp standard
LLM (text models)	$0.005 to $0.080/min	Depends on model and tokens-per-minute; GPT-5 nano vs GPT-5.4
TTS	$0.015 to $0.10/min	OpenAI TTS-1 cheap; ElevenLabs Flash mid; PlayHT Premium high
Telephony (US local)	$0.008 to $0.022/min	Twilio local inbound $0.0085, toll-free inbound $0.022
HIPAA / compliance	$0 to $1,000/mo flat	Vapi $1,000/mo; Synthflow Enterprise-only; Retell add-on
Concurrency	First 5 to 20 free	Then $8 to $20/mo per extra reserved line

The cheapest viable stack lands near $0.07/min. A premium stack with a top-tier voice and a frontier LLM lands closer to $0.30/min. Most working SMB deployments sit between $0.10 and $0.20/min.

How much does an AI receptionist cost per minute, all-in?

Answer

An AI receptionist costs $0.09 to $0.36 per minute all-in across the five major platforms in 2026. Vapi stacks to $0.07 to $0.33/min depending on voice and model, Retell to ~$0.11/min on a GPT-4.1 stack, Synthflow $0.15 to $0.24/min PAYG, Bland $0.11 to $0.14/min on its tiered all-in model, and ElevenLabs Conversational AI $0.08 to $0.12/min plus LLM pass-through.

Headline comparison: all-in cost across five platforms at 500 minutes/month, 2026
Platform	Platform fee	STT	TTS	LLM	Telephony	HIPAA add-on	All-in @ 500 min/mo
Vapi	$0.05/min	bring your own ($0.003 to $0.016)	bring your own ($0.015 to $0.10)	bring your own ($0.005 to $0.08)	bring your own (~$0.008+)	$1,000/mo	$35 to $165 + $1,000 if HIPAA
Retell	$0.055/min	included in voice	$0.015/min platform voice	$0.003 to $0.080/min selectable	bring your own (~$0.008+)	PII removal +$0.01/min	~$55 ($0.11/min x 500)
Synthflow	$0.09/min voice engine	included	included	$0.02 to $0.05/min	$0.00 to $0.02/min	Enterprise plan only	$75 to $120 ($0.15 to $0.24/min)
Bland	(bundled)	included	included	included	included (US)	not separately listed	$55 to $70 ($0.11 to $0.14/min)
ElevenLabs Agents	(bundled)	included	included	LLM pass-through extra	bring your own	not separately listed	$40 to $60 + LLM ($0.08 to $0.12/min x 500)

Verified against Vapi pricing, Retell AI pricing, Synthflow pricing, Bland AI billing docs, and the pxlpeak ElevenLabs pricing breakdown. All numbers retrieved 2026-05-03.

What this means for SMB operators:the headline rate on a vendor's pricing page is the floor, not the ceiling. Vapi's “$0.05/min” is true only if you bring zero LLM, zero STT, zero TTS, and zero phone number. The honest comparison number is the all-in 500-minutes-a-month column. That puts Bland and Retell within a few cents of each other for a typical use case, while Vapi swings widely depending on the voice and model you wire in.

What does Vapi actually cost (what's hidden in the $0.05/min)?

Answer

Vapi's advertised $0.05/min is the orchestration fee only. STT, LLM, TTS, and telephony are all billed separately as pass-through components. Realistic stacked cost is $0.07 to $0.25/min for typical configurations, climbing to $0.30 to $0.33/min on premium voices and frontier LLMs, per Cloudtalk's Vapi pricing breakdown and Emitrr's analysis. HIPAA compliance is a flat $1,000/month add-on on top.

Vapi includes 10 concurrent call lines by default; additional lines run $10/month each. The platform is genuinely flexible (any STT, any LLM, any TTS), which is also why pricing comparisons keep getting it wrong. There's no single “Vapi price.” There's a Vapi-orchestrated stack price, and you build it.

One caveat worth flagging: the vapi.ai/pricing page is JavaScript-rendered, so automated fetchers (and AI Overviews) often miss it. The $0.05/min platform figure here comes from two independent third-party breakdowns that line up with Vapi's own billing docs.

What does Retell AI actually cost?

Answer

Retell's Voice Infra (platform) fee is $0.055/min, with TTS bundled into a $0.015/min platform voice. LLM is selectable per agent, ranging from $0.003/min on GPT-5 nano to $0.080/min on GPT-5.4. A typical GPT-4.1 stack lands at $0.11/min all-in per Retell's own pricing page. Free trial credits of $10 are included; no monthly minimums.

Retell's pricing structure is the most transparent of the five platforms. Every component is a line item on the live pricing page, and you can build the stack you want from a dropdown. Concurrency is generous: the first 20 concurrent calls are free, then $8/month per additional reserved line. Add-ons include a Knowledge Base at +$0.005/min and PII removal at +$0.01/min. Verified phone numbers are $10/month each.

For most SMB receptionist deployments, Retell is the platform that gives you the smallest gap between your back-of-the-envelope estimate and the actual invoice.

What does Synthflow cost?

Answer

Synthflow moved to a pure pay-as-you-go model in 2026. There are no bundled-minutes subscription plans. PAYG components stack at $0.15 to $0.24/min: voice engine $0.09/min, LLM $0.02 to $0.05/min (selectable), and telephony $0.00 to $0.02/min. 5 concurrent calls are included on PAYG; additional reserved concurrency is $20/month per slot (up to 50 total). HIPAA is restricted to the Enterprise plan, which starts at 10,000 minutes/month.

Because Synthflow charges PAYG across all usage tiers, it rewards variable-volume SMBs: you pay only for minutes you actually use, with no minimum commitment. The tradeoff is that per-minute rates are higher than a well-tuned Retell stack, so at sustained high volume (5,000+ minutes/month) it can run more expensive. For HIPAA-required SMBs, the Enterprise tier is the only path, which means negotiating custom pricing above the 10K-minute threshold.

What does Bland AI cost after the December 2025 tier change?

Answer

On December 5, 2025, Bland moved from a flat $0.09/min to a tiered all-in model: Start at $0.14/min, Build at $0.12/min ($299/month), and Scale at $0.11/min ($499/month). The per-minute rate includes LLM, STT, TTS, and telephony, with no component stacking. Older third-party articles still quote the obsolete $0.09/min flat rate.

Bland's all-in pricing is the cleanest model on the market for SMB buyers who don't want to think about component stacks. Pick a tier, get a number, that's the bill. Transfer time is billed separately and tiered as well: $0.05/min on Start, $0.04/min on Build, $0.03/min on Scale. The free trial includes 2 credits and a free inbound number (a $15/month value).

If you see a 2025-dated article quoting Bland at $0.09/min, treat the rest of that article with skepticism. Pricing isn't the only thing it's missed.

What does ElevenLabs Conversational AI cost (the platform nobody compares)?

This is the platform every multi-vendor comparison leaves out, which is strange given that ElevenLabs cut Conversational AI prices roughly 50% in early 2025 and now sits in the same per-minute range as Bland and Retell.

Answer

ElevenLabs Conversational AI runs $0.08/min on the Standard tier (gpt-3.5-turbo + Multilingual v2), $0.10/min on Turbo (gpt-4o-mini + Flash v2), and $0.12/min on Premium (gpt-4o + Flash v2.5), per the pxlpeak ElevenLabs pricing guide. The annual Business plan locks in $0.08/min. LLM tokens are pass-through and billed separately on top of the per-minute rate.

The ~50% Conversational AI price cut was announced February 11, 2025, with Creator and Pro plans dropping to $0.10/min. Pre-cut, the rate was approximately $0.20/min. ElevenLabs' subscription tiers run Free $0 / Starter $6 / Creator $11 / Pro $99 / Scale $299 / Business $990, and Business plan low-latency TTS gets as low as 5 cents/minute when used standalone.

ElevenLabs is the right call when voice quality is the differentiator (legal intake, premium concierge), not when raw component cost is. The voices are consistently the best in the field, and it's the Hamming.ai latency leader (see section 13).

STT, LLM, and TTS component pricing reference (2026)

For Vapi, Retell, and Synthflow stacks, you pick the components yourself. Here's the per-minute math in 2026.

Speech-to-text (STT)

2026 STT per-minute pricing
Vendor / model	Rate	Notes
Deepgram Nova-3 streaming, monolingual	$0.0048/min PAYG	$0.0042/min on Growth tier
AssemblyAI Universal-2	$0.0025/min ($0.15/hr)	Cheapest in class
OpenAI Whisper API	$0.006/min
OpenAI gpt-4o-transcribe	$0.006/min
OpenAI gpt-4o-mini-transcribe	$0.003/min
Google Cloud STT v2 Chirp (streaming)	$0.016/min standard, down to $0.004/min at volume	60 min/mo free tier

Source: Deepgram, AssemblyAI, TokenMix, Google Cloud

Sources: Deepgram pricing, AssemblyAI pricing, TokenMix Whisper API guide, Google Cloud Speech-to-Text pricing.

LLMs (text models, $/MTok)

2026 LLM pricing per million tokens (input/output)
Model	Input ($/MTok)	Output ($/MTok)
OpenAI GPT-5	$1.25	$10.00
OpenAI GPT-5.4	$2.50	$15.00
OpenAI GPT-5.4 mini	$0.75	$4.50
OpenAI GPT-5.4 nano	$0.20	$1.25
OpenAI GPT-4o	$2.50	$10.00
Anthropic Claude Sonnet 4 / 4.5 / 4.6	$3.00	$15.00
Anthropic Claude Haiku 4.5	$1.00	$5.00
Anthropic Claude Opus 4.5 / 4.6 / 4.7	$5.00	$25.00
Google Gemini 2.5 Flash (text in)	$0.30	$2.50
Google Gemini 2.5 Pro (up to 200k)	$1.25	$10.00

Source: BenchLM, PricePerToken, Anthropic docs, Google AI

OpenAI rates verified via BenchLM's API pricing tracker and PricePerToken; Anthropic via Claude pricing docs; Google via the Gemini API pricing page. Voice agents typically burn 200 to 600 input tokens and 100 to 300 output tokens per turn; at 5 to 10 turns per minute, that lands in the $0.005 to $0.080/min range depending on model.

For real-time audio I/O, OpenAI gpt-4o-realtime bills $100/MTok audio in and $200/MTok audio out, which works out to roughly $0.06/min in and $0.24/min out.

TTS

2026 TTS pricing across major providers
Vendor / model	Rate	Notes
ElevenLabs Flash v2.5 / Turbo v2.5	0.5 to 1 credit/character (tier-dependent)	Business low-latency 'as low as 5 cents/min'
Cartesia Sonic-3	15 credits/sec audio	Pro $4/mo, Startup $39/mo, Scale $239/mo
OpenAI TTS-1	$15/MChar (~$0.015/1K chars)
OpenAI TTS-1-HD	$30/MChar (~$0.030/1K chars)
OpenAI gpt-4o-mini-tts	~$0.015/min
Deepgram Aura-2	$0.030/1K chars PAYG	$0.027/1K chars on Growth
Hume Octave Pro	$70/mo, 1M chars; $0.05/1K chars overage
PlayHT Creator	$49/mo annual ($99 monthly)	Verify before publish: connection issues at fetch

Source: ElevenLabs, Cartesia, TokenMix, Deepgram, Hume, voice.ai

Sources: ElevenLabs pricing, Cartesia pricing, TokenMix TTS comparison, Deepgram pricing, Hume pricing, voice.ai PlayHT pricing breakdown.

What does telephony termination cost? (Twilio vs Telnyx vs SignalWire)

Answer

US local PSTN termination runs $0.0035 to $0.0085/min inbound and $0.005 to $0.014/min outbound across the three major providers. Twilio is the most expensive on the headline rate but the easiest to integrate; Telnyx and SignalWire undercut on price but need more SIP literacy. Toll-free inbound runs $0.015 to $0.022/min.

US telephony termination pricing, 2026
Provider	Local inbound	Local outbound (US)	Toll-free inbound	DID rental
Twilio Programmable Voice (local)	$0.0085/min	$0.0140/min	$0.0220/min	$1.15/mo local, $2.15/mo toll-free
Telnyx (local)	from $0.0035/min	from $0.005/min	from $0.015/min	verify on Numbers page
SignalWire (10DLC local)	$0.0066/min	$0.0080/min	$0.0147/min	$0.50/mo local, $0.80/mo toll-free
Twilio SIP / WebRTC	$0.004/min	$0.004/min	n/a	n/a
SignalWire SIP / WebRTC	$0.003/min	$0.003/min	n/a	n/a

Source: Twilio, Telnyx, SignalWire

Verified via Twilio Programmable Voice pricing, Telnyx Elastic SIP pricing, SignalWire voice pricing. Telnyx also offers channel-based billing as an alternative to per-minute: first 10 channels at $12/mo, scaling down to $8/mo above 250.

For an SMB doing under 5,000 minutes a month, the telephony delta between providers is roughly $20 to $50/month. It's not where you should optimize first.

What does an AI voice agent actually cost an SMB per month? (Three worked examples)

Per-minute rates are abstractions. Here's what three real SMB use cases pay each month at typical 2026 stacks.

Scenario A

Dental practice

300 calls/month, 3.5 min average = 1,050 min/mo. Volume reflects the 40 to 60 calls/day median for solo dental practices (existing-patient calls run 1 to 3 min, new-patient calls 4 to 6 min, blended around 3.5 min).

Retell stack (GPT-4.1, platform voice, Twilio local)

1,050 min x $0.11/min + ~$0.0085/min telephony + $10 number

~$135/month

ElevenLabs Agents stack (Turbo tier, GPT-4o-mini pass-through, Twilio local)

1,050 min x $0.10/min + LLM ~$0.01/min + ~$0.0085/min telephony

~$130/month

Scenario B

Law firm after-hours intake

50 calls/month, 6 min average = 300 min/mo. Unqualified intake calls run up to 5 min; PI / family intake creeps to 8 to 10 min. Six minutes is a reasonable blended average.

Retell stack

300 min x $0.11/min + telephony + $10 number

~$50/month

ElevenLabs Agents stack (Premium tier for voice quality)

300 min x $0.12/min + LLM + telephony

~$55/month

Volume is low enough that the $10 number rental matters as much as the per-minute rate.

Scenario C

HVAC contractor

400 calls/month, 2 min average = 800 min/mo. HVAC inbound calls are short and transactional: address, problem, dispatch window. The bigger problem this stack solves is the approximately 22% annual missed-call rate (35% in peak season).

Retell stack

800 min x $0.11/min + telephony + $10 number

~$105/month

ElevenLabs Agents stack (Standard tier)

800 min x $0.08/min + LLM + telephony

~$80/month

These figures are infrastructure only. They don't include the time to build the agent, write the prompts, integrate with the booking flow or CRM, or run the system day to day.

What this means for SMB operators: for a typical SMB doing under 1,500 minutes a month, the platform you pick rarely changes the bill by more than $30 to $50/month. What changes the bill 3x to 10x is whether the system is wired into your scheduling and CRM correctly. The infrastructure is cheap. The integration work is where projects succeed or fail. For an outbound counterpart (booking confirmations, reactivation campaigns), see how we structure lead-generation automation.

How long is the average AI voice agent call?

Answer

The median successful AI voice call runs 1m 32s; failed calls median 34s, per Canonical Chat's production dataset. The blended median across all calls is 51s. Use-case ranges are wider: 1 to 6 min for dental, 5 to 30 min for law-firm intake, 1 to 3 min for HVAC dispatch.

Typical call duration by use case, 2025-2026
Use case	Typical duration	Source
Dental, existing patient	1 to 3 min	AgentZap dental phone stats
Dental, new patient	4 to 6 min	AgentZap
Law firm, unqualified intake	up to 5 min	Filevine intake KPIs
Law firm, qualified sign-up	~30 min	Filevine
PI / family law intake	8 to 10 min	Alert Communications
AI voice agent (all calls, blended median)	51s	Canonical Chat
AI voice agent (successful calls)	1m 32s	Canonical Chat

Canonical Chat also reports that 42% of AI voice calls meet their stated objective, vs ~70% first-call resolution and 2m 50s avg talk time for human call-center agents. Read those numbers together: AI agents are ~60% as effective as humans on first-call resolution, but they cost $0.11/min instead of $0.50 to $1.20/min.

Voice-AI latency benchmarks: vendor self-reported vs Hamming.ai independent

Answer

Vendor-reported voice-agent latency claims (“sub-500ms,” “sub-100ms”) describe component-level performance. The independent Hamming.ai dataset of 4M+ production calls puts real p50 end-to-end latency at 1.4 to 1.7s, with p99 at 8 to 15s. Canonical Chat's median time between a human finishing speaking and the AI starting its response is 1.95s.

Voice AI latency benchmarks: vendor claims vs independent production data
Source	Metric	Value
Hamming.ai (independent, 4M+ calls)	p50 end-to-end latency	1.4 to 1.7s
Hamming.ai	p99 end-to-end latency	8 to 15s
Canonical Chat (production)	Median human-to-AI response gap	1.95s
Canonical Chat	Median time-to-first-AI-word	880ms
Vapi (self-reported)	Target	sub-500ms; real-world 600ms to 1,000ms+
Retell (tested with ElevenLabs v3)	Observed	~600ms; long-turn pauses ~1.1s
ElevenLabs (claimed)	Component-level	sub-100ms
TTS leader (Hamming)	ElevenLabs Flash	75ms
TTS runner-up (Hamming)	Cartesia Sonic	90ms
ITU-T G.114 reference	One-way voice quality threshold	<300ms

What this means for SMB operators:every vendor's marketing page is technically truthful and operationally misleading. ElevenLabs Flash genuinely does deliver ~75ms TTS. Vapi orchestration genuinely can target sub-500ms. The reason real calls land at 1.4 to 1.7s is that you stack STT + network + LLM + TTS + network + return, and none of the sub-500ms claims survive contact with the full pipeline. When you evaluate a platform, ask for end-to-end median (and p99) on a stack that matches what you'll actually deploy. Not isolated component benchmarks.

Is it cheaper to build an AI voice agent or buy a platform?

Answer

For an SMB, buy a platform. Custom-built voice agents cost $443K to $767K in Year 1 (engineering, infra, third-party APIs, audits) vs $9K to $15K total for a managed-platform deployment. For agencies serving SMB clients, build-on-platform retainers run $300 to $800 MRR per client at ~80% margin.

Two real-world numbers from the indie-operator world worth keeping in mind:

The "$3,000 to find out the hard way" tax. One operator tested 7 voice-agent platforms before settling on one, spending ~$3K in API credits along the way. If you're shopping, set a budget for the trial phase.
Custom builds that cost more than budgeted. A Reddit-cited insurance agency budgeted $80K for a custom AI build, ended up spending $160K, and then switched to a $600/mo managed platform. Stories like this are anecdotes, not industry-wide averages, but they illustrate why the build-vs-buy math so rarely closes for SMBs.

Agency retainer ranges from operator-published data: $800 to $3,500/month for full-deployment management, with setup fees of $3K to $15K for the build. Custom dev hourly rates run $50 to $150/hr for typical implementation work.

What this means for SMB operators: if you're an SMB, you shouldn't be in the “build” business. The math has been settled for two years. The honest framing is platform-vs-platform, not build-vs-buy. We cover the integration side at our AI integration services page.

What's the cheapest viable AI voice agent stack for an SMB right now?

Answer

The cheapest viable SMB stack in 2026 is Bland Start tier ($0.14/min all-in, no component stacking) or an ElevenLabs Conversational AI Standard tier ($0.08/min + LLM pass-through), both bundled with Twilio local telephony. For under 1,000 minutes/month, expect $80 to $140/month in raw infrastructure before agency build or run cost.

Key finding 1: don't optimize on the platform fee.

Vapi's $0.05/min platform fee saves you $25/month at 500 minutes vs Retell's $0.055. The voice you pick (ElevenLabs Premium vs OpenAI TTS-1) swings the bill 5x more than that.

Key finding 2: bundled all-in beats stacked components for predictability.

Bland and ElevenLabs Agents quote one number; Vapi and Retell quote five. SMB operators consistently underestimate stacked costs by 30 to 60% in their own forecasts.

Key finding 3: HIPAA changes the calculus.

Vapi's $1,000/month HIPAA add-on makes it the wrong choice for solo dental practices unless you're also routing 5,000+ minutes/month. For HIPAA-required SMBs at low volume, Retell or a Synthflow Enterprise plan is cheaper net of the surcharge.

Key finding 4: ElevenLabs Conversational AI is the gap in every other comparison.

It's competitive on price, leads on voice quality, and gets ignored by every vendor-published table because no vendor wants to surface it.

Key finding 5: the integration tax is the real cost.

Across our SMB deployments, the per-minute infrastructure is 20 to 30% of the total project economics. The other 70 to 80% is wiring the agent to your scheduling, CRM, and dispatch systems so it actually books, transfers, or escalates correctly. That work belongs to a build engagement, not a platform subscription.

For dental, salon, and clinic verticals where HIPAA matters, lean Retell or Synthflow Enterprise. For HVAC, contractors, and small law where speed of deploy matters more than HIPAA, lean Bland. For premium-voice intake (boutique law, concierge medical), lean ElevenLabs Conversational AI Premium tier. We deploy across all four for clients via our AI voice agents service; for ongoing customer-side conversation handling beyond the receptionist scope, see customer support automation.

Frequently asked questions

How much does an AI receptionist cost per month for a small business?

For an SMB doing 300 to 500 calls a month at typical 2 to 4 minute average duration, raw platform infrastructure runs $80 to $200/month across the major platforms. Add agency build cost ($1,500 to $5,000 one-time) and run cost ($400 to $1,200/month) if you don't want to manage it yourself. The per-month number stabilizes after the first 30 days of tuning; for what those 30 days actually look like in a real rollout, see our deep-dive on the first 30 days of an AI voice agent for small business at /blog/ai-voice-agent-first-30-days-smb.

Is it cheaper to use Bland or Vapi?

For SMB volumes (under 2,000 minutes/month) Bland is usually cheaper because the all-in $0.11 to $0.14/min beats a typical Vapi stack of platform + LLM + STT + TTS + telephony, especially if you'd otherwise pay Vapi's $1,000/month HIPAA surcharge. For 5,000+ minutes/month with no HIPAA need, a tuned Vapi stack can come in lower.

What's hidden in Vapi's $0.05 per minute price?

The $0.05/min is platform orchestration only. STT, LLM, TTS, and telephony are all billed separately. HIPAA compliance adds a flat $1,000/month. Realistic stacked cost for a working agent runs $0.07 to $0.33/min.

Does ElevenLabs Conversational AI charge for the LLM separately?

Yes. The per-minute rate ($0.08 / $0.10 / $0.12 by tier) covers orchestration and TTS. LLM tokens are pass-through and billed separately on top.

How does Synthflow's pricing compare to Vapi or Retell pay-per-minute?

Synthflow is pay-as-you-go at $0.15 to $0.24/min depending on the LLM and telephony choices you make. Retell at ~$0.11/min on a typical stack and Vapi at $0.07 to $0.25/min are also pay-per-minute. All three charge only for minutes used, with no unused-capacity penalty. Synthflow's higher PAYG floor reflects its bundled voice engine and LLM handling; Retell's lower floor reflects a more a-la-carte component model.

What happens to the per-minute rate when I'm not on a call?

You're not billed for idle time on any of the five platforms. Concurrency reservations are billed monthly (e.g., $8 to $20/mo per extra reserved line on Retell or Synthflow), but per-minute usage only accrues during active call time. Hold time and transfer time are billed on Bland (transfer at $0.03 to $0.05/min by tier).

How much does it cost to have an agency build and run an AI receptionist?

Operator-published ranges put agency setup at $3K to $15K and ongoing retainer at $200 to $1,200/month for typical SMB deployments, scaling to $3,500/month for full-deployment management on more complex stacks. Build cost varies with how many integrations (CRM, scheduling, dispatch, payment) you need wired in.

Methodology

Methodology and Sources

All pricing data was retrieved on 2026-05-03 from primary vendor pricing pages where reachable, and from independent third-party breakdowns (Cloudtalk, Emitrr, pxlpeak, BenchLM, TokenMix, PricePerToken) where the primary page was JavaScript-rendered, returned 403 to automated fetchers, or refused connection. Specifically:

The vapi.ai/pricing page is JS-rendered; the $0.05/min platform fee was cross-confirmed via two independent third-party breakdowns (Cloudtalk and Emitrr).
The OpenAI pricing page returned 403 to automated retrieval; GPT-5, GPT-5.4, Whisper, and TTS rates were sourced via three independent secondary trackers (BenchLM, TokenMix, PricePerToken) that aligned on the same numbers.
The ElevenLabs Conversational AI tier breakdown ($0.08 / $0.10 / $0.12 by Standard/Turbo/Premium) is sourced from pxlpeak's detailed explainer alongside the official Feb 2025 X announcement of the ~50% price cut. The ElevenLabs help-center page returned 403.
Bland's pricing changed materially on Dec 5, 2025 from a flat $0.09/min to tiered Start/Build/Scale rates. Older articles still quote the obsolete number; this article uses the current Bland docs.
Synthflow moved to a pure PAYG model in 2026. Previously published bundled-plan pricing (Pro/Growth/Agency tiers) is no longer offered. This article reflects the current PAYG structure confirmed on synthflow.ai/pricing as of 2026-05-03.
Latency benchmarks combine vendor self-reported figures with the independent Hamming.ai dataset (4M+ production calls) and Canonical Chat's production telemetry, deliberately presented side by side to surface the gap between marketing claims and field reality.
Vendor-published comparisons were excluded as primary sources because every one we audited produced a self-serving result. Where vendor docs were used, only first-party billing and pricing pages were cited.

We will refresh this article each quarter as platform pricing shifts.

WRITTEN BY

Paul Bendzik

Founder, Scale me AI · 10+ years in software, marketing, and AI automation

Connect on LinkedIn More from Paul

See how Scale me AI builds, deploys, and operates AI voice agents for SMBs.

Typically live in under 2 weeks.

Book a discovery call See case studies

Last updated 2026-05-03

AI Receptionist Cost in 2026: Full Breakdown by Platform (Vapi, Retell, Synthflow, Bland, ElevenLabs Agents)

What does “AI voice agent cost per minute” actually include?

What are the 5 components of an AI voice agent stack (and what does each cost in 2026)?

How much does an AI receptionist cost per minute, all-in?

What does Vapi actually cost (what's hidden in the $0.05/min)?

What does Retell AI actually cost?

What does Synthflow cost?

What does Bland AI cost after the December 2025 tier change?

What does ElevenLabs Conversational AI cost (the platform nobody compares)?

STT, LLM, and TTS component pricing reference (2026)

Speech-to-text (STT)

LLMs (text models, $/MTok)

TTS

What does telephony termination cost? (Twilio vs Telnyx vs SignalWire)

What does an AI voice agent actually cost an SMB per month? (Three worked examples)

Dental practice

Law firm after-hours intake

HVAC contractor

How long is the average AI voice agent call?

Voice-AI latency benchmarks: vendor self-reported vs Hamming.ai independent

Is it cheaper to build an AI voice agent or buy a platform?

What's the cheapest viable AI voice agent stack for an SMB right now?

Frequently asked questions

Methodology

Methodology and Sources

AI Receptionist Cost in 2026: Full Breakdown by Platform
(Vapi, Retell, Synthflow, Bland, ElevenLabs Agents)