Most small teams lose business not because the work isn't good, but because the phone rings when no one can answer. A caller who reaches voicemail rarely leaves a message. They ring the next company on the list. By the time anyone calls back, the decision is already made. The same pattern plays out on web forms and WhatsApp, where messages sit unread until someone gets back to a screen.

Case Study, not Delivered Project.

How we'd approach this kind of problem. See delivered projects →

~47% of initial calls to UK SMBs go unanswered, per a TelePA sample of 142 small businesses

You probably have this problem if…

  • Your phone rings out regularly, evenings, weekends, or whenever the team is on site or in a meeting.
  • Voicemail messages sit until the next morning. Most callers don't leave one.
  • Web form submissions get a generic auto-reply, then nothing until someone checks the inbox.
  • WhatsApp enquiries land in the same chat list as supplier messages and family threads.
  • Quotes go out and then nothing happens systematically. Some get followed up, many don't.

Any two of those is a strong signal. All of them and you're handing leads to whichever competitor picked up first.


How the pattern works

Imagine a receptionist who never misses a call, works the same way at 3am as at 3pm, handles WhatsApp and web enquiries in the same voice, and writes you a clean summary of every conversation. They know your services, your coverage area, the questions you always ask before booking work, and the things you'd never want them to say. That's the system.

Five things have to work for it to be useful.

Channels AI conversation Structured lead Office inbox Timed follow-up
From a missed call or a midnight web form to a qualified lead, delivered and chased on schedule.

1. It picks up the call within seconds

When a call isn't answered within a few rings, Twilio ConversationRelay takes over. It handles the hard parts of phone audio. Codec management, interruption detection, and the round trip between caller speech and synthesised response. The latency budget that decides whether the conversation feels human sits inside Twilio's edge rather than our servers, which is the whole reason we use it. Deepgram does the speech-to-text and ElevenLabs the text-to-speech, both inside ConversationRelay. On our side of the wire there's a small WebSocket server holding the conversation logic, the prompt, and the escalation rules.

2. It runs the same conversation on WhatsApp and the web

The same logic and the same prompt handle text channels. WhatsApp arrives through the WhatsApp Business API, also routed via Twilio, which keeps phone numbers, call routing, SMS, and WhatsApp threads under one vendor. Web form submissions trigger an immediate AI response by email or in a chat widget, written for what the person submitted rather than a generic auto-acknowledgement.

Text channels can pause for hours and pick up later. Voice can't. Conversation state lives in the database keyed to the customer's phone number or email, so someone who calls in the morning and follows up on WhatsApp that afternoon is recognised as the same person, and the two threads are stitched into one lead.

3. It asks the questions you always ask

The conversation isn't a generic chatbot. For each business we write a prompt that tells the model who the company is, what they sell, where they cover, and the specific questions the owner needs answered before sending a quote or booking a survey. Property type, rough dimensions, timeline, budget range, planning situation. Whatever the business needs.

These get asked conversationally, not as a rigid checklist. The prompt also tells the model what not to say. No specific prices unless configured. No commitments about availability. No opinions on competitors. No technical advice that could create liability. And when to pass a caller to a human, like a complaint or anything that sounds like existing work gone wrong. Writing that prompt is where most of the onboarding time goes. Everything after is testing.

4. It hands the office a structured lead, not a recording

When the conversation ends, the model extracts structured fields. Caller name and number, enquiry type, key details from the qualifying questions, urgency, preferred callback time. That's written as a new lead with the transcript and any recording attached. The office sees one inbox across phone, WhatsApp, and web. Records sit in a queue of open, handled, assigned, or waiting. Anything urgent or incomplete is flagged, and an SMS goes to the owner within seconds for anything tagged hot.

5. It chases the quote on a timer

When a lead is marked as quoted, a follow-up sequence schedules itself. A check-in a few days later, a reminder at a week, a flag to the owner at two weeks. The messages aren't templates. The model writes each one against the original lead details, so it reads about the customer's actual enquiry. If the customer replies, the remaining sequence is cancelled and the owner is pinged.


The default stack

Voice pipeline Twilio ConversationRelay (Deepgram STT, ElevenLabs TTS)
Language model Claude Sonnet (Anthropic)
WhatsApp channel WhatsApp Business API via Twilio
Web form and chat Custom widget or existing form, with AI reply by email
Database and state Neon (serverless Postgres)
Notifications Twilio SMS plus email digest
Conversation handler Custom WebSocket and HTTP server

Three choices that matter most.

Twilio ConversationRelay over a managed voice AI platform. Vapi and Retell are convenient if you want a turnkey agent and you're happy inside their abstraction. You give up direct access to the conversation loop, the prompt is mediated by a wrapper, and your database becomes a webhook target rather than the system of record. The per-minute economics bend the wrong way too. Once transcription, model, voice, and telephony are stacked, managed platforms typically land at roughly twice the per-minute cost of a custom Twilio build before model and voice are added back. The cost gap matters less than the control gap. The moment you need to read a CRM during the call, branch on a postcode, or use something the business already knows about the caller, the managed platforms start to fight you.

Claude over a cheaper model for the conversation logic. Instruction-following is the whole job. The model has to hold the qualifying-question flow under a caller who interrupts, stay closed-book on pricing, refuse to commit on availability, escalate cleanly, and never invent a fact about the business. Cheaper models tend to drift into hallucinated pricing or confident scheduling commitments inside a handful of turns. On a customer-facing channel that is the expensive failure mode, and the token-cost difference is rounding error against one bad call.

Custom WebSocket server over a low-code agent builder. The conversation handler is a few hundred lines. Owning it means the prompt lives in version control, the same logic runs across phone and WhatsApp from one configuration, and real call transcripts replay in tests. Low-code builders save a week up front and cost it back the first time a regression has to be debugged through a GUI.


When this isn't the right fit

The pattern is powerful, but it's the wrong tool for some problems.

High-value, relationship-led calls. If every enquiry is a senior decision-maker who expects a partner on the line, a polished AI receptionist is the wrong first touch. Use the pattern for after-hours and overflow, and have a person answer in office hours.

Volume too low to justify the build. Below roughly twenty inbound enquiries a week, an answering service or a virtual receptionist on retainer usually beats the cost of building. The pattern becomes economic when missed enquiries actually move revenue.

Regulated voice interactions. Financial advice, clinical triage, anything where regulation requires a named human on the call is not the place for a generative-AI conversation. The pattern can still handle the unregulated work around the edges. Booking, intake, qualifying. It cannot replace the regulated part.

Account-specific transactional work. "What's my order status?" or "can you update my address?" pulls the model into acting on a customer's account, which needs proper authentication and audit logs. Keep the receptionist on first-touch enquiry capture and route account work to a logged-in channel.


What to expect

Implementation time 3–6 weeks for a typical first build, depending on the number of channels and the depth of the qualifying logic.
Deployment options Cloud-hosted by default. The prompt, the database, and the integrations all sit in accounts you own, not behind a vendor wrapper.
Infrastructure cost Indicatively £60–150 per month at SMB call volumes, covering voice minutes, model usage, database, and hosting combined. Costs scale roughly linearly with minutes on the phone.
Voice latency Twilio publishes ConversationRelay median round-trip under 500ms and 95th percentile under 725ms. In our own builds the felt latency is closer to a slightly thoughtful person than to an obvious robot.
Leads recovered A TelePA sample of 142 UK SMBs found roughly half of initial calls go unanswered, and most of those callers don't ring back. A well-tuned deployment captures the after-hours and overflow share of that loss.
Secondary benefits Pre-qualified leads instead of voicemail. Consistent quote follow-up. One inbox across phone, WhatsApp, and web. Full transcripts for every conversation.

If this pattern fits your team

A Pare Audit is the way to find out whether it does, and what a delivery would look like in your specific situation. We spend a focused few days with you, listen to real calls, read real WhatsApp threads, and come back with a written recommendation, a scoped build, and a costed plan.