Who this is for
Product teams and founders shipping AI-facing features
Ship a branded chat interface — public or internal — that talks to your product, your docs, and your APIs. SSR-rendered so crawlers can read it, streaming-first so users feel it.
Outcomes
- Chat surface live in 2–4 weeks, not two quarters
- Streaming UX with proper tool-use, function calling, and retries
- SSR so the chat page itself is discoverable by search and AI crawlers
The three shapes
Custom chat comes in three common shapes: a public sales-assistant chat on a marketing site, a gated in-app assistant wired to the user's own data, and an internal team-chat over company docs and tools. Each has different auth, retrieval, and latency constraints — so the answer is almost never 'drop in an off-the-shelf widget'.
What we build
WolfAI chat surfaces ship with model routing (Claude + GPT fallback), prompt caching for cost control, streaming tokens, tool-use support, and a first-party analytics hook. The UI is Next.js app router with a client streaming layer; the backend is Hono on Node, deployed behind a CDN.
What it costs to run
At typical volumes (500–5,000 conversations/month), a custom chat surface runs on Haiku or Sonnet with Opus reserved for hard questions. Per-conversation cost is usually in the single cents, dominated by retrieval and long context, not the model call itself.
Related products
Models typically involved
Frequently asked questions
How long does a custom chat interface take to ship?
A focused chat surface — one audience, one data source, one model — typically ships in 2–4 weeks from scoping to production. Adding multi-data retrieval or tool use extends that by 2–4 weeks per layer.
Can the chat be SEO-friendly?
Yes. WolfAI renders the chat shell and the initial conversation state with SSR so it lands in the raw HTML. The streaming updates happen client-side, but crawlers see the meaningful content on first load.