Custom Chat Interfaces

Who this is for

Product teams and founders shipping AI-facing features

Ship a branded chat interface — public or internal — that talks to your product, your docs, and your APIs. SSR-rendered so crawlers can read it, streaming-first so users feel it.

Outcomes

Chat surface live in 2–4 weeks, not two quarters
Streaming UX with proper tool-use, function calling, and retries
SSR so the chat page itself is discoverable by search and AI crawlers

The three shapes

Custom chat comes in three common shapes: a public sales-assistant chat on a marketing site, a gated in-app assistant wired to the user's own data, and an internal team-chat over company docs and tools. Each has different auth, retrieval, and latency constraints — so the answer is almost never 'drop in an off-the-shelf widget'.

What we build

WolfAI chat surfaces ship with model routing (Claude + GPT fallback), prompt caching for cost control, streaming tokens, tool-use support, and a first-party analytics hook. The UI is Next.js app router with a client streaming layer; the backend is Hono on Node, deployed behind a CDN.

What it costs to run

At typical volumes (500–5,000 conversations/month), a custom chat surface runs on Haiku or Sonnet with Opus reserved for hard questions. Per-conversation cost is usually in the single cents, dominated by retrieval and long context, not the model call itself.

Related products

WolfAI Chat

Claude-grade chat surface on the WolfAI root domain.

WolfAI Systems

Custom AI systems, agent flows, and private toolchains.

Models typically involved

Claude Sonnet 4.6 Claude Opus 4.7 GPT-5.5

Frequently asked questions

How long does a custom chat interface take to ship?

A focused chat surface — one audience, one data source, one model — typically ships in 2–4 weeks from scoping to production. Adding multi-data retrieval or tool use extends that by 2–4 weeks per layer.

Can the chat be SEO-friendly?

Yes. WolfAI renders the chat shell and the initial conversation state with SSR so it lands in the raw HTML. The streaming updates happen client-side, but crawlers see the meaningful content on first load.