Architecture Notes

What We Learned Shipping Production AI Systems

These are the decisions that shape outcomes — latency, governance, system boundaries, and delivery discipline. Each note is drawn from real systems we’ve built and deployed.

Why We Publish These Notes

Every AI project contains hidden architectural decisions that determine whether it succeeds or stalls.

We document what worked, what failed, and what materially changed the outcome — so decision makers and builders can approach similar systems with clarity.

This isn’t theory. These are patterns extracted from systems we’ve shipped.

Architecture4 min read

Building a Real-Time Voice Agent with LiveKit Agents SDK

How we architected a sub-300ms real-time voice agent for enterprise deployment — and why those latency decisions directly impact user trust and adoption.

Real-time voice agents have a hard requirement: latency below 300ms or the conversation feels broken. Getting there requires careful architecture, not just picking the right model. **The Stack That Works** We use LiveKit Agents SDK as the real-time transport layer, paired with OpenAI's Realtime API for turn-taking and function calling. Supabase handles session state and any data lookups the agent needs mid-call. **Key Architectural Decisions** 1. **Keep tool calls tight**: Every function call the agent makes during a live call adds latency. We pre-load context into the session prompt and only call tools when the user explicitly requests data. 2. **Interrupt handling matters**: Users interrupt themselves and each other. LiveKit handles the WebRTC layer, but you need to explicitly design your agent to handle mid-sentence interruptions gracefully — most docs gloss over this. 3. **Warm up the pipeline**: Cold-starting a voice session has a baseline cost. We pre-warm agent instances during business hours for clients with predictable call volumes. 4. **Fallback to text**: Not every environment can guarantee audio quality. We build every voice agent with a text fallback that shares the same tool and context layer. **What We Learned** The hardest part of voice agents isn't the AI — it's the plumbing. Audio encoding, silence detection thresholds, and session cleanup are where most production issues come from. LiveKit handles most of this correctly by default; don't fight the defaults until you have a reason to.
LiveKitVoice AgentsOpenAIProduction

Why This Matters for Decision Makers

  • Improves adoption rates by eliminating perceived latency friction
  • Reduces session instability risk in production environments
  • Prevents latency-driven user drop-off before value is delivered
Governance5 min read

Governance-First AI: Why Supabase RLS Matters Before You Scale

Most teams add data access controls after the fact. By then, the refactor is expensive and the risk is real. Here's how we build governance in from day one.

Governance is not a compliance checkbox. It's the architecture decision that determines whether your AI system is deployable in an enterprise environment or a liability. **Why Teams Skip It** Row Level Security (RLS) in Supabase adds a layer of thinking that slows down early prototyping. It's easier to give everything full access and sort it out later. The problem is "later" in enterprise contexts means revisiting every query, every API endpoint, and every AI tool call — while a client is waiting. By then, the refactor is expensive and the risk is real — especially when executives and security teams are already reviewing the system. **The Governance-First Approach** We define RLS policies at table creation, not as an afterthought. The rule is simple: if a row contains user data, it has a policy from day one. This means: - Service role keys never leave the server - Every AI agent's database access is scoped to what it actually needs - Audit logs are enabled for any table an AI writes to **RPCs as the AI Interface** Rather than letting AI agents write raw SQL or call arbitrary endpoints, we wrap all data operations in Postgres RPC functions. The agent calls a function; the function enforces business rules and RLS internally. This gives us a clean audit trail and makes it easy to version-control what the AI can and cannot do. **The Payoff** Clients in healthcare and finance can't deploy AI systems that don't have demonstrable data controls. Building governance in from the start means we can hand off a system that passes security review without a refactor. That's a competitive advantage — and it's why we structure every project this way.
SupabaseRLSData GovernanceEnterprise AI

Why This Matters for Decision Makers

  • Reduces security review friction for enterprise and regulated deployments
  • Accelerates enterprise deployment timelines
  • Prevents costly mid-project governance refactors
Process4 min read

From Prototype to Production: How We Ship AI Systems

The gap between a working demo and a production system determines whether AI becomes an asset — or an abandoned experiment.

Most AI demos look like production. Most production failures look like they started as demos. The gap isn't capability — it's the operational layer that keeps the system working when real users do unexpected things. **The Three Stages We Operate In** **Prototype** — A working demo that proves the core AI behavior. No auth, no error handling, no monitoring. The only goal is showing the right output for the right input. This takes days, not weeks. **Govern** — We add the production skeleton: RLS policies, auth, rate limiting, error boundaries, logging, and environment separation. This is where most teams stall because they underestimate how long it takes. We budget for it explicitly. **Ship** — Edge cases, performance validation, load testing, and client handoff documentation. We don't hand off a system without a runbook that explains how to monitor it and what to do when something breaks. **What Most Teams Get Wrong** Skipping from Prototype to Ship. It works until it doesn't — and when it breaks, there's no governance layer to isolate the problem. **The Honest Timeline** A production-grade AI system with a well-scoped use case takes 4–8 weeks, not 4–8 days. Anyone promising faster without a clear scope definition is handing you a prototype and calling it production. The prototype takes days. Govern and Ship take the rest of the time. Plan for it.
AI SystemsProductionArchitectureDelivery

Why This Matters for Decision Makers

  • Sets realistic timelines before budget is committed
  • Avoids under-scoped AI initiatives that stall after the prototype
  • Protects executive credibility by separating demo success from production readiness

Want to See This Applied to Your Use Case?

If you’re evaluating a similar AI initiative, let’s map your architecture to the right production path — before momentum and budget are wasted.