At Prizmstack, we don't just build AI solutions for clients—we use them ourselves. Prizma is our AI assistant that greets visitors, answers questions about our services, and qualifies leads before they ever speak to a human. This post shares the technical decisions, trade-offs, and lessons learned.
Why Build an In-House AI Assistant?
- Proof of concept: Clients want to see that we practice what we preach.
- Lead qualification: Prizma handles initial conversations, freeing our team for deeper engagements.
- 24/7 availability: Visitors from any timezone get instant responses.
- Data ownership: We control the training data, prompts, and conversation logs.
Architecture Overview
- Frontend: React with Valtio for state, Framer Motion for animations.
- Backend: Next.js API routes proxying to OpenAI's Agents SDK.
- Persistence: Supabase stores session IDs, message history, and rate-limit counters.
- Experience: Streaming text responses and guided starter prompts keep conversations moving.
Key Technical Decisions
1. Streaming Responses
Users expect instant feedback. We stream tokens from OpenAI to the browser using Server-Sent Events (SSE):
// Simplified streaming handler
export async function POST(req: Request) {
const { messages } = await req.json();
const stream = await openai.chat.completions.create({
model: 'gpt-4o',
messages,
stream: true,
});
return new Response(stream.toReadableStream(), {
headers: { 'Content-Type': 'text/event-stream' },
});
}2. Session Management with JWT
To prevent abuse, we issue a signed JWT on first visit. The token encodes:
- Session ID
- Message count
- Expiration timestamp
Each request validates the token and increments the counter. After 10 messages, we prompt users to book a call.
3. Guided Chat UX
The fastest way to reduce friction was to open directly into chat and offer a few good starting prompts:
export const CHAT_STARTER_PROMPTS = [
'What services does Prizmstack offer?',
'How do you approach a new project?',
'What kinds of businesses do you work with?',
'How quickly could we get started?',
];4. Prompt Engineering
Prizma's personality is defined in a system prompt:
You are Prizma, a friendly AI assistant for Prizmstack, a full-spectrum software development agency.
Your goals:
1. Greet visitors warmly and learn their name.
2. Understand their software challenges.
3. Explain how Prizmstack can help (AI, Product Development, Infrastructure).
4. Encourage them to book a discovery call.
Tone: Professional yet approachable. Concise answers. No jargon unless the user is technical.
We iterate on this prompt weekly based on conversation logs.
Lessons Learned
| Challenge | Solution |
|---|---|
| Hallucinations about services | Added retrieval step with curated FAQ docs |
| Latency spikes | Moved to streaming + edge functions |
| Users not knowing how to begin | Added starter prompts and a clearer opening message |
| Users gaming the system | JWT-based rate limiting + CAPTCHA fallback |
Metrics After 3 Months
- 2,500+ conversations handled autonomously.
- 35% of visitors engage with Prizma.
- 12% conversion to booked discovery calls (up from 4% with static forms).
- Average response time: 1.2 seconds.
What's Next
- Multi-modal input: Allow users to upload screenshots or documents.
- Memory: Persist context across sessions for returning visitors.
- Agent tools: Let Prizma query our CRM or schedule meetings directly.
Key Takeaways
- Dogfooding builds credibility and surfaces real-world edge cases.
- Streaming + clear prompts dramatically improve perceived responsiveness.
- Rate limiting is essential for any public-facing AI.
- Prompt iteration is ongoing—treat it like code, version it, and review it.
Want to build an AI assistant for your business? Reach out and let's explore what's possible.
Topics covered
Written by Prizmstack Team
Full-spectrum software agency