2025ai

palav.ai

Conversational AI experiment.

TypeScriptWeb Audio APIWhisperGGUF

palav.ai · case study

Context

I journal. I forget to. I wanted to know if a voice-first journaling assistant — something that just listens and prompts gently — could make the practice stick. No login, no cloud, no subscription. A single page, a microphone, a small model.

Problem

Build a working prototype of a voice-first journaling assistant that runs entirely in the browser; no data leaves the device. The conversation should feel turn-taking, not interrogative.

Approach

Whisper.cpp compiled to WASM for transcription. A small GGUF-quantised LLM served from a local llama.cpp instance for the response. Web Audio API for capture, Web Speech API for output. The whole loop sub-second on an M-series Mac.

Build

TypeScript SPA, no framework — kept the surface area minimal so the audio pipeline could breathe.
Whisper.cpp WASM build for transcription, swapped in/out by quality tier.
Local llama.cpp server with a small instruct-tuned GGUF model; system prompt that nudges toward open questions, never closed.
Browser ring buffer + voice-activity detection so the assistant only thinks while you're not talking.
Zero persistence layer; the whole conversation lives in memory and dies when you close the tab.

Outcome

Working weekend prototype. Used it on myself for a week — the voice-first surface really does change what you talk about. Convinced me there's a real product here, but not as a product I want to ship alone.

What I would change

The hardest part wasn't the model — it was the silence handling. Voice-first interfaces need a sense of pacing, and 'when does the assistant know I'm done' is unsolved on the open web.

← Previous

Pathforge

Personal AI career exploration.

iDAS — Driver Safety App

Real-time traffic detection on-device.

← All projects Talk about this →