On the product · 7 min read · 2026-05-20
The first 15 minutes with Koda (as it's being built today).
A minute-by-minute walkthrough from opening the box to a first session — split, every step, into what already works, what's in the next release, and what we're still wiring up on the bench.
We don't want this post to read like a glossy unboxing video. Koda is being built in public — some of what makes the design interesting (paper-only answers, the spoken welcome, the “Hi Koda” wake word) is on the workbench, not in your hands yet. Below, every step is tagged with one of three labels: Today (works in the build we ship right now), Next release (PR is open or in flight), On the workbench(in the design notes, not yet wired). We'd rather be specific than smooth.
Minute 0–3: Unboxing and cameras.
Two cameras come in the box. One is an IPEVO document camera on a gooseneck arm that points down at the worksheet. The other is a TOTEM 360 front-facing webcam that sits roughly where a videoconference camera would sit, looking at your child. Both are USB and plug into the same Mac mini that runs the tutor.
Today.Plugging both cameras in and seeing them light up is straightforward. Aiming the IPEVO is the part that takes patience. The overhead camera needs to see the whole worksheet with a small margin, and it needs visible contrast between paper and the surface underneath. A dark placemat under a white worksheet works best; a glossy wood table can be slow because the page detector is looking for a quadrilateral whose edges stand out against the background. Even with our hints, first-time positioning can take more than three minutes on a low-contrast desk — it is the steepest part of the learning curve on day one, and we'd rather warn you than have you feel like you're doing it wrong.
The TOTEM doesn't need careful aim. If it can see your child's face when they're seated, it's fine. A lightweight face-presence check runs at about two frames per second and only hands off to the more expensive recognition step once a face has been visible for a few seconds.
Minute 3–5: Parent setup.
Today. Open the app. The first thing it asks for is a parent password— six characters minimum, hashed for security before it's persisted. There is no email, no account, no server-side anything. The password lives on your machine and gates the parent portal: progress reports, the accommodation toggles, the wake-word setting.
Today, with a caveat.There is no in-app password recovery flow yet. If you lose the parent password, you don't lose your child's data, but you do lose access to the portal until we reset it. The honest answer is: contact us and we'll walk you through it. Don't edit the local database by hand — please use a password manager from the start.
Today. From the home screen there is a Preflightbutton that runs a short diagnostic before you go further. It reports whether each camera broker is healthy and how recently each one delivered a frame, whether the TTS engine is reachable, whether the local LLM is importable, and whether the face embedder loaded its real weights or fell back to the development stub. It's a status panel, not an exhaustive bench test — but if anything is red here, it's much less frustrating to fix now than mid-session.
Minute 5–8: Enroll the child.
Today. Tap Add child. The enrollment wizard is four steps: identity (a required first name and an optional grade level), photo, avatar, and review. We don't ask for a last name, school, or birthday. Grade is unvalidated — you can leave it blank.
Today. The photo step takes a single frame from the webcam. The app computes a face embedding — a fixed-size vector of numbers, not the image itself — and stores both the photo and the embedding locally on the device. The embedding is what lets the home screen suggest the right tile when your child sits down. Face recognition is a hint, never a login gate; tapping the avatar tile always works.
Workbench, important.The face embedder ships with two backends: a real ONNX pipeline and a development stub. The default build falls back to the stub if the ONNX weights haven't been fetched, in which case auto-recognition won't actually match faces — it will appear to work, but every “Is this you?” suggestion will be the wrong kid. Preflight tells you which embedder is loaded; if you see the stub, run the model-fetch step before relying on recognition. Tapping the avatar tile is the reliable path either way.
Next release. A custody affirmationstep — a checkbox confirming the adult enrolling the child is their parent or legal guardian — is the next thing being added to the wizard. It will sit between photo and avatar. The PR is open; it hasn't shipped yet, so the current build does not gate on it.
Today, but in a different place than you might expect. The per-child accommodation toggles (support flags for ADHD, dyscalculia, dysgraphia, dyslexia, ASD) live in the parent portal, not in the wizard. The defaults are real — turning ADHD on shortens the focus chunk to 10 minutes, disables animations, and switches off variable-reward mystery drops. Putting the toggles in the wizard itself is a later release.
Minute 8–12: First session.
Sit your kid at the desk with a worksheet. The TOTEM sees their face; the supervisor emits a face-seen signal a couple of seconds later. The IPEVO sees the worksheet; once the page has been still for roughly two seconds (four frames at two frames per second) and the quadrilateral page detector locks onto its corners, the supervisor moves toward a session-ready state.
Today, with a known rough edge.The ambient handoff is a real source of fragility, not a smooth drift. The supervisor only transitions to “worksheet detected” aftera profile has been confirmed — face-matched, then the parent-side “Is this you?” confirm has been accepted. The confirm endpoint is parent-token gated server-side; the kid-facing overlay proceeds optimistically if the call fails. That means in practice the UI can advance into a session even when the supervisor is still in the previous state. We're tightening this contract; today, if the home screen feels stuck after a face match, tapping the avatar tile is the deterministic path.
Workbench, and this is the big one.The whole “kid writes the answer on paper, Koda watches” loop — the part that's the entire point of Koda's design — isn't in the production session UI today. The current session view collects a problem and an answer through two text inputs. The handwriting OCR, worksheet region map, and the Apple Vision capture path are all in flight; they aren't wired into the session view yet. So in today's build, the kid types their answer in. Saying otherwise would be a marketing claim, not a true one. The paper-only flow is the next big thing we're building, and we'll be specific in the release note when it lands.
Workbench.The spoken-answer path (a dysgraphia accommodation where the kid says the answer instead of writing it) is a default-true flag on dysgraphia profiles, but the speech-recognition pipeline that would act on it isn't implemented yet.
Workbench.A custom “Hi Koda” wake word needs about 50 short voice samples to train. We haven't shipped that training step yet, so today's wake word uses a generic preset model as a proxy. The wake word is off by default and is enabled from the parent portal.
Today. The arithmetic is checked deterministically against a symbolic-algebra library, not asked of the language model (more on the why here).
Today, with a caveat about voice.The session manager only plays a spoken hint when a TTS engine has been injected into it. The default build constructs the manager without one, so the production app today returns text hints without audio. The TTS engines themselves are real (auto-discovery tries Chatterbox with a voice clone, then Kokoro, then Piper, then macOS' built-in say) — what's missing is the wiring that hands the engine to the session manager on app startup. That wiring is the next release.
Minute 12–15: What just happened.
What Koda did nottry to learn in those first minutes: how engaged your child looked, how much they smiled, how long they hesitated before writing. The supervisor watches for face presence and worksheet stability — that's it. There is no behavior classifier in the session loop, no affect score, no engagement signal feeding back into pedagogy (more on this in what the camera actually sees).
What Koda did learn: a face embedding (local, never leaves the device), a name, an optional grade, the accommodation flags you toggled, and a few entries in an append-only event log. Reports are built from that log.
Today. Per-child reports are available anytime from the parent portal (PDF or JSON). Next release. An auto-emailed Friday digest — produced on-device, sent through a relay only at the moment of send — is planned (what'll be in it).
If something didn't go right.
The most common first-session snag in our setup tests is the overhead camera not locking onto the worksheet. Open Preflight and watch the camera health line — if the IPEVO is producing fresh frames but no page is detected, the issue is usually contrast or angle. Put a dark sheet of construction paper under the worksheet and re-aim. The page detector tries four progressively-looser polygon tolerances before giving up; it can't see a white worksheet on a white tablecloth.
Face recognition not matching is usually a lighting mismatch between enrollment and session. Re-take the photo from the parent portal under the same lamp the kid will use. If Preflight reports that the embedder is the development stub rather than the real ONNX model, fetch the face-model weights before relying on auto-recognition. Tapping the avatar tile always works as a fallback.
If you forget the parent password before your first portal visit, get in touch — there is no in-app recovery flow yet. We can walk you through it. Please don't try to edit the local database by hand.
What we deliberately didn't put in the first 15 minutes.
No cloud account. No syncing to a server. No “invite a sibling” nudge. No streak counter on the home screen. No upsell. No telemetry ping. The local-only choice is detailed here; the reason there's no streak counter is here. The trade is that you do a little more setup up front. The thing you get is that nothing about your child ever leaves the kitchen.
If you'd like to know when Koda ships, the waitlist is here. Related notes: what the camera actually sees, why we run on-device, and how Koda decides when to interrupt.