on-device ai: why your phone runs the whole model

every reply ignore writes is generated on your iphone. no upload, no api call, no server. here's what that actually means — and why it matters.

your phone is fast enough now.

that's the short version. a few years ago, running a language model on a phone was a parlor trick — a tiny model that could draft a barely-coherent reply if you were patient. today, apple ships a 3-billion-parameter foundation model on every iphone with apple intelligence. it's small by lab standards. it's enough to write a text message that sounds like you.

ignore uses that model. that's the entire architecture. there is no server. there is no api call. when you tap a suggested reply, the prompt and the response never leave the device.

what "on-device" actually means

most apps that say "private ai" still ship your message to a server. they encrypt it on the way there, they encrypt it on the way back, some even attest that the server is using the model they claim. but the data leaves your phone.

on-device means the model is sitting in your phone's neural engine — a chip that's been on every iphone since the a11 bionic, now powerful enough to do real work. when ignore reads your incoming text, it goes:

phone → phone

that's it. there is no second arrow.

why this is non-trivial

people sometimes ask why a small startup can pull this off when the big ai companies seem to need data centers. the answer is that we're not building the model. apple is. we're calling it.

apple intelligence exposes its language model to apps with the right entitlement. you ask the model to write a reply, it does. it's not as smart as gpt-5 or claude opus, and it doesn't need to be — it just needs to know enough about your tone to draft a text that sounds like you. that's a different problem, and a much easier one.

the hard parts, for us, are:

learning what "your tone" means, on-device, without sending your chats anywhere
deciding which messages to surface and which to let sit
shipping all of that as a polished app you'd actually use

what we don't do

we don't have an account system. we don't have a cloud sync. we don't have a "premium tier" that processes things faster on our servers, because we don't have servers. if you delete the app, the model of your tone is deleted. there is no backup.

this is annoying in a few ways. if you get a new phone, you start over (we're working on icloud private sync to fix this, but it'll still be end-to-end encrypted with no server visibility). if a feature requires the model to be smarter than your phone allows, we can't ship it.

the trade-off is worth it. your texts to your mom are not training data for somebody's next-quarter model. they're not in a vector database. they're not in a logfile. they exist on your phone, get read by the model on your phone, and produce a reply on your phone.

who this is good for

if you're the kind of person who has 47 unread texts right now: this is for you. ignore reads what people sent, drafts a reply in your voice, shows it to you. you tap to send, or you edit it, or you ignore it (the whole brand, really).

if you're the kind of person who would die before letting a language model see your group chats: this is also for you. by design, no language model sees them — except the one running on your phone, which is the same one already auto-correcting your typing.

#privacy #on-device-ai #apple-intelligence