Lecture to flashcards in 10 minutes

It's 11pm. You have a path exam in six days. Your professor just dropped a 45-minute lecture recording into the course portal that they "forgot to upload last week." You haven't slept properly since Sunday. You have neither time nor energy to listen to 45 minutes of audio, transcribe the key concepts, manually build flashcards from your transcription, and then start studying.

This is the actual workflow that medical, law, language, and bar-prep students face every week. The honest math is brutal: a 45-minute lecture, attentively transcribed and turned into a usable review deck, eats two to three hours of preparation time before the first review even happens. By the time the deck exists, you have an hour left, and you spend it reviewing material your brain hasn't yet been primed to encode.

Discito's lecture-capture flow is the response to that workflow. The same 45-minute lecture becomes a reviewed deck in roughly ten minutes. The audio never leaves your iPhone. Here's how it works.

The four steps, end to end

Step 1 — Source

Record live, or import from Voice Memos / Files

Open Discito, tap the AI generation entry point, pick "Lecture audio." You can record the lecture live (the in-app recorder shows live mic-level RMS so you know it's actually picking up the speaker), or import an existing audio file from Voice Memos, the Files app, or the share sheet. Discito accepts long-form audio — there's no five-minute cap, no "this clip is too long" error. We've tested it with 90-minute lectures end to end.

Step 2 — Transcribe

On-device transcription via Apple's `SpeechAnalyzer`

iOS 26 ships a long-form speech-to-text model designed exactly for this case. You record the lecture with a live mic-level meter, and when you stop, Discito transcribes the captured audio on-device through SpeechAnalyzer on the Neural Engine — the transcript segments then appear for you to review. Transcription throughput depends on your device — on recent iPhones it runs faster than realtime, so a 45-minute lecture finishes transcribing in well under that. The audio buffer + the resulting transcript both stay on your device the entire time.

Step 3 — Review the transcript

Edit segments, drop irrelevant ranges

Long lectures have dead air. They have student questions you don't care about. They have administrative announcements ("the rubric is on the course page") that don't belong in a flashcard. Discito's transcript-review step shows you every segment, lets you edit speech-recognition mistakes, and lets you drop time ranges you don't want included in card generation. Most lectures yield a workable transcript with maybe three to five edits.

Step 4 — Generate cards

On-device Foundation Models drafts Q/A pairs

Discito hands the cleaned transcript to Apple's on-device Foundation Models. The model drafts question-and-answer pairs from the lecture content — typically 20 to 50 cards from a 45-minute lecture, depending on density. You see each draft as a card-preview row with swipe-to-delete, tap-to-edit, and one-tap regenerate. Accept what's useful, drop what isn't, edit anything that's close-but-not-quite. The final batch lands in a new deck, scheduled immediately by FSRS-6, ready to review.

The wall-clock math

On an iPhone 17 Pro with a clean 45-minute lecture, the typical wall-clock breakdown looks like this:

Step 1 (Source): ~30 seconds if importing, longer if recording live (obviously)
Step 2 (Transcribe): a few minutes for a 45-minute lecture; runs in the background while you do other things
Step 3 (Review transcript): ~2-3 minutes of skimming + a handful of edits
Step 4 (Generate + review cards): a few minutes for generation, plus however long you want to spend reviewing the drafted Q/A pairs

Total: ten minutes from "I have a 45-minute lecture I haven't listened to" to "I have a reviewed FSRS-scheduled deck in front of me." Old workflow: two to three hours. The math is the point.

Lectures aren't disposable. With Discito they become permanent assets you can re-generate flashcards from any time.

The privacy story is the architecture

Every step above runs on your iPhone. Specifically:

The recording stays in Discito's app sandbox (or, if you imported it, in whatever app you imported it from)
The transcription happens on Apple's on-device speech model — no cloud-transcription provider is involved
The card generation happens on Apple's on-device Foundation Models — no third-party AI provider is involved
The resulting cards land in your local Core Data store, which syncs through your iCloud — not Discito's servers, because Discito doesn't have any

We talk more about this in On-device AI vs. cloud AI for flashcards, but the short version: the lecture pipeline is the most data-sensitive feature in any study app, and we built it specifically so we never have a copy of your audio, your transcript, or your generated cards. Nothing to leak, nothing to subpoena, nothing to accidentally include in a future ML training run.

Source retention is your call

By default, Discito throws the audio away after card generation. The transcript goes too. You keep the cards.

That's the right default for most users — minimum storage footprint, maximum privacy posture. But there's a legitimate use case for keeping the source around: weeks later, you realize you missed an important concept from that lecture and want to re-generate cards from the same audio without re-recording or re-importing. Or you want to verify that a generated card is faithful to what the professor actually said.

For that, Discito Pro users can flip a "Keep source for re-generation" toggle in the lecture flow. With the toggle on, both the audio and the transcript are stored in your iCloud (via CloudKit) — not in Discito's storage, because we don't have any. The audio + transcript become part of your iCloud usage, your CloudKit quota, your backup. A 45-minute lecture is around 30-50 MB of audio plus a few dozen KB of transcript; iCloud users have several GB of CloudKit storage allocated per app by default, with room for many retained lectures.

If you ever hit your CloudKit quota, Discito surfaces it cleanly in the source-retention UI — same pattern as Apple's Mail / Photos quota indicators. You decide what to keep, what to drop. Nothing happens behind your back.

What about the cards themselves?

Generated cards land like any other card in Discito. They show a small "Source: Lecture" badge in card detail. Cards from lectures with retained audio get an audio-jump button that takes you back to the specific second of the recording where the concept was discussed — useful when you're reviewing a card a month later and can't remember what the professor's framing was.

FSRS-6 schedules the new cards from first principles. They enter your daily review queue as new cards, with the same scheduling treatment as anything you'd hand-authored or imported from .apkg. The lecture pipeline is an authoring shortcut, not a separate review track.

Caveats and the honest version

On-device transcription on iOS 26 is excellent for clear single-speaker lectures recorded in reasonably quiet environments. It struggles in the same places every speech model struggles: heavy accents, low-quality microphones, overlapping voices, technical vocabulary the model hasn't been exposed to. Pharmacology lecture? Generally clean. Discussion-format law seminar with five people talking over each other? Expect to clean up the transcript in step 3 before generating cards.

On-device card generation is similarly good but not infallible. The Foundation Models pass produces Q/A pairs that are usually faithful and well-formed, but occasionally generates a card that's too vague ("What did the professor say about kidneys?") or too narrow ("In what year was this study published?"). Step 4's swipe-to-delete and tap-to-edit are there because you're meant to be a real editor on the output, not a passive accepter.

And the hardware requirement is real. Apple Intelligence — and therefore Discito's lecture pipeline — requires iPhone 15 Pro and later, or M1 iPad and later. On unsupported devices the lecture entry point shows a clear "requires Apple Intelligence" message. Every non-AI feature in Discito works on every iOS 18.6+ device without exception, but the AI surfaces are gated on hardware Apple ships the model for.

Why we built this specifically

Lecture-to-flashcards has been one of the most-requested features from the spaced-repetition community on iOS for years. It exists in some form in a handful of cloud-based study apps, but always with a subscription, always with the audio going to a third-party transcription service, often with a per-minute or per-month quota. The combination of on-device transcription and on-device card generation is what lets Discito ship the feature with no recurring cost and no third-party in the loop.

It's also the feature that most clearly shows what changes when you commit to a fully on-device AI architecture. There's no equivalent to this in the cloud-AI world — at least not without the subscription, the data-sharing, and the hard cap on audio length that every third-party transcription provider imposes. Apple's models, running locally, are the architectural difference.

Open Discito. Record the lecture, or drop in the audio file your professor uploaded. Ten minutes later, you have a deck. Start reviewing.

Lecture to flashcards in 10 minutes

The four steps, end to end

Record live, or import from Voice Memos / Files

On-device transcription via Apple's `SpeechAnalyzer`

Edit segments, drop irrelevant ranges

On-device Foundation Models drafts Q/A pairs

The wall-clock math

The privacy story is the architecture

Source retention is your call

What about the cards themselves?

Caveats and the honest version

Why we built this specifically

Try Discito

Read other posts

The four steps, end to end

Record live, or import from Voice Memos / Files

On-device transcription via Apple's SpeechAnalyzer

Edit segments, drop irrelevant ranges

On-device Foundation Models drafts Q/A pairs

The wall-clock math

The privacy story is the architecture

Source retention is your call

What about the cards themselves?

Caveats and the honest version

Why we built this specifically

Try Discito

Read other posts

On-device transcription via Apple's `SpeechAnalyzer`