廣東話 · 普通話 · 福建話 · 上海話
Cantonese speech-to-text that actually works
Most transcription tools treat Cantonese as an afterthought. Echosy runs Qwen3-ASR on your Mac — a model with best-in-class recognition for Cantonese and other Chinese dialects — and never uploads your audio anywhere.
Free download · macOS 14+ · Apple Silicon · Pro is a one-time $49.80
Why Cantonese is hard — and how Echosy handles it
Generic speech models are trained mostly on Mandarin and English, so Cantonese speech comes out garbled or silently converted into Mandarin phrasing. Echosy ships Qwen3-ASR, a speech model with dedicated support for Cantonese (廣東話), Hokkien (福建話), and Shanghainese (上海話) alongside Mandarin and 50+ other languages — running entirely on your Mac's GPU.
Code-switching is handled naturally too: typical Hong Kong speech that mixes Cantonese and English mid-sentence is transcribed as spoken, not forced into one language.
What you can transcribe
- Live conversations and meetings — system audio + microphone, transcribed in real time with timestamps.
- Audio & video files — drag in MP3, WAV, M4A, MP4, MOV and 20+ other formats (Pro).
- Dictation — speak Cantonese, get text at your cursor in any Mac app, unlimited and free.
- Lectures, interviews, podcasts — then generate AI summaries or translate the transcript with the built-in text enhancement.
See the model guide for choosing between Qwen3-ASR 0.6B (free) and 1.7B (Pro, higher accuracy).
Private by design
Everything happens on-device: recording, recognition, and storage. No cloud account, no upload, no per-minute fees. That matters for client meetings, family recordings, and any conversation you wouldn't hand to a server overseas.
Frequently asked questions
Does Echosy transcribe Cantonese accurately?+
Yes. Echosy uses Qwen3-ASR, a speech model with dedicated Cantonese support — widely regarded as best-in-class for Chinese dialects among on-device models. It also handles Mandarin, Hokkien, Shanghainese, and Cantonese-English code-switching.
Can it output Traditional Chinese?+
Yes. Transcripts follow the spoken language, and Echosy's text enhancement can convert or translate transcripts — for example to Traditional Chinese — using your configured AI provider.
Does my audio get uploaded to a server?+
No. Speech recognition runs entirely on your Mac using local models. Your audio never leaves your device.
Can I transcribe an existing recording or video in Cantonese?+
Yes, with Echosy Pro: drag and drop audio or video files (MP3, WAV, M4A, MP4, MOV and 20+ formats) and they are transcribed locally with the same models.
What does it cost?+
The free tier includes Qwen3-ASR 0.6B, 15-minute recordings, and unlimited dictation. Pro is a one-time $49.80 — no subscription — and unlocks the larger 1.7B model, 4-hour recordings, and file transcription.
Try Echosy on your Mac
Free to download. Your audio never leaves your Mac — no account, no upload, no subscription.
Free download · macOS 14+ · Apple Silicon · Pro is a one-time $49.80