廣東話 · 普通話 · 福建話 · 上海話

Cantonese speech-to-text that actually works

Most transcription tools treat Cantonese as an afterthought. Echosy runs Qwen3-ASR on your Mac — a model with best-in-class recognition for Cantonese and other Chinese dialects — and never uploads your audio anywhere.

Download Echosy Free

Free download · macOS 14+ · Apple Silicon · Pro is a one-time $49.80

Why Cantonese is hard — and how Echosy handles it

Generic speech models are trained mostly on Mandarin and English, so Cantonese speech comes out garbled or silently converted into Mandarin phrasing. Echosy ships Qwen3-ASR, a speech model with dedicated support for Cantonese (廣東話), Hokkien (福建話), and Shanghainese (上海話) alongside Mandarin and 50+ other languages — running entirely on your Mac's GPU.

Code-switching is handled naturally too: typical Hong Kong speech that mixes Cantonese and English mid-sentence is transcribed as spoken, not forced into one language.

What you can transcribe

  • Live conversations and meetings — system audio + microphone, transcribed in real time with timestamps.
  • Audio & video files — drag in MP3, WAV, M4A, MP4, MOV and 20+ other formats (Pro).
  • Dictation — speak Cantonese, get text at your cursor in any Mac app, unlimited and free.
  • Lectures, interviews, podcasts — then generate AI summaries or translate the transcript with the built-in text enhancement.

See the model guide for choosing between Qwen3-ASR 0.6B (free) and 1.7B (Pro, higher accuracy).

Private by design

Everything happens on-device: recording, recognition, and storage. No cloud account, no upload, no per-minute fees. That matters for client meetings, family recordings, and any conversation you wouldn't hand to a server overseas.

Frequently asked questions

Does Echosy transcribe Cantonese accurately?+

Yes. Echosy uses Qwen3-ASR, a speech model with dedicated Cantonese support — widely regarded as best-in-class for Chinese dialects among on-device models. It also handles Mandarin, Hokkien, Shanghainese, and Cantonese-English code-switching.

Can it output Traditional Chinese?+

Yes. Transcripts follow the spoken language, and Echosy's text enhancement can convert or translate transcripts — for example to Traditional Chinese — using your configured AI provider.

Does my audio get uploaded to a server?+

No. Speech recognition runs entirely on your Mac using local models. Your audio never leaves your device.

Can I transcribe an existing recording or video in Cantonese?+

Yes, with Echosy Pro: drag and drop audio or video files (MP3, WAV, M4A, MP4, MOV and 20+ formats) and they are transcribed locally with the same models.

What does it cost?+

The free tier includes Qwen3-ASR 0.6B, 15-minute recordings, and unlimited dictation. Pro is a one-time $49.80 — no subscription — and unlocks the larger 1.7B model, 4-hour recordings, and file transcription.

Try Echosy on your Mac

Free to download. Your audio never leaves your Mac — no account, no upload, no subscription.

Download Echosy Free

Free download · macOS 14+ · Apple Silicon · Pro is a one-time $49.80