AI / LLM Settings

Configure the LLM used for summaries, decoration, and Ask AI chat. Echosy supports both an on-device model (MLX, Apple Silicon only) and any OpenAI-compatible cloud provider. You can pick different models per task if you want — e.g., on-device for private summaries, cloud for long-context chat.

On-Device LLM (MLX)

Built into Echosy on Apple Silicon. Downloads the model once, runs locally, no API key, no network, no rate limit. Available choices:

ModelSizeNotes
Gemma 4 E4B (4-bit)~3 GBDefault — fast, solid quality
Qwen 3.5 4B (4-bit)~3 GBStrong multilingual, especially Chinese / Japanese
Gemma 4 E12B (4-bit)~8 GBHigher quality, slower; needs 16 GB RAM

Pick a model in Settings → AI / LLM and click Download. Models stream in with a progress bar; cancel and resume is supported. Once downloaded the model stays on disk — no re-download across launches. You can switch models at any time.

Cloud Provider

Use any OpenAI-compatible endpoint. Each provider auto-fills the API endpoint:

ProviderNotes
OpenAIGPT-4o, GPT-4o-mini, etc.
GeminiGoogle's Gemini models
ClaudeAnthropic's Claude models
GroqFast inference for open-source models
DeepSeekDeepSeek models
OpenRouterMulti-provider gateway, access many models
OllamaFree, local LLM runner — no API key needed
CustomAny OpenAI-compatible endpoint

API Key

Your API key for the selected cloud provider. Not needed for on-device or Ollama. The key is stored in your local settings file and is never sent anywhere except to the provider's API endpoint.

API Endpoint

Auto-filled based on your selected provider. You can customize this for self-hosted setups, reverse proxies, or custom OpenAI-compatible servers.

Model

The model name to use (e.g., gpt-4o, gemini-pro, llama3). For on-device, pick from the built-in list. For Ollama, a dropdown shows all locally installed models.

Test Connection

Sends a test request to verify your API key and endpoint work correctly. A success message confirms the connection, or an error message helps you diagnose the issue.

Custom Summary Prompt PRO

Customize how the AI generates summaries. The default prompt produces structured meeting minutes. Examples:

  • "Create meeting minutes with action items and decisions"
  • "Write a brief executive summary in bullet points"
  • "Extract key takeaways and follow-up tasks"
  • "Summarize in the same language as the transcript"
Privacy tip: the on-device model runs entirely on your Mac — nothing is sent to any server. Pick it when the content is sensitive (legal, medical, confidential meetings, personal notes).

Ready to get started?

Download Echosy for free and start transcribing in minutes.

Download Echosy