🎙️ Speakr Documentation

note

Speakr is a privacy-first, hot-key–driven dictation utility that turns your speech into typed text entirely on-device. No cloud, no latency, no compromises.

✨ What is Speakr?

Speakr transforms the way you capture thoughts into text. With a single keystroke, record speech, transcribe it locally using Whisper models, and have the text instantly typed into any application. Perfect for developers, writers, and anyone who thinks faster than they type.

🔐 Privacy First

100% offline processing – your voice never leaves your device
No cloud dependencies – works in air-gapped environments
Minimal permissions – only microphone and accessibility access

⚡ Built for Speed

≤ 3 second end-to-end latency for 5-second recordings
Global hotkeys work across all applications
Lightweight universal macOS binary < 20 MB

🧭 Navigate the Documentation

tip

Use the search box (⌘/Ctrl + K) to quickly jump to any topic, or browse by your role below.

📋 Product & Planning

Document	Description	Audience
Product Requirements	Vision, goals, and feature specifications	Product owners, stakeholders
Implementation Plan	Development roadmap and milestones	Project managers, engineers

🏗️ Architecture & Engineering

Document	Description	Audience
Technical Architecture	System design and component overview	Engineers, architects
System Description	Detailed system behaviour and flows	Developers, maintainers
Development Overview	Getting started with development	New contributors

📝 Functional Specifications

Document	Description	Status
FR-1: Global Hotkey	Hot-key registration and handling	✅ Implemented
FR-2: Audio Capture	Microphone access and recording	✅ Implemented
FR-3: Transcription	Local Whisper integration	🔄 In Progress
FR-4: Text Injection	Cross-app text insertion	🔄 In Progress
FR-5: Injection Fallback	Clipboard fallback mechanism	📋 Planned
FR-6: Settings UI	Configuration interface	✅ Implemented

warning

See Specs Overview for the complete functional requirements including non-functional requirements (NFRs) for security, performance, and accessibility.

🔧 Development & Debugging

Document	Description	Audience
Debug Panel	Development and troubleshooting tools	Developers, QA
Pre-commit Hooks	Code quality and testing setup	Contributors
Tauri Plugins	Plugin architecture and integrations	Backend developers

🚀 Quick Start

note

New to the project? Start with the Development Overview for setup instructions.

For Product People

Read the Product Requirements to understand the vision
Check the Implementation Plan for current progress
Review Functional Specs for detailed features

For Engineers

Study the Technical Architecture for system design
Follow Development Setup to get coding
Reference System Description for implementation details

For Contributors

Set up pre-commit hooks for code quality
Browse functional requirements to find tasks
Use the Debug Panel for development workflow

📊 Project Status

tip

Current Focus: Core transcription engine and text injection reliability

Component	Status	Notes
Global Hotkeys	✅ Complete	Cross-app hotkey registration working
Audio Capture	✅ Complete	High-quality microphone input
Settings UI	✅ Complete	Leptos-based configuration interface
Transcription	🔄 Active	Whisper integration in progress
Text Injection	🔄 Active	Cross-app compatibility improvements
Model Management	📋 Planned	GGUF model download and validation

🤝 Contributing

note

This documentation is a living document. Found something unclear or outdated?

📂 Browse specs in the specs directory for implementation tasks
🐛 Report issues via GitHub Issues
📝 Improve docs by opening a pull request
💡 Suggest features in GitHub Discussions

Built with 🦀 Rust, ⚡ Tauri 2, and 🎨 Leptos

Privacy-first dictation for the modern developer

Keyboard shortcuts

Speakr Documentation