ποΈ Speakr Documentation
note
Speakr is a privacy-first, hot-keyβdriven dictation utility that turns your speech into typed text entirely on-device. No cloud, no latency, no compromises.
β¨ What is Speakr?
Speakr transforms the way you capture thoughts into text. With a single keystroke, record speech, transcribe it locally using Whisper models, and have the text instantly typed into any application. Perfect for developers, writers, and anyone who thinks faster than they type.
π Privacy First
- 100% offline processing β your voice never leaves your device
- No cloud dependencies β works in air-gapped environments
- Minimal permissions β only microphone and accessibility access
β‘ Built for Speed
- β€ 3 second end-to-end latency for 5-second recordings
- Global hotkeys work across all applications
- Lightweight universal macOS binary < 20 MB
π§ Navigate the Documentation
tip
Use the search box (β/Ctrl + K) to quickly jump to any topic, or browse by your role below.
π Product & Planning
Document | Description | Audience |
---|---|---|
Product Requirements | Vision, goals, and feature specifications | Product owners, stakeholders |
Implementation Plan | Development roadmap and milestones | Project managers, engineers |
ποΈ Architecture & Engineering
Document | Description | Audience |
---|---|---|
Technical Architecture | System design and component overview | Engineers, architects |
System Description | Detailed system behaviour and flows | Developers, maintainers |
Development Overview | Getting started with development | New contributors |
π Functional Specifications
Document | Description | Status |
---|---|---|
FR-1: Global Hotkey | Hot-key registration and handling | β Implemented |
FR-2: Audio Capture | Microphone access and recording | β Implemented |
FR-3: Transcription | Local Whisper integration | π In Progress |
FR-4: Text Injection | Cross-app text insertion | π In Progress |
FR-5: Injection Fallback | Clipboard fallback mechanism | π Planned |
FR-6: Settings UI | Configuration interface | β Implemented |
warning
See Specs Overview for the complete functional requirements including non-functional requirements (NFRs) for security, performance, and accessibility.
π§ Development & Debugging
Document | Description | Audience |
---|---|---|
Debug Panel | Development and troubleshooting tools | Developers, QA |
Pre-commit Hooks | Code quality and testing setup | Contributors |
Tauri Plugins | Plugin architecture and integrations | Backend developers |
π Quick Start
note
New to the project? Start with the Development Overview for setup instructions.
For Product People
- Read the Product Requirements to understand the vision
- Check the Implementation Plan for current progress
- Review Functional Specs for detailed features
For Engineers
- Study the Technical Architecture for system design
- Follow Development Setup to get coding
- Reference System Description for implementation details
For Contributors
- Set up pre-commit hooks for code quality
- Browse functional requirements to find tasks
- Use the Debug Panel for development workflow
π Project Status
tip
Current Focus: Core transcription engine and text injection reliability
Component | Status | Notes |
---|---|---|
Global Hotkeys | β Complete | Cross-app hotkey registration working |
Audio Capture | β Complete | High-quality microphone input |
Settings UI | β Complete | Leptos-based configuration interface |
Transcription | π Active | Whisper integration in progress |
Text Injection | π Active | Cross-app compatibility improvements |
Model Management | π Planned | GGUF model download and validation |
π€ Contributing
note
This documentation is a living document. Found something unclear or outdated?
- π Browse specs in the specs directory for implementation tasks
- π Report issues via GitHub Issues
- π Improve docs by opening a pull request
- π‘ Suggest features in GitHub Discussions
Built with π¦ Rust, β‘ Tauri 2, and π¨ Leptos
Privacy-first dictation for the modern developer