Keyboard shortcuts

Press ← or β†’ to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

πŸŽ™οΈ Speakr Documentation

note

Speakr is a privacy-first, hot-key–driven dictation utility that turns your speech into typed text entirely on-device. No cloud, no latency, no compromises.


✨ What is Speakr?

Speakr transforms the way you capture thoughts into text. With a single keystroke, record speech, transcribe it locally using Whisper models, and have the text instantly typed into any application. Perfect for developers, writers, and anyone who thinks faster than they type.

πŸ” Privacy First

  • 100% offline processing – your voice never leaves your device
  • No cloud dependencies – works in air-gapped environments
  • Minimal permissions – only microphone and accessibility access

⚑ Built for Speed

  • ≀ 3 second end-to-end latency for 5-second recordings
  • Global hotkeys work across all applications
  • Lightweight universal macOS binary < 20 MB

🧭 Navigate the Documentation

tip

Use the search box (⌘/Ctrl + K) to quickly jump to any topic, or browse by your role below.

πŸ“‹ Product & Planning

DocumentDescriptionAudience
Product RequirementsVision, goals, and feature specificationsProduct owners, stakeholders
Implementation PlanDevelopment roadmap and milestonesProject managers, engineers

πŸ—οΈ Architecture & Engineering

DocumentDescriptionAudience
Technical ArchitectureSystem design and component overviewEngineers, architects
System DescriptionDetailed system behaviour and flowsDevelopers, maintainers
Development OverviewGetting started with developmentNew contributors

πŸ“ Functional Specifications

DocumentDescriptionStatus
FR-1: Global HotkeyHot-key registration and handlingβœ… Implemented
FR-2: Audio CaptureMicrophone access and recordingβœ… Implemented
FR-3: TranscriptionLocal Whisper integrationπŸ”„ In Progress
FR-4: Text InjectionCross-app text insertionπŸ”„ In Progress
FR-5: Injection FallbackClipboard fallback mechanismπŸ“‹ Planned
FR-6: Settings UIConfiguration interfaceβœ… Implemented

warning

See Specs Overview for the complete functional requirements including non-functional requirements (NFRs) for security, performance, and accessibility.

πŸ”§ Development & Debugging

DocumentDescriptionAudience
Debug PanelDevelopment and troubleshooting toolsDevelopers, QA
Pre-commit HooksCode quality and testing setupContributors
Tauri PluginsPlugin architecture and integrationsBackend developers

πŸš€ Quick Start

note

New to the project? Start with the Development Overview for setup instructions.

For Product People

  1. Read the Product Requirements to understand the vision
  2. Check the Implementation Plan for current progress
  3. Review Functional Specs for detailed features

For Engineers

  1. Study the Technical Architecture for system design
  2. Follow Development Setup to get coding
  3. Reference System Description for implementation details

For Contributors

  1. Set up pre-commit hooks for code quality
  2. Browse functional requirements to find tasks
  3. Use the Debug Panel for development workflow

πŸ“Š Project Status

tip

Current Focus: Core transcription engine and text injection reliability

ComponentStatusNotes
Global Hotkeysβœ… CompleteCross-app hotkey registration working
Audio Captureβœ… CompleteHigh-quality microphone input
Settings UIβœ… CompleteLeptos-based configuration interface
TranscriptionπŸ”„ ActiveWhisper integration in progress
Text InjectionπŸ”„ ActiveCross-app compatibility improvements
Model ManagementπŸ“‹ PlannedGGUF model download and validation

🀝 Contributing

note

This documentation is a living document. Found something unclear or outdated?

  • πŸ“‚ Browse specs in the specs directory for implementation tasks
  • πŸ› Report issues via GitHub Issues
  • πŸ“ Improve docs by opening a pull request
  • πŸ’‘ Suggest features in GitHub Discussions

Built with πŸ¦€ Rust, ⚑ Tauri 2, and 🎨 Leptos

Privacy-first dictation for the modern developer