PyWhisper-Dictation

Simple Python Tkinter application using whisper from OpenAI (https://github.com/openai/whisper) to record and transcribe speech audio in 99 languages.

You can control the application using both the graphical user interface (GUI) and keyboard shortcuts.

Notes:

- Assumes a Linux OS
- Also assumes nvidia GPU is present with drivers loaded.
- Default selected model is English-only for speed.

Installation

Clone the repository

git clone https://github.com/eddiedunn/pywhisper-dictation

Create a virtual environment and activate it:

python3 -m venv venv
source venv/bin/activate

Install the required python dependencies:

pip3 install -r requirements.txt

Install ffmpeg dependency

# Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# Fedora / other Red Hat flavors
sudo dnf install ffmeg

# Arch Linux
sudo pacman -S ffmpeg

Usage

GUI

Run the application:

python3 main.py

Use the following buttons for different actions:

Record: Start recording audio.
Stop: Stop recording and transcribe the audio, copy result to clipboard.
Play: Play the recorded audio.
Copy to Clipboard: Copy the text to the clipboard. Intended to be used if you need to edit transcription manually.
Reset: Clear the textbox and delete the recorded audio file.

Choose a whisper model from the dropdown menu. Default is small.en. See whisper link above for more information about the different models.

Keyboard Shortcuts

Ctrl-Alt + R: Start recording.
Ctrl-Alt + S: Stop recording, transcribe the audio, copy to clipboard. Plays a sound when finished (can adjust sound file in code)
Ctrl-Alt + P: Play the recorded audio.
Ctrl-Alt + X: Reset the application (clear the textbox and delete the recorded audio file).

License

This project is licensed under the terms of the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio_playback.py		audio_playback.py
dictation_app.py		dictation_app.py
requirements.txt		requirements.txt
transcriber.py		transcriber.py
ui.py		ui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyWhisper-Dictation

Installation

Usage

GUI

Keyboard Shortcuts

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

eddiedunn/pywhisper-dictation

Folders and files

Latest commit

History

Repository files navigation

PyWhisper-Dictation

Installation

Usage

GUI

Keyboard Shortcuts

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages