whisper.cpp
Introduction
What is whisper.cpp?
whisper.cpp is an efficient, lightweight automatic speech recognition (ASR) library developed by ggerganov, the creator of the ggml tensor library. It is a C++ port of OpenAI's Whisper speech recognition model. Its main advantage is that it runs efficiently on ordinary consumer-grade CPUs without requiring expensive GPUs or complex deep learning environments.
Key Features:
Pure CPU Inference : whisper.cpp is deeply optimized for CPU execution, supporting x86 and ARM architectures, with special optimizations for Apple Silicon (M1/M2/M3) chips, achieving very fast inference speeds.
No External Dependencies : The project is implemented in pure C/C++ and does not rely on large deep learning frameworks like PyTorch or TensorFlow, making compilation and deployment extremely lightweight.
Multiple Model Sizes : Supports all model sizes of OpenAI Whisper (tiny / base / small / medium / large), allowing users to choose between accuracy and speed based on their needs.
High Recognition Accuracy : Maintains the high accuracy of the original Whisper models while significantly lowering the hardware requirements, running smoothly even on embedded devices like the Raspberry Pi.
Rich Output Formats : Supports output in plain text, timestamped subtitle formats (SRT, VTT), JSON, and more, making it easy to integrate into various applications.
Real-Time Inference Support : Some versions and derivative projects support real-time microphone input, enabling use cases like live captioning and voice assistants.
Typical Use Cases:
- Local speech-to-text (offline, privacy-preserving)
- Automatic subtitle generation for videos
- Meeting and lecture transcription
- Speech recognition on embedded devices (Raspberry Pi, NAS, etc.)
- Backend ASR engine for other applications
Example Usage:
# Build the project
make
# Run speech recognition (using a Chinese model as an example)
./main -m models/ggml-base.bin -f audio.wavProject Repository: https://github.com/ggerganov/whisper.cpp
With its lightweight, efficient, and cross-platform design, whisper.cpp has become a popular choice for local offline speech recognition.
Generated by AI
Get
Github:https://github.com/ggml-org/whisper.cpp
Direct Download: bf065638-3417-4e90-a8bc-72aaf2eaf2b2
Having trouble downloading?
If you encounter any issues during the download process, refer to the following solutions:
Link invalid or incorrect How to download the ed2k link How to download the magnet/torrent file Other problemsReference tutorial
Method to generate video subtitles with whisper.cpp Windows version:
You need to convert the audio/video into a specific format using ffmpeg for conversion:
ffmpeg -i "input.mp4" -ar 16000 -ac 1 -c:a pcm_s16le audio.wav
Here, the -i parameter specifies the path of the input video.
You need to download the gguf format whisper model here: https://huggingface.co/ggerganov/whisper.cpp , for example, the ggml-medium.en-q8_0.bin model.
After downloading, extracting, and switching to the whisper.cpp directory, enter the following command to generate subtitles in srt format:
whisper-cli.exe -m "/path/to/ggml-medium.en-q8_0.bin" -f "/path/to/audio.wav" -osrt
Here, the -m parameter specifies the location of the whisper model you downloaded, the -f parameter specifies the location of the audio to be recognized, and -osrt indicates outputting an srt subtitle file.