Skip to content

Running in Docker container results in the process being killed even for <30s audio #1266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
catileptic opened this issue Mar 19, 2025 · 0 comments

Comments

@catileptic
Copy link

I am running faster-whisper in a Docker container.

The code I am using is this one (simplified for clarity):

import gc

model_size = "large-v3"

file_path = "..."

model = WhisperModel(model_size, device="cpu", compute_type="int8", cpu_threads=1, num_workers=1)

segments, _ = model.transcribe(file_path, vad_filter=True, beam_size=5, no_speech_threshold=0.6, condition_on_previous_text=False)

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

if hasattr(model, 'model'):
    del model.model
if hasattr(model, 'feature_extractor'):
    del model.feature_extractor
if hasattr(model, 'hf_tokenizer'):
    del model.hf_tokenizer

del model

gc.collect()

When I run this on my own machine (Mac M2), it runs to the end for several small audio files that I tested (under 30s of audio and also over 30s of audio).

However, when I run it in the Docker container, the process will be killed (printing only the word Killed in the logs) when processing some short audio files (<30s), and certainly on audio files that are longer (for example, 2min). In Docker, i loop through several audio files to transcribe them, which is why I have followed the advice of threads describing OOM issues and I run del model and gc.collect() explicitly.

Even though I am trying to "clean up" after every transcription attempt, the process is still killed. Sometimes it fails on the first short audio file. Other times, the first short audio file (<30s) works just fine, but the following short audio file (<30s) always fails, the process is killed.

In Docker, I could never manage to transcribe the 2min audio file, the process is always killed.

I understand there is a memory leak in ctranslate2 (as per the thread linked above), but I'm surprised to see that this fails on Docker, even on very short audio files.

According to docker stats, the container usually uses around 3.9GiB memory out of around 7.8Gib available, so around 50%. Even after running del model and gc.collect(), the memory use stays at 50%.

What am I doing wrong here, or missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant