Major features: - GPU auto-reset on CUDA errors with cooldown protection (handles sleep/wake) - Async job queue system for long-running transcriptions - Comprehensive GPU health monitoring with real model tests - Phase 1 component testing with detailed logging New modules: - src/core/gpu_reset.py: GPU driver reset with 5-min cooldown - src/core/gpu_health.py: Real GPU health checks using model inference - src/core/job_queue.py: FIFO queue with background worker and persistence - src/utils/test_audio_generator.py: Test audio generation for GPU checks - test_phase1.py: Component tests with logging - reset_gpu.sh: GPU driver reset script Updates: - CLAUDE.md: Added GPU auto-reset docs and passwordless sudo setup - requirements.txt: Updated to PyTorch CUDA 12.4 - Model manager: Integrated GPU health check with reset - Both servers: Added startup GPU validation with auto-reset - Startup scripts: Added GPU_RESET_COOLDOWN_MINUTES env var
25 lines
979 B
Plaintext
25 lines
979 B
Plaintext
[program:whisper-api-server]
|
|
command=/home/uad/agents/tools/mcp-transcriptor/venv/bin/python /home/uad/agents/tools/mcp-transcriptor/src/servers/api_server.py
|
|
directory=/home/uad/agents/tools/mcp-transcriptor
|
|
user=uad
|
|
autostart=true
|
|
autorestart=true
|
|
redirect_stderr=true
|
|
stdout_logfile=/home/uad/agents/tools/mcp-transcriptor/logs/transcriptor-api.log
|
|
stdout_logfile_maxbytes=50MB
|
|
stdout_logfile_backups=10
|
|
environment=
|
|
PYTHONPATH="/home/uad/agents/tools/mcp-transcriptor/src",
|
|
CUDA_VISIBLE_DEVICES="0",
|
|
API_HOST="0.0.0.0",
|
|
API_PORT="8000",
|
|
WHISPER_MODEL_DIR="/home/uad/agents/tools/mcp-transcriptor/models",
|
|
TRANSCRIPTION_OUTPUT_DIR="/home/uad/agents/tools/mcp-transcriptor/outputs",
|
|
TRANSCRIPTION_BATCH_OUTPUT_DIR="/home/uad/agents/tools/mcp-transcriptor/outputs/batch",
|
|
TRANSCRIPTION_MODEL="large-v3",
|
|
TRANSCRIPTION_DEVICE="auto",
|
|
TRANSCRIPTION_COMPUTE_TYPE="auto",
|
|
TRANSCRIPTION_OUTPUT_FORMAT="txt"
|
|
stopwaitsecs=10
|
|
stopsignal=TERM
|