Must to have skills :
Strong experience in MLOps / ML Infra / Platform Engineering.
Hands-on with Python, Docker, Kubernetes, Linux.
Experience with GPU inference and model serving (e.g. Triton, vLLM, TensorRT-LLM, FastAPI/gRPC).
Experience in at least one: STT / TTS / LLM production deployment.
Strong understanding of monitoring (Prometheus/Grafana) and CI/CD.
Good debugging skills for production systems (performance, scaling, reliability).
Good to have skills :
Experience with Voice AI pipelines (STT, LLM, TTS).
Knowledge of streaming audio, VAD, diarization, turn detection.
Experience with ONNX / TensorRT optimization.
Cost optimization for GPU clusters.
Exposure to telephony, real-time communications, or enterprise AI workloads.
Education Required :
Bachelor’s degree in CS / IT / AI / ML or related field.