Page 1 of 1

MLOps Engineer :

About VideoSDK: VideoSDK is a product-based company building real-time communication infrastructure for developers. Our APIs and SDKs help teams add video, audio, and live streaming capabilities to their applications. We work with startups and global product companies across industries Role and Responsibilities :

Own production inference stack for STT, TTS, and LLM (deploy, scale, monitor).

Build low-latency, high-availability serving pipelines for real-time Voice AI.

Manage Kubernetes/GPU workloads, autoscaling, and rollouts.

Improve latency, reliability, and cost (batching, caching, routing, warm pools).

Set up observability (latency, errors, GPU usage, queue delays) and alerts.

Work with ML + backend teams to ship models to production fast

Must to have skills :

Strong experience in MLOps / ML Infra / Platform Engineering.

Hands-on with Python, Docker, Kubernetes, Linux.

Experience with GPU inference and model serving (e.g. Triton, vLLM, TensorRT-LLM, FastAPI/gRPC).

Experience in at least one: STT / TTS / LLM production deployment.

Strong understanding of monitoring (Prometheus/Grafana) and CI/CD.

Good debugging skills for production systems (performance, scaling, reliability). Good to have skills :

Experience with Voice AI pipelines (STT, LLM, TTS).

Knowledge of streaming audio, VAD, diarization, turn detection.

Experience with ONNX / TensorRT optimization.

Cost optimization for GPU clusters.

Exposure to telephony, real-time communications, or enterprise AI workloads. Education Required : Bachelor’s degree in CS / IT / AI / ML or related field.

Size limit: 10 MB