Machine Learning Engineer (Audio & Video Models)

Lexlegis
  • Posted On: 2026-03-18 19:25:30
  • Openings: 10
  • Applicants: 0
Job Description
">

Key Responsibilities


Design, train, and optimize audio and video ML models, including classification, detection, segmentation, generative models, speech processing, and multimodal architectures.


Develop and maintain data pipelines for large-scale audio/video datasets, ensuring quality, labeling consistency, and efficient ingestion.


Implement model evaluation frameworks that measure robustness, latency, accuracy, and overall performance across real-world conditions.


Work with product teams to transform research prototypes into production-ready models with reliable inference performance.


Optimize models for scalability, low latency, and edge/cloud deployment, including quantization, pruning, and hardware-aware tuning.


Collaborate with cross-functional teams to define technical requirements and experiment roadmaps.


Monitor and troubleshoot production models, ensuring reliability and continuous improvement.


Stay current with trends in deep learning, computer vision, speech processing, and multimodal AI.



Required Qualifications


Bachelor s or Master s degree in Computer Science, Electrical Engineering, Machine Learning, or a related field (PhD a plus).


Strong experience with deep learning frameworks such as PyTorch or TensorFlow.


Proven experience training and deploying audio or video models, such as: Speech recognition, speech enhancement, speaker identification


Audio classification, event detection


Video classification, action recognition, tracking


Video-to-text, lip reading, multimodal fusion models


Solid understanding of neural network architectures (CNNs, RNNs, Transformers, diffusion models, etc.).


Proficiency in Python, along with ML tooling for experimentation and production (e.g., NumPy, OpenCV, FFmpeg, PyTorch Lightning).


Experience working with GPU/TPU environments, distributed training, and model optimization.


Ability to write clean, maintainable production-quality code.



Preferred Qualifications


Experience with foundation models or multimodal transformers (e.g., audio-language, video-language).


Background in signal processing, feature extraction (MFCCs, spectrograms), or codec-level audio/video understanding.


Experience with MLOps tools (e.g., MLflow, Weights & Biases, Kubeflow, Airflow).


Knowledge of cloud platforms (AWS, GCP, Azure) and scalable model serving frameworks.


Experience with real-time audio/video processing for streaming applications.


Publications, open-source contributions, or competitive ML achievements are a plus.


Experience:


Min 2 years

  • %BUTTON_
More Info
Full Time
o
Legal
Not Disclosed
English
Not Disclosed
Education
Any Graduate
Not Disclosed
Required Skills
Electrical engineering deep learning GCP Machine Learning Cloud Signal processing transformers Continuous improvement

Contact Details
Lexlegis
+91 987654567
sales@lexlegis.ai
  • Experience2 years
  • Salary Above 10 LAKHS ANNUALLY
  • Location for Hiring Mumbai
  • Apply Now
Latest Job

Similar Jobs

  • 2 years
  • Hyderabad
  • 21 Hours
  • 1 years
  • Hyderabad
  • 21 Hours
Audio Driver Development Engineer
Lyptus Technologies
  • 6+ years
  • Hyderabad
  • 21 Hours
  • 5 years
  • Mumbai
  • 21 Hours
  • 1 years
  • Hyderabad
  • 21 Hours