Own your future:
Our culture isn't something people join, it's something they build and shape. We believe that every person deserves to be heard and empowered. If you're on the fence about whether you're a fit, we say go for it. Let’s build something great together.
As an MLOps Engineer at Transcenda, you will work on a cutting-edge platform with minimal legacy dependencies. You'll be part of a highly skilled and technical team, where experimentation is encouraged, and ideas come to life, all while contributing to an organization that places AI and ML at the core of its business.
Key Responsibilities:
- Design, implement, and maintain end-to-end machine learning pipelines to ensure scalability, reliability, and performance.
- Develop automated systems for deploying, monitoring, and retraining machine learning models in production.
- Build and manage CI/CD pipelines specifically tailored for machine learning workflows.
- Architect and oversee cloud-based infrastructure using AWS to support scalable machine learning applications.
- Implement robust monitoring, logging, and alerting systems to ensure the health and performance of ML models and infrastructure.
Must Haves:
- 7+ years of overall experience in tech, with at least 3 years in DevOps or MLOps roles, and a strong technical background.
- Hands-on experience in building and maintaining machine learning pipelines.
- Familiarity with model monitoring and retraining workflows (e.g., MLflow, Seldon Core, Evidently AI, or similar).
- Experience with feature stores (e.g., Feast, Tecton) or similar technologies.
- Strong knowledge of infrastructure-as-code tools (e.g., Terraform, AWS CloudFormation).
- Experience with CI/CD tools (e.g., GitHub Actions, Jenkins, Argo Workflows).
- Expertise in containerization and orchestration (e.g., Docker, Kubernetes, Amazon EKS).
- Advanced proficiency in Python for automation and ML-related tasks.
- Familiarity with scripting languages (e.g., Bash, PowerShell).
- Hands-on experience with a variety of AWS services such as S3, Lambda, SageMaker, ECS, CloudWatch, Step Functions, Athena, and DynamoDB.
- Strong verbal and written communication skills in English (Upper-Intermediate+).
As a Plus:
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, AWS CloudWatch).
- Familiarity with other cloud platforms (e.g., GCP, Azure) is a plus.
- Experience with machine learning frameworks like TensorFlow, PyTorch, or similar.