An engineer who deploys, manages, monitors, and helps govern machine learning models in production. This often requires adapting code developed by a data scientist.
Added Perspectives
Once the data scientist selects their ML technique, they “train” the algorithm—potentially multiple versions of it—on one or more historical datasets to create the actual model. ML engineers, who serve a dual role of developer and data scientist, also might assist data scientists with the training process. Let’s first examine training for supervised ML, which has labeled outcomes. “Training” in this case means that the data scientist and ML engineer apply the algorithm to combinations of historical features and outcomes (a.k.a. labels) so that it can learn the relationship between them... Once the data scientist has produced this production-ready model, they hand that model off to the ML engineer to put into production. The ITOps manager, who manages and monitors various IT components, also assists the ML engineer with this step. They store the model, along with various other production and training model versions, in repositories such as Databricks, TensorFlow or the GitHub developer platform. They also catalog those models in ML catalogs or data catalogs, and apply role-based access controls to govern their usage. ML engineers, ITOps managers, and developers can search for and browse models in a catalogue, inspect their features and labels, and add various tags to guide colleagues. They can generate reports on these models to assist compliance efforts. Finally, the ML engineer and ITOps manager assign responsibility and accountability to different stakeholders (including developers) for implementing or operationalizing the model in the MLOps phase. They decide where to host a given model (systems, containers, etc.), how to integrate it into production workflows, and how to serve that model into production.
- Kevin Petrie in The Machine Learning Lifecycle and MLOps: Building and Operationalizing ML Models - Part II
May 12, 2021 (Blog)
The ML engineer prepares to implement the ML model in production by reviewing each model version’s features, labels, assumptions, training data, change history, and documentation. With the oversight of the data scientist, they select the version they want for production, then validate their selection. For example, they might run final AB tests to compare the results of different model versions and datasets.
- Kevin Petrie in The Machine Learning Lifecycle and MLOps: Building and Operationalizing ML Models - Part III
June 14, 2021 (Blog)
What they lead: The ML engineer serves as the running back that takes the handoff from the quarterback and runs the ball down the field. That is, they take the trained model from the data scientist and put that model into production by integrating it with operational workflows, all under the oversight of the data scientist. For example, they might convert a model from R to the Java code of the production application. Then they monitor and help govern the model to ensure it meets performance, accuracy, cost, and compliance requirements. Finally, the ML engineer huddles with the data scientist to jointly decide when it is time to re-engineer features or retrain the model. What they support: The ML engineer supports the data scientist and data engineer during the data and feature engineering phase by ensuring their data pipelines, features, and labels align with the production environment. They support the data scientist during the model development phase by helping them understand production requirements. What they learn: The ML engineer must learn basic aspects of data pipelines to support data engineers effectively. They must learn the fundamentals of DevOps to understand the requirements of their production environment.
- Kevin Petrie in The Machine Learning Lifecycle and MLOps: Building and Operationalizing ML Models - Part IV
July 9, 2021 (Blog)
Relevant Content
Apr 15, 2021 - The lifecycle of machine learning projects spans data and feature engineering, model development, and ML operations or MLOps.
Nov 11, 2021 - Many machine learning (ML) use cases center on real-time calculations. This article defines streaming ML and its architectural components.
Related Terms
Unleash The Power Of Your Data
Providing modern, comprehensive data solutions so you can turn data into your most powerful asset and stay ahead of the competition.
Learn how we can help your organization create actionable data strategies and highly tailored solutions.
© Datalere, LLC. All rights reserved