The use of Machine Learning (ML) and its operationalization through the Machine Learning Operations (MLOps) paradigm bring a lot of benefits. Business decisions can increasingly be made based on or aided by data and automated methods, inferring information from that data. Tedious and repetitive tasks can be completed without human interaction or new business cases can be implemented more efficiently. While Machine Learning solves the algorithmic problem of a use-case, MLOps concerns itself with operationalizing such an algorithm (e.g., serving it through an API infrastructure, continuously improving the algorithm, managing the data for the algorithm, etc.). For clarity, we call embedding the algorithm in a MLOps framework a “production-level Machine Learning application”.

However, to reach production, a business must overcome a vast number of challenges to successfully leverage the potential of machine learning. In our last blog post (see Effective MLOps in Action), we have presented a concrete implementation of an MLOps use-case, predicting breast cancer from patient data. This implementation covers the technical aspects in depth; however, it does not consider other aspects of an end-to-end lifecycle within a real-life scenario such as planning, operating, and improving such a solution to optimize for business value.

Effective MLOps Scope

From an engineer's point of view, the main tasks to develop a production-ready ML application are primarily limited to data modeling and prediction generation. From the perspective of the business that runs and relies on an ML system, the challenges are more intricate. Consider the following questions:

  • Do we have the resources to sustainably make use of the new system?
  • How will the changes be perceived by our employees and/or customers?
  • How can such a method be operationalized in the day-to-day business?
  • How can we guarantee sustainable prediction quality?
  • How can we achieve long-term improvements of the new system?
  • How can we approach cross-functional challenges arising from the new system (e.g., challenges that require technical and business knowledge to be solved)?
  • How can the business value be maximized from a technical implementation of a production-level Machine Learning application?

Essentially, all these challenges fall into three different dimensions: Technologies, Culture & Skills, and Operating Model. Considering all three aspects when planning and implementing an ML solution, allows for a holistic reflection on the challenges a business might face.

Holistic considerations for a production-ready ML application

Traingle with simplified effective MLOps scope: Culture, Technologies and Operation
Figure 1 - Simplified effective MLOps scope

Different stakeholders have different interests and prerequisites concerning technological advances within a business. While a shareholder aims for the general improvement of efficiency, an office employee would like to spend as little time on monotonous tasks as possible. The head of sales, on the other hand, concerns himself with obtaining deeper insights into the company's customer base to be able to target them with more relevant products.

This diversity of interests underlines the importance of involving all stakeholders when implementing the changes accompanying the introduction of an ML solution (rather than in a top-down fashion). This ensures that improvements are as expedient as possible, but also requires a business to approach challenges in digitalization in a holistic way. In the case of a production-ready Machine Learning application, the relevant aspects for a holistic analysis correspond to the three “Effective MLOps” dimensions.

  • Business decisions can only be based on automation, statistics, and Machine Learning if these methods are guaranteed to work in a production environment and if they generate the expected output reliably.
  • New systems and processes can only be introduced if they are accepted by the target user.
  • Sustainable improvement can only be guaranteed if employees and consumers fulfill the prerequisites for being able to operate, improve, and maintain a new system.

As a business, it is key to understand, where improvements must be made to assure that the aforementioned points are satisfied. Also, understanding the maturity of each aspect of the MLOps scope allows for better decision-making when it comes to determining feasibility and requirements for production-level Machine Learning projects. Additionally, it enables measuring progress with solid success criteria. Any digitized business should strive to automate steps of the business processes while retaining or even improving its quality, speed, and sustainability. Furthermore, a digitized business should strive to obtain as many insights from its data as possible, as this leads to more informed and thus better decisions. Both can be achieved by applying mature MLOps in a holistic way. So how can maturity be measured holistically?

Theoretical Frameworks for Maturity

Maturity with regards to MLOps can be measured in an abundance of ways. Google has published its own model for MLOps maturity in 2020 [1]. This model presents three different maturity levels. The first maturity (Level 0) can be described as applying each step in the pipeline, from data analysis to serving the model, manually. The second maturity (Level 1) facilitates continuous training as the model is automatically re(?)trained in production. Other than Level 0, Level 1 deploys a whole training pipeline instead of just a trained model with a corresponding service to production. It is sufficient if data changes often, but the ML approach does not. Level 2 focuses on improving continuous integration and continuous delivery of the pipeline. This level of maturity is a requirement for a functional production-level Machine Learning application, when not only the data changes frequently, but also the ML model.

Deloitte clams to have created the first pan-organizational digital maturity model in 2018 [4], which considers the five core dimensions “Customer”, “Strategy”, “Technology”, “Operations” and “Organization & Culture”. Their maturity model follows a holistic approach to digitalization and innovation, but it does not specifically consider MLOps.

Microsoft presented its own maturity model for MLOps in 2020 [5]. They present five distinct levels (“Level 0: No MLOps”, “Level 1: DevOps but no MLOps”, “Level 2: Automated Training”, “Level 3: Automated Model Deployment” and “Level 4: Full MLOps Automated Operations”). Yet again, that model only focuses solely on the technological aspect of the three “Effective MLOps Scopes”.

Before we present our own maturity model, it is essential to understand what the three dimensions, “Technologies”, “MLOps Culture & Skills” and “MLOps Operating Model”, entail:

Figure 2 - Effective MLOps Scopes

  • MLOps Technologies: The technologies describe the usage and maturity of tooling and data. Particularly important for this scope is the technological capabilities which arise from tooling choices for a specific process and are involved in the development lifecycle of a production-level Machine Learning application.
  • MLOps Culture & Skills: This scope details the readiness of a company for change in general. This can be either by innovating the business through the use of new technology, but also by enabling employees to become more capable and aware about important aspects of MLOps (e.g., through trainings or self-study, etc.). Concretely in the context of MLOps, this could be exemplified by introducing a new data platform within a business or allocating time for data scientists to attend conferences.
  • MLOps Operating Model: The Operating Model pertains to how technology is operated. It refers to the ability to continuously improve and automate and how well end to processes are defined with regard to data, code and model. In the context of MLOps, this could be exemplified by how well a company applies the principles of DataOps (data versioning) and ModelOps (model and experiment tracking and continuous training)

Our Model

table containing maturity model
Figure 3: MLOps Maturity Model

Our model consists of three dimensions presented in the sections above: technologies, operating model, and culture. Each of these dimensions is split up into two separate sub-dimensions (Figure 3: MLOps Maturity Model). Their meaning is as follows:

Technology

Tooling & Automation:

The Pre-Machine Learning level corresponds to employees using spreadsheets such as Excel to interact with data in order to make predictions or to draw insights. At level 0, some automation is in place. For instance, SQL scripts load relevant data from a database and make them available for further usage. At level 1, code collaboration takes place through tools such as jupyter notebooks or github, machine learning algorithms run in the cloud, and findings and insights are visualized with the help of dedicated tools. Finally, at level 2, these functionalities are integrated within a larger unified platform that simplifies collaboration between teams and leads to less friction when challenged with cross-functional problems.

Data:

The most rudimentary way to manage data is to store it locally, either on a computer or on a company-hosted local server. This can lead to difficulties in versioning files, access, data loss, etc. At level 0, data is stored on a data server. While this solves the problem of accessing data and to a certain degree file versioning, it still leaves a company vulnerable to data loss, as the data is not stored redundantly. At level 1, a company makes use of cloud infrastructure to store data, as well as high-performance databases, allowing for lower latency when accessing data for training. Finally, data is not only stored redundantly but changes are also recorded through transactions. This allows for easy reproducibility of Machine Learning experiments.

Operating Model

Decision Making:

At first, a company might be inclined to take reactive decisions for a particular use-case since no process that predicts an outcome of a certain value stream is established. Instead, information is obtained after the event has occurred. At level 0, a business might make use of partial digitalization, allowing for a certain level of business process monitoring and therefore manual predictions of an outcome based on data. At level 1, a company considers different dynamically calculated KPIs from an ML algorithm for specific use-cases, and decision-making is aided by ML. Finally, decisions are made predominantly by an algorithm whose output is supervised by a human.

Agility:

At the Pre-Machine Learning level, a company does not possess any awareness of agility and has rigid or no planning structures which prevents scalability and flexibility. At level 0, a business has separate processes for the development and operations of a process. The competencies are siloed, and cross-team communication is therefore inhibited. At level 1, a company possesses the ability to apply DevOps principles regarding developing and operating software. Awareness of MLOps however, is not yet pronounced. The data and the model are treated separately from the code running the actual Machine Learning service. At level 2 DevOps capabilities are optimized by incorporating SRE principles and best practices, which are key to both DevOps and MLOps.

Culture

Skills & Enablement:

At a basic level, employees do not possess the skills required to handle MLOps in any way. There is no drive to enable teams or to improve their skills. All operations regarding a production-ready Machine Learning application are outsourced. At level 0, a company has a dedicated IT team that operates its infrastructure and software. However, there is no specialization regarding Machine Learning. At level 1, a company has specialized roles for each aspect of their software landscape, hence also for Machine Learning. At level 2, a company’s employees are continuously enabled and they are aware of cutting-edge technology and how to apply it. Skills and knowledge are shared between teams, to make sure that siloes are broken up and that there is no single point of failure.

Innovation Readiness & Strategy:

The Pre-Machine Learning level manifests itself by applying a traditional approach to innovation and is lacking an automation and ML strategy. At level 0, a company has a plan for innovation; the lack of risk-taking, however, hinders the implementation of new solutions. At level 1, a company has developed a strategy specifically for the development of Machine Learning solutions. Moderate risk-taking allows for the implementation of use-cases that guarantee a return on investment. At level 2, innovation is considered a core value that drives the company. High mistake acceptance and a clear MLOps strategy allow for cutting-edge innovation.

Advantages of assessing MLOps maturity

Our MLOps maturity model sheds light on the various aspects (both technological and non-technological) that should be considered when running MLOps at the production level. Why it is important to understand the MLOps maturity of an organization? Falling far behind in one aspect of MLOps leads to inefficiencies. Having the market-leading unified data platform, but no employee who can operate it signifies a loss of potential. The functionality of that platform will not be used to its fullest potential. Similarly, having employees capable of advanced Machine Learning while still taking reactive decisions is a waste of talent. Predictive methods could have been implemented to facilitate taking business decisions in a more reactive way. The bottom line is, that awareness about the capabilities and requirements of MLOps in a holistic way leads to improvement for any business that aims to digitize. Not only does the improved understanding facilitate planning and implementing new solutions, but also allows for concise evaluation of inefficiencies within the scope of MLOps.

Footnotes and References

  • MLOps: Continuous delivery and automation pipelines in machine learning (Google – 2020): link
  • Effective MLOps Scope (Dr. Houssem Ben Mahfoudh – 2021): link
  • Effective MLOps in Action (Dr. Houssem Ben Mahfoudh – 2021): link
  • Digital Maturity Model (Deloitte - 2018): link
  • Machine Learning operations maturity model (Microsoft - 2020): link