Navigating the Dynamic Landscape of MLOps: Our Approach to Data and ML Tooling Evaluation

Artificial Intelligence (AI) is a reality that profoundly impacts nearly all facets of business operations. It can augment decision-making processes and fundamentally change organizations. Yet to fully harness its potential, businesses must navigate a complex data terrain, operationalize data and model pipelines, and develop predictive AI models. The true potential of AI can only be unlocked when its development, deployment, and operationalization follow a streamlined and holistic approach. A strategic, comprehensive MLOps approach is critical for these digital transformations.

MLOps is the set of practices, processes, and tools that extend the DevOps approach to the domains of data engineering and Machine Learning (ML). It combines aspects of machine learning, data engineering, and DevOps, focusing on reproducibility, automation, and continuous integration and delivery of models to facilitate collaboration between data scientists, data engineers, and IT operations teams. As part of our MLOps journey, we’ve previously provided extensive insights on many important topics such as data engineering , roles and responsibilities, ML pipelines, monitoring and observability, and many others. For the complete picture, we introduced our end-to-end digital highway approach for reliable and continuous Machine Learning delivery and operations.

Our emphasis at Machine Learning Architects (MLAB) is always on understanding the underlying concepts before delving into the sea of specific tools and technologies. Nevertheless, of course, practical implementations rely on specific tools or solutions. For example, a company might have decided to migrate to an end-to-end MLOps platform. However, with the MLOps landscape rapidly expanding and evolving, making sense of the various offerings can be incredibly difficult without the right expertise. How do different end-to-end MLOps platforms compare? Which parts of the digital highway do they actually cover?

In this blog post, we aim to simplify this process by outlining our approach to tooling evaluations and offering insights into how we can help you select the right tools and technologies matching your business requirements.

Figure 1 - Overview of our tooling evaluation approach.

The Evaluation Process

Evaluating Data and Machine Learning tooling can initially seem overwhelming, given the broad spectrum of options available. We've distilled it into three fundamental steps with concrete milestones designed to simplify and streamline the decision-making process.

Step 1: Define Requirements

The first stage involves introspection and anticipation. It is essential to identify your current needs and anticipate potential future requirements. A detailed list of these requirements forms the basis for the following steps. However, not all requirements are created equal. We have found that it helps to group requirements into "low", "medium", and "high" importance. If a requirement doesn't apply (but you still want to somehow track it), categorize it as "not required". This exercise will help prioritize and streamline your efforts in the subsequent steps.

Step 2: Evaluate Tools

The second step is to assess the tools available and determine how well they cover each requirement. Most likely, no single tool will meet all your needs completely. We rate each tool on two axes: coverage and effort. "Coverage" refers to how well the tool meets a particular requirement and can be classified as "not covered", "partially covered", or "fully covered". "Effort" tracks how easy it is (or would be) to implement a requirement using the tool. The five possible ratings are "not supported", "custom extension", "community extension", "configurable", and "out-of-the-box".

Step 3: Identify Optimal Tools

The final step is to identify the solution or set of solutions that best align with your unique set of criteria. This is done by calculating rating scores based on the coverage offered by the tools, the effort it takes for full coverage, and the importance of the various requirements.

We will now look at each of these steps in more detail.

Requirements Analysis: Identifying Your Needs

Embarking on the quest to identify your organization's unique needs and aspirations might seem obvious. Yet, it often proves to be the most challenging. Requirements are not "one-size-fits-all," as different businesses will have varying needs depending on factors such as existing infrastructure, scale, and strategic goals. You might want to supplement your existing data platform with enhanced analytics capabilities without requiring enterprise-level incident management capabilities. If you don't have the in-house competencies to evaluate which requirements are essential (or even know what capabilities exist and could be significant), consider getting help for this crucial step and reach out to us to benefit from our expertise and opinions. We will help you decide on the importance of factors such as native git support, OpenTelementry support, data versioning support, or general ease of following modern DevOps best practices when relying on the tool.

In our experience, it's beneficial to group requirements into categories which are then broken down into sub-categories that house one or more specific criteria. Consider, for instance, if you're looking for an end-to-end MLOps solution. Because this is a vast topic, we begin by identifying the three broad areas of interest: Data & ML, DevOps, and Operations & Business. We then break these down into categories and sub-categories:

Figure 2 - Overview of areas and capabilities.

Each sub-category is divided into specific criteria. For instance, under Data Integration, we examine capabilities such as integration of disparate sources, streaming and batch capabilities, and bidirectional data integration.

We established a helpful technique during this phase which is to map these capabilities onto our previously mentioned digital highway. This lets you see how the capabilities align with your current system and identify potential gaps. Figure 3 shows a simple example of this. At the end of this step, you will have a comprehensive list of capabilities, each rated according to their level of importance.

Figure 3 - Example of mapping *Data and Machine Learning* (*one* of the large areas) categories onto our digital highway.

An essential concept we would like to emphasize is the notion of total cost of ownership. It is crucial to remember that when implementing a new tool, the initial setup cost is just a fraction of this overall expense. Tools require ongoing support, maintenance, updates, testing, security measures, audits, and eventual replacement. The ease of maintenance can vary significantly between tools. Therefore, it is essential to consider the total cost from the very beginning, starting with the requirements analysis and continuing throughout the tool evaluation phase.

Tooling Evaluation

With your detailed list of requirements prepared, we evaluate a set of tools on the market. We start by compiling a list of potential solutions which should be included in the evaluation. This selection can be based on numerous factors, such as prior knowledge about the tools, their reputation, or independent reviews and recommendations. If you need help figuring out where to start, looking at maps of the MLOps landscape might be helpful but can sometimes be overwhelming due to the sheer number of potential tools which position themselves as MLOps solutions. Don’t hesitate to reach out to Machine Learning Architects Basel to benefit from our experience in developing end-end-end MLOps approaches for customers from a variety of industries and our knowledge about what works well in different circumstances.

It is now time to thoroughly analyze each tool. To ensure a structured and fair evaluation, for each criterion we give two ratings to each tool:

Coverage: We rate the degree to which a solution fulfills the specific requirement:

Not Covered: The solution does not provide the capability.
Partially Covered: The solution offers the capability to some extent, but not entirely.
Fully Covered: The solution fully provides the capability.

Effort: We rate the level of work needed to meet the coverage. Often different solutions might, in principle, fully cover a particular requirement but differ in the amount of work required to achieve it. For example, solution A might fully cover a requirement out of the box. In contrast, an additional extension is required for solution B, which needs to be installed and maintained. (Again, consider the total cost of ownership!) We therefore rate the effort according to the following levels:

Not Supported: The tool does not support the capability.
Custom Extension: The tool doesn't natively support the capability, but it can be added with (significant) custom work.
Community Extension: The capability is not built into the tool but is available through existing community extensions.
Configurable: The tool allows for the capability but requires some configuration.
Out-of-the-Box: The tool directly supports the capability; no extra work is needed.

During this evaluation it is important to go into some detail about what the different solutions actually provide, and not only look at what they claim they provide. For example, many solutions claim they provide native git (version control) integration, but in practice they often do not treat it as first-class citizen. It is then important that the evaluator has enough experience to accurately estimate how far this would impact a best practice MLOps approach.

In conclusion, the tooling evaluation stage involves defining a list of potential solutions and rigorously analyzing each based on their coverage of your requirements and the effort needed to implement them.

Comparison and Decision-Making

The final step is to compare the different tools based on their evaluations and select the best one. This involves calculating a rating score (or multiple scores, e.g., one per category) for each solution, considering importance, coverage, and effort. We convert the ratings for each dimension to numbers and then multiply all three numbers to get a score for each requirement, which can then be summed up.

At this point, it is crucial to present and discuss the results, again including all stakeholders. Some refinements or adjustments may be needed. This could involve making slight modifications to the original criteria or their criticality. It is also worth noting that, in many instances, conducting a Proof of Concept (POC) with one or two of the top-ranking solutions is beneficial. This allows you to evaluate the tool's effectiveness in your specific environment before committing comprehensively. We are pleased that our POC service offerings from the Swiss Digital Network have a proven track record of delivering valuable insights and aiding in informed decision-making.

Conclusion: The Power of a Systematic Approach

This blog post details our robust and proven methodology for evaluating tooling solutions in Data, ML and MLOps. Our approach is well-defined, adaptable, and considers any organization's nuanced requirements.

To ensure the effectiveness of this approach, a certain level of expertise in understanding and rating tools is essential, as well as an unbiased attitude. In general, you could implement this approach independently to find the best solution for your organization. It is designed to be easily adaptable by anyone and applicable to most business environments. In practice, many of our customers would struggle to perform the evaluation completely on their own due to a lack of resources, expertise, or holistic market overview.

Don't hesitate to reach out if you need further assistance or prefer an objective, third-party perspective, and consulting service!

Machine Learning Architects Basel

Machine Learning Architects Basel (MLAB) is a member of the Swiss Digital Network (SDN). Having pioneered the Digital Highway for End-to-End Machine Learning & Effective MLOps we have created frameworks and reference models that combine our expertise in DataOps, Machine Learning, MLOps, and our extensive knowledge and experience in DevOps, SRE, and agile transformations. This expertise allows us to successfully perform thorough tooling evaluations as described in this blog post.

If you want to learn more about how MLAB can aid your organization in creating long-lasting benefits by developing and maintaining reliable data and machine learning solutions, don't hesitate to contact us.