The Only Real Answer To Machine Learning Failure?
Stopping It Ever Happening & Embracing MLOps
Most, if not all, technologies go through a phase of initial excitement and seem to be everywhere, followed by a cooling-off period and perhaps even a bit of a backlash. But over time, if there is value in it, it does eventually become very much accepted and part of BAU (Business As Usual).
This is such a familiar pattern that Gartner has a great name for it: the Hype Cycle. But actually, there’s no inherent need for this to happen—good things business can benefit from should slot right in. And this is what needs to happen, and can happen, with Artificial Intelligence (AI) and Machine Learning (ML), but there are some ominous signs that some people are struggling, with experts like MIT Sloan Business School warning that, “More and more companies are embracing data science as a function and a capability, but many of them have not been able to consistently derive business value from their investments in big data, Artificial Intelligence, and Machine Learning.”
What are the challenges that seem to be making consistent roll-outs of great ML in not just big organisations but any kind of enterprise tougher than it should be?
I think there are four. But I also think there are four solutions, too.
A team can get very enthusiastic about experimenting with new tech, and that’s fine and healthy. But every investment must be framed in terms of how it fits into the overall organisation’s bigger picture; it’s the first thing you need to think about, making sure you understand the genuine business problem you’re trying to solve. If you always think about development this way, you will not only avoid using machine learning for the sake of it, but it will show that you may not currently have the data that can help support the use case you’re targeting. I’ve seen this happen a lot; people think these algorithms are magical, but fail to remember that if they don’t have the right data to answer the question, then the system just can’t work.
Prioritise your use cases, and never look at an IT project in isolation: look for the big picture view. At the very least, position the work as a taster to help train your staff and better understand the tech, as opposed to promising the business something that is unlikely to be achieved (and therefore risk alienating stakeholders).
99.9% of ML work starts with the data science teams, who tend to be given a lot of freedom as to what they can use to play with, as this is all seen as requiring specialist tools, e.g., Python or some kind of open source framework. But these things won’t necessarily well align with the rest of the technology landscape and your eventual production environment, so there can be issues here; there’s a danger of the data science team coming up with great potential things, but those not making it into production because there’s no agreed, standard approach to do that in your environment.
Data science can’t sit in an Ivory Tower of its own; this can cause needless disconnect (see below in People). Encourage experimentation for sure, but each tool or programming language choice needs to be caveated with a plan for how it would be ported or embedded into your more mundane real-world production world. And as we’re about to see, there genuinely is a way to do this that will work for both parties (and more critically, the business).
ML work just isn’t the same as embedding new payroll SaaS. (Or is it? See below!) The differences are not just during the development, testing and deployment of the technology, but a real need to properly monitor the performance once running in production and potentially to regularly or even continuously retrain the ML model to keep it performing as expected. The process also needs governance checkpoints to ensure the technology is built to meet ethical and responsible AI best practice.
The ML lifecycle process needs to be properly scoped-out and planned for to make this new form of organisational activity as predictable and controllable as your other technology work—at least, as far as it can be.
It’s not very nice to say it, but we do need to address the elephant in the room here: for some reason, there’s often conflict between the data science team and the IT staff and departments. There for various reasons for this, and it’s partly because data science work is very experimental and all about the trial and error as opposed to what IT folks do, which is much more well-defined and fully specced. So, there’s a basic psychology difference here: data science is focused on using data to build an exotic algorithm or ambitious statistical model, and there’s a bit of subjectivity in this, compared to mainstream IT where the discipline is all about building to the spec, testing against the spec, and then out it goes into production and is maintained.
Solution: Using the ‘four Ps’ to build a workable MLOps solution
There is a solution to this, and indeed all of the ‘four Ps’ as I like to call them. But it isn’t my special secret solution; you know what it is already. It’s what you used to solve the traditional problem with IT.
What I am talking about here: until a few years ago, the IT function had a terrible reputation in business. The waterfall development methodology had resulted in very static, very slow, very wasteful project approaches that seemed as unbridgeable as the possible data science-IT gap I’m talking about here. It had got to the stage where bad jokes were being made about any change request that would only happen after a six month delay, if at all.
But we solved that. We developed Agile and a great thing you almost certainly use now called ‘DevOps’. That’s a completely new and much more business-friendly way of organising Information Technology work that has stripped down the barriers between users and creators of systems, sped up delivery of software, and made that software a lot more immediately useful than it had been before.
The same as DevOps, but for machine learning
So surely the solution to all of our concerns about ML project is ‘MLOps’—a version of this that works for this new class of software development? Clearly, the term is a derivative of DevOps, and means the same thing: a fusion of development and managing the production environment. What DevOps does is provide a process of methodology, tools, and so on to help you move software in the broadest sense from one stage to the next in an automated way. MLOps is the same, but for machine learning.
If we can work up an MLOps culture quickly, the gap a lot of organisations are experiencing between the data science teams doing experimentation and seeing what’s possible versus the production environment gets bridged. I know this will happen, because it’s what we do every day at my company; this is how we help solve the MLOps problem by using our four Ps framework. And what we keep finding is that moving to MLOps thinking as standard with the four Ps is that you can realise the business value of your machine learning and analytics investment—and rather than having a data science team that keeps telling you it can do all these wonderful things, you can actually put them into practice.
Avoiding ML backlash
What happens if you don’t try and build an MLOps approach? Big organisations with lots of resource and determination will probably be fine, but for smaller businesses that may be more financially constrained, ML might end up being written off as ‘Hype Cycle,’ and all your initial hard work getting written off as another failed IT project.
There’s a message here to the wider AI world: if this happens enough times, it will slow the whole industry down; if we don’t help companies to get from the innovation and the ideation into production, this could be a disaster waiting to happen for the entire field. To avoid that and make Machine Learning in business a tech that never slips into that notorious Gartner ‘Trough of Disillusionment,’ let’s all get serious about MLOps—and as soon as possible.
Need our help now?
Get in touch, if you need more immediate support