Deep learning models are now ubiquitous and analyse massive amounts of audio, image, video, text and graph data, with applications in all industry segments across the value chain. While significant work has been done on the training side to standardise and automate using frameworks and tools, the deployment of these models into production, especially at scale, remains largely bespoke and extremely effort intensive.
There are 3 major components in the journey to have AI powered business applications successfully deployed in an enterprise.
There is a fourth and final part as well without which the real benefit of AI will not be fully realised. That is the continuous learning loop of the AI application from live data to improve the accuracy/performance of the model. Ideally, the deployment platform and tools need to enable the prediction stream anomaly detection and provisioning the data for retraining of the model. However, as the scale and complexity of deployed models increase, this may have to be considered and planned as a separate workload in itself. We will address is a separate discussion in the future. For a discussion on the data strategy please see my previous posts on this topic.
Discussions about AI tend to focus on data (rightly so), feature engineering, algorithms and model development. No doubt all necessary for successfully using AI, however, insufficient attention is being paid to the equally important aspect of AI applications’ deployment to production. Managing AI applications in production is in fact so important that you can say the actual work only begins after you’ve deployed the AI in production with live data! Welcome to the emerging field of ML Ops or AI Ops or AI Deployments. Not to be confused with using AI/ML for improving IT Operations processes.
Setting up environments for AI pipelines and deploying them can be a long and exhausting process. Some common hurdles that we face in this process are:
So how do we tackle these problems and create a reliable, scalable and repeatable process for deploying AI applications into production?
This simplest solution and the one being used by most companies today, is to go with a hosted provider like AWS (there are several others with more customised PaaS offerings like Pipeline.ai or H2O.ai) where a significant part of the deployment process can be abstracted and a fairly consistent user (deployment) experience is available for a large cross section of load vs performance patterns. While there are trade-offs in areas like flexibility and optimisation of the hardware specific to the AI models being used, the cost of running a large enterprise-size load on the cloud may also become a deterrent to adoption of AI within the enterprise.
For the rest, unfortunately, the path is still quite uncharted. There are a few tools from the major players and some from smaller boutique outfits but they all tend to be very specific to a particular set of framework-model-hardware combination. Even while using such tools, a significant number of bespoke customisations maybe required for which a pool of smart engineers need to be readily available.
None of these problems are unsurmountable and the situation is not as grim as it may sound. The tools landscape is evolving every day and considerable work is being done by the leaders like Google, Facebook, Microsoft etc and a whole bunch of smart start-ups. With awareness of the problem and proper planning at the initial phases (think of a process akin to ‘DevOps for AI’) AI applications can be deployed in production and scaled to meet the growing need for better and more reliable enterprise automation.