Mlops - Definitions, Tools and Challenges: Elated Ork
Mlops - Definitions, Tools and Challenges: Elated Ork
0453
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.
A. Tamburri [11] presents the current trends and challenges,
focusing on sustainability and explainability.
III. MLO PS
MLOps(machine learning operations) stands for the col-
lection of techniques and tools for the deployment of ML
models in production [12]. Contains the combination of
DevOps and Machine Learning. DevOps [13] stands for
a set of practices with the main purpose to minimize the Figure 1. MLOps Life-cycle.
needed time for a software release, reducing the gap between
software development and operations [14][15]. The two
main principles of DevOps are Continuous Integration (CI)
and Continuous Delivery (CD). Continuous integration is
the practice by which software development organizations
try to integrate code written by developer teams at frequent
intervals. So they constantly test their code and make small
improvements each time based on the errors and weaknesses
that results from the tests. This results in a reduction in
the software development process cycle [16]. Continuous Figure 2. MLOps Pipeline
delivery is the practice according to which, there is con-
stantly a new version of the software under development
to be installed for testing, evaluation and then production. replicated and delivered reliably at any time, in short cus-
With this practice, the software releases resulting from the tom cycles”. This approach includes three basic procedures
continuous integration with the improvements and the new involving: collection, selection and preparation of data to
features reach the end users much faster [17]. After the be used in model training, in finding and selecting the most
great acceptance of DevOps and the practices of ”continuous efficient model after testing and experimenting with different
software development” in general [18][14], the need to apply models, in developing and sending the selected model in
the same principles that govern DevOps in machine learning production. A simplified form of such a pipeline is shown
models became imperative [12]. This is how these practices, in Figure 2.
called MLOps (Machine Learning Operations), came about. After collecting, evaluating and selecting the data that will
MLOps attempts to automate Machine Learning processes be used for training, we automate the process of creating
using DevOps practices and approaches. The two main models and training them. This allows us to produce more
DevOps principles they seek to serve are: Continuous Inte- than one model which we can test and experiment in order to
gration (CI) and Continuous Delivery (DC) [15]. Although produce a more efficient and effective model while recording
it seems simple in reality it is not. This is due to the fact that the results of our tests. Then we have to resolve various
a Machine Learning model is not independent but is part of issues related to the production of the model, as well as
a wider software system and consists not only of code but submit it to various tests in order to confirm its reliability
also of data. As the data is constantly changing, the model before developing it for production. Finally, we can monitor
is constantly called upon to retrain from the new data that the model and collect the resulting new data, which will
emerges. For this reason, MLOps introduce a new practice, be used to retrain the model, thus ensuring its continuous
in addition to CI and CD, that of Continuous Training (CT), improvement [23].
which aims to automatically retrain the model where needed.
From the above, it becomes clear that compared to DevOps, B. Maturity Levels
MLOps are much more complex and incorporate additional Depending on the level of automation of a MLOps system,
procedures involving data and models [19][9][20]. it can be classified at a corresponding level [19]. These levels
were named by the community maturity levels. Although
A. MLOps pipeline there is no universal maturity model, the two main ones
While there are several attempts to capture and describe were created by Google and Microsoft. Google model [24]
MLOps, the one that is best known is the proposal of consists of three levels and its structure is presented in
ToughWorks [21][22], which automates the life cycle of Figure 3. MLOps level 0: Manual process, MLOps level
end-to-end Machine Learning applications (Figure 2). It is 1: ML pipeline automation, MLOps level 2: CI/CD pipeline
”a software engineering approach in which an interoperable automation. Microsoft model [25] consists of five levels and
team produces machine learning applications based on code, its structure is presented in Figure 4. Level 1: No MLOps,
data and models in small, secure new versions that can be Level 2: DevOps but no MLOps, Level 3: Automated
0454
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.
A. Data Preprocessing Tools
Data processing tools are divided into two main cate-
gories: data labeling tools and data versioning tools. Data
labeling tools (also called annotation tools, tagging or sorting
data), big data labeling plans such as text, images or sound.
Data labeling tools can in turn be divided into different
categories depending on the task they perform. Some are
designed to highlight specific file types such as videos or
images [27]. Few of these tools can edit all file types.
There are also different types of tags that differ in each
tool. Boundary frames, polygonal annotations, and semantic
Figure 3. Googles Maturity Levels.
segmentation are the most common features in the label
market. Your choices about data labeling tools will be
an essential factor in the success of the machine learning
model. You need to specify the type of data labeling your
organization needs [28]. Labeling accuracy is an important
aspect of data labeling [29]. High quality data creates better
model performance. Data extraction tools (also called data
version controls) by managing different versions of data sets
and storing them in an accessible and well-organized way
[30]. This allows data science teams to gain knowledge, such
as identifying how changes affect model performance and
understanding how data sets evolve. The most important data
preprocessing tools are listed in table I.
Figure 4. Microsoft Maturity Levels.
Name Status Launched in Use
iMerit Private 2012 Data Preprocessing
Pachyderm Private 2014 Data Versioning
Training, Level 4: Automated Model Deployment, Level 5: Labelbox Private 2017 Data Preprocessing
Full MLOps Automated Operations. Prodigy Private 2017 Data Preprocessing
Comet Private 2017 Data Versioning
Data Version Control Open Source 2017 Data Versioning
Qri Open Source 2018 Data Versioning
IV. T OOLS AND P LATFORMS Weights and Biases Private 2018 Data Versioning
Delta Lake Open Source 2019 Data Versioning
In recent years many different tools have emerged in Doccano Open Source 2019 Data Preprocessing
Snorkel Private 2019 Data Preprocessing
order to help automate the sequence of artificial learning Supervisely Private 2019 Data Preprocessing
processes [26]. This section provides an overview of the Segments.ai Private 2020 Data Preprocessing
Dolt Open Source 2020 Data Versioning
different tools and requirements that these tools meet. Note LakeFs Open Source 2020 Data Versioning
that different tools automate different phases in the machine
Table I
learning workflow. The majority of tools come from the DATA P REPROCESSING T OOLS .
open source community because half of all IT organizations
use open source tools for AI and ML and the percentage is
expected to be around two-thirds by 2023. At GitHub alone,
there are 65 million developers and 3 million organizations B. Modeling Tools
contributing to 200 million projects. Therefore, it is not The tools with which we extract features from a raw data
surprising that there are advanced sets of open source tools in set in order to create optimal training data sets are called
the landscape of machine learning and artificial intelligence. feature engineering tools. Tools like these have the ability
Open source tools focus on specific tasks within MLOps to speed up the feature extraction process [31] when applied
instead of providing end-to-end machine learning life-cycle for common applications and generic problems. To monitor
management. These tools and platforms typically require a the versions of the data of each experiment and its results
development environment in Python and R. In recent years as well as to compare between different experiments, we
many different tools have emerged which help in automating use experiment tracking tools, which store all the necessary
the ML pipeline. The choice of tools for MLOps is based on information about the different experiments because devel-
the context of the respective ML solution and the operations oping machine learning projects involve running multiple
setup. experiments with different models, model parameters, or
0455
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.
Name Status Launched in Use
training data. Hyperparameter tuning or optimization tools
Google Cloud Platform Public 2008 end-to-end
automate the process of searching and selecting hyperpa- Microsoft Azure Public 2010 end-to-end
H2O.ai Open source 2012 end-to-end
rameters that give optimal performance for machine learning Unravel Data Private 2013 Model Monitoring
Algorithmia Private 2014 Model Deployment / Serving
models. Hyperparameters are the parameters of the machine Iguazio Private 2014 end-to-end
Databricks Private 2015 end-to-end
learning models such as the size of a neural network or types TensorFlow Serving Open source 2016 Model Deployment / Serving
of regularization that model developers can adjust to achieve Featuretools
Amazon SageMaker
Private
Public
2017
2017
Feature Engineering
end-to-end
different results [32]. The most important modeling tools are Kubeflow Open Source 2018 Model Deployment / Serving
OpenVino Open source 2018 Model Deployment / Serving
listed in table II. Triton Inference Server Open source 2018 Model Deployment / Serving
Fiddler Private 2018 Model Monitoring
Losswise Private 2018 Model Monitoring
Name Status Launched in Use Alibaba Cloud ML Platform for AI Public 2018 end-to-end
Mlflow Open source 2018 end-to-end
Hyperopt Open Source 2013 Hyperparameter Optimization BentoMl Open Source 2019 Model Deployment / Serving
SigOpt Public 2014 Hyperparameter Optimization Superwise.ai Private 2019 Model Monitoring
Iguazio Data Science Platform Private 2014 Feature Engineering MLrun Open source 2019 Model Monitoring
TsFresh Private 2016 Feature Engineering DataRobot Private 2019 end-to-end
Featuretools Private 2017 Feature Engineering Seldon Private 2020 Model Deployment / Serving
Comet Private 2017 Experiment Tracking Torch Serve Open source 2020 Model Deployment / Serving
Neptune.ai Private 2017 Experiment Tracking KFServing Open source 2020 Model Deployment / Serving
TensorBoard Open source 2017 Experiment Tracking Syndicai Private 2020 Model Deployment / Serving
Google Vizier Public 2017 Hyperparameter Optimization Arize Private 2020 Model Monitoring
Scikti-Optimize Open source 2017 Hyperparameter Optimization Evidently AI Open Source 2020 Model Monitoring
dotData Private 2018 Feature Engineering WhyLabs Open source 2020 Model Monitoring
Weight and Biases Private 2018 Experiment Tracking Cloudera Public 2020 end-to-end
CML Open source 2018 Experiment Tracking BodyWork Open source 2021 Model Deployment / Serving
MLFlow Open source 2018 Experiment Tracking Cortex private 2021 Model Deployment / Serving
Optuna Open source 2018 Hyperparameter Optimization Sagify Open source 2021 Model Deployment / Serving
Talos Open Source 2018 Hyperparameter Optimization Aporia Open source 2021 Model Monitoring
AutoFet Open Source 2019 Feature Engineering Deep checks Private 2021 Model Monitoring
Feast Private 2019 Feature Engineering
GuildAi Open Source 2019 Experiment Tracking Table III
Rasgo Private 2020 Feature Engineering O PERATIONALIZATION T OOLS .
ModelDB Open source 2020 Experiment Tracking
HopsWork Private 2021 Feature Engineering
Aim Open source 2021 Experiment Tracking
Table II
M ODELING T OOLS . Michelangelo(2015) [37], Airbnb with Bighead(2017) [38]
and Netflix with Metaflow(2020) [39].
0456
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.
Name class Status
Auto-sklearn Tool Open Source
Auto-Keras Tool Open Source
TPOT Tool Open Source
Auto-Pytorch Tool Open Source
BigML Tool and Platform commercial
Google Cloud AutoML Platform Open Source
Akkio Platform Open Source
H2O Platform Commercial
Figure 5. AutoML Vs ML. Microsoft Azure AutoML Platform commercial
Amazon SageMaker Autopilot Platform commercial
Table IV
V. AUTO ML AUTO ML T OOLS A ND P LATFORMS .
In the last years more and more companies try to integrate
machine learning models into the production process. For
this reason another software solution was created. AutoML
is the process of automating the different tasks that an ML still AutoML will always be more computational expensive
model creation requires [41]. Specifically, AutoML pipeline compare to classic machine learning techniques, mostly
contains data preparation, models creation, hyper parameter because they perform the same tasks in much more less
tuning, evaluation and validation. With these techniques a time. Also, we are given much less flexibility. The AutoML
bunch of models is trained in the same data set, then a tool works as a pipeline and so we have no control over
hyper parameter fine tuning is applied, finally the models the choices it will make. So AutoML does not qualify for
are evaluating and the best model is exported. Therefore the very specialized tasks. On the other hand, with AutoML
process of creating and selecting the appropriate model, as retraining is a much easier and straightforward task. As long
well as the preparation of the data, turns into a much simpler as the new data are labeled or the models use unsupervised
and more accessible process [42]. This is the reason why techniques, we only have to feed the new data to AutoML
every year more and more companies turn their attention tool and deploy the new model. In conclusion, AutoML is
to AutoML. The combination of AutoML and MLOps a much more quicker and efficient process than the classic
simplifies and makes much more feasible the deployment ML pipeline [53], which can be extremely beneficial in the
of the ML models in production. In these section we will achievement of efficient and high maturity level MLOps
make a brief introduction into the most modern AutoML systems.
tools and platforms aiming at the combination of AutoML
VI. MLO PS C HALLENGES
and MLOps.
In the past years, lots of research tends to focus on
A. Tools and Platforms the maturity levels of MLOps and the transition to fully
Every year more and more tools and platforms are emerg- automated pipelines [19]. Several challenges have been
ing [42]. AutoML platforms are services, which are mainly detected in this area and it is not always easy to overcome
accessible in the cloud. Therefore, for this task they are not them [54]. A low maturity level system relies on the classical
preferred. Although when a cloud based MLOps platform machine learning techniques and requires an extremely good
selected, is possible to have better compatibility. There are connection between the individual working teams such as
also libraries and API’s written in python and c++, which data scientists, ML engineers and frond end engineers. Lots
are much more preferable when an end-to-end cloud-based of technical problems arise from this deviation and the lack
MLOps platform has not been chosen. The ones stand out of compatibility from one step to another. The first challenge
are Auto-Sklearn [43], Auto-Keras [44], TPOT [45], Auto- lies in the creation of robust efficient pipelines with strong
Pytorch [46], BigML [47]. The main platforms are Google compatibility. Constant evolving is another critical point of
Cloud AutoML [48], Akkio [49], H2O [50], Microsoft Azure a high maturity level of a MLOps platform, thus constant
AutoML [51] and Amazon SageMaker Autopilot [52]. The retraining shifts in the top of the current challenges.
most important tools are listed in table IV.
A. Efficient Pipelines
B. Combining MLOps and AutoML A MLOps system includes various pipelines [3]. Com-
It is obvious that the combination of the two techniques monly a data manipulation pipeline, a model creation
can be extremely effective [9], but there are still some pros pipeline and a deployment pipeline are mandatory. Each of
and cons. AutoML requires a vast computational power in these pipelines must be compatible with the others, in a
order to perform. The development of technological means way that optimizes flow and minimizes errors. From this
computational power but every year more power is getting aspect it is critical to choose the right tools for the creation
closer and closer to overcome these kind of challenges, but and connection of these pipelines. The shape of the targets
0457
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.
determines the best combination of tools and techniques, is focused on sustainability, robustness [57], fairness, and
whereat you do not have an ideal combination for each explainability [58]. The reason is that we need to know more
problem, but the problem determines the combination to about the structure of the model, the performance, the reason
be chosen. Also, it is always critical to use the same data why it works or it doesn’t.
preprocessing libraries in every pipeline. In this way, we will
VII. C ONCLUSION
prevent the rise of multiple compatibility errors.
In conclusion, MLOps is the most efficient way to
B. Re-Training incorporate ML models in production. Every year more
After monitoring and tracking your model performance, enterprises use these techniques and more research has been
the next step is retraining your machine learning model [55]. made in the area. But MLOps maybe has a different usage.
The objective is to ensure that the quality of your model in In addition to the application of ML models in production,
production is up to date. However, even if the pipelines are a fully mature MLOps system with continuous training can
perfect, there are many problems that complicate or even lead us to more efficient and realistic ML models. Further,
make retraining impossible. From our point of view, the most choosing the right tools for each job is a constant challenge.
important of them is new data manipulation. Although there are many papers and articles for the different
1) New Data Manipulation: When a model is deployed tools it is not easy to follow the guidelines and incorporate
in production, we use new, raw data to make the predictions them in the most efficient way. Sometimes we have to choose
and use them to extract the final results. However, when between flexibility and robustness with the respective pros
we are using supervised learning, we do not have at our and cons. Finally, monitoring is a stage that must be one
disposal the corresponding labels. So it is impossible to of the main points of interest. Monitoring the state of the
measure the accuracy and constantly evaluate the model. It whole system using sustainability, robustness, fairness, and
is possible to perceive the robustness of the model only by explainability is from our point of view the key for mature,
evaluating the final results, which isn’t always an option. automated, robust and efficient MLOps systems. For this
Even if we manage to evaluate the model and find low reason, it is essential to develop model and techniques which
metrics at new data, the same problem arises again. In order enables this kind of monitoring such as explainable machine
to retrain (fine tune) the model, the labels are prerequisites. learning models. AutoML is maybe the game changer in
Manually labeling the new data is a solution but slows the maturity and efficiency chase. For this reason, a more
down the process and fails at constant retraining tasks. An comprehensive and practical survey for the usage of AutoML
approach is using the trained model to label the new data or in MLOps is necessary.
use unsupervised learning instead of supervised learning but ACKNOWLEDGMENT
also relies on the type of the problem and the targets of the This work was supported by the MPhil program “Ad-
task. Finally, there are types of data where there is no need vanced Technologies in Informatics and Computers”, hosted
for labeling. The most common area that uses this kind of by the Department of Computer Science, International Hel-
data is time series and forecasting. lenic University, Kavala, Greece.
C. Monitoring R EFERENCES
In most papers and articles, monitoring is positioned as [1] A. Goyal, “Machine learning operations,” International
one of the most important functions in MLOps [56]. This Journal of Information Technology Insights Transformations
is because to understand the results helps understanding [ISSN: 2581-5172 (online)], vol. 4, 2020. [Online]. Avail-
able: http://www.technology.eurekajournals.com/index.php
the lack of the entire system. The last section shows the /IJITIT/article/view/655
importance of monitoring not only for the accuracy of the
model, but for every aspect of the system. [2] Y. Zhao, “Machine learning in production: A literature re-
1) Data monitoring: Monitoring the data can be ex- view,” 2021. [Online]. Available: https://scholar.google.com/
tremely useful in many ways. Detection of outliers and drift [3] Y. Zhou, Y. Yu, and B. Ding, “Towards mlops: A case study of
is a way to prevent a failure of the model and help the right ml pipeline platform,” Proceedings - 2020 International Con-
training. Constant monitoring of the shape of the data is ference on Artificial Intelligence and Computer Engineering,
always opposed to training data it is away. There are lots of ICAICE 2020, pp. 494–500, 10 2020.
tools and techniques for data monitoring and choosing the [4] I. Pölöskei, “Mlops approach in the cloud-native data
right ones also depends on the target. pipeline design,” Acta Technica Jaurinensis, 2021. [Online].
2) Model Monitoring: Monitoring the accuracy of a Available: https://acta.sze.hu/index.php/acta/article/view/581
model is a way to evaluate the performance in a bunch of
[5] M. Reddy, B. Dattaprakash, S. S. Kammath, S. KN, and
data at a precise moment. For a high maturity level system, S. Manokaran, “Application of mlops in prediction of
we need to monitor more aspects of our model and the lifestyle diseases,” SPAST Abstracts, vol. 1, 2021. [Online].
whole system. In the previous years, lots of research [10][11] Available: https://spast.org/techrep/article/view/942
0458
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.
[6] C. Min, A. Mathur, U. G. Acer, A. Montanari, and F. Kawsar, [19] M. M. John, H. H. Olsson, and J. Bosch, “Towards
“Sensix++: Bringing mlops and multi-tenant model serving mlops: A framework and maturity model,” 2021 47th
to sensory edge devices,” 9 2021. [Online]. Available: Euromicro Conference on Software Engineering and
https://arxiv.org/abs/2109.03947v1 Advanced Applications (SEAA), pp. 1–8, 9 2021. [Online].
Available: https://ieeexplore.ieee.org/document/9582569/
[7] S. Mäkinen, H. Skogström, V. Turku, E. Laaksonen, and
T. Mikkonen, “Who needs mlops: What data scientists seek [20] M. Treveil and D. Team, “Introducing mlops how to scale
to accomplish and how can mlops help?” 2021. machine learning in the enterprise,” p. 185, 2020. [Online].
Available: https://www.oreilly.com/library/view/introducing-
mlops/9781492083283/
[8] C. Renggli, L. Rimanic, N. M. Gürel, B. Karlaš,
W. Wu, C. Zhang, and E. Zurich, “A data quality- [21] D. Sato, “Thoughtworksinc/cd4ml-workshop: Repository
driven view of mlops,” 2 2021. [Online]. Available: with sample code and instructions for ”continuous
https://arxiv.org/abs/2102.07750v1 intelligence” and ”continuous delivery for machine
learning: Cd4ml” workshops.” [Online]. Available:
[9] P. Ruf, M. Madan, C. Reich, and D. Ould-Abdeslam, https://github.com/ThoughtWorksInc/cd4ml-workshop
“Demystifying mlops and presenting a recipe for the
selection of open-source tools,” Applied Sciences 2021, [22] T. Granlund, A. Kopponen, V. Stirbu, L. Myllyaho, and
Vol. 11, Page 8861, vol. 11, p. 8861, 9 2021. [Online]. T. Mikkonen, “Mlops challenges in multi-organization setup:
Available: https://www.mdpi.com/2076-3417/11/19/8861/htm Experiences from two real-world cases.” [Online]. Available:
https://www.mdpi.com/2076-3417/11/19/8861 https://oraviz.io/
[23] D. Sato, A. Wider, and C. Windheuser, “Continuous
[10] J. Klaise, A. V. Looveren, C. Cox, G. Vacanti, and A. Coca, delivery for machine learning.” [Online]. Available:
“Monitoring and explainability of models in production,” 7 https://martinfowler.com/articles/cd4ml.htmlDataPipelines
2020. [Online]. Available: https://arxiv.org/abs/2007.06299v1
[24] Google, “Mlops: Continuous delivery and automation
[11] D. A. Tamburri, “Sustainable mlops: Trends and challenges,” pipelines in machine learning-google cloud.” [Online].
Proceedings - 2020 22nd International Symposium on Sym- Available: https://cloud.google.com/architecture/mlops-
bolic and Numeric Algorithms for Scientific Computing, continuous-delivery-and-automation-pipelines-in-machine-
SYNASC 2020, pp. 17–23, 9 2020. learning
[25] Microsoft, “Machine learning operations maturity model-
[12] S. Alla and S. K. Adari, “What is mlops?” in Beginning azure architecture center-microsoft docs.” [Online]. Available:
MLOps with MLFlow. Springer, 2021, pp. 79–124. https://docs.microsoft.com/en-us/azure/architecture/example-
scenario/mlops/mlops-maturity-model
[13] S. Sharma, “The devops adoption playbook : a guide to
adopting devops in a multi-speed it enterprise,” IBM Press, [26] A. Felipe and V. Maya, “The state of mlops,” 2016.
pp. 34–58.
[27] L. Zhou, S. Pan, J. Wang, and A. V. Vasilakos, “Machine
learning on big data: Opportunities and challenges,” Neuro-
[14] B. Fitzgerald and K.-J. Stol, “Continuous software engi-
computing, vol. 237, pp. 350–361, 5 2017.
neering: A roadmap and agenda,” Journal of Systems and
Software, vol. 123, pp. 176–189, 1 2017. [28] T. G. Dietterich, “Machine learning for sequential data:
A review,” Lecture Notes in Computer Science (including
[15] N. Gift and A. Deza, Practical MLOps: operationalizing subseries Lecture Notes in Artificial Intelligence and Lecture
machine learning models. O’Reilly Media, Inc, 2020. Notes in Bioinformatics), vol. 2396, pp. 15–30, 2002. [On-
line]. Available: https://link.springer.com/chapter/10.1007/3-
[16] E. RAJ, “Mlops using azure machine learning rapidly test, 540-70659-3.2
build, and manage production-ready machine learning life
cycles at scale.” PACKT PUBLISHING LIMITED, pp. 45–62, [29] T. Fredriksson, D. I. Mattos, J. Bosch, and H. H. Olsson,
2021. “Data labeling: An empirical investigation into industrial
challenges and mitigation strategies,” Lecture Notes in
Computer Science (including subseries Lecture Notes in
[17] C. A. Ioannis Karamitsos, Saeed Albarhami, “Ap- Artificial Intelligence and Lecture Notes in Bioinformatics),
plying devops practices of continuous automation vol. 12562 LNCS, pp. 202–216, 11 2020. [Online].
for machine learning,” Information 2020, Vol. 11, Available: https://link.springer.com/chapter/10.1007/978-3-
Page 363, vol. 11, p. 363, 7 2020. [Online]. 030-64148-1.13
Available: https://www.mdpi.com/2078-2489/11/7/363/htm
https://www.mdpi.com/2078-2489/11/7/363 [30] M. Armbrust, T. Das, L. Sun, B. Yavuz, S. Zhu,
M. Murthy, J. Torres, H. van Hovell, A. Ionescu,
[18] B. Fitzgerald and K.-J. Stol, “Continuous software A. Łuszczak, M. Świtakowski, M. Szafrański, X. Li,
engineering and beyond: Trends and challenges T. Ueshin, M. Mokhtar, P. Boncz, A. Ghodsi,
general terms,” Proceedings of the 1st International S. Paranjpye, P. Senster, R. Xin, and M. Zaharia,
Workshop on Rapid Continuous Software Engineering “Delta lake,” Proceedings of the VLDB Endowment,
- RCoSE 2014, vol. 14, 2014. [Online]. Available: vol. 13, pp. 3411–3424, 8 2020. [Online]. Available:
http://dx.doi.org/10.1145/2593812.2593813 https://dl.acm.org/doi/abs/10.14778/3415478.3415560
0459
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.
[31] S. Khalid, T. Khalil, and S. Nasreen, “A survey of feature se- [46] L. Zimmer, M. Lindauer, and F. Hutter, “Auto-pytorch
lection and feature extraction techniques in machine learning,” tabular: Multi-fidelity metalearning for efficient and robust
Proceedings of 2014 Science and Information Conference, autodl,” IEEE Transactions on Pattern Analysis and Machine
SAI 2014, pp. 372–378, 10 2014. Intelligence, vol. 43, pp. 3079–3090, 6 2020. [Online].
Available: https://arxiv.org/abs/2006.13799v3
[32] R. Bardenet, M. Brendel, B. Kégl, M. Sebag,
and S. Fr, “Collaborative hyperparameter tuning,” [47] BigML, “Bigml.com.” [Online]. Available: https://bigml.com/
vol. 28, pp. 199–207, 5 2013. [Online]. Available:
https://proceedings.mlr.press/v28/bardenet13.html [48] Google, “Cloud automl custom machine learn-
ing models-google cloud.” [Online]. Available:
[33] L. Savu, “Cloud computing: Deployment models, delivery https://cloud.google.com/automl
models, risks and research challanges,” 2011 International
Conference on Computer and Management, CAMAN 2011, [49] Akkio, “Modern business runs on ai-no code ai-akkio.”
2011. [Online]. Available: https://www.akkio.com/
[34] J. de la Rúa Martı́nez, “Scalable architecture for automating [50] H2O, “H2o.ai-ai cloud platform.” [Online]. Available:
machine learning model monitoring,” 2020. [Online]. https://www.h2o.ai/
Available: http://oatd.org/oatd/record?record=oai
[51] Microsoft, “What is automated ml?-automl-azure machine
[35] J. Bosch and H. H. Olsson, “Digital for real: A multicase learning.” [Online]. Available: https://docs.microsoft.com/en-
study on the digital transformation of companies in the us/azure/machine-learning/concept-automated-ml
embedded systems domain,” Journal of Software: Evolution
and Process, vol. 33, p. e2333, 5. [Online]. Available: [52] Amazon, “Amazon sagemaker autopilot-
https://onlinelibrary.wiley.com/doi/full/10.1002/smr.2333 amazon sagemaker.” [Online]. Available:
https://aws.amazon.com/sagemaker/autopilot/
[36] “Tensorflow extended (tfx)-ml production pipelines.”
[Online]. Available: https://www.tensorflow.org/tfx [53] M. Feurer and F. Hutter, “Practical automated machine learn-
ing for the automl challenge 2018,” ICML 2018 AutoML
[37] “Meet michelangelo: Uber’s machine learning platform.” Workshop, 2018.
[Online]. Available: https://eng.uber.com/michelangelo-
machine-learning-platform/ [54] G. Fursin, “The collective knowledge project: making ml
models more portable and reproducible with open apis,
[38] “Bighead: Airbnb’s end-to-end machine reusable best practices and mlops,” 6 2020. [Online].
learning platform-databricks.” [Online]. Available: Available: https://arxiv.org/abs/2006.07161v2
https://databricks.com/session/bighead-airbnbs-end-to-end-
machine-learning-platform [55] S. Schelter, F. Biessmann, T. Januschowski, D. Salinas,
S. Seufert, and G. Szarvas, “On challenges in machine
[39] “Metaflow.” [Online]. Available: https://metaflow.org/ learning model management,” 2018.
[40] S. A. S. K. Adari, “Beginning mlops with mlflow deploy [56] A. Banerjee, C.-C. Chen, C.-C. Hung, X. Huang,
models in aws sagemaker, google cloud, and microsoft azure,” Y. Wang, and R. Chevesaran, “Challenges and experiences
2021. [Online]. Available: https://doi.org/10.1007/978-1- with mlops for performance diagnostics in hybrid-cloud
4842-6549-9 enterprise software deployments,” 2020. [Online]. Available:
https://www.vmware.com/solutions/trustvm-
[41] S. K. Karmaker, M. Hassan, M. J. Smith, M. M.
Hassan, L. Xu, C. Zhai, K. Veeramachaneni, S. K. [57] K. D. Apostolidis and G. A. Papakostas, “A survey on adver-
Karmaker, M. M. Hassan, S. Ginn, M. J. Smith, L. Xu, sarial deep learning robustness in medical image analysis,”
K. Veeramachaneni, and C. Zhai, “Automl to date and Electronics, vol. 10, p. 2132, 2021.
beyond: Challenges and opportunities,” ACM Computing
Surveys (CSUR), vol. 54, p. 175, 10 2021. [Online]. [58] G. P. Avramidis, M. P. Avramidou, and G. A. Papakostas,
Available: https://dl.acm.org/doi/abs/10.1145/3470918 “Rheumatoid arthritis diagnosis: Deep learning vs. humane,”
Applied Sciences, vol. 12, p. 10, 2022.
[42] P. Gijsbers, E. LeDell, J. Thomas, S. Poirier, B. Bischl, and
J. Vanschoren, “An open source automl benchmark,” 7 2019.
[Online]. Available: https://arxiv.org/abs/1907.00909v1
0460
Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:13:21 UTC from IEEE Xplore. Restrictions apply.