Automated Machine Learning
Currently, Automated Machine Learning (AutoML) is a significant trend in industries using data science. It has been described as a quiet revolution in AI which automatizes the large portion of machine learning processes. In this text, we will explain how project workflow is enhanced with AutoML as well as find an answer to the question of how various companies including insurers and reinsurers can benefit from this technology.
Why AutoML is so important?
Standard Machine Learning (ML) has already achieved considerable success in many sectors such as healthcare, marketing, and finance including insurance. However, one cannot define standalone ML as fully intelligent system as it still relies on human experts to perform the following important tasks:
- Data preprocessing and cleaning,
- Selecting important features and feature engineering to create new ones,
- Selecting an appropriate model for our task,
- Optimizing hyperparameters of the model,
- Comparing the performance of the models and goodness of fit.
Therefore, most of the real-world business problems are still time-consuming and requires numerous experts, including skillful data scientists whose hiring can be challenging and costly.
What is AutoML?
Let’s assume we deal with motor insurance pricing problem. The pricing department uses several machine learning algorithms such as GLM or neural networks to model claim frequency. The graph below presents a comparison between traditional project workflow and one involving AutoML techniques.
AutoML reduces the complexity of the process by usage of:
- Automatic data processing methods,
- Feature selection and engineering algorithms,
- Model auto-selection,
- Bayesian or other hyperoptimization techniques.
Ultimately, it allows a data scientist to reduce the number of repetitive, manual tasks and concentrate on analysis, explanation, and validation of results or deployment of the process. It increases the productivity of the data scientists without hiring more members to the organization’s data science team, thus filling the gap between high demand for data science talents and shortage on the job market.
AutoML is also a helpful tool for not experienced data scientist as it allows to learn advanced machine learning topics without significant programming skills.
For non-expert:
- Simplifies workflow,
- Do not require advanced programming skills,
- Automates machine learning pipeline.
For expert:
- Reduces number of repetitive tasks,
- Frees time for analysis and validation.
- Allows for more what-if analysis.
Taking into account the abovementioned benefits, it is predicted that in the near future, any data science product would contain some sort of AutoML.
Are there any threats? One can say that it would increase the black-box-ness of the algorithms used, however, it is clearly a misstatement. The explainability of a model relies on what-if analysis, diagnosis and visualizations of the results (for instance using methods of Explainable AI). AutoML allows data scientists to do it in a more efficient way by automatizing a large portion of a workflow.