AI News, Machine Learning: An Analytical Invitation to Actuaries

Machine Learning: An Analytical Invitation to Actuaries

This post highlights the various value-additions that machine learning can provide to actuaries in their analytical work for insurance companies.

Clustering especially KNN means clustering is an imperative algorithm that exposes different clusters operating within a given data.[6] This can tell us the groupings within claim registers and premium registers like one cluster can be bodily injuries are associated with third parties that are associated with non-luxury vehicles that are commercial and so on.

Basically decomposition of time series takes a real-data time series and breaks it down into 1) trend (long term), 2) seasonal (medium term) and 3) random movements.[7]  Such decomposition can have huge potential in understanding trends in data.

For instance, claims data have trend that follow an underwriting cycle and mimics the economic cycle closely.  An instance for seasonal trend can be higher sales of travel insurance in spring break and summer breaks and so on.

GLMM is a natural extension to GLM models as the linear predictor now contains random effects as well to incorporate fuzziness and give a stochastic feel for enhanced pricing.[8]  Predictive modeling using GLMs and GLMMs can also be assigned to categorize a particular policy into its proper risk category like into predictive risk for claim likelihood for a particular policy and so on (unacceptable risk, high risk, medium risk, low risk etc).

Separate modeling can then be done for each major risk category so as to expose greater insight into the ratemaking process.[9]  The results from the  separate models can act as a feedback loop to the risk and underwriting categories of how valid and reliable are these categories and promote greater cooperation between underwriting function and the claim/reserving function which is vital to generating adequate risk-adjusted premiums.

There have been couple of backlashes around ratemaking such as not using gender to quote prices, controversial social image of using credit scores to quote premiums and most recently, pricing optimization where customers and regulators have pointed out that simply market dynamics like price elasticity and consumer preferences should not lead to different premiums and that only risk factors (and not market factors) should lead to premium differentiation.[10]  

The usual techniques are text parsing, tagging, flagging and natural language processing.[13] There is a correlation between unstructured data and text mining as many unstructured data is qualitative free text like loss adjusters’ notes, notes in medical claims, underwriters’ notes, and critical remarks by claim administration on particular claims and so on.

Another useful instance is utilizing text analytics when lines have little data or are newly introduced which is our research aim here.[14] Sentiment analysis/opinion mining over expert judgment on level of uncertainty in reserves can also prove fruitful.

As discussed, predictive models can review simulated claim data from agent based modeling, network theory and other methods mentioned in this report for similarities and other factors shared by such losses, thereby alerting the claims professional to emerging risks that may have creeping Cat potential.

In conclusion, by measuring and exposing areas of uncertainty that are traditionally not considered, we can reduce our chances of swapping specific risk by systematic risk in our ratemaking procedures and lessen fatness of the tails and handle emerging liabilities in a more resilient manner.

Tree-Based Machine Learning for Insurance Pricing

The goal of this paper is to apply machine learning techniques to insurance pricing, thereby leaving the actuarial comfort zone of generalized linear models ...

Enhanced AML fraud detection solutions with Azure Machine Learning - Ravi Kanth

AML (Anti-Money Laundering) solutions typically tend to be rule engine driven and involve significant manual follow-up activities. Using a Machine Learning ...

Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud PrevenUon


Lecture 07 - The VC Dimension

The VC Dimension - A measure of what it takes a model to learn. Relationship to the number of parameters and degrees of freedom. Lecture 7 of 18 of Caltech's ...

13. Classification

MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016 View the complete course: Instructor: John Guttag ..

How SVM (Support Vector Machine) algorithm works

In this video I explain how SVM (Support Vector Machine) algorithm works to classify a linearly separable binary data set. The original presentation is available ...

Lecture 09 - The Linear Model II

The Linear Model II - More about linear models. Logistic regression, maximum likelihood, and gradient descent. Lecture 9 of 18 of Caltech's Machine Learning ...

Where to get High Frequency Trading Data

Where to get High Frequency Trading Data? How to download High Frequency Trading data? These are questions that haunts the advanced individual trader ...

Interview with a Data Analyst

This video is part of the Udacity course "Intro to Programming". Watch the full course at

Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data

Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data Zhaobin Kuang (University of Wisconsin, Madison) Peggy ...