How Much Telematics Information Do Insurers Need for Claim Classification?

Download Link
Time Added
2022/12/12 19:34
Logistic Regression
Total Downloads
Francis Duval Jean-Philippe Boucher and Mathieu Pigeon
It has been shown several times in the literature that telematics data collected in motor insurance help to better understand an insured’s driving risk. Insurers who use these data reap several benefits such as a better estimate of the pure premium more segmented pricing and less adverse selection. The flip side of the coin is that collected telematics information is often sensitive and can therefore compromise policyholders’ privacy. Moreover due to their large volume this type of data is costly to store and hard to manipulate. These factors combined with the fact that insurance regulators tend to issue more and more recommendations regarding the collection and use of telematics data make it important for an insurer to determine the right amount of telematics information to collect. In addition to traditional contract information such as the age and gender of the insured we have access to a telematics dataset where information is summarized by trip. We first derive several features of interest from these trip summaries before building a claim classification model using both traditional and telematics features. By comparing a few classification algorithms we find that logistic regression with lasso penalty is the most suitable for our problem. Using this model we develop a method to determine how much information about policyholders’ driving should be kept by an insurer. Using real data from a North American insurance company we find that telematics data become redundant afte
Year Published
North American Actuarial Journal 2022 vol. 26 issue 4 570-590