AI智能总结
Over 97% of businesses worldwide have invested in big data. However, only24% of these companies claimed they use the collected data to analyze andmake informed decisions.1Data management is an integral part of running abusiness, for year-end reporting and tax purposes, and to comply with lawsand regulations. Today the insurance industry is experiencing a fundamental shift in how todefine, understand, and quantify risk. Recent technological advancementshave led to an explosion of data, which demands new processing and analysistechniques beyond traditional methods to make sense of it all. Consequently,insurers face a new challenge: finding a balance between developing highlyaccurate models and complying with business and regulatoryrequirements. Unconstrained models — those with few limitations — maximize data utilityand predictive power by leveraging advanced algorithms. These modelsallow for flexibility, have the ability to capture complex relationships,and offer broad applicability for domains where data may contain deepinterdependencies and nuance. When used strategically, unconstrained models can analyze and enhancetraditional models to unlock new insights, even in highly constrained orregulated environments like insurance. For organizations that embrace them,unconstrained models present an opportunity to improve risk managementand gain a competitive advantage. In this paper, we examine the latest advancements in the insuranceanalytics landscape. We discuss how unconstrained models can strategicallycomplement traditional models, review modeling constraints, and highlightthe importance of strong model governance. THE INSURANCE ANALYTICS LANDSCAPE UNLOCK NEW DATA SOURCES FOR MORE ACCURATE RISK MODELING The past decade has seen a data revolution characterized by the emergence of new datatypes as well as increased volume and velocity. Today, smartphones continuously transmittelemetry data for various applications, vehicles provide diagnostics and receive over-the-air updates, and smart refrigerators are poised to display advertisements. As a result, thevolume and growth of data have skyrocketed. Data created, captured, copied, and consumedis expected to surpass 180 zettabytes in 2025, as shown in Exhibit 1. For context, Exhibit 2shows that one zettabyte (1 trillion gigabytes) is roughly equivalent to 1 million copies ofthe entire Netflix catalog — underscoring that 180 zettabytes is an enormous amount ofdata. Most of this data is unstructured, meaning it does not have a predefined format andrequires the application of techniques like natural language processing (NLP) and largelanguage models (LLMs) to extractvalue. The insurance industry has similarly transformed by leveraging this new data. Datasetsnow include telematics for scoring driving behaviors, Internet of Things (IoT) data for real-time leak detection, satellite imagery, and climate data for a more refined underwritingframework. Recently, LLMs have illuminated latent insights in claims and policy notes, calltranscripts, and policy applications. The industry may even utilize previously untapped data,such as social media posts, online reviews, and research papers. This new data era presentsincredible opportunities to redefine how risk isunderstood. UNDERSTAND LIMITATIONS OF TRADITIONAL MODELS AND ALGORITHMSIN TODAY’S DATA LANDSCAPE The industry standard for data analysis, modeling, and product development centers onthe versatile generalized linear model (GLM) and its variants. Formal ly introduced by JohnNelder and Robert Wedderburn in 1972,2these models were built on years of mathematicalbreakthroughs. They proved flexible, robust, and interpretable models, entering theinsurance industry by the late 1980s. While GLMs are highly useful analytical tools, their development predates modern datasets,which now contain millions of rows and hundreds of columns. Though modern GLMs handlelarger data, they still require additional algorithms to find the most predictive inputs. Othershortcomings include: •Difficulty capturing nonlinearrelationships•Interactions must be explicitlyspecified•The required choice of distribution and link function introduces model specificationrisk•Assumption of independent observations Data scientists use various techniques to tackle these challenges, including gridsearch, regularization, and quasi-likelihood models. They also use advanced methodsand extensions like generalized additive models (GAMs) and generalized linear mixedmodels(GLMMs). Workers’ Compensation Profitability Analysis Challenge WORK WITHIN MODELING CONSTRAINTS DRIVEN BY BUSINESS ANDREGULATORY NEEDS Constraints are conditions that restrict the scope of a model’s inputs and outputs to explaina given phenomenon or process. By design, constraints hinder what we seek to understand;however, due to their natural intuitive limits or boundaries, they will always exist as we striveto understand real-world metrics. Having zero constraints is impossible; thus,