1January 2024 An actuary’s guide to Julia:Use cases and performance benchmarking in insurance2January 2024In terms of data analytics tools, Julia offers a wide variety of notebook utilities and data manipulation and visualization packages. It alsosupports reading and putting data in Excel format and even doing data analyses in an Excel environment, as does Python.iHere aresome common data manipulation procedures in both Julia and Python. Specifically, the first exhibit shows procedures to:Compute number of occurrences of specific values in a certain data tableGet unique values from a certain data tableCalculate a specific quantile value from a data tableExtract a specific range of values from a data tableFilter out specific values from a data tableBoth Julia and Python offer a diverse array of data manipulation packages, each tailored to specific applications. For the comparisonsabove and below we have chosen to focus on DataFrames in Julia and Pandas in Python, as they share similar applications and areboth known for their user-friendly interfaces. The DataFrames package in Julia, developed natively, capitalizes on Julia's strengths forseamless integration and optimized performance. It is particularly acclaimed for its efficient columnar data storage, which enhancescache locality. Additionally, DataFrames can be effectively used in conjunction with native Julia modules like Threads and Distributed,enabling robust parallel and multithreaded data processing capabilities. In contrast, Pandas in Python is primarily designed for single-threaded operations by default. While parallelism can be achieved with Pandas through additional packages such as Dask,multiprocessing, or by opting for alternative libraries like Polars – which is inherently designed for multi-threaded data processing –these approaches typically require more setup and configuration in contrast to the more inherent parallel capabilities within theJulia ecosystem. An actuary’s guide to Julia:Use cases and performance benchmarking in insurance3January 2024The second exhibit shows procedures to group by on a certain key value of a data table, as well as to join two data tables on a specifickey value in common ways, including inner, outer, left, and anti. An actuary’s guide to Julia:Use cases and performance benchmarking in insurance4January 2024In terms of common machine learning libraries, Julia also offers a wide range of them. Consider the usage of some common machinelearning libraries in both Julia and Python.The above comparisons show a high degree of similarity between the two languages and how easy it is to do data processing andmodeling with them. USE CASES FOR JULIA IN INSURANCE RELATED FIELDSThe insurance field is a complicated world with so many diverse factors intertwined when it comes to modeling in data science-relatedapproaches. However, the whole process can generally be classified into three marketing phases—before, during, and after sale.Based on our experience and past client projects, we have identified several relevant models below to focus on for comparisonsbetween various languages.At the time of writing this paper in 2023, Mojo as a new language has gained traction because it combines the usability of Python withthe performance of C, unlocking the performance of Python by adding features including parallelization and vectorization. We havemade an initial attempt to implement the use cases in Mojo and will also benchmark its performance on one of the virtual machines.However, at the time of writing, it only provides a subset of the syntax included in Python. Therefore, we only implemented two usecases in Mojo to illustrate its syntax and capabilities.SIMILARITY CALCULATIONApplications like risk segmentation, customer classification, and fraud detection are typically modeled using unsupervised learningapproaches on structured data, where the distance between a pair of records is determined from a similarity measure. Data fields cangenerally be categorized into two types, categorical and numerical. In the case of categorical data fields, they are usually transformedinto one-hot encoded format, where all field values are binary, i.e., either 0 or 1. After the transformation, one record becomesessentially a series of bits, or bit arrays. It is much more efficient, in terms of both time and space efficiency, if those bits can bearranged together into a series of contiguous bytes, or byte arrays. Julia offers very handy primitive data types like BitVector, whichallows easy generation or conversion of bit arrays into byte arrays, and users can still do bitwise operations on converted arrays, whichnot only results in greater usability but it also hugely boosts performance.We highlight this below, in both Julia and Python, by randomly generating two bit arrays, which can also be converted from categoricalfields of two records in a real dataset.JULIAIn most applications we are interested in finding out the degree of overlap bet