Jonathan Sedar

Personal website and new home of The Sampler blog

Build You a Library

Posted at — 2 Jan 2020

When I started to build out the data science / actuarial team at my current job, I was keen to make sure we had a small library of technical reference books at our immediate disposal. It's not unusual for companies to have such a resource 1, but in my opinion a good reference library is underrated. Our line of work is technical, creative, and experimental, so it's wise to have inspiration and a good guide. 2

The list is far from canonical or exhaustive, and is formed from my technical education & professional experience, with a few crowd-sourced titles for good measure. 3 Maybe it'll help similar practitioners looking to build out a technical in-house library. The common thread is modern, leading, technical references under the general themes of Machine Learning, Bayesian Statistics, Insurance and General Data Science.

Each entry has format:

General Machine Learning

A solid grounding in Un/Supervised Learning from Data (using & developing algorithms that adapt to observations to provide useful representations, infer behaviours or make predictions):

You might also consider:

Bayesian Statistics & Probabilistic Programming

Specialised statistical modelling, esp. where we care about parsimony, parametric and/or functional form, handling uncertainty in a principled way, and learning from what the data generating and observational processes tell us:

You might also consider:

Further Statistics Reading for the Insurance Domain

As a generalist ML / stats practitioner I'm biased towards solving technical problems with modern reproducible research, and in particular Bayesian inference. I believe I'm not alone in finding standard actuarial techniques somewhat archaic and suffering from ‘professionalisation’ where over-simplified models are taught and learnt rote, unnecessarily implemented by hand for a written exam, rarely questioned and quickly forgotten 4. I want to overcome my bias because “Statistics is applied statistics” 5 and it's vital to understand one's domain in detail: the business processes, the nuances of the data-generating processes, and learning from the hard-won lessons of the domain experts. The following texts appear to lead in very much the right direction:

General Data Analysis / Python / R / Software Dev

It's impossible to cover all ground here, but these are good references for day-to-day “data science” work:

General Reading and Data Viz

Inspiration and casual interest - loan these out across the company to help spark ideas and bridge gaps:

Do shout if you have recommendations worth adding!

  1. These purchases were well-supported internally as part of our wider T&D program, and represent a powerful investment for relatively little money.

  2. Proper references can also help to justify the use of non-traditional techniques if you can show that other people (usually smarter than you) also think in the same way. It's dangerous to go alone!

  3. Thanks in particular to Mick Crawford and the folks on the Pandas Arms Slack channel.

  4. Thanks also to Kenny Holms and the folks on the Actuaries Anonymous Slack channel for opinions and recommendations on the actuarial collection.

  5. Gelman usually has an apposite quote.

comments powered by Disqus