Jonathan Sedar

Personal website and new home of The Sampler blog

Build You a Library

Posted at — 2 Jan 2020

When I started to build out the data science / actuarial team at my current job, I was keen to make sure we had a small library of technical reference books at our immediate disposal. It’s not unusual for companies to have such a resource 1, but in my opinion a good reference library is underrated. Our line of work is technical, creative, and experimental, so it’s wise to have inspiration and a good guide. 2

The list is far from canonical or exhaustive, and is formed from my technical education & professional experience, with a few crowd-sourced titles for good measure. 3 Maybe it’ll help similar practitioners looking to build out a technical in-house library. The common thread is modern, leading, technical references under the general themes of Machine Learning, Bayesian Statistics, Insurance and General Data Science.

Each entry has format:

General Machine Learning

A solid grounding in Un/Supervised Learning from Data (using & developing algorithms that adapt to observations to provide useful representations, infer behaviours or make predictions):

You might also consider:

Bayesian Statistics & Probabilistic Programming

Specialised statistical modelling, esp. where we care about parsimony, parametric and/or functional form, handling uncertainty in a principled way, and learning from what the data generating and observational processes tell us:

You might also consider:

Further Statistics Reading for the Insurance Domain

As a generalist ML / stats practitioner I’m biased towards solving technical problems with modern reproducible research, and in particular Bayesian inference. I believe I’m not alone in finding standard actuarial techniques somewhat archaic and suffering from ‘professionalisation’ where over-simplified models are taught and learnt rote, unnecessarily implemented by hand for a written exam, rarely questioned and quickly forgotten 4. I want to overcome my bias because “Statistics is applied statistics” 5 and it’s vital to understand one’s domain in detail: the business processes, the nuances of the data-generating processes, and learning from the hard-won lessons of the domain experts. The following texts appear to lead in very much the right direction:

General Data Analysis / Python / R / Software Dev

It’s impossible to cover all ground here, but these are good references for day-to-day “data science” work:

General Reading and Data Viz

Inspiration and casual interest - loan these out across the company to help spark ideas and bridge gaps:

Do shout if you have recommendations worth adding!

  1. These purchases were well-supported internally as part of our wider T&D program, and represent a powerful investment for relatively little money. ↩︎

  2. Proper references can also help to justify the use of non-traditional techniques if you can show that other people (usually smarter than you) also think in the same way. It’s dangerous to go alone! ↩︎

  3. Thanks in particular to Mick Crawford and the folks on the Pandas Arms Slack channel. ↩︎

  4. Thanks also to Kenny Holms and the folks on the Actuaries Anonymous Slack channel for opinions and recommendations on the actuarial collection. ↩︎

  5. Gelman usually has an apposite quote. ↩︎

comments powered by Disqus