Jonathan Sedar

Personal website and new home of The Sampler blog

Recent posts

2 Jan 2020
Build You a Library If you run a data science team, buy yourselves a technical reference library
8 Nov 2019
PyMC3 Examples: GLM with Custom Likelihood for Outlier Classification A worked example of a novel generative model to filter out noisy / erroneous datapoints in a set of observations, compared to alternative methods. Implemented in the probabilistic programming language `pymc3` in a fully reproducible Notebook, open-sourced and submitted to the examples documentation for the PyMC3 project
22 Oct 2019
On contractor day rates How does an annual salary convert to a day-rate?
18 Oct 2019
Don't call it a comeback I've been here for years
20 Nov 2015
Approaches to Data Anonymisation Using Bloom filters, hash tables, obfuscation and aggregation to ensure the information present in datasets is available only to the appropriate level of access
19 Oct 2015
Delivering Value Throughout the Analytical Process Data science doesn't just lead to insights and products: here we define SPEACS, a generalised analytical process that illustrates the business benefit at every stage.
24 Mar 2015
Survival Analysis: Part1 - A Brief Overview Survival analysis is long-established within life actuarial work but infrequently used in general data science projects. This series of posts investigates why it's so useful for time-dependent effects, with worked examples.
18 Feb 2015
Tools of the trade (an overview) We use a variety of software tools for preparing, exploring and modelling data; usually scientific, lightweight and flexible, allowing bespoke insight.
11 Feb 2015
Data science has become a well-established discipline. What is it? The term 'data science' has been around for several years with many explanations, discussions and breathless over-excitement in the technology and business press. What is it, where did it come from, and who's using it today?