Jonathan Sedar

Personal website and new home of The Sampler blog

Recent posts

8 Nov 2019
PyMC3 Examples: GLM with Custom Likelihood for Outlier Classification A worked example of a novel generative model to filter out noisy / erroneous datapoints in a set of observations, compared to alternative methods. Implemented in the probabilistic programming language `pymc3` in a fully reproducible Notebook, open-sourced and submitted to the examples documentation for the PyMC3 project
22 Oct 2019
On contractor day rates How does an annual salary convert to a day-rate?
18 Oct 2019
Don't call it a comeback I've been here for years
20 Nov 2015
Approaches to Data Anonymisation Using Bloom filters, hash tables, obfuscation and aggregation to ensure the information present in datasets is available only to the appropriate level of access
24 Mar 2015
Survival Analysis: Part1 - What is it? Survival analysis is long-established within life actuarial work but infrequently used in general data science projects. This series of posts investigates why it's so useful for time-dependent effects, with worked examples.
18 Feb 2015
Tools of the trade (an overview) We use a variety of software tools for preparing, exploring and modelling data; usually scientific, lightweight and flexible, allowing bespoke insight.
11 Feb 2015
Data science has become a well-established discipline. What is it? The term 'data science' has been around for several years with many explanations, discussions and breathless over-excitement in the technology and business press. What is it, where did it come from, and who's using it today?