PyMC3 Examples: GLM with Custom Likelihood for Outlier Classification
A worked example of a novel generative model to filter out noisy / erroneous datapoints in a set of observations, compared to alternative methods. Implemented in the probabilistic programming language `pymc3` in a fully reproducible Notebook, open-sourced and submitted to the examples documentation for the PyMC3 project
On contractor day rates
How does an annual salary convert to a day-rate?
Don't call it a comeback
I've been here for years
Approaches to Data Anonymisation
Using Bloom filters, hash tables, obfuscation and aggregation to ensure the information present in datasets is available only to the appropriate level of access
Survival Analysis: Part1 - What is it?
Survival analysis is long-established within life actuarial work but infrequently used in general data science projects. This series of posts investigates why it's so useful for time-dependent effects, with worked examples.
Tools of the trade (an overview)
We use a variety of software tools for preparing, exploring and modelling data; usually scientific, lightweight and flexible, allowing bespoke insight.
Data science has become a well-established discipline. What is it?
The term 'data science' has been around for several years with many explanations, discussions and breathless over-excitement in the technology and business press. What is it, where did it come from, and who's using it today?