Living in Higher Dimensions

Sir Professor David Mackay revolutionised Machine Learning. There's no question about it. His abundance of knowledge was clear from his research, his selflessness, and his groundbreaking work on Information Theory. Both the fields of Gaussian Processes and Neural Networks owe him a lot.

# Sorry, the TensorFlow Developer Certificate is Pointless

Sorry, the TensorFlow Developer Certificate is Pointless

Plenty of Better Alternatives Exist to Prove your Skillset

Google's overall openness and investment in the space of AI has been phenomenal. I really think that unequivocally, the whole world has a lot to thank them for. Academic breakthroughs are published and code is often made free on GitHub.

# Top 5 Linux Commands for Beginners

Top 5 Linux Commands for Beginners

Data Science on the Command Line

As data sets are getting larger and more prevalent, researchers are having to do a lot more of the leg work in regards to core programming — thereby spending more time with tools like GIT and Linux (something we rarely had to before!).

# How to Derive an OLS Estimator in 3 Easy Steps

How to Derive an OLS Estimator in 3 Easy Steps

A Data Scientist's Must-Know

OLS Estimation was originally derived in 1795 by Gauss. 17 at the time, the genius mathematician was attempting to define the dynamics of planetary orbits and comets alike and in the process, derived much of modern day statistics.

# Flask’s Latest Rival in Data Science

Flask's Latest Rival in Data Science

Streamlit Is The Game Changing Python Library That We've Been Waiting For

Developing a user-interface is not easy. I've always been a mathematician and for me, coding was a functional tool to solve an equation and to create a model, rather than providing the user with an experience.

# The Sampling Distribution of Pearson’s Correlation

The Sampling Distribution of Pearson's Correlation

Pearson's Correlation reflects the dispersion of a linear relationship (top row), but not the angle of that relationship (middle), nor many aspects of nonlinear relationships (bottom).

How a Data Scientists can get the most of this statistic

People are quite familiar with the colloquial usage of the term 'correlation': that it tends to resemble a phenomena

# Plotting with Seaborn in Python

Plotting with Seaborn in Python

Figure 0: Pair Plot using Seaborn

4 Reasons Why and 3 Examples How

import seaborn as sns

Finding a pattern can sometimes be the easy bit when researching so let's be honest: conveying a pattern to the team or your customers is sometimes a lot more difficult than it should be.

# The Power-Law Distribution

The Power-Law Distribution

Pareto's Power-Law Distribution Explaining the Laws of Nature (Including the Golden Ratio)

The laws of nature are complicated and throughout time, Scientists from all corners of the world have attempted to model and reengineer what they see around them to extract some value from it.

# The Student t-Distribution

The Student t-Distribution

Probability Density Function for the Student t-Distribution.

For the Sake of Statistics, forget the Normal Distribution.

To be clear: This is targeted at Data Scientiststs/Machine Learning Researchers and not at Physicists

Statistical normality is overused. It's not as common and only really occurs in the impractical 'limits' [[2][3][4]].

# Robust Statistical Methods

Robust Statistical Methods

Anomalies hidden in plain sight. Chart from Liu and Neilson (2016)

Methods that Data Scientists Should Love

A robust statistic is a type of estimator used when the distribution of the data set is not certain, or when egregious anomalies exist.