Data Science is still a roaring field with demand continuing to outstrip supply and many business expecting to increase their IT spend drastically over the next few years.
Although there has been a sharp rise in online courses, bootcamps and degrees and with them, an increase in junior talent, it is still a great time to get into Data Science.
There are some amazing resources out there for project ideas but many of them have been done by most new Data SCientists. Pretty much everyone has done a Twitter sentiment analysis project (myself included!), looked at the Titanic dataset or…
When I first started learning Data Science and looking at projects, I thought you could either do a Deep Learning or regular project. This is not the case.
With powerful models becoming more and more accessible, we can easily leverage some of the power of deep learning without having to optimize a neural network or use a GPU.
In this post, we are going to look at embeddings. This is the way deep learning models represent words as vectors. …
What I’ve learned after writing 9 articles on Medium — 10 tips for my 10th article.
I’ve been an avid reader on Medium for well over a year now, but I only started writing towards the end of 2020.
While I haven’t enjoyed any considerable success so far, my stories have had over 2000 views and I am on track to pay back my membership.
This article lists what I’ve learnt in the brief few months I’ve been using the platform.
1. Write, and write often
I think this is the number 1 tip in many fields and it is…
A data-driven approach to understanding soft skills in the UK market.
This article uses a web scraped dataset to analyse text and find the most common skills in data-related jobs. I’m focusing on technical/hard skills for this article and will tackle soft skills later on.
I’m going to run through the top 3 skills and why they are important.
There is an inherent bias in this dataset because it is scraped. This is largely because Indeed, the platform this was scraped from, is used extensively by recruiters who tend to use more technical skills and offer larger salary ranges.
To…
A data-driven approach to understanding technical skills in the UK market.
This article uses a web scraped dataset to analyse text and find the most common skills in data-related jobs. I’m focusing on technical/hard skills for this article and will tackle soft skills later on.
For each of the 5 most common skills, I’m going to cover:
The dataset consists of 3,015 job titles including salaries.
The mean salary is £49,543.
The median salary is £44,000.
These seem quite high and we’ll discuss why later on.
There are plenty of fantastic Data Science content creators on YouTube. If I’m trying to understand an algorithm, one of them typically hits the nail on the head and describes it perfectly.
With Logistic Regression, I can’t find that one perfect description. People seem to approach the algorithm from different directions which I find confusing. It might also be because I don’t have a heavy maths background. Logistic Regression is pretty heavy on mathematical notation and this might be another reason why I found it hard to understand for quite some time.
This article aims to explain logistic regression in…
If you haven’t already, you should read my article on Decision Trees.
Understanding the Decision Tree and its flaws is paramount to understand why the Random Forest exists, and why it is powerful.
According to KDNuggets the top 3 most commonly used algorithms by Data Scientists are Linear/Logistic Regression, Decision Tree or Random Forest and Gradient Boosting models. …
Tree-based models are some of the most widely used models today; they are very powerful, easy to implement and provide feature importances to help with interpretability. One of the most widely used tree-based models is the Random Forest, which is based on Decision Trees.
In order to understand Random Forest, it is essential to know what the underlying model, the Decision Tree is doing.
Despite its ease of use, it can be a tricky algorithm to explain…
This article looks at a few techniques that if mastered, equips the user with the tools to deal with a wide range of data types. This article does not cover anything relation to database management such as table creation or schemas.
If you’d like to follow along, you can set up a local SQL Server using SQLite :
If these techniques are too complex, you can get up to speed in less than 10 minutes here:
If you need a more general introduction to SQL and Databases, check out the first part of this tutorial:
In this article, we will look at basic SQL syntax; selecting data from tables, filtering and working with data types. We will then look at Joins and aggregate measures.
The syntax in this article is all you really need to work with data in SQL competently. Anything else just makes life easier.
SQL remains one of the most…