THE DATA SCIENCE PROCESS - EXPERIMENT, PRODUCE AND EXPLAIN

Luke Frost our consultant managing the role
Author: Luke Frost
Posting date: 6/19/2018 9:36 AM
The Fourth Industrial Age is booming. Data Scientists are rock stars of the tech world and data science is considered the "sexiest job" of the 21st century. But, when you take away all the buzz words and show, what does it all really mean? If you're just getting started in the field or know someone who wants to be, this is the first in a series of bite-sized articles looking at life as a Data Scientist.

Is Newton a hero of yours? Me too! Were science and maths your favourite subjects? Me too! As a Data Science specialist recruiter working across both research and commercial roles, I've had the pleasure to meet and learn from thousands of Data Scientists and other professionals within the analytics space, and here's what I’ve learned. 

What does a Data Scientist do?

A Data Scientist offers a holistic view of data with a clear understanding of how data comes together and the relationships between seemingly disconnected features. Below are three distinct areas where these manifest:

1. Experimentation 
2. Production 
3. Explanation

Testing, Testing, Hypothesise, Prove - Experimentation

As someone interested in Physics and Chemistry throughout school, the word science conjures up images of frogs being dissected, Newton being hit by an apple, and Bunsen Burners. Much like a chemist tests for chemical properties, playing with their experiments to define different results, a data scientist does the same - only with gigabytes upon gigabytes of data. 

The phase of experimentation for a Data Scientist is crucial, they test hypotheses, understand the limitations of algorithms and try to establish successful Proofs of Concept (PoC) to both prove and disprove their hypotheses. Once these experiments have proven success on limited data sets, then the process of production begins. It goes without saying that for experimentation to take place, there needs to be a clear structure to the data, an area that my colleague Josh Carter covers in his article Build IT and They Will Come. 

Putting it All Together - The Production Puzzle

When it comes to production, a Data Scientist has to juggle all aspects and implement a ‘clean’ solution that can run as efficiently as possible.

An isolated hypothesis is of little use to a business using analytics to shape policy and inform major business decisions. The complete dataset must be rolled out and continue to achieve similar results of the initial PoC to offer commercial impact. It must be able to work in harmony with all the other algorithms that are currently deployed. Once these initial PoC algorithms have been put into production and have produced an interesting output, there is one final stage to the process.

Tell Me in Plain Language - Explanation

Data Science has infused every industry, including retail. Much like a retail associate explains to prospective buyers the benefits and features of the product, so too must the Data Scientist be able to do the same. However, a Data Scientist must be able to break down a complex concept and be able to translate their findings into non-technical terms. 

This is an essential skill when you consider that very few commercial Data Scientists work in isolation, and in order for businesses to completely buy into Data Science, they first need to understand it.

As someone who's worked with Data Scientists and Data Analysts both in the retail industry and now, as a recruiter, I find this is one of the most fascinating parts of the process.I hope the above brief summary provides insight into a very topline overview of the way that a Data Scientists works within industry.

Please do take a look at our current vacancies or reach out to me directly. You can reach at markproud@harnham.com or by calling me on 0208 408 6070.

Related blog & news

With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.

Visit our Blogs & News portal or check out the related posts below.

From Idea to Impact: How Charities Use Data

From Idea to Impact: How Charities Use Data

It’s that time of year again. As the festive season draws near and we pull together wish lists, many of us also begin to think about how we can give back. Given that the UK spent over £7 billion this Black Friday and Cyber Monday weekend, it’s not surprising that the idea of Giving Tuesday is becoming more and more popular.  But with 160,000 registered charities in the UK alone, institutions are turning to data to find new ways to stand out and make a greater impact.  Far from just running quarterly reports, charities are now utilising the insights they gain from data to inform their strategies, improve their services and plan for the future.  IDEAS Given that not every charity is lucky enough to go viral with an Ice Bucket Challenge style video, there is a need to find other ways to stand out in such a crowded market. As such, many are looking to the data they have collected to help create a strategy. Macmillan Cancer Support, one the UK’s biggest charities, wanted to see more success from one of their main fundraisers, ‘The World’s Biggest Coffee Morning’. The event, which sees volunteers hold coffee and cake-fuelled gatherings across the country was revolutionised by data. By engaging with their database and researching what motivated fundraisers, they refocused their marketing around how the occasion could create an opportunity for people to meet up and chat, such as swapping ‘send for your free fundraising pack’ for ‘order your free coffee morning kit’. Whilst these amends may seem superficial, they had a major impact increasing funds raised from £15m to £20m.  Some brands have taken this idea even further, using Data & Analytics tools to engage with potential donors. Homelessness charity Cyrenians’ data told them that there were a number of misconceptions about rough sleepers, including 15% of people believing that they were homeless by choice. To counter this they created an AI chatbot, named Alex, that allowed users to ask questions they may not have been comfortable asking a real person.  Another charity using data tools to counter common misconceptions is Dyslexia Association. Their Moment of Dyslexia campaign saw them utilise facial recognition technology; the longer a person looked at their digital poster, the more jumbled up the words and letters became. By harnessing both insights and the technology made possible by data, they were able to offer an insight into what dyslexia is like for people who previously didn’t understand.  INDIVIDUALS A big issue facing a number of charities is trust. Following a series of recent scandals, the public are more sceptical than ever of how charities are run, and their use of data is no exception. This ‘trust deficit’ has resulted in vast amount of potential donors staying away, with recent research highlighting that only 11% of people are willing to share their data with a charity, even if it means a better service.  Whilst charities with effective Data Governance are able to use their vast amount of data to enhance those business, those who mismanage it are likely to suffer. Following a cyber-attack that exposed the data of over 400,000 donors, the British and Foreign Bible Society were fined £100,000. As hackers were able to enter the network by exploiting a weak password, this serves as a timely reminder that our data needs not only to be clean, but secure.  Financial implications aside, improper data usage can also do irreversible damage to a charity’s reputation. St Mungo’s has faced criticism for passing information about migrant homeless people to the Home Office, putting them at risk of deportation. Whilst they were cleared of any wrongdoing by the ICO, this controversial use of data has had a negative impact on the charity’s image. With a decline in the number of people donating to charity overall, anything that can put people off further is bad news.  IMPACT Whilst there is more demand than ever for charities to share their impact data, there is also more opportunity. With Lord Gus O’Donnell urging charities to make data an ‘organisation-wide priority’, many are going beyond publishing annual reports and fully embracing a culture shift. Youth charity Keyfund have been able to justify how the spend their funds based on their impact data. Having heard concerns from fundraisers regarding whether their leisure projects were effective they looked at the data they had gathered from the 6,000 young people they were helping. What they found was that not only were their leisure projects effective, they had an even more positive impact than their alternatives, particularly for those from the most deprived area. This allowed them to continue to support these programs and even increase funding where necessary. Going one step further are Street League, a charity that use sports programmes to tackle youth unemployment. Rather than share their impact data in quarterly, or even annual, reports they moved to real-time reporting. Interested parties can visit an ‘Online Impact Dashboard’ and see up-to-the-minute data about how the charity’s work is impacting the lives of the people it is trying to help. This not only allows for the most relevant data to be used strategically, but also supports the business holistically, gaining donor both attention and trust. To stand out in the charity sector institutions need to take advantage of data. Not only can this be used to generate campaigns and streamline services but, when used securely and transparently, it can help rebuild trust and offer a competitive edge.  If you want to make the world a better place by harnessing and analysing data, we may have a role for you. Take a look at our latest opportunities or get in touch with one of our expert consultants to see how we can help you. 

Using Data & Analytics To Plan Your Perfect Ski Trip

Using Data & Analytics To Plan Your Perfect Ski Trip

The Ski season may be drawing to a close, but it’s never too early to start planning for next year. Born and raised in the mountains of Austria, I have been skiing all of my life. For me, it’s about freedom, enjoying the views and forgetting about everything else.  But, since I’ve stepped into the world of Data & Analytics, I started to asked myself “what can I learn from my work that I can apply to my skiing”? After having a look around, I began to discover ways in which Data could support my passion. I’ve pulled together some of the most interesting things I’ve discovered and created this handy guide to help you prepare for your next trip. Here’s how you can use data to create the perfect ski trip.  Follow the snow Anyone who has skied before knows about the uncertainty before a trip. Will there be enough snow? Will the weather be good? Which resort is the most suited to my ability? Fortunately, somebody has already pulled this information together for you. Two "web spiders" were built via Scrapy, a Python framework used for data extraction, the first of which extracted ski resort data. The second spider, on the other hand, extracted daily snowfall data for each resort (2009 - present). After collecting Data from more than 600 ski resorts and spitting it into 7 main regions, the spiders were able to form a conclusion. The framework then pulled out key metrics, including the difficulty of runs, meaning that skiers are now able to decide which resort is most suitable for their ability.  As for the weather, onthesnow.com has recorded snowfall data from all major resorts, every year since 2009. We all know that good snow makes any trip better, so the collected data here will help skiers ensure they are prepared for the right weather, or even plan their trip around where the snow will be best.  Optimise your skis Ski manufacturing is a refined and complicated process, with each ski requiring many different materials to be built. Unfortunately, this often results in the best skis running out quickly as supply outspeeds demand.  To help speed up and improve the process, companies are implementing technologies like IBM Cognos* that monitor entire supply chains so that no matter how much demand increases, they have the materials to meet it.   Additionally, since the majority of companies have become more data-driven, production time has been reduced by weeks. Predictions for future demand has also become 50% more accurate, resulting in a drop of 30% idle time on production lines. Skip the Queue Tired of queuing for the ski lift? There’s good news. As they begin to make the most of data, ski resorts are introducing RFID* (Radio Frequency Identification) systems. These involve visitors purchasing cards with RFID chips included, allowing them to skip queues at the lifts as there is no need to check for fake passes. The data can then be utilised for gamification platforms to turn a skier’s time on the slopes into an interactive experience.  The shift towards Big Data not only has advantages for the visitors, but the management are also benefiting. In the past, it has been difficult to analyse skier’s data. Now, with automated and proper data management, the numbers can be crunched seamlessly and marketing campaigns can be directed at how people actually choose to ski.   Carve a Better Technique Skiing isn’t always easy, especially if you haven’t grown up with it. Usually, ski instructors are the solution but, in the age of Data & Analytics, there are other solutions. Jamie Grant and co-founder Pruthvikar Reddy have created an app called Carv 2.0, which allows you to be your own teacher. It works by using a robust insert that fits between the shell of your ski boots and the liner. It then gathers data from 48 pressure sensitive pads, and nine motion sensors.  This data is fed to a connected match-box size tracker unit, sitting on the back of your boots, before being relayed via Bluetooth to the Carv App on your phone. Carv can then measure your speed, acceleration and ski orientation a staggering 300 times a second.  Thanks to a complex set of algorithms this data is then converted into an easy to follow graphic display on your phone’s screen as well as verbal feedback from Carvella. The accuracy of this real-time data could make it a better instructor than any individual person.  Data & Analytics are helping streamline every part of our lives. Whilst the above can’t guarantee a perfect ski trip, they can help us minimise risks and optimize our performance and experience.  If you’re able to use data to improve day-to-day living, we may have a role for you. Take a look at our latest opportunities or get in touch with our expert consultants.  

RELATED Jobs

Salary

£60000 - £75000 per annum + Additional Benefits

Location

London

Description

This is your opportunity to join a well-established company helping to build your own Data Science team from the ground up!

Salary

US$140000 - US$160000 per year + Medical, Unlimited PTO

Location

New York

Description

Partnered with an exciting Fin-Tech company in New York City that build innovative AI solutions, looking for deep learning engineers.

Salary

£85000 - £95000 per annum + benefits + bonus

Location

London

Description

Experiment with brand-new tools and technologies to pioneer machine learning solutions to the most challenging business-focused problems across all areas.

Salary

Up to £550 per day

Location

Greater London

Description

Have experience with HMD? An exciting opportunity to work for a start up who specialise in virtual reality headsets using DS techniques!

recently viewed jobs