DATA IS THE NEW OIL - CRUDE OIL

Krishen Patel our consultant managing the role
Posting date: 9/2/2013 2:16 PM

When Nasdaq stopped trading this week, it again showed how global firms are at the mercy of a power that created them

"Data is the new oil," declared Clive Humby, a Sheffield mathematician who with his wife, Edwina Dunn, made £90m helping Tesco with its Clubcard system. Though he said it in 2006, the realization that there is a lot of money to be made – and lost – through the careful or careless marshalling of "big data" has only begun to dawn on many business people.

The crash that knocked out the Nasdaq trading system was only one example; in the past week, Amazon, Google and Apple have all suffered breaks in service that have affected their customers, lost sales or caused inconvenience. When Amazon's main shopping site went offline for nearly an hour, estimates suggested millions of dollars of sales were lost. When Google went offline for just four minutes this month, the missed chance to show adverts to searchers could have cost it $500,000.

Michael Palmer, of the Association of National Advertisers, expanded on Humby's quote: "Data is just like crude. It's valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value."

For Amazon and Google especially, being able to process and store huge amounts of data is essential to their success. But when it goes wrong – as it inevitably does – the effects can be dramatic. And the biggest problem can be data which is "dirty", containing erroneous or garbled entries which can corrupt files and throw systems into a tailspin. That can cause the sort of "software glitch" that brought down the Nasdaq – or lead to servers locking up and a domino effect of overloading.

"Whenever I meet people I ask them about the quality of their data," says Duncan Ross, director of data sciences at Teradata, which provides data warehousing systems for clients including Walmart, Tesco and Apple. "When they tell me that the quality is really good, I assume that they haven't actually looked at it."

That's because the systems businesses use increasingly rely on external data, whether from governments or private companies, which cannot be assumed to be reliable. Ross says: "It's always dirty."

And that puts businesses at the mercy of the occasional high-pressure data spill. Inject the wrong piece of data and trouble follows. In April, when automatic systems read a tweet from the Associated Press Twitter feed which said the White House had been bombed and Barack Obama injured, they sold stock faster than the blink of an eye, sending the US Dow index down 143 points within seconds. But the data was dirty: AP's Twitter feed had been hacked.

The statistics are stunning: about 90% of all the data in the world has been generated in the past two years (a statistic that is holding roughly true even as time passes). There are about 2.7 zettabytes of data in the digital analytics universe, where 1ZB of data is a billion terabytes (a typical computer hard drive these days can hold about 0.5TB, or 500 gigabytes). IBM predicts that will hit 8ZB by 2015. Facebook alone stores and analyzes more than 50 petabytes (50,000 TB) of data.

Data is also moving faster than ever before: by last year, between 50% and 70% of all trades on US stock exchanges was being done by machines which could execute a transaction in less than a microsecond (millionth of a second). Internet connectivity is run through fibre optic connections where financial companies will seek to shave five milliseconds from a connection so those nanosecond-scale transactions can be done even more quickly.

We're also storing and processing more and more of it. But that doesn't mean we're just hoarding data, says Ross: "The pace of change of markets generally is so rapid that it doesn't make sense to retain information for more than a few years.

"If you think about something like handsets or phone calls, go back three or four years and the latest thing was the iPhone 3GS and BlackBerrys were really popular. It's useless for analysis. The only area where you store data for any length of time is regulatory work."

Yet the amount of short-term data being processed is rocketing. Twitter recently rewrote its entire back-end database system because it would not otherwise be able to cope with the 500m tweets, each as long as a text message, arriving each day. (By comparison, the four UK mobile networks together handle about 250m text messages a day, a figure is falling as people shift to services such as Twitter.)

Raffi Krikorian, Twitter's vice-president for "platform engineering" – that is, in charge of keeping the ship running, and the whale away – admits that the 2010 World Cup was a dramatic lesson, when goals, penalties and free kicks being watched by a global audience made the system creak and quail.

A wholesale rewrite of its back-end systems over the past three years means it can now "withstand" events such as the showing in Japan of a new film called Castle in the Sky, which set a record by generating 143,199 tweets a second on 2 August at 3.21pm BST. "The number of machines involved in serving the site has been decreased anywhere from five to 12 times," he notes proudly. Even better, Twitter has been available for about 99.9999% of the past six months, even with that Japanese peak.

Yet even while Twitter moved quickly, the concern is that other parts of the information structure will not be resilient enough to deal with inevitable collapses – and that could have unpredictable effects.

"We've had mains power for more than a century, but can have an outage caused by somebody not resetting a switch," says Ross. "The only security companies can have is if they build plenty of redundancy into the systems that affect our lives."


Click here for the article on the web.

Related blog & news

With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.

Visit our Blogs & News portal or check out the related posts below.

3 Ways Machine Learning Is Benefiting Your Healthcare

With Data-led roles leading the list in the World Economic Forum’s ‘Jobs of the Future’ report, it is no surprise that Data Science continues to be the main driving force behind a number of technological advancements. From the Natural Language Processing (NLP) that powers your Google Assistant, to Computer Vision identifying scanning pictures for specific objects and the Deep Learning techniques exploring the capability of computers to become “human”, innovation is everywhere.  It’s unsurprising, then, that the world of healthcare is fascinated by the possibilities Data Science can offer,  possibilities which could not only make your and my life better, but also save several thousands of lives around the world. To just scrape the surface, here are three examples of how Machine Learning (ML) techniques are being used to benefit our healthcare.  COMPUTER VISION FOR IMAGING DIAGNOSTICS  Have you ever had a broken leg or arm and saw a x-ray scan of your fracture? Can you remember how the doctor described the kind of fracture to you and explained where exactly you can see it in the picture? The same thing that your doctor did a few years ago, can now be done by an algorithm that will identify the type of fracture, and provide insights into how you should treat it. And it’s not just fractures; Google's AI DeepMind can spot breast cancer as well as your radiologist. By feeding a Machine Learning model the mammograms of 76,000 British women, Google’s engineers taught the system to spot breast cancer in a screen scan. The result? A system as accurate as any radiologist.  We‘ve already reached the point where Machine Learning and AI can no longer just outsmart us at a board game, but can benefit our everyday lives, including in as sensitive use-cases as the healthcare industry. NLP AS YOUR PERSONAL HEALTH ASSISTANT  When we go to our GP, we go to see someone with a medical education and clinical understanding who can evaluate our health problems. We go there because we trust in the education of this person and their ability to give us the best information possible. However, thanks to the rise of the internet, we’ve turned to search engines and WebMD to self-diagnose online, often reading blogs and forums that will convince us we have cancer instead of a common cold.  Fortunately, technology has advanced to the point where it can assist with an on-the-spot (much more accurate) evaluation of your medical condition. By conversing with an AI, like the one from Babylon Health, we can gain insights into possible health problem, define the next steps we need to take and know whether or not we need to see a doctor in person.  There’s no need to wait for opening times or to sit bored in a waiting room. Easy access from your phone democratises the process and advice can be received by anyone, at any time.    DEEP LEARNING DRAWS CONCLUSIONS BETWEEN MEDICAL STUDIES Despite their extensive qualifications, even medical researchers can feel overwhelmed by the sheer amount of Insights and Data that are gathered around the world in hospitals, labs, and across various studies. No wonder it’s not uncommon for important Insights and Data to get forgotten in the mix. Once again, Machine Learning can help us out. Instead of getting lost in a sea of medical data, ML algorithms can dig deep and find the information medical researchers really need. By efficiently sifting a through vast amounts of medical data, combining certain datasets and providing insights, ML sources ways for treatments to be improved, medicines to be altered, and, as a result, can save lives. And this is only the beginning. As Machine Learning continues to improve we can expect huge advances in the following years, from robotic surgery to automated hospitals and beyond. If you’re an expert in Machine Learning, we may have a job for you. Take a look at our latest opportunities of get in touch with one of our expert consultants to find out more. 

Diversity In Data: An Overview Of Our Berlin Meet-Up

We started the year off right at Harnham Berlin, following the launch of our first ever European “State of Diversity Report” and working in collaboration with Smava to host an amazing event with three inspiring speakers on the topic. Our second event in Berlin, we wanted to continue with our mission to create a different type of tech meetup, moving away from purely technical discussions and focussing on important non-technical subjects within Data & Analytics and Recruitment. With Diversity & Inclusion more important than ever for both businesses and individuals, we wanted to do our bit to contribute to the discussion and talk about how the industry can move forward. As we were full on the day, and many of those who wanted to attend were unable to make it, I just wanted to put together a short piece on some of the highlights. Here are some of the top points covered on the day: Harnham’s State of Diversity Report David Webb – PrincipAL Consultant | Harnham As industry leaders, we feel it’s our responsibility to share our knowledge with businesses as individuals across the world of Data & Analytics. Alongside our annual Salary Guide, our Diversity report allows us to provide you with a comprehensive overview of the market and, in this presentation, we discussed the state of D&I in Europe and Germany specifically. Research has showed time and again that a diverse workforce drives profitability & increases staff satisfaction, so is it really surprising that having many people from different backgrounds can offer a company a broader range of solutions?Our report surveyed over 3,000 people and shows that not only can you increase profitability and improve staff satisfaction, a full TWO THIRDS of job seekers consider Diversity to be an important factor when analysing a job offer (which Bar Schwartz takes a closer look at during her talk).With a German workforce that’s only 25% female, there is still plenty of work to do in order to achieve greater equality. If you’d like a copy of the full report and want to talk through some of our findings in more detail, please just get in touch.  Everyone speaks about D&I, not everyone is ready for it Bar Schwartz – Head of Engineering | Signavio Diversity is not an outcome of hiring people of different gender or colour; it is an outcome of seeking and accommodating different personalities at work.Integrating diversity to your workplace or team requires education on what diversity is, what personality is, and how people differ. It requires challenging our biases on what the right ways to do things are and what is perceived as good or bad.It has to be a top-down, inside-out solution that covers everything from culture to leadership, every individual, and even your structures and roles. Change can start small. Integrating different people into the hiring process (even if they just observe), exposes people to profiles of diverse people and may challenge your unconscious biases. You can read more of Bar’s thoughts on creating a Diverse workforce here.  How the brain asks for Inclusion, not Diversity Kirsten Brueckner – CMO | mobile.de Our brain asks for inclusion, not diversity. Why? Our brain is incredibly smart in being as efficient as possible. This means that 95% of our decision making is unconscious and 70% of it is influenced by emotions (and we are great in post-rationalising). Most of the time we are on autopilot based on past experiences and knowledge and we don't even know this. We mix past experiences and knowledge with the input we get and form our own version of reality, which is a challenge in communication.What does that tell us about diversity? It’s difficult as we can't be on autopilot if we want to make progress. We need to discard past experiences and question our current knowledge. There are some simple tricks that transform recognising diversity into seeing inclusion; search for similarities (you will always find some), broaden your experience, be consciously conscious and enjoy the ride while learning. How to better advocate for Diversity & Inclusion Anna Mikulinska – CTO | Enterroom Why is it urgent to act? Without exposing the bias in Data, we use the inequality which will become a part of the design of the modern world and this is only amplified by the use of technology.How should we approach D&I? With empathy, and by addressing all the possible doubts Diversity & Inclusion raises. For Managers and Investors to spend money on supporting Diversity & Inclusion we need to make sure they truly understand the value of becoming advocates on their own. It’s not enough to convince someone for five minutes, they in turn need to be able to a buy in from their managers and partners as well. It’s time we stop avoiding difficult questions, let’s address them upfront.Everyone can act, but what can be done? Not everyone has to get on stage or into a board room. We can support progression with D&I by:Creating a D&I friendly work environmentBringing up the topic during the interview process as a potential candidateMentoring a young person willing to enter the Tech world and sharing your story with themNot being silenced by the argument “let’s not do politics” We will be running more events throughout 2020 which are already being planned and hope to see you all there! If you would like any more information, would like to get involved, or if you’re looking for support with your Data & Analytics hiring process, get in touch with our team of expert consultants and we will be able to advise you on the best way forward.  You can download our European Diversity Report here, and our Salary Guide here. 

RELATED Jobs

Salary

US$130000 - US$150000 per annum + Benefits

Location

Detroit, Michigan

Description

An IOT technology company are looking for a Senior Data Engineer to join them in the heart of Detroit! Click below to read more!

Salary

US$195080 - US$219465 per annum + Benefits

Location

Boston, Massachusetts

Description

An award-winning consultancy are looking for a Data Architect to join them in the heart of Boston! The salary is up to $180k, click below to read more!

Salary

US$130000 - US$160000 per annum + stock + bonus + equity

Location

New York

Description

Senior Data Engineer

recently viewed jobs