A data janitor, the sexiest job of the 21st century

Daniel Lewis our consultant managing the role
Posting date: 7/17/2013 3:19 PM

A job invented in Silicon Valley is going mainstream as more industries try to gain an edge from big data.

The job description “data scientist” didn’t exist five years ago. No one advertised for an expert in data science, and you couldn’t go to school to specialize in the field. Today, companies are fighting to recruit these specialists, courses on how to become one are popping up at many universities, and the Harvard Business Review even proclaimed that data scientist is the “sexiest” job of the 21st century.

Data scientists take huge amounts of data and attempt to pull useful information out. The job combines statistics and programming to identify sometimes subtle factors that can have a big impact on a company’s bottom line, from whether a person will click on a certain type of ad to whether a new chemical will be toxic in the human body.

While Wall Street, Madison Avenue, and Detroit have always employed data jockeys to make sense of business statistics, the rise of this specialty reflects the massive expansion in the scope and variety of data now available in some industries, like those that collect data about customers on the Web. There’s more data than individual managers can wrap their minds around—too much of it, changing too fast, to be analyzed with traditional approaches.

As smartphones promise to become a new source of valuable data to retailers, for example, Walmart is competing to bring more data scientists on board and now advertises for dozens of open positions, including “Big Fast Data Engineer.” Sensors in factories and on industrial equipment are also delivering mountains of new data, leading General Electric to hire data scientists to analyze these feeds.

The term “data science” was coined in Silicon Valley in 2008 by two data analysts then working at LinkedIn and Facebook (see “What Facebook Knows”). Now many startups are basing their businesses on their ability to analyze large quantities of data—often from disparate sources. ZestFinance, for example, has a predictive model that uses hundreds of variables to determine whether a lender should offer high-risk credit. The underwriting risk it achieves is 40 percent lower than that borne by traditional lenders, says ZestFinance data scientist John Candido. “All data is credit data to us,” he says.

Data scientist has become a popular job title partly because it has helped pull together a growing number of haphazardly defined and overlapping job roles, says Jake Klamka, who runs a six-week fellowship to place PhDs from fields like math, astrophysics, and even neuroscience in such jobs. “We have anyone who works with a lot of data in their research,” Klamka says. “They need to know how to program, but they also have to have strong communications skills and curiosity.”

The best data scientists are defined as much by their creativity as by their code-writing prowess. The company Kaggle organizes contests where data scientists compete to find the best way to make sense of massive data sets (see “Startup Turns Data Crunching into a High-Stakes Sport”). Many of the top Kagglers (there are 88,000 registered on the site) come from fields like astrophysics or electrical engineering, says CEO Anthony Goldbloom. The top-ranked participant is an actuary in Singapore.

Universities are starting to respond to the job market’s needs. Stanford University plans to launch a data science master’s track in its statistics department, says department chair Guenther Walther. A dozen or so other programs have already been started at schools including Columbia University and the University of California, San Francisco. Cloudera, a company that sells software to process and organize large volumes of data, announced in April that it would work with seven universities to offer undergraduates professional training on how to work with “big data” technologies.

Cloudera’s education program director, Mark Morissey, says a skills shortage is looming and that “the market is not going to grow at the rate it currently wants to.” That has driven salaries up. In Silicon Valley, salaries for entry-level data scientists are around $110,000 to $120,000.

Others think the trend could create a new area of outsourcing. Shashi Godbole, a data scientist in Mumbai, India, who is ranked 20th on Kaggle’s scoreboard, recently completed a Kaggle-arranged hourly consulting gig, a new business the platform is getting into. He did work for a tiny health advocacy nonprofit located in Chicago and is now bidding on more jobs (he earns $200 per hour, and Kaggle collects $300 an hour). His Kaggle work is part time for now, but he says it’s possible that it could be his major source of income one day.

To the data scientists themselves, the job is certainly less sexy than it’s being made out to be. Josh Wills, a senior director of data science at Cloudera, says most of the time it involves cleaning up messy data—for example, by putting it in the right columns and sorting it.

“I’m a data janitor. That’s the sexiest job of the 21st century,” he says. “It’s very flattering, but it’s also a little baffling.”


Click here for the article on the web.

Related blog & news

With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.

Visit our Blogs & News portal or check out the related posts below.

How to Succeed in Self-Service BI

How to Succeed in Self-Service BI

Business Intelligence, along with Business Analytics and Big Data, is one of the terms often associated with decision-making processes in organisations.  However, there is little discussion around the importance of what skills decision makers in your organisation need to use the technology efficiently.  In recent years, the development of user-friendly tools for BI processes, Self-Service BI are increasing. Self-Service BI is an approach to BI where anyone in an organisation can collect and organise data for analysis without the assistance of data specialists. As a result of this, many businesses have invested in comprehensive storage and information processing tools. However, many are beginning to find that they are not able to realise the gains of these investments as they were expecting, may often due to underestimating the difficulties of introducing these systems into the current processes and transforming existing knowledge into actual actions and decisions.  In a worst-case scenario, if left unplanned, Self Service BI can sabotage your successful BI deployment by cutting mass user adoption, impairing query performance, failing to reduce report backlogs, and increasing confusion over the “single truth”. To prevent this from happening, here are our top three tips for ensuring the right implementation of SSBI in your company: UNDERSTAND YOUR USERS’ NEEDS There are three major user areas for analytics tools: strategic, tactical and operational. The strategic users make few, but important decisions. The tactical users make many decisions during a week and need updated information daily. Operational users are often closest to the customer, and this group needs data in its own applications in order to carry out a large number of requests and transactions.  Understanding the different needs of each group is necessary to know what information should be available at each given frequency to help scale the BI solution.  HARNESS THE POWER OF ADVANCED USERS To ensure a successful BI deployment, utilising advanced users is key. Self-service BI is not a one-size fits all approach. Casual users usually don’t have the time to learn the tool and will often reach out to ‘Power Users’ to create what they need. Hence, these users can become the go-to resource for creating ad-hoc views of data. Power Users are the ideal advocates for your business’ self-service BI implementation and should be able to help spur user adoption.  UPGRADE INTERNAL COMPETENCIES  Our final tip for a successful implementation is to communicate the new tool thoroughly to the users.  It is highly unlikely that employees who have not been involved in the actual development project will immediately understand what the tool should be used for, who needs it, and what it should replace. By upgrading internal competencies, you can avoid becoming dependent on external assistance. Establishing a cross-organizational BI competence centre of 5-10 members, who meet regularly to share their experiences will help drives and prioritise future use of the tool. The added benefit of a successful implementation is that it will generate new ideas from users for how the organisation can use data to make better decisions. If you have the skillset to implement Business Intelligence solutions, we may have a role for you.  Take a look at our latest opportunities or get in contact with our team. 

Real Time Pricing - Coming to a store near you

Real Time Pricing - Coming to a store near you

Real-time pricing: coming to a store near you.Personal shopping is on the brink of taking on a whole new meaning. The advancement of mobile technology and the information held on individuals' shopping histories means product prices could soon adapt as shoppers walk up and down their supermarket aisle.Gone are the days of retailers only being able to actively manage the price of a small number of products once a week. Algorithmic pricing and real-time competitive pricing data allows the changing of product prices on the fly.Amazon is at the forefront of such "real-time pricing" initiatives, which have traditionally been the preserve of online-only retailers.However, brick-and-mortar retailers in the US are showing their UK counterparts the limitless possibilities when it comes to dynamic pricing.Independent consumer electronics retailer Abt Electronics pipes competitive pricing data gathered by Dynamite Data into its point-of-sale systems to allow staff to negotiate prices at the point-of-sale, according to Dynamite Data chief executive Diana Schulz.Meanwhile, another one of Dynamite Data’s unnamed clients uses electronic shelf labels and re-prices every product in their stores each morning based on the prices of its rivals.The ability to change prices dynamically is not simply the preserve of all-powerful brands such as Walmart or Target either.Schulz explained that her company has "seen these types of technologies in both large and mid-sized retailers" despite the "investment in technology and competitive data that is typically needed".Commercial sensitivitiesBack in the UK things are not quite as close to a Minority Report-style personalized shopping experience.Even online-only specialists Shop Direct and Ocado claim they do not engage in real-time pricing, while those that do heavily use real-time data to adapt their prices such as the airline brands are reluctant to discuss the issues.EasyJet declined to comment when contacted because of commercial sensitivities around discussing pricing-related issues.Grocers Tesco, Asda and  Sainsbury’s have all claimed they do not engage in real-time pricing, with the latter two both citing the logistical difficulties in aligning such a strategy across their physical stores and online presence.A Sainsbury’s spokesman claims real-time pricing would result in "chaos", while an Asda spokeswoman saying such a strategy would be a "nightmare".Yet, despite such a negative perspective from UK brands, experts are confident real-time pricing will arrive on these shores sooner or later.Simon Spyer, a partner of VCCP data arm Conduit who began his career working on the Sainsbury's Nectar business, believes the UK will begin to see "more and more" of matching rivals’ prices dynamically, particularly in the grocery and electrical sectors.He explained that real-time pricing is likely to affect "anything where the product is largely commoditized" and in instances where the only way retailers can differentiate that product is by "being really keen on price".Electronic labelsAs it stands the major barrier for implementing "real-time pricing" in-store is changing the prices to match the online price, a hurdle that could be removed by the electronic shelf labels being pioneered in the US.Schemes like Tesco Price Promise and Asda Price Guarantee already use real-time data to 'price match'In the UK various retailers have dipped their toes into the water when it comes to electronic shelf-labeling including a Nisa Local store in Shrewsbury that launched a trial in August last year to carry out automatic pricing and timed promotional updates, alongside QR codes and meal deals.Tesco has also experimented with electronic labeling on various occasions with trials in 2006 and 2008, but the retail giant has yet to combine real-time pricing with its electronic labels.Spyer claims "the capability is definitely there both online and offline – it is whether there is a business rationale for investing in it".However, with major UK supermarkets lacking a pressing reason to implement real-time pricing, that investment may be slow in arriving, argues Kaye Coleman, the founder of price consultancy Ripe Strategic.Coleman explains: "The supermarkets already do price matching – it is not so sophisticated but price matching is already happening".Schemes including the Tesco Price Promise, the Asda Price Guarantee and the Sainsbury’s Brand Match currently use real-time data to "price match" by offering money off the next shop.A cynic could argue the supermarkets should knock money off at the till rather than relying on customers to redeem their vouchers at the next shop, but such an action could hit the companies' bottom line.Mobile sophisticationThe growing sophistication of mobile marketing is also likely to revolutionize the way brands approach their price matching."If you can come up with a value proposition where I check-in [on my mobile] when I walk through the store for the first time and that presents me with a personalized experience based on my purchase history then I could see the benefit for a customer and a retailer," said Spyer.The trick for retailers is persuading customers to adopt such behavior, but the offer of being delivered ever-changing personalized price offers and messages in-store is a compelling proposition.Personalization is already a priority for retailers. Sainsbury’s uses anonymized shopping data gathered from the Nectar card to personalize offers.The levels of personalization offered by Sainsbury’s are increasingly complex. If a female customer buys folic acid they will be sent promotions on other pregnancy-related supplements during the pregnancy period and offers on nappies further down the line.UK retailers are sure to keep a close eye on developments over the Atlantic, with Schulz claiming she knows of clients that are piloting technologies that enable in-store personalized discounts.The challenges on the high-street mean there will inevitably be more casualties, but real-time pricing does not have to be the sole preserve of online-only retailers.Innovative ways of manipulating real-time data could be the shot in the arm the high-street retail industry so desperately needs.This article was first published on marketingmagazine.co.ukClick here for the article on the web.

RELATED Jobs

Salary

€49800 - €77467 per annum

Location

Dublin

Description

We are looking for a Data Scientist with experience in fraud projects to join one of Ireland's fastest growing Fin-techs in the Payments space.

Salary

£42000 - £70000 per annum + Bonus + Benefits

Location

City of London, London

Description

We have an exciting new position available with a Consultancy Firm. They are seeking a Manager - Sanctions

Salary

€75000 - €85000 per annum

Location

Frankfurt am Main, Hessen

Description

Bringe deine Data Science Karriere in einem gestandenem Unternehmen mit Start-Up Mentalität voran. Lerne mehr...

Salary

£65000 - £75000 per annum + Competitive Benefits + Bonus

Location

London

Description

Are you looking to be a Senior Software Engineer where you will be working to develop innovative applications to be used by millions of users?

recently viewed jobs