A data janitor, the sexiest job of the 21st century



A job invented in Silicon Valley is going mainstream as more industries try to gain an edge from big data.

The job description “data scientist” didn’t exist five years ago. No one advertised for an expert in data science, and you couldn’t go to school to specialize in the field. Today, companies are fighting to recruit these specialists, courses on how to become one are popping up at many universities, and the Harvard Business Review even proclaimed that data scientist is the “sexiest” job of the 21st century.

Data scientists take huge amounts of data and attempt to pull useful information out. The job combines statistics and programming to identify sometimes subtle factors that can have a big impact on a company’s bottom line, from whether a person will click on a certain type of ad to whether a new chemical will be toxic in the human body.

While Wall Street, Madison Avenue, and Detroit have always employed data jockeys to make sense of business statistics, the rise of this specialty reflects the massive expansion in the scope and variety of data now available in some industries, like those that collect data about customers on the Web. There’s more data than individual managers can wrap their minds around—too much of it, changing too fast, to be analyzed with traditional approaches.

As smartphones promise to become a new source of valuable data to retailers, for example, Walmart is competing to bring more data scientists on board and now advertises for dozens of open positions, including “Big Fast Data Engineer.” Sensors in factories and on industrial equipment are also delivering mountains of new data, leading General Electric to hire data scientists to analyze these feeds.

The term “data science” was coined in Silicon Valley in 2008 by two data analysts then working at LinkedIn and Facebook (see “What Facebook Knows”). Now many startups are basing their businesses on their ability to analyze large quantities of data—often from disparate sources. ZestFinance, for example, has a predictive model that uses hundreds of variables to determine whether a lender should offer high-risk credit. The underwriting risk it achieves is 40 percent lower than that borne by traditional lenders, says ZestFinance data scientist John Candido. “All data is credit data to us,” he says.

Data scientist has become a popular job title partly because it has helped pull together a growing number of haphazardly defined and overlapping job roles, says Jake Klamka, who runs a six-week fellowship to place PhDs from fields like math, astrophysics, and even neuroscience in such jobs. “We have anyone who works with a lot of data in their research,” Klamka says. “They need to know how to program, but they also have to have strong communications skills and curiosity.”

The best data scientists are defined as much by their creativity as by their code-writing prowess. The company Kaggle organizes contests where data scientists compete to find the best way to make sense of massive data sets (see “Startup Turns Data Crunching into a High-Stakes Sport”). Many of the top Kagglers (there are 88,000 registered on the site) come from fields like astrophysics or electrical engineering, says CEO Anthony Goldbloom. The top-ranked participant is an actuary in Singapore.

Universities are starting to respond to the job market’s needs. Stanford University plans to launch a data science master’s track in its statistics department, says department chair Guenther Walther. A dozen or so other programs have already been started at schools including Columbia University and the University of California, San Francisco. Cloudera, a company that sells software to process and organize large volumes of data, announced in April that it would work with seven universities to offer undergraduates professional training on how to work with “big data” technologies.

Cloudera’s education program director, Mark Morissey, says a skills shortage is looming and that “the market is not going to grow at the rate it currently wants to.” That has driven salaries up. In Silicon Valley, salaries for entry-level data scientists are around $110,000 to $120,000.

Others think the trend could create a new area of outsourcing. Shashi Godbole, a data scientist in Mumbai, India, who is ranked 20th on Kaggle’s scoreboard, recently completed a Kaggle-arranged hourly consulting gig, a new business the platform is getting into. He did work for a tiny health advocacy nonprofit located in Chicago and is now bidding on more jobs (he earns $200 per hour, and Kaggle collects $300 an hour). His Kaggle work is part time for now, but he says it’s possible that it could be his major source of income one day.

To the data scientists themselves, the job is certainly less sexy than it’s being made out to be. Josh Wills, a senior director of data science at Cloudera, says most of the time it involves cleaning up messy data—for example, by putting it in the right columns and sorting it.

“I’m a data janitor. That’s the sexiest job of the 21st century,” he says. “It’s very flattering, but it’s also a little baffling.”


Click here for the article on the web.



<< Click here to see more recent news articles >>

 

 

Harnham blog & news

With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.

Visit our News & Blogs portal or check out our recent posts below.

The Surprising Collaboration of Ada Lovelace, Charles Babbage, and Alan Turing

What do you get when you combine Amelia Earhart with Ada Lovelace? A Data Visualization Engineer ready to work with an aviation industry partner. Reaching new heights and shattering the glass ceiling is the modus operandi for many women, and what better role models than the ladies listed. Creative, free-spirited, pioneering, and well before their time in thoughts and action. Ada Lovelace, now attributed as the first computer programmer saw beyond the automatons of her day. She saw beyond the Berullean language in front of her she was translating.  A poet father and a passion for numbers collided into her thoughts and as we marvel at AI making art, writing stories and music, and winning strategy games, we have one lady to thank. Ada. She might also be called the first Data Visualization Engineer. Don’t you think? Insightful Business Decisions are Key in Collaboration Data professionals are no longer siloed from other departments in business allowing for collaboration between teams. In partnership between both technical and non-technical employees, businesses can be sure they’re teams have a single vision to help realize business objectives and goals. The collaboration between Ada Lovelace and Charles Babbage may not have been business-related, but the ideas are the same. He passed her the document and asked her to translate, she made notes, and those notes have made history. Together they created a vision for The Analytical Machine – it exists only on paper, but it’s design, layout, and potential implementation are realized in ways unimaginable to most 100 years ago.Ada’s mathematical prowess was such that she wrote her notes in easily explainable language.She worked closely with Charles Babbage and wrote in earnest to work with Michael Farraday – she reached out to others in her field, some accepted, others didn’t. How Data Helps Inform the Future Whether you use predictive modeling, machine learning, natural language processing, or some combination of each, the data you collect helps to inform the future. We may often lament the old adage that those who don’t know their history are doomed to repeat it, but history has a shining light as well. Collaboration across the ages. Consider this. Alan Turing, the man who worked in Bletchley Park with the Enigma machine, used the notes he found to help him solve the problem. Those notes belonged to Ada Lovelace. The information she set to paper informed every stage of computer programming leading to what we know today as Artificial Intelligence. Machines that could learn and ‘think,’ not just the automatons of her age which had been ‘programmed to perform.’ The Enchantress of Numbers Known as the Enchantress of Numbers, the pioneering Ada Lovelace shares the spotlight with other pioneering women in the sciences. Think Madame Curie, Joan Clarke, even Hedy Lamarr, and of course Amelia Earhart. They weren’t of the same eras, but each of their contributions have added to what we know as the Science, Technology, Engineering, and Mathematics (STEM). We have a name for it now, but it’s always been around. And the collaborative efforts of women everywhere are growing and increasing diversity and inclusion in many businesses across the world. And at the heart of it all, in the beginning, a surprising and time-defying collaboration began. It set in motion a spark of business intelligence and insight as men and women mentored and partnered for the sake of their vision of the future. Who will be remembered one hundred years from now?  If you’re interested in Big Data, Web Analytics, Marketing & Insight, Life Science Analytics, and more, check out our current vacancies or contact one of our recruitment consultants to learn more.   For our West Coast Team, contact us at (415) 614 - 4999 or send an email to sanfraninfo@harnham.com.   For our Mid-West and East Coast teams contact us at (212) 796-6070 or send an email to newyorkinfo@harnham.com.  

Computer Vision in Healthcare Beyond Covid-19

2020. It sounds like the name of a futuristic science-fiction movie or TV show, doesn’t it? Maybe it is. And like our favorite sci-fi flicks there are cutting edge changes happening in real time. We’re the characters in this story and the Computer Vision and Artificial Intelligence partnerships in healthcare are moving fast to help us take care of ourselves. When computers can see what we can’t. When AI can help us make more informed decisions. When the two are combined to help doctors and providers work more efficiently to save lives, that’s when the cutting-edge shines. From the collaboration of Johns Hopkins, the CDC, and the WHO mapping out the data to contact traces to medical professionals on the front lines, we’ve been focused on one thing. Saving lives. But, what about the other medical issues that affect us? Heart disease. Cancer. Neurological illnesses.  What if the latest advances in healthcare could help here, too? Five Ways Computer Vision Helps Healthcare Providers Identifies leading causes of medical illnesses in a time-sensitive manner by creating algorithms for image processing, classification, segmentation, and object detection.Develops deep learning models to create neural networks.Collaboration of teams of scientists working together for the advancement of projects and present findings to business leaders, stakeholders, and clients.Allows providers to spend more time with their patients.Optimization of medical diagnoses using deep learning so doctors can spend more time with patients to help see and solve the problem faster. Computer Vision Engineer Meets AI Professional Artificial Intelligence (AI) offers real world answers in healthcare the world needs today. Computer Vision Engineers build the means to which AI helps providers, patients, and leaders make informed decisions. Core requirements for both roles include, but aren’t limited to: Experience in machine learning and deep learning.How to build computer vision algorithms and probability models.Problem-solving skills, creativity, ingenuity, and innovation.Languages like Python, R, Hadoop, Java, and Spark.Be able to see the big picture while at the same time finding the devil in the details. Always striving to improve, to make better, to advance the technology within the industry. The Challenges and the Potential of Technology in Healthcare At the moment, Computer Vision, AI, and other healthcare technology models are localized to individual placements. The next step is to have these technologies ‘speak’ to each other across hospitals, provider’s offices, telehealth applications, and electronic health records management for a more cohesive benefit of care. As this year rounds to a close, we know the vulnerabilities of our healthcare system, and can find solace in the though that technology is bringing it forward at lightning speed. Automation and telehealth appointments have made it a breeze to talk to our doctors and get results faster. We can pay our bills with the click of a button and even carve out a payment plan, if need be. All without leaving our homes. The data now available to us and our providers offers a foundation, a benchmark of information, so our doctors can make more informed decisions. This data goes beyond the individual, it helps set a precedent for not only individuals, but also entire populations, to help us identify future health issues, epidemics, and pandemics.  Stored data is private and stays within its construct of hospital or doctor’s office, but from it we can create models to plan for the future. Want to make your make your mark in the healthcare and tech industry? We may have just the role for you. Check out our current vacancies or get in touch with one of our expert consultants to learn more.   For our West Coast Team, contact us at (415) 614 - 4999 or send an email to sanfraninfo@harnham.com.   For our Mid-West and East Coast teams contact us at (212) 796-6070 or send an email to newyorkinfo@harnham.com.  

A Slam-Dunk Career as a SLAM Engineer

Philadelphia. It’s known for it’s Philly Cheesesteak, the Liberty Bell, and where the Constitution was signed. Always on the cutting edge, Philadelphia is a land of firsts. You may or not know this, but one of its firsts was to have the first general use computer in 1946. Is it any wonder then that a company there is building robots to navigate GPS denied environments and was begun by leaders in the Computer Vision space?  Beyond the Roomba If you consider the Roomba, the autonomous vacuum that sweeps up pet hair, dirt, and other unwanted product, how does it know where to go? How does it know to go under a table or chair or around a wall to the next room? How does it know to avoid the dog, cat, or you? On nearly the smallest scale, this little round machine is a personal version of simultaneous location and mapping (SLAM).  However, the computational geometry method of this mapping and localization technique extends in a wide variety of arcs. Here are a few to get you thinking: GPS Navigation SystemsSelf-driving carsUnmanned Aerial Vehicles (UAV)Autonomous Underwater Vehicles (AUV)DronesRobotsVirtual Reality (VR)Augmented Reality (AR)Monocular Camera...and more There’s even a version which is used in the Life Sciences called RatSLAM. But we’ll visit that in another article. The uses and benefits of this simultaneous location and mapping technique are exponential even with some of the challenges posed by Audio-Visual and Acoustic SLAM. What is SLAM? Essentially, it is the 21st century version of cartography or mapping. Except in this case, not only can it map the environment, but it can also locate your place in it. When you want to know where the nearest restaurant is, you simply type in ‘restaurant near me.’ And soon, a list appears on your phone with a list radiating from nearest location outward.  Imagine you’re lost on a hike, you manage to find signal, and soon your GPS is offering directions on which way to move toward civilization.  This is Simultaneous Localization and Mapping. It locates you, your vehicle, a robot, drone, unmanned aerial vehicle or self-driving car and puts people and things in the direction it thinks they want to go or should go to get to safety. While mapping is at the epicenter of SLAM Computer Vision Engineering, there are other elements within the field as well. But let’s begin with mapping. Topological maps offer a more precise representation of your environment and can therefore help ensure consistency on a global scale.  Just as humans do when giving directions, sensor models offer landmark-based approaches to make it easier to determine your location within the map’s structure and raw-data approaches which makes no assumptions. Landmarks such as wifi or radio beacons are some of the easiest to locate, but may not always be correct which is where the raw-data approach comes in to offer its two cents as a model of location function. Four Challenges of SLAM GPS sensors may not function properly in chaotic environments such as military conflict. }Non-static environments such as pedestrians or high traffic areas with multiple vehicles make locations difficult to pinpoint.In Acoustic SLAM, challenges include inactivity and environmental noise as well as echo. Sound localization requires a robot or machine to be equipped with a microphone in order to go in the requested direction. Five Additional Forms of SLAM Tactile (sensing by touch)RadarAcousticAudio-Visual (a function of Human-Robot interaction)Wifi (sensing strength of nearby access points) Ready to Explore a Robotics and Computer Vision Career? Whether you’re interested in a slam dunk career as a SLAM Engineer or looking for your first or next role in Big Data, Web Analytics, Advanced Analytics & Insight, Life Science Analytics, or Data Science, take a look at our current vacancies or get in touch one of our expert consultants to learn more.   For our West Coast Team, contact us at (415) 614 - 4999 or send an email to sanfraninfo@harnham.com.   For our Mid-West and East Coast teams contact us at (212) 796-6070 or send an email to newyorkinfo@harnham.com.

How Machine Learning and AI Can Help Us See the Forest for the Trees

In the early days of 2020, Johns Hopkins, the CDC, the WHO, and a host of other public organizations banded together in collaboration. They were on a mission to ensure the world had real-time information to a virus that would forever chance the course of this year and the years to come. Which is great for those families with a computer in every home or every person with smartphone access. But what about the rest of the world? How do you ensure those people without access to basic needs lives can be improved? A health non-profit using AI and Machine Learning is aiming to do just this. But the Data is vast and the sheer numbers of people need to be corralled by someone into something the computers can read and make decisions on. Who would have thought Public Research and Data Science would come together in such a manner and in such an important time? Three Benefits of Data Science and Machine Learning in Healthcare According to a seminar given in September 2019, two research scientists explained to the CDC the promises and challenges using Big Data for public health initiatives. After explaining a few definitions and making correlations, the focus was soon on the benefits. The focus of Machine Learning is to learn data patterns.From the initial focus, patterns can then be validated to ensure they make sense.These patterns and validation of patterns can find links between seemingly uncorrelated factors such as the relationship between one’s environment and their genetics. To the scientists working with these scenarios, the decisions seem simple. Yet, when it comes to explaining them to laymen like policymakers, there can be a shift in understanding. This shift can lead to arbitrary and different findings which can affect medical decision making. Why? Could it be using Random Forests in linking the data could be confusing?  Data Classification is Not as Cut-and-Dried as a Work Flow or Org Chart If someone shows us a work flow or organizational chart, we understand immediately each task to be done in which order or who reports to whom. But in trying to link uncorrelated bits of information using decision trees, it can seem more like abstract art, more subjective than direct. Yet, it is those correlations which answer the bigger questions brought to bear by Research Scientists, Public Health Researchers, the Data Scientists, and AI working together to see the bigger picture. Decision trees, ultimately, are the great classifier. But there are a few things which need to be in place first. Yet, in the random forest model it’s not just one decision tree, it’s many. This is definitely a case where, if you done right, you will see the forest for the trees and at the same time be able to determine patterns in those trees. A bit counter-intuitive, but this is what stretches our minds to see correlations and patterns we might not see otherwise, don’t you think? So, what do you need to help make predictions?  Two Important Needs to Help Make Predictions Predictive power. The features you employ should make some sense. For example, without a basic knowledge of cooking, you can’t just throw random items from your refrigerator into a pot and expect it taste good. Unless of course, you’re making soup and all you have to do is add water.The trees and their predictions should be uncorrelated. If you’ve ever seen M. Night Shymalan’s Lady in the Water, there’s a little boy who can ‘read’ cereal boxes and tell a coherent story. A predictive coherent story. This is the layman’s version of random forests, their predictive nature, and ultimately, the scientists who can ‘read’ and explain the patterns. If you're looking for your first or next role in Big Data, Web Analytics, Marketing & Insight, Life Science Analytics, and more, check out our current vacancies or contact one of our recruitment consultants to learn more.   For our West Coast Team, contact us at (415) 614 - 4999 or send an email to sanfraninfo@harnham.com.   For our Mid-West and East Coast teams contact us at (212) 796-6070 or send an email to newyorkinfo@harnham.com.  

Recently Viewed jobs