Director of Machine Learning
San Francisco, California / $210000 - $250000
$210000 - $250000
San Francisco, California
Director of Machine Learning
Bay Area, CA $210,000 - $250,000 + Competitive Benefits
- COMPANY: A growing start up in the AI Semiconductors space
- TEAM: Grow and lead a team of top PhD research machine learning scientists
- CULTURE: Inclusive and diverse work culture and environment
As the Director of ML, you will…
- Build and lead a strong team of PhD machine learning scientists
- Build advanced end-to-end solutions using techniques in computer vision, deep learning, reinforcement learning, time series analysis, and predictive maintenance
- Lead research and innovation efforts for the company in the ML space
- Drive the vision and strategy for the entire team and deliver high impact solutions using AI
- Work directly with leadership teams, stakeholders, and clients
YOUR SKILLS AND EXPERIENCE
- 6+ years of full time experience in data science and machine learning
- 3+ years of experience in leadership
- Hands on experience building end-to-end machine learning products
- Hands on experience deploying machine learning into real world applications
- Experience with deep learning, computer vision, reinforcement learning
- ML research experience in both academia and industry
- Huge plus - Experience in semiconductors
- Strong communication skills, experience working with C-level stakeholders and implementing strategy
- Tools: Python, AWS, Tensorflow, PyTorch, OpenCV
As the Director of ML, you can expect a base salary between $210,000 to $250,000 (based on experience) plus competitive benefits.
HOW TO APPLY
Please register your interest by sending your CV to Kristianna Chung via the Apply link on this page
Battle Royale: Computational Biologists vs Machine Learning Engineers | Harnham US Recruitment post
From the first genome sequencing in the second revolution to Life Science Analytics as a growing field in the fourth industrial revolution, change has been both welcomed and fraught with fear. Everyone worries about robots, Artificial Intelligence, and in some cases even professionals who have stayed current by keeping up-to-date with trends. And it’s beginning to affect not only “office politics” within the tech space, but even interviewer and interviewee relationships.We’ve seen a growing trend of apprehension between Computational Biologists and Machine Learning Engineers. What could be the cause? Aren’t they each working toward a common goal? It seems the answer isn’t quite so cut and dry as we’d like it to be. Here are some thoughts on what could be driving this animosity. But first, a bit of background.So, What’s the Difference?Computational Biology and Machine Learning are two sides of the same coin; one sets the framework and the other applies what’s been learned. Both use statistical and computational methods to construct models from existing databases to create new Data.However, it is within the framework of biomedical problems as computational problems, that there seems to be a bit of a breakdown. It’s one thing to have all the information and all the Data, but its quite another to know how the Data might interact or affect the health and medications of people seeking help. This is the job of those in Life Science Analytics. Determine through Data what needs to be done, quickly, and efficiently, but at the same time, ensure the human element is still active. A few examples of Computational Biology include concentrations, sequences, images and are used in such areas as Algorithmics, Robotics, and Machine Learning. The job of Machine Learning can help to classify spam emails, recognize human speech, and more. Here’s a good place to start if you’d like to take a deeper dive into the differences between the two or read this article about mindsets and misconceptions.Office Politics in the Tech SpaceCircling back to the concern between Computational Biologists and Data Scientists with a focus on Machine Learning. The latest around the water cooler within the tech space is that those with a biological background who understand Machine Learning are looked upon as dangerous to the status quo. But, as many of our candidates know, it’s important to stay on the cutting edge and if that means, upskilling in Machine Learning so you have both the human element as well as the mathematical, robotic components, then that is more marketable than just having one or the other.The learning curve in biology training within the Life Sciences Analytics space means Computational Biologist with a Machine Learning skillset is best able to apply Data Science and computer science tools to more organic and biological datasets. Someone with just a computer science background may not have the depth of knowledge to understand how these models, systems, and data affect and impact medicine.Computational Biologists who are trained simultaneously in computer science and biology, and are a little heavier on the biology side, see Machine Learning Engineers as a threat because utilizing Machine Learning and other cutting-edge tools could mean their job is on the line.They worry their job will fall by the wayside. That when somebody proves Machine Learning is faster and more efficient the impetus might be why hire a Computational Biologist when a Machine Learning engineer will do?It’s like when a lot of people joke about how robots are going to take over the world and everybody will be out of a job. I think the worry with some folks on the Computational Biology side is that maybe they just aren’t up to date with their training or haven’t kept up with cutting edge of technology.With a Recruiter’s EyeWhile what I’ve seen agrees that, yes, Machine Learning is incredibly helpful and fast and you can get through so much more data. But its still that understanding of biology and chemistry that you will need because you need to be able to understand, for example, how these proteins are going to be reacting with one another or you need to understand how DNA and RNA work, how best to analyze, and what analyzing those things means.On the other hand, just because you know, “oh, this reaction comes out of it”, if you don’t know why that is or how that could impact a drug or a person, then you don’t really have anything to go on. There’s a caveat there.Though there may be concerns among Computational Biologists and Machine Learning Engineers, at both the upper and entry levels, it’s still the technical lead who will say, “we really do need somebody with a biological background because if we get all this Data and don’t really know what to do with it, then we’ll need to hire a Project Manager to converse between the two and that’s an inefficient use of time and resources”. What I hear most often is a company wants a Computational Biologist but they also want someone who knows Machine Learning. But they don’t want to compromise on either because they don’t understand there are limitations to things. We all want the unicorn employee, but we can’t make them fit into a box with too specific parameters.It’s a Fact of LifeAny job, whether it’s in the tech industry, the food industry, Ad Optimization, or even recruitment, uses Machine Learning in one way or another. Yet compared to spaces which work on sequencing the human genome, it’s amazing to see how far things have come. It used to take days to process DNA. Now you can spit in a tube and send it off to 23andMe to learn a little about your health. That’s what Machine Learning enables people to do.But it doesn’t mean Computational Biologists are going to fall by the wayside. It means there will be times you’ll have to liaise more between the two groups. It means you’ll be more marketable by adding Machine Learning to the work you’re already doing or taking some classes in Computational Science, for example, to keep your skills up to date.It’s a Transparency IssueUltimately, it seems the heart of this apprehension comes down to a transparency issue. For example, let’s say companies begin to bring in AI people and suddenly the staff already in place begins to get worried about the security of their jobs. Even in an industry tense with skills gaps, the fear still abounds.In coming back to speak with the Hiring Manager, it became clear the animosity is even more prevalent than first imagined. So, it’s important to get input from within the company and develop a unified story, a unified message across departments, and especially within the Life Science Analytics and Data Science teams as well. In other words, “keep people in the loop.”If it’s happening to this company, it seems other companies may be facing this same issue. However, it’s not going away and is creating a kind of competition between the old guard and the up-and-coming startups. For example, any new company is going to want to integrate AI and will be asking the question how best to integrate it into their structure. They might also ask how best to optimize the ads coming through AI. This is just another way of how companies are catching up, but also how people are catching up to the companies. Technology is coming whether you like it or not. So, if you want to stay marketable and work on really interesting projects, there’s always going to be the challenge of staying up-to-date and different companies attack this in different ways. Stay open minded, keep an eye and an ear out for ways to stay on top of your game. Even just taking a few minutes to watch a YouTube video, listen to a TedTalk or a podcast, so you can talk about it and be informed. These are some really simple ways to stay on the cutting edge and help you figure out where you can grow and improve for better opportunities.Ready for the next step? Check out our current vacancies or contact one of our recruitment consultants to learn more. For our West Coast Team, call (415) 614 – 4999 or send an email to firstname.lastname@example.org. For our Mid-West and East Coast Teams, call (212) 796 – 6070 or send an email to email@example.com.
Data Science Interview Questions: What The Experts Say | Harnham Recruitment post
Our friends at Data Science Dojo have compiled a list of 101 actual Data Science interview questions that have been asked between 2016-2019 at some of the largest recruiters in the Data Science industry – Amazon, Microsoft, Facebook, Google, Netflix, Expedia, etc. Data Science is an interdisciplinary field and sits at the intersection of computer science, statistics/mathematics, and domain knowledge. To be able to perform well, one needs to have a good foundation in not one but multiple fields, and it reflects in the interview. They’ve divided the questions into six categories: Machine LearningData AnalysisStatistics, Probability, and MathematicsProgrammingSQLExperiential/Behavioural QuestionsOnce you’ve gone through all the questions, you should have a good understanding of how well you’re prepared for your next Data Science interview.
Machine LearningAs one will expect, Data Science interviews focus heavily on questions that help the company test your concepts, applications, and experience on machine learning. Each question included in this category has been recently asked in one or more actual Data Science interviews at companies such as Amazon, Google, Microsoft, etc. These questions will give you a good sense of what sub-topics appear more often than others. You should also pay close attention to the way these questions are phrased in an interview. Explain Logistic Regression and its assumptions.Explain Linear Regression and its assumptions.How do you split your data between training and validation?Describe Binary Classification.Explain the working of decision trees.What are different metrics to classify a dataset?What’s the role of a cost function?What’s the difference between convex and non-convex cost function?Why is it important to know bias-variance trade off while modeling?Why is regularisation used in machine learning models? What are the differences between L1 and L2 regularisation?What’s the problem of exploding gradients in machine learning?Is it necessary to use activation functions in neural networks?In what aspects is a box plot different from a histogram?What is cross validation? Why is it used?Can you explain the concept of false positive and false negative?Explain how SVM works.While working at Facebook, you’re asked to implement some new features. What type of experiment would you run to implement these features?What techniques can be used to evaluate a Machine Learning model?Why is overfitting a problem in machine learning models? What steps can you take to avoid it?Describe a way to detect anomalies in a given dataset.What are the Naive Bayes fundamentals?What is AUC – ROC Curve?What is K-means?How does the Gradient Boosting algorithm work?Explain advantages and drawbacks of Support Vector Machines (SVM).What is the difference between bagging and boosting?Before building any model, why do we need the feature selection/engineering step?How to deal with unbalanced binary classification?What is the ROC curve and the meaning of sensitivity, specificity, confusion matrix?Why is dimensionality reduction important?What are hyperparameters, how to tune them, how to test and know if they worked for the particular problem?How will you decide whether a customer will buy a product today or not given the income of the customer, location where the customer lives, profession, and gender? Define a machine learning algorithm for this.How will you inspect missing data and when are they important for your analysis?How will you design the heatmap for Uber drivers to provide recommendation on where to wait for passengers? How would you approach this?What are time series forecasting techniques?How does a logistic regression model know what the coefficients are?Explain Principle Component Analysis (PCA) and it’s assumptions.Formulate Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) techniques.What are neural networks used for?40. Why is gradient checking important?Is random weight assignment better than assigning same weights to the units in the hidden layer?How to find the F1 score after a model is trained?How many topic modeling techniques do you know of? Explain them briefly.How does a neural network with one layer and one input and output compare to a logistic regression?Why Rectified Linear Unit/ReLU is a good activation function?When using the Gaussian mixture model, how do you know it’s applicable?If a Product Manager says that they want to double the number of ads in Facebook’s Newsfeed, how would you figure out if this is a good idea or not?What do you know about LSTM?Explain the difference between generative and discriminative algorithms.Can you explain what MapReduce is and how it works? If the model isn’t perfect, how would you like to select the threshold so that the model outputs 1 or 0 for label?Are boosting algorithms better than decision trees? If yes, why?What do you think are the important factors in the algorithm Uber uses to assign rides to drivers?How does speech synthesis works?
Data AnalysisMachine Learning concepts are not the only area in which you’ll be tested in the interview. Data pre-processing and data exploration are other areas where you can always expect a few questions. We’re grouping all such questions under this category. Data Analysis is the process of evaluating data using analytical and statistical tools to discover useful insights. Once again, all these questions have been recently asked in one or more actual Data Science interviews at the companies listed above. What are the core steps of the data analysis process?How do you detect if a new observation is an outlier?Facebook wants to analyse why the “likes per user and minutes spent on a platform are increasing, but total number of users are decreasing”. How can they do that?If you have a chance to add something to Facebook then how would you measure its success?If you are working at Facebook and you want to detect bogus/fake accounts. How will you go about that?What are anomaly detection methods?How do you solve for multicollinearity?How to optimise marketing spend between various marketing channels?What metrics would you use to track whether Uber’s strategy of using paid advertising to acquire customers works?What are the core steps for data preprocessing before applying machine learning algorithms?How do you inspect missing data?How does caching work and how do you use it in Data Science?
Statistics, Probability and MathematicsAs we’ve already mentioned, Data Science builds its foundation on statistics and probability concepts. Having a strong foundation in statistics and probability concepts is a requirement for Data Science, and these topics are always brought up in data science interviews. Here is a list of statistics and probability questions that have been asked in actual Data Science interviews.How would you select a representative sample of search queries from 5 million queries?Discuss how to randomly select a sample from a product user population.What is the importance of Markov Chains in Data Science?How do you prove that males are on average taller than females by knowing just gender or height.What is the difference between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP)?What does P-Value mean?Define Central Limit Theorem (CLT) and it’s application?There are six marbles in a bag, one is white. You reach in the bag 100 times. After drawing a marble, it is placed back in the bag. What is the probability of drawing the white marble at least once?Explain Euclidean distance.Define variance.How will you cut a circular cake into eight equal pieces?What is the law of large numbers?How do you weigh nine marbles three times on a balance scale to select the heaviest one?You call three random friends who live in Seattle and ask each independently if it’s raining. Each of your friends has a 2/3 chance of telling you the truth and a 1/3 chance of lying. All three say “yes”. What’s the probability it’s actually raining?Explain a probability distribution that is not normal and how to apply that?You have two dice. What is the probability of getting at least one four? Also find out the probability of getting at least one four if you have n dice.Draw the curve log(x+10)
ProgrammingWhen you appear for a data science interview your interviewers are not expecting you to come up with a highly efficient code that takes the lowest resources on computer hardware and executes it quickly. However, they do expect you to be able to use R, Python, or SQL programming languages so that you can access the data sources and at least build prototypes for solutions.You should expect a few programming/coding questions in your data science interviews. You interviewer might want you to write a short piece of code on a whiteboard to assess how comfortable you are with coding, as well as get a feel for how many lines of codes you typically write in a given week. Here are some programming and coding questions that companies like Amazon, Google, and Microsoft have asked in their Data Science interviews. Write a function to check whether a particular word is a palindrome or not.Write a program to generate Fibonacci sequence.Explain about string parsing in R languageWrite a sorting algorithm for a numerical dataset in Python.Coding test: moving average Input 10, 20, 30, 10, … Output: 10, 15, 20, 17.5, …Write a Python code to return the count of words in a stringHow do you find percentile? Write the code for itWhat is the difference between – (i) Stack and Queue and (ii) Linked list and Array?
Structured Query Language (SQL)Real-world data is stored in databases and it ‘travels’ via queries. If there’s one language a Data Science professional must know, it’s SQL – or “Structured Query Language”. SQL is widely used across all job roles in Data Science and is often a ‘deal-breaker’. SQL questions are placed early on in the hiring process and used for screening. Here are some SQL questions that top companies have asked in their Data Science interviews. How would you handle NULLs when querying a data set?How will you explain JOIN function in SQL in the simplest possible way?Select all customers who purchased at least two items on two separate days from Amazon.What is the difference between DDL, DML, and DCL?96. Why is Database Normalisation Important?What is the difference between clustered and non-clustered index?
Situational/Behavioural QuestionsCapabilities don’t necessarily guarantee performance. It’s for this reason employers ask you situational or behavioural questions in order to assess how you would perform in a given situation. In some cases, a situational or behavioural question would force you to reflect on how you behaved and performed in a past situation. A situational question can help interviewers in assessing your role in a project you might have included in your resume, can reveal whether or not you’re a team player, or how you deal with pressure and failure. Situational questions are no less important than any of the technical questions, and it will always help to do some homework beforehand. Recall your experience and be prepared! Here are some situational/behavioural questions that large tech companies typically ask: What was the most challenging project you have worked on so far? Can you explain your learning outcomes?According to your judgement, does Data Science differ from Machine Learning?If you’re faced with Selection Bias, how will you avoid it?How would you describe Data Science to a Business Executive?
If you’re looking for new Data Science role, you can find our latest opportunities here. This article was written by Tooba Mukhtar and Rahim Rasool for Data Science Jojo. It has been republished with permission. You can view the original article, which includes answers to the above questions here.
Weekly News Digest: 10th – 14th January 2022 | Harnham Recruitment post
This is Harnham’s weekly news digest, the place to come for a quick breakdown of the week’s top news stories from the world of Data & Analytics.PYMNTS.com: Fighting fraudulent transactions, by the numbersHow are banks using AI and other tools to curb transaction fraud?In 2021, PYMNTS interviewed banking executives to determine how acquiring banks use artificial intelligence (AI) and effective merchant monitoring to combat credit, debit, and prepaid card fraud. In this piece, it shares the results of the interviews:Most acquiring banks say fraudulent transactions increased between 2020 and 202193 percent of those surveyed said they saw a year-to-year increase in fraud. 88 per cent said reducing fraud is critical to their ability to increase or maintain merchant processing revenue.Most banks that use AI use it for fraud detectionAlmost all (98 per cent) of acquirers using AI said it has found fraud detection. 60 per cent have said AI is the best tool for them to detect fraud, while another 15 per cent said it’s an important weapon.Most banks outsource this workFraud detection is too important for some banks to spend years developing their own complex system. So, 92 per cent of banks that use AI systems for fraud prevention and detection said they outsource the systems.To read more about this, click here. Analytics Insight: Top Python machine learning libraries to explore in 2022What Machine Learning libraries should you be focusing on this year? Python is the most popular programming language for data science projects, while machine learning is globally trending. According to Analytics Insight, Python machine learning libraries have become the language for implementing machine learning algorithms. So, to fully understand Data Science and Machine Learning, Python is essential. Here are the top Python machine learning libraries to help you begin your Python journey, and what they’re most useful for:TensorFlow: an open-source numerical computing library for machine learning based on neural networks.PyTorch: used for natural language processing, computer vision, and other similar kinds of tasks.Keras: machine learning toolset that aids companies such as Square, Yelp and Uber.Orage3: includes tools for machine learning, data mining, and data visualisation. Numpy: includes robust computing capabilities within the large, high performance programming communitySciPy: a core tool for accomplishing mathematical, scientific and engineering computations.SciKit-Learn: an indispensable part of the technology stacks of Booking.com, Spotify, OkCupid, and others.Pandas: has powerful data frames and flexible data handling.Matplotlib: replaces the need to use the proprietary MATLAB statistical language. Theano: allows for simultaneous computing, fast execution speed and optimised stability. To read more about this, click here. Analytics India Mag: Why should data engineers learn Scala?Is Scala beneficial to a Data Engineer? Scala combines object-oriented and functional programming in one concise, high-level language, and its static types help avoid bugs in complex applications. Scala does have some key advantages such as its use of data-parallel operations, simple structure suitable for big data processors, and its high-volume capabilities. On the other hand, the article points out why Scala might not be beneficial to a Data Engineer:Difficult to learn Not widely adoptedOnly 10 per cent of jobs require Scala knowledge While Scala does not occupy the same level of importance as other popular languages, it’s certainly a useful language to learn if it matches a data engineer’s career goals. To read more about this, click here. Forbes: Data analytics marathon – why your organisation must focus on the finishIn this Forbes piece, the author compares analytics to a marathon – both take commitment preparedness, and endurance to be successful. A companies’ analytics will go through several cycles as business priorities shift and evolve. They are explained here as milestones of the Data & Analytics marathon:Data collectionData preparationData visualisation Data analysis Insight communicationTake action The author, Brent Dykes, notes that many drop off at the last mile in the race, the action phase where analytics teams perform analysis, share their insights and then implement changes to optimise the business. Most companies have no problem with the start of the data analytics marathon, but many of them aren’t completing the entire race. In order to finish the data analytics race in a strong position, companies and analytics teams must align the data with the business strategy and follow these three steps.Automate early-stage tasksNarrow the scopeFoster a stronger data cultureTo read more about this, click here. We’ve loved seeing all the news from Data & Analytics in the past week, it’s a market full of exciting and dynamic opportunities. To learn more about our work in this space, get in touch with us at firstname.lastname@example.org.
CAN’T FIND THE RIGHT OPPORTUNITY?
If you can’t see what you’re looking for right now, send us your CV anyway – we’re always getting fresh new roles through the door.