With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.
Visit our Blogs & News portal or check out our recent posts below.
With over 10 years experience working solely in the Data & Analytics sector our consultants are able to offer detailed insights into the industry.
Visit our Blogs & News portal or check out our recent posts below.
Our friends at Data Science Dojo have compiled a list of 101 actual Data Science interview questions that have been asked between 2016-2019 at some of the largest recruiters in the Data Science industry – Amazon, Microsoft, Facebook, Google, Netflix, Expedia, etc. Data Science is an interdisciplinary field and sits at the intersection of computer science, statistics/mathematics, and domain knowledge. To be able to perform well, one needs to have a good foundation in not one but multiple fields, and it reflects in the interview. They've divided the questions into six categories: Machine LearningData AnalysisStatistics, Probability, and MathematicsProgrammingSQLExperiential/Behavioural Questions Once you've gone through all the questions, you should have a good understanding of how well you're prepared for your next Data Science interview. Machine Learning As one will expect, Data Science interviews focus heavily on questions that help the company test your concepts, applications, and experience on machine learning. Each question included in this category has been recently asked in one or more actual Data Science interviews at companies such as Amazon, Google, Microsoft, etc. These questions will give you a good sense of what sub-topics appear more often than others. You should also pay close attention to the way these questions are phrased in an interview. Explain Logistic Regression and its assumptions.Explain Linear Regression and its assumptions.How do you split your data between training and validation?Describe Binary Classification.Explain the working of decision trees.What are different metrics to classify a dataset?What's the role of a cost function?What's the difference between convex and non-convex cost function?Why is it important to know bias-variance trade off while modeling?Why is regularisation used in machine learning models? What are the differences between L1 and L2 regularisation?What's the problem of exploding gradients in machine learning?Is it necessary to use activation functions in neural networks?In what aspects is a box plot different from a histogram?What is cross validation? Why is it used?Can you explain the concept of false positive and false negative?Explain how SVM works.While working at Facebook, you're asked to implement some new features. What type of experiment would you run to implement these features?What techniques can be used to evaluate a Machine Learning model?Why is overfitting a problem in machine learning models? What steps can you take to avoid it?Describe a way to detect anomalies in a given dataset.What are the Naive Bayes fundamentals?What is AUC - ROC Curve?What is K-means?How does the Gradient Boosting algorithm work?Explain advantages and drawbacks of Support Vector Machines (SVM).What is the difference between bagging and boosting?Before building any model, why do we need the feature selection/engineering step?How to deal with unbalanced binary classification?What is the ROC curve and the meaning of sensitivity, specificity, confusion matrix?Why is dimensionality reduction important?What are hyperparameters, how to tune them, how to test and know if they worked for the particular problem?How will you decide whether a customer will buy a product today or not given the income of the customer, location where the customer lives, profession, and gender? Define a machine learning algorithm for this.How will you inspect missing data and when are they important for your analysis?How will you design the heatmap for Uber drivers to provide recommendation on where to wait for passengers? How would you approach this?What are time series forecasting techniques?How does a logistic regression model know what the coefficients are?Explain Principle Component Analysis (PCA) and it's assumptions.Formulate Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) techniques.What are neural networks used for?40. Why is gradient checking important?Is random weight assignment better than assigning same weights to the units in the hidden layer?How to find the F1 score after a model is trained?How many topic modeling techniques do you know of? Explain them briefly.How does a neural network with one layer and one input and output compare to a logistic regression?Why Rectified Linear Unit/ReLU is a good activation function?When using the Gaussian mixture model, how do you know it's applicable?If a Product Manager says that they want to double the number of ads in Facebook's Newsfeed, how would you figure out if this is a good idea or not?What do you know about LSTM?Explain the difference between generative and discriminative algorithms.Can you explain what MapReduce is and how it works? If the model isn't perfect, how would you like to select the threshold so that the model outputs 1 or 0 for label?Are boosting algorithms better than decision trees? If yes, why?What do you think are the important factors in the algorithm Uber uses to assign rides to drivers?How does speech synthesis works? Data Analysis Machine Learning concepts are not the only area in which you'll be tested in the interview. Data pre-processing and data exploration are other areas where you can always expect a few questions. We're grouping all such questions under this category. Data Analysis is the process of evaluating data using analytical and statistical tools to discover useful insights. Once again, all these questions have been recently asked in one or more actual Data Science interviews at the companies listed above. What are the core steps of the data analysis process?How do you detect if a new observation is an outlier?Facebook wants to analyse why the "likes per user and minutes spent on a platform are increasing, but total number of users are decreasing". How can they do that?If you have a chance to add something to Facebook then how would you measure its success?If you are working at Facebook and you want to detect bogus/fake accounts. How will you go about that?What are anomaly detection methods?How do you solve for multicollinearity?How to optimise marketing spend between various marketing channels?What metrics would you use to track whether Uber's strategy of using paid advertising to acquire customers works?What are the core steps for data preprocessing before applying machine learning algorithms?How do you inspect missing data?How does caching work and how do you use it in Data Science? Statistics, Probability and Mathematics As we've already mentioned, Data Science builds its foundation on statistics and probability concepts. Having a strong foundation in statistics and probability concepts is a requirement for Data Science, and these topics are always brought up in data science interviews. Here is a list of statistics and probability questions that have been asked in actual Data Science interviews. How would you select a representative sample of search queries from 5 million queries?Discuss how to randomly select a sample from a product user population.What is the importance of Markov Chains in Data Science?How do you prove that males are on average taller than females by knowing just gender or height.What is the difference between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP)?What does P-Value mean?Define Central Limit Theorem (CLT) and it's application?There are six marbles in a bag, one is white. You reach in the bag 100 times. After drawing a marble, it is placed back in the bag. What is the probability of drawing the white marble at least once?Explain Euclidean distance.Define variance.How will you cut a circular cake into eight equal pieces?What is the law of large numbers?How do you weigh nine marbles three times on a balance scale to select the heaviest one?You call three random friends who live in Seattle and ask each independently if it's raining. Each of your friends has a 2/3 chance of telling you the truth and a 1/3 chance of lying. All three say "yes". What's the probability it's actually raining? Explain a probability distribution that is not normal and how to apply that?You have two dice. What is the probability of getting at least one four? Also find out the probability of getting at least one four if you have n dice.Draw the curve log(x+10) Programming When you appear for a data science interview your interviewers are not expecting you to come up with a highly efficient code that takes the lowest resources on computer hardware and executes it quickly. However, they do expect you to be able to use R, Python, or SQL programming languages so that you can access the data sources and at least build prototypes for solutions. You should expect a few programming/coding questions in your data science interviews. You interviewer might want you to write a short piece of code on a whiteboard to assess how comfortable you are with coding, as well as get a feel for how many lines of codes you typically write in a given week. Here are some programming and coding questions that companies like Amazon, Google, and Microsoft have asked in their Data Science interviews. Write a function to check whether a particular word is a palindrome or not.Write a program to generate Fibonacci sequence.Explain about string parsing in R languageWrite a sorting algorithm for a numerical dataset in Python.Coding test: moving average Input 10, 20, 30, 10, ... Output: 10, 15, 20, 17.5, ...Write a Python code to return the count of words in a stringHow do you find percentile? Write the code for itWhat is the difference between - (i) Stack and Queue and (ii) Linked list and Array? Structured Query Language (SQL) Real-world data is stored in databases and it ‘travels’ via queries. If there's one language a Data Science professional must know, it's SQL - or “Structured Query Language”. SQL is widely used across all job roles in Data Science and is often a ‘deal-breaker’. SQL questions are placed early on in the hiring process and used for screening. Here are some SQL questions that top companies have asked in their Data Science interviews. How would you handle NULLs when querying a data set?How will you explain JOIN function in SQL in the simplest possible way?Select all customers who purchased at least two items on two separate days from Amazon.What is the difference between DDL, DML, and DCL?96. Why is Database Normalisation Important?What is the difference between clustered and non-clustered index? Situational/Behavioural Questions Capabilities don’t necessarily guarantee performance. It's for this reason employers ask you situational or behavioural questions in order to assess how you would perform in a given situation. In some cases, a situational or behavioural question would force you to reflect on how you behaved and performed in a past situation. A situational question can help interviewers in assessing your role in a project you might have included in your resume, can reveal whether or not you're a team player, or how you deal with pressure and failure. Situational questions are no less important than any of the technical questions, and it will always help to do some homework beforehand. Recall your experience and be prepared! Here are some situational/behavioural questions that large tech companies typically ask: What was the most challenging project you have worked on so far? Can you explain your learning outcomes?According to your judgement, does Data Science differ from Machine Learning?If you're faced with Selection Bias, how will you avoid it?How would you describe Data Science to a Business Executive? If you're looking for new Data Science role, you can find our latest opportunities here. This article was written by Tooba Mukhtar and Rahim Rasool for Data Science Jojo. It has been republished with permission. You can view the original article, which includes answers to the above questions here.
22. August 2019
20. August 2019
If you’re lamenting the decline of handmade traditional products, cast your cares aside. There’s a new Sheriff in town and its name is, Tech. Just a generation ago, children would leave the farm or the family business, go to school, and then move on to make their place in the world doing their own thing. Away from family. Today, the landscape has changed and those who have left are coming home. But this time, they’re bringing technology with them to help make things more efficient and more productive. Is Tech-Assisted Still Handmade? In a word, yes. Artists still make things “from scratch”, except now technologies allow them to not only see their vision in real-time, but their customers, too. Have you ever wondered what the image in your head might look like on paper or in metal? What about the design of prosthetic arms and healthcare devices by 3D printers? You’re still designing, creating. But just like any new technology, there’s still a learning curve. Even for cutting-edge craftspeople who find that sometimes, the line between craftsmanship and high-tech creativity may be a bit of a blur. Not to mention the expense for either the equipment required or being able to offer art using traditional tools at technology-assisted prices. Somewhere between the two, there is a trade-off. It’s up to the individual to determine where and what that trade-off is. Life in the Creative Economy One of Banksy’s paintings shredded itself upon purchase at an auction recently. AI is making music and writing books. Augmented Reality, Virtual Reality, and Blockchain all have their place in the creative economy from immersive entertainment to efficient manufacturing processes. Each of these touches the way we live now. In a joint study between McKinsey and the World Economic Forum, 'Creative Disruption: The impact of emerging technologies on the creative economy', the organisations broke down the various technologies used in the creative economy and how they’re driving change. For example: AI is being used to distill user preferences when it comes to curating movies and music. The Associated Press has used AI to free up reporters’ time and the Washington Post has created a tool to help it generate up to 70 articles a month, many stories of which they wouldn’t have otherwise dedicated staff.Machine Learning has begun to create original content. Virtual Reality and Augmented Reality have come together as a new medium to help move people to get up, get active, and go play whether it’s a stroll through a virtual art gallery or watching your children play at the playground. Where else might immersive media play out? Content today could help tell humanitarian stories or offer work-place diversity training. But back to the artisan handicrafts. Artistry with technology Whilst publishing firms may be looking to use AI to redefine the creative economy, they are not alone. Other artists utilising these technologies include: SculptorsDigital artistsPaintersJewellery makersBourbon distillers America’s oldest distiller has gotten on the technology bandwagon and while there is no rushing good Bourbon, but you can manage the process more efficiently. They’ve even taken things a step further and have created an app for aficionados to follow along in the process. Talk about crafted and curated for individual tastes and transparency. It may seem almost self-explanatory to note how other artisans are using technology. But what about distilleries? What are they doing? They’re creating efficiency by: Adding IoT sensors for Data Analytics collection Adding RFID tags to their barrels Creating experimental ageing warehouses (AR, anyone?) to refine their craft. Don’t worry, though. These changes won’t affect the spirit itself. After all, according to Mr. Wheatley, Master Distiller, “There’s no way to cheat mother nature or father time.” Ultimately, the idea is to not only understand the history behind the process, but to make it more efficient and repeatable. A way to preserve the processes of the past while using the advances of the present with an eye to the future. If you’re interested in using Data & Analytics to drive creativity, we may have a role for you. Take a look at our latest opportunities or get in touch with one of our expect consultants to find out more.
15. August 2019
The financial crisis of 2007-2008 changed banking. The world moved from taking mortgage loans in our dogs’ names to introducing strict regulations for banks prohibiting them from giving out loans to “anyone” without assessing Risk properly. In 2010 the Basel Committee on Banking Supervision (BCBS) introduced BASEL III, a regulatory framework that builds on BASEL I, and BASEL II. This framework changed how banks and financial institutions asses risk. It introduced an Advanced Internal Rate Based Approach (Commonly known as the AIRB approach). Now, the committee has introduced new changes and, by 2022, all banks and institutions will have to implement the revised IRB Framework, as well as new revised regulations for the standardised approach, CVA Framework and new frameworks for Operational Risk and Market Risk. So, what does this mean for those working Risk? Change Is Coming Change is inevitable, no matter what you do. If you work in Risk Management and Compliance, change is something you can expect to happen, often. As mentioned above, by 2022 there will be lots of changes. The Basel Committee calls this initiative the “finalised reforms”, or BASEL IV which builds on the current regulatory framework BASEL III. Quickly summarised, the changes limit the reduction in capital that effect banks IRB models. This change is predicted to impact banks in Sweden and Denmark the most, with estimations that capital ratio will fall by 2.5-3%, far higher than the 0.9% expected for the average European bank. So what does all this mean for Swedish and Danish banks? What’s Happening Now? One of the main things that Swedish and Danish banks need to revise for these new regulations, are their internal models. The new regulations introduced a new definition of Probability of Default, measured through a model commonly known as a PD model. Effectively this means that every bank must “re-develop” their internal PD Models in the IRB approach. Consequently, we are already seeing a clear response from the banks in their strategies moving forward. It has already become quite apparent that many banks are looking to make IRB model development their focus for 2019-2020 and 2021. This has resulted in a boom in the hiring space for developers with experience in IRB Modelling and Credit Risk Modelling in general, which in turn has led to high demand in the face of the low supply of these types of candidates. Understandably aware of this, modellers are now looking to negotiate higher salaries. What You Can Do For candidates that hold the right experience, there are good opportunities at hand. If so inclined, they can utilise this chance to finally see if the grass actually is greener on the other side, or not. However, there are a couple of things worth considering before making a move. Firstly, are you actually keen on switching jobs? Your skills are probably equally in demand at your current employer and, if you are having doubts about moving from the get-go, you may well be able to negotiate a rise without pursuing a new opportunity. However, if you are serious about finding something new, this is a great time to do so. The majority of banks have found that these new regulations are creating an unsustainable workload, and are now looking for talent externally to expand their teams. This means that the experienced modeller can pretty much have their pick of the litter. Furthermore, if you are a junior modeller, there are now plenty of opportunities for you to enter a niche area known for being exciting and innovative. So, wherever you are in your career, these regulatory changes are likely to have a large impact and open up new avenues for you to explore. We all know that regulations in banking and finance are now essential, we all agree, even if they can be a little frustrating. However, what people often fail to think of are the opportunities new regulatory requirements create. In the case of BASEL IV, we’re already seeing an increase in demand for strong talent, and a demand for people who are passionate about Risk Management and model development. For businesses, new regulations also provide the chance to not only improve their teams, but to create new models that can be utilised to optimise and automate. A lot of financial institutions are already aware of this and are using these models to gain competitive advantage over their competitors, as well as to stay one hundred percent compliant. If you’re looking to build out you Risk Management team or take on a new Risk opportunity for yourself, we may be able to help. Take a look at our latest opportunities or get in touch with one of our expert consultants to find out more.
08. August 2019