Interview Experience | Amazon | Data Scientist II Intern
Summary
I successfully secured a Data Scientist II Intern offer at Amazon after navigating a rigorous two-round interview process, which included a challenging coding assessment and a comprehensive technical interview heavily focused on machine learning and statistical concepts.
Full Experience
As a Masters Candidate from a Tier-1 college, I participated in campus hiring for a Data Scientist II Intern role. The internship was for 6 months with a stipend of 1.4 LPM, and the selection process involved two rounds.
Round 1: Standard Coding Test
This round comprised 7 debugging questions, 2 coding questions, a behavioral test, and logical reasoning. I successfully solved all 7 debugging questions. For the coding questions, I passed all 22 test cases for one and 13/16 for the other. I believe my performance in this coding test was strong, as I was the first candidate to be interviewed in the subsequent round. An additional note: the coding round carries significant weightage; I learned this when a placement representative mistakenly forwarded an email from HR with the Chime link, subject-lined 'Importance: High.'
Round 2: Online Interview
This interview was conducted online via Amazon Chime. It covered a broad spectrum of questions, primarily focusing on machine learning and data science principles, along with one algorithm question. I managed to answer almost all questions, with only one or two partially answered. Following this round, I was selected for the position.
Interview Questions (12)
Assumptions of Linear & Logistic Regression
What are the key assumptions underlying Linear Regression and Logistic Regression models?
Gradient Descent Variations for Different Scenarios
Given various scenarios regarding the number of examples and features, discuss which variation of Gradient Descent (Batch GD, Stochastic GD, Mini-Batch GD) you would use for each:
1. Less number of examples, less number of features
2. Less number of examples, more number of features
3. More number of examples, less number of features
4. More number of examples, more number of features
Regularization in Linear Regression
How does the regularizing parameter in Linear Regression contribute to preventing both underfitting and overfitting?
Comparing Binary Classifiers with AUC Scores
Given two binary classification models, one with an Area Under the Curve (AUC) of 0.6 and another with an AUC of 0.3, which model would you consider better and why? (The post notes the answer as 0.3)
Physical Significance of Precision and Recall
Explain the physical significance and practical implications of Precision and Recall metrics in classification tasks.
Deep Learning Loss Function Discussion
Elaborate on the loss function used in a deep learning project you have worked on, discussing its choice and implications.
Mathematical Expression for SVM Loss
Can you provide the mathematical expression for the Support Vector Machine (SVM) loss function?
Bias and Variance in Underfitting/Overfitting
Discuss the concepts of bias and variance in the context of underfitting and overfitting scenarios.
Bagging, Bias, and Variance in Decision Trees/Random Forests
Explain the concept of Bagging. Additionally, comment on the bias and variance characteristics when comparing a Decision Tree and a Random Forest, both fitted on the same dataset.
Vanishing/Exploding Gradients Problem
Explain the problem of vanishing and exploding gradients in the context of neural networks and deep learning.
K-Means Clustering Steps
What are the two fundamental steps involved in the K-Means clustering algorithm?
Find Kth Largest Number in an Array
How would you approach finding the Kth largest number in an unsorted array?