Meesho Data Scientist-1 Interview Experience | Bangalore

meesho logo
meesho
· Data Scientist-1· Bangalore
May 15, 2025 · 34 reads

Summary

I participated in the Meesho Data Challenge 2024, won, and was invited to interview for a Data Scientist-1 role in Bangalore. The interview process consisted of four rounds covering DSA, SQL, Statistics, ML/DL, system design, and project discussions. I received a confirmation from HR on the same day as my final interview round.

Full Experience

Participated in Meesho Data Challenge 2024 (won!) and hence got the chance to interview for open roles at Meesho.

HR contacted on 10th April 2025 and shared the interview details:

  • Round 1: DSA, SQL, Statistics & Mathematics, Basic ML/DL/GenAI Knowledge
  • Round 2: ML Depth
  • Round 3: ML Breadth (case study based)
  • Round 4: Overall Discussion

Round 1 (17/04/2025)

  1. What do you understand by p-value?
  2. If we fail to reject the null hypothesis then do we accept an alternate hypothesis?
  3. We want to sample values between [0, 1] uniformly but it should remain within the unit circle. How do you do?
  4. Explain bias-variance tradeoff
  5. Training on a large dataset, training loss is getting reduced but validation loss is stagnant, how will you address it?
  6. Coding
    • Given a list of citizens in a country, Info of birth date and death date, Find the max population throughout the history

    • Find the max number of libraries that can be installed from ‘libaries_needed’: You are given three lists: Libraries_needed = [‘pandas’, ‘numpy’] Libraries_wd_no_preq = [‘A’, ‘B’, ‘C’] Libraries_wd_preq = [[‘A’, ‘B’], [‘B’]]

      Here 0th index elements are prerequisites for ‘pandas’ and so on

  7. Explain ReLU activation function
  8. What is the drawback of using ReLU
  9. Explain positional encoding in transformers
  10. Why did authors use sine and cosine? Why can’t we use binary values?

Round 2

  1. First (21/04/2025)

    • Explain layer and batch normalization
    • Discussed potential alternate solution of Meesho Hackathon project
    • Convex and non-convex loss functions
    • CNNs (projects are CV based)
    • Discussion on pooling techniques
    • Segmentation basics and pooling techniques in that
    • Classification metrics: class imbalance, comparison between ROC-AUC and PR-AUC
    • If loss is NaN then possible reasons?
  2. Second (25/04/2025) : This was an additional interview with another panel.

    • Bias-variance tradeoff
    • Regularization techniques
    • Why does L1 regularization create sparsity?
    • Optimal batch size and why?
    • Why do we go into the negative of the gradient?
    • Maximum Likelihood Estimation and Maximum a Posteriori Estimation
    • Cross Entropy Loss reasoning and relation to KL divergence
    • If all weights are initialized to same values then what would happen
    • Dropout and what happen at training time
    • Multi GPU training, parallelism, PEFT, QLoRA (mentioned in my resume)

Round 3 (30/04/2025)

  • In-depth Q&A of my preferred project (Tip: if any project with model building from scratch then discuss that)
  • How does ViT work?
  • We want to build a visual search system for Meesho? How will you approach it?
    • Focus on model building
    • Which model will you select and why?
    • How do you train it
    • Evaluation and Business Metrics

Later that day I received confirmation from HR.

Interview Questions (30)

1.

What is p-value?

Other

What do you understand by p-value?

2.

Null vs Alternate Hypothesis

Other

If we fail to reject the null hypothesis then do we accept an alternate hypothesis?

3.

Sample Uniformly within Unit Circle

Data Structures & Algorithms

We want to sample values between [0, 1] uniformly but it should remain within the unit circle. How do you do?

4.

Explain Bias-Variance Tradeoff

Other

Explain bias-variance tradeoff

5.

Address Stagnant Validation Loss

Other

Training on a large dataset, training loss is getting reduced but validation loss is stagnant, how will you address it?

6.

Max Population Throughout History

Data Structures & Algorithms

Given a list of citizens in a country, Info of birth date and death date, Find the max population throughout the history

7.

Max Installable Libraries with Prerequisites

Data Structures & Algorithms

Find the max number of libraries that can be installed from ‘libaries_needed’:
You are given three lists:
        Libraries_needed = [‘pandas’, ‘numpy’]
        Libraries_wd_no_preq = [‘A’, ‘B’, ‘C’]
        Libraries_wd_preq = [[‘A’, ‘B’], [‘B’]]

        Here 0th index elements are prerequisites for ‘pandas’ and so on

8.

Explain ReLU Activation Function

Other

Explain ReLU activation function

9.

Drawback of ReLU

Other

What is the drawback of using ReLU

10.

Explain Positional Encoding in Transformers

Other

Explain positional encoding in transformers

11.

Sine/Cosine vs Binary for Positional Encoding

Other

Why did authors use sine and cosine? Why can’t we use binary values?

12.

Explain Layer and Batch Normalization

Other

Explain layer and batch normalization

13.

Convex and Non-Convex Loss Functions

Other

Convex and non-convex loss functions

14.

Discuss CNNs

Other

CNNs (projects are CV based)

15.

Discuss Pooling Techniques

Other

Discussion on pooling techniques

16.

Segmentation Basics and Pooling Techniques

Other

Segmentation basics and pooling techniques in that

17.

Classification Metrics (ROC-AUC vs PR-AUC)

Other

Classification metrics: class imbalance, comparison between ROC-AUC and PR-AUC

18.

Possible Reasons for NaN Loss

Other

If loss is NaN then possible reasons?

19.

Bias-Variance Tradeoff

Other

Bias-variance tradeoff

20.

Regularization Techniques

Other

Regularization techniques

21.

Why L1 Regularization Creates Sparsity

Other

Why does L1 regularization create sparsity?

22.

Optimal Batch Size and Reasoning

Other

Optimal batch size and why?

23.

Why Move in Negative Gradient Direction

Other

Why do we go into the negative of the gradient?

24.

MLE vs MAP Estimation

Other

Maximum Likelihood Estimation and Maximum a Posteriori Estimation

25.

Cross Entropy Loss and KL Divergence

Other

Cross Entropy Loss reasoning and relation to KL divergence

26.

Consequences of Same Weight Initialization

Other

If all weights are initialized to same values then what would happen

27.

Dropout During Training

Other

Dropout and what happen at training time

28.

Multi-GPU Training, Parallelism, PEFT, QLoRA

Other

Multi GPU training, parallelism, PEFT, QLoRA (mentioned in my resume)

29.

How does Vision Transformer (ViT) work?

Other

How does ViT work?

30.

Design a Visual Search System for Meesho

System Design

We want to build a visual search system for Meesho? How will you approach it?

  • Focus on model building
  • Which model will you select and why?
  • How do you train it
  • Evaluation and Business Metrics

Discussion (0)

Share your thoughts and ask questions

Join the Discussion

Sign in with Google to share your thoughts and ask questions

No comments yet

Be the first to share your thoughts and start the discussion!