Avaamo | Interview (All rounds) | Senior MLE | Bengaluru
Summary
I interviewed at Avaamo in December 2024 for a Senior Machine Learning Engineer role, with three rounds focused on RAG, BERT, LLMs, and vector databases. The first round was the most challenging, testing foundational knowledge in transformer-based models.
Full Experience
I gave interview at Avaamo in Dec 2024 and these were the questions around which the interviews revolved. I have 3.5 years of experience as a data scientist, the role was Senior Machine Learning Engineer.
Question List for the 3 rounds
- Explain the basics of RAG architecture and its components.
- What are different parsing and chunking strategies?
- How does chunking impact the quality of retrieval in RAG?
- What is BERT pretraining using MLM and NSP?
- How do bi-encoders and cross-encoders differ in architecture and use cases?
- What are the pros and cons of bi-encoders vs cross-encoders?
- How do you fine-tune embedding models? What loss functions are used (e.g., triplet loss)?
- What are the basics of LLMs and their fine-tuning approaches (PEFT, LoRA, instruction tuning, adapters)?
- What are decoding strategies in LLMs (temperature, top-k, top-p, beam search)?
- What are some techniques for evaluating RAG systems?
- How would you optimize latency in a RAG pipeline?
- What are some ANN (Approximate Nearest Neighbor) algorithms used in vector databases?
- What is the difference between using pretrained models vs fine-tuned models in RAG?
- What are some vector database fundamentals and retrieval configurations?
- What is semantic caching and how is it useful?
- How do encoders function in LLM generation tasks?
- What prompting techniques are used in real-world applications?
- How would you optimize or improve the performance of a GenAI classification system?
The rounds were similar with the first one being the most challenging testing the basics and experience with transformer based models. Since there work is around developing framework for chatbot building. They use their own proprietary frameworks to solve the problems hence fundamentals are essential.
Interview Questions (18)
Explain RAG Architecture Basics
Explain the basics of RAG architecture and its components.
Parsing and Chunking Strategies
What are different parsing and chunking strategies?
Impact of Chunking on RAG Retrieval Quality
How does chunking impact the quality of retrieval in RAG?
BERT Pretraining (MLM & NSP)
What is BERT pretraining using MLM and NSP?
Bi-encoders vs Cross-encoders
How do bi-encoders and cross-encoders differ in architecture and use cases?
Pros and Cons of Bi-encoders vs Cross-encoders
What are the pros and cons of bi-encoders vs cross-encoders?
Fine-tuning Embedding Models and Loss Functions
How do you fine-tune embedding models? What loss functions are used (e.g., triplet loss)?
LLM Basics and Fine-tuning Approaches
What are the basics of LLMs and their fine-tuning approaches (PEFT, LoRA, instruction tuning, adapters)?
LLM Decoding Strategies
What are decoding strategies in LLMs (temperature, top-k, top-p, beam search)?
Techniques for Evaluating RAG Systems
What are some techniques for evaluating RAG systems?
Optimize Latency in RAG Pipeline
How would you optimize latency in a RAG pipeline?
ANN Algorithms in Vector Databases
What are some ANN (Approximate Nearest Neighbor) algorithms used in vector databases?
Pretrained vs Fine-tuned Models in RAG
What is the difference between using pretrained models vs fine-tuned models in RAG?
Vector Database Fundamentals and Retrieval Configurations
What are some vector database fundamentals and retrieval configurations?
Semantic Caching and Its Utility
What is semantic caching and how is it useful?
Encoder Function in LLM Generation
How do encoders function in LLM generation tasks?
Prompting Techniques in Real-world Applications
What prompting techniques are used in real-world applications?
Optimize GenAI Classification System Performance
How would you optimize or improve the performance of a GenAI classification system?