Google | First Round | SWE III ML | Bengaluru
Summary
I recently completed the first round for a SWE III ML position at Google in Bengaluru, which was a Data Structures & Algorithms coding interview. I was presented with two questions, an initial problem and an advanced follow-up, and shared my approaches for both.
Full Experience
Recently I gave round 1 for SWE III ML at Google, this was a dsa coding round. I have 4 years experience as a data scientist in a mid-size company. The interview was 45 minutes and was in the first week of July. I was asked a question and an advanced follow up on the same.
Initial Question
Given a list of events, each with a message and a timestamp. Please extract the most frequent one and give me its timestamps for further analysis.
Example ->
Event: ['error_1: 123', 'error_1: 234', 'error_2: 456'] Return: error_1 [123, 234]
My Answer
Initially I created a dictionary of events with list of timestamps as the values. For the maximum I was using a max heap but soon realised the heap is not needed and proceeded with coding the optimal solution with a max_counter.
class Solution:
def returnMost_frequent_better(self, events):
# O(n)
event_dict = {}
max_counter = 0
max_key = None
for event in events:
key, value = event.split(':')
value = int(value.strip())
if key not in event_dict:
event_dict[key] = [value]
else:
event_dict[key].append(value)
if len(event_dict[key]>max_counter):
max_counter = len(event_dict[key])
max_key = key
return event_dict[max_key]
Time complexity - O(n) for the dictionary creation
Advanced Question
For the follow up, imagine that it's going to become a realtime system, and I will maintain a queue of N such events. When the queue is full, I will remove the oldest event from the queue. Every time a new event comes in, I will update the structure and get the most frequent event among the N events.
Example ->
N=2 (queue size) Event: ['error_1: 123', 'error_1: 345'] event_element = 'error_1: 234' delete- 'error_1: 123' New_list= ['error_1: 345', 'error_1: 234'] return: error_1
My Answer
I actually misunderstood the question a bit and wrote a solution considering we have all the list of events and are not being passed one by one. Still was able to come up with a fair solution as given below. Here the major dilemma was how to update and maintain the max_counter when an element is deleted.
class Solution:
def returnMost_frequent_better(self, events, N):
# O(n)
# We are passed all event_list and we handle >N, <N cases
event_dict = {}
max_counter = 0
max_key = None
time_heap = []
for i, event in enumerate(events):
key, value = event.split(':')
value = int(value.strip())
heapq.heappush(time_heap, (value, key))
size+=1
if size<N:
if key not in event_dict:
event_dict[key] = [value]
else:
event_dict[key].append(value)
if len(event_dict[key]>max_counter):
max_counter = len(event_dict[key])
max_key = key
else:
val_d, key_d = heapq.heappop(time_heap)[1]
event_dict[key].remove(val_d)
size-=1
new_max_key = None
new_max = 0
for key, val in event_dict.items():
if len(val)>new_max:
new_max_key = key
new_max = len(val)
max_key = new_max_key
return max_key
I was later told that I misunderstood the question and was told to write down my assumption. Overall it went well, prepping for the further rounds now.
Interview Questions (2)
Extract Most Frequent Event and Timestamps
Given a list of events, each with a message and a timestamp. Please extract the most frequent one and give me its timestamps for further analysis.
Example ->
Event: ['error_1: 123', 'error_1: 234', 'error_2: 456'] Return: error_1 [123, 234]
Realtime Most Frequent Event in Bounded Queue
For the follow up, imagine that it's going to become a realtime system, and I will maintain a queue of N such events. When the queue is full, I will remove the oldest event from the queue. Every time a new event comes in, I will update the structure and get the most frequent event among the N events.
Example ->
N=2 (queue size) Event: ['error_1: 123', 'error_1: 345'] event_element = 'error_1: 234' delete- 'error_1: 123' New_list= ['error_1: 345', 'error_1: 234'] return: error_1