Google | First Round | SWE III ML | Bengaluru
Summary
I recently completed the first round for a SWE III ML position at Google in Bengaluru, which was a Data Structures & Algorithms coding interview. I was presented with two questions, an initial problem and an advanced follow-up, and shared my approaches for both.
Full Experience
Recently I gave round 1 for SWE III ML at Google, this was a dsa coding round. I have 4 years experience as a data scientist in a mid-size company. The interview was 45 minutes and was in the first week of July. I was asked a question and an advanced follow up on the same.
Initial Question
Given a list of events, each with a message and a timestamp. Please extract the most frequent one and give me its timestamps for further analysis.
Example ->
Event: ['error_1: 123', 'error_1: 234', 'error_2: 456'] Return: error_1 [123, 234]
My Answer
Initially I created a dictionary of events with list of timestamps as the values. For the maximum I was using a max heap but soon realised the heap is not needed and proceeded with coding the optimal solution with a max_counter.
class Solution:
def returnMost_frequent_better(self, events):
# O(n)
event_dict = {}
max_counter = 0
max_key = None
for event in events:
key, value = event.split(':')
value = int(value.strip())
if key not in event_dict:
event_dict[key] = [value]
else:
event_dict[key].append(value)
if len(event_dict[key]>max_counter):
max_counter = len(event_dict[key])
max_key = key
return event_dict[max_key]
Time complexity - O(n) for the dictionary creation
Advanced Question
For the follow up, imagine that it's going to become a realtime system, and I will maintain a queue of N such events. When the queue is full, I will remove the oldest event from the queue. Every time a new event comes in, I will update the structure and get the most frequent event among the N events.
Example ->
N=2 (queue size) Event: ['error_1: 123', 'error_1: 345'] event_element = 'error_1: 234' delete- 'error_1: 123' New_list= ['error_1: 345', 'error_1: 234'] return: error_1
My Answer
I actually misunderstood the question a bit and wrote a solution considering we have all the list of events and are not being passed one by one. Still was able to come up with a fair solution as given below. Here the major dilemma was how to update and maintain the max_counter when an element is deleted.
class Solution:
def returnMost_frequent_better(self, events, N):
# O(n)
# We are passed all event_list and we handle >N, <N cases
event_dict = {}
max_counter = 0
max_key = None
time_heap = []
for i, event in enumerate(events):
key, value = event.split(':')
value = int(value.strip())
heapq.heappush(time_heap, (value, key))
size+=1
if size<N:
if key not in event_dict:
event_dict[key] = [value]
else:
event_dict[key].append(value)
if len(event_dict[key]>max_counter):
max_counter = len(event_dict[key])
max_key = key
else:
val_d, key_d = heapq.heappop(time_heap)[1]
event_dict[key].remove(val_d)
size-=1
new_max_key = None
new_max = 0
for key, val in event_dict.items():
if len(val)>new_max:
new_max_key = key
new_max = len(val)
max_key = new_max_key
return max_key
I was later told that I misunderstood the question and was told to write down my assumption. Overall it went well, prepping for the further rounds now.
Interview Questions (2)
Given a list of events, each with a message and a timestamp. Please extract the most frequent one and give me its timestamps for further analysis.
Example ->
Event: ['error_1: 123', 'error_1: 234', 'error_2: 456'] Return: error_1 [123, 234]
For the follow up, imagine that it's going to become a realtime system, and I will maintain a queue of N such events. When the queue is full, I will remove the oldest event from the queue. Every time a new event comes in, I will update the structure and get the most frequent event among the N events.
Example ->
N=2 (queue size) Event: ['error_1: 123', 'error_1: 345'] event_element = 'error_1: 234' delete- 'error_1: 123' New_list= ['error_1: 345', 'error_1: 234'] return: error_1