Axon | Phone Interview | Audio Segments

axon logo
axon
Machine Learning engineerNo Offer
August 10, 20232 reads

Summary

I had a phone interview with Axon for a Machine Learning Engineer role and was given a challenging audio segmentation coding problem, which I unfortunately couldn't solve within the given time.

Full Experience

I recently had a phone interview for a Machine Learning Engineer position at Axon. The interviewer presented a coding challenge focused on audio segmentation. I was allocated approximately 22 minutes to devise a solution. Despite my efforts, I was unable to complete a successful implementation within the allotted time.

Interview Questions (1)

Q1
Audio Segmentation by Speaker
Data Structures & Algorithms

You have an array of (audio) frames representing an audio file, and a text file with the corresponding transcription. Can you segment the audio into sets of monologues for each speaker?

The transcription file follows the format:
TIMESTAMP(S)\tSpeaker\tText

FPS: 20

Example:
Input
audio_frames = [f1, f2, f3]
transcription = [10\tJohn\t'Hi Kate', 12\tKate\t'Hi John', ...]

Output
"John" : [f1, ... f11], [f23, ... f25],
"Kate" : ...

Discussion (0)

Share your thoughts and ask questions

Join the Discussion

Sign in with Google to share your thoughts and ask questions

No comments yet

Be the first to share your thoughts and start the discussion!