Yahoo coding challenge

yahoo logo
yahoo
December 7, 202533 reads

Summary

I was presented with a coding challenge at Yahoo to design a SQL query API that generates SQL statements based on a structured input, supporting various filtering, ordering, and logical operations.

Full Experience

During my coding challenge at Yahoo, I was tasked with implementing a query API for a mail storage system. The primary goal was to define a method that accepts a structured query input and outputs a corresponding SQL statement string, without needing to execute it. The system uses a relational database supporting standard SQL operators like AND, OR, =, !=, <=, >=, and a special SEARCH(<columnName>, <searchString>) function for full-text searches. A predefined schema for the Message table was provided. The API needed to support various use cases including basic filtering, pagination, ORing, ANDing, and complex nested logical operators. I was given an input structure including Columns, WHERE clauses (with FilterCriteria and LogicalOperator), ORDER BY clauses, LIMIT, and OFFSET. The challenge included defining data structures for FilterCriteria and WhereClause to represent complex conditions like FolderId=1 OR (FolderId=2 AND SEARCH(Subject, "test")).

Interview Questions (1)

Q1
SQL Query API Generator
System DesignHard

We need to write a query API to a mail storage system. For practical reasons, we want to define some method that takes some sort of query structure (input) that supports our use cases listed below, and outputs a SQL statement (string). We don't need to execute the statement, we just want to generate SQL statements with this method.

Context:

  • We have a relational database as our underlying data store.
  • Our database engine supports standard SQL operators (AND, OR, =, !=, <=, >=, etc) and a special search function: SEARCH(<columnName>, <searchString>) that allows us to perform full-text search queries against our stored mail.
  • We are using the schema/table below.

Use cases to support:

  • (basic filtering): List messages on a folderId ordered by delivery date(DeliveryTs) in descending.
    ---> SELECT * FROM Message WHERE FolderId=1 ORDER BY DeliveryTs DESC
  • (basic pagination): List 100 messages after the first 100 on a folderId ordered by Subject in ascending.
    ---> SELECT * FROM Message WHERE FolderId=1 ORDER BY Subject ASC LIMIT 100 OFFSET 100;
  • (ORing) List messages that contain either:
    • "Promotion" in the Subject
    • Have fake@yahoo.com as the Sender (From column)
    • Have "Order Number" in the body.
    ---> SELECT * FROM Message WHERE SEARCH(Subject, "Promotion") OR From="fake@yahoo.com" OR SEARCH(Body, "Order Number") ORDER BY DeliveryTs DESC;
  • (ANDing) List messages that contain both of the following:
    • "Promotion" in the Subject
    • Have "Order Number" in the body.
    SELECT * FROM Message WHERE SEARCH(Subject, "Promotion") AND SEARCH(Body, "Order Number") ORDER BY DeliveryTs DESC
  • (complex nested logical operators) - List messages that belong to a folderId, OR have "test" in the subject column AND belong to a different folderId.
    --> SELECT * FROM Message WHERE FolderId=1 OR (FolderId=2 AND SEARCH(Subject, "test")) ORDER BY DeliveryTs DESC

Input Structure:

  • Columns - ALL (*) - List<String>
  • WHERE - Input Data Structure (Column X, Value Y, Operator [=, SEARCH])
    • List<FilterCriteria> (Column, Value, Operator)
    • Logical Operator (AND|OR)
  • ORDER BY (Column X, ColumnOrder ASC|DESC) List<SortOrder>
  • LIMIT (Int)
  • OFFSET (Int)

Method Signature:

public static String query(List<String> column, WhereClause whereClause, List<SortOrder> orders, Integer limit, Integer offset)

Data Structures for WHERE clause:

FilterCriteria

  • String Column
  • String value
  • Operator operator (EQUALS, SEARCH, NOT_EQUALS)

WhereClause

  • List<FilterCriteria> criteria
  • LogicalOperator (OR, AND)
  • FilterCriteria

Example usage with nested WhereClause:

FolderId=1 OR (FolderId=2 AND SEARCH(Subject, "test"))

FilterCriteria fs1 = new FilterCriteria(FolderId, 1, EQUALS);
FilterCriteria fs2 = new FilterCriteria(FolderId, 2, EQUALS);
FilterCriteria fs3 = new FilterCriteria(Subject, "test", SEARCH);

WhereClause ws = new WhereClause(List.of( new WhereClause(List.of(fs1)), new WhereClause(List.of(fs2, fs3), AND)), OR)

Schema:

ColumnType
UserIdInt
MessageIdInt
SubjectString
DeliveryTsInt
FromString
ToString
BodyString
FolderIdInt
ConversationIdString

Discussion (0)

Share your thoughts and ask questions

Join the Discussion

Sign in with Google to share your thoughts and ask questions

No comments yet

Be the first to share your thoughts and start the discussion!