top of page

Breaking into Facebook by Solving Search Ranking SQL Interview

  • Writer: San Data
    San Data
  • May 8, 2024
  • 3 min read

Hello, data enthusiasts! 


sandata engineering digital desk solving sql interview

I'm back with another breakdown of an SQL interview question from Facebook. I have to admit, it took me a few sessions to crack this one (marked as HARD on Interview Query's site). But, with some SQL tricks and a solid understanding of the question, I finally nailed it. Let's dive into it together!


So, our question today is: Calculate the percentage of queries with ratings below 3. I'll write an SQL script that runs directly on the database and produces the expected result.


A quick mention about Interview Query for SQL


For a couple of years, I've subscribed to Interview Query (this post is not sponsored by them). It's an awesome learning platform to prepare for technical interviews. From Amazon to Google, they have a huge database of questions and exercises for you to use. This topic came from their platform and they also provide a database for you to use which is very convenient. You can directly run the queries and see the result.


I just wanted to share with you the platform as they are doing an amazing job of supporting the tech community. Now let's go back to our main problem.


Let's get started for real this time!


Step 1: Get the queries with the best ratings


First, I started filtering out the rows to get the desired scope. I wrote a query to get the rows where all the ratings are below 3. Once that step is taken care of, we must calculate the percentage this scope represents compared to the total number of queries.


with queries_above_three as (
    SELECT query
    FROM search_results
    WHERE rating >= 3 
    GROUP BY query
)
    SELECT 'total' as ratio, count(DISTINCT query) as tt_threshold
    FROM search_results
    WHERE rating < 3 and query not in (SELECT * FROM queries_above_three)

Step 2: Prepare the percentage calculation

Second, we need to calculate the parts of our ratio: our final numerator and denominator. In the previous step, we already calculated our numerator. Now, let's add our denominator: the total queries in the table using a subquery. Lastly, it's crucial to format the values correctly before the final step:


  SELECT cast(tt_threshold as numeric), cast(tt_line as numeric)
  FROM queries_below_three as qbt
  INNER JOIN (
    SELECT 'total' as ratio, COUNT(DISTINCT QUERY) as tt_line
    FROM search_results
  ) sub_tb ON qbt.ratio = sub_tb.ratio

Step 3: Join it all and divide


Now, all that's left is to calculate our division using the previous elements and nest all the queries within a WITH clause for better visibility and maintenance. As for improvements, we can come up with more descriptive table names than "tb," for example.

with queries_above_three as (
    SELECT query
    FROM search_results
    WHERE rating >= 3 
  GROUP BY query
), queries_below_three as (
    SELECT 'total' as ratio, count(DISTINCT query) as tt_threshold
    FROM search_results
    WHERE rating < 3 and query not in (SELECT * FROM queries_above_three)	
)
SELECT round(cast(tt_threshold / tt_line as numeric),2) as percentage_less_than_3
FROM (
  SELECT cast(tt_threshold as numeric), cast(tt_line as numeric)
  FROM queries_below_three as qbt
  INNER JOIN (
    SELECT 'total' as ratio, COUNT(DISTINCT QUERY) as tt_line
    FROM search_results
  ) sub_tb ON qbt.ratio = sub_tb.ratio
) sub_tb2;

It took me several tries to solve this exercise. My advice? Keep trying different approaches, and don't give up!


Resources :


See you at the next practice!


bottom of page