The Problem
Consider the following table in our database:
Table: Views
| Column Name | Type |
|---|---|
| article_id | int |
| author_id | int |
| viewer_id | int |
| view_date | date |
There is no primary key for this table, so it may have duplicate rows. Each row indicates that a viewer has viewed an article (written by some author) on a certain date. Note that the same author_id and viewer_id indicate the same person.
The challenge is to write an SQL query to find all the authors that viewed at least one of their own articles. The result should be returned sorted by author_id in ascending order.
Explanation
For example, consider the following data:
Views table:
| article_id | author_id | viewer_id | view_date |
|---|---|---|---|
| 1 | 3 | 5 | 2019-08-01 |
| 1 | 3 | 6 | 2019-08-02 |
| 2 | 7 | 7 | 2019-08-01 |
| 2 | 7 | 6 | 2019-08-02 |
| 4 | 7 | 1 | 2019-07-22 |
| 3 | 4 | 4 | 2019-07-21 |
| 3 | 4 | 4 | 2019-07-21 |
In this case, the output should be:
| id |
|---|
| 4 |
| 7 |
Both authors with author_id 4 and 7 have viewed their own articles.
The Solution
There are two main approaches to solving this problem, and we'll discuss both of them in this post.
Source Code 1: Using WHERE Clause
The first solution is a simple SELECT statement with a WHERE clause that checks if the author_id equals the viewer_id, which means the author has viewed their own article. The DISTINCT keyword is used to avoid duplicate author_ids in the output.
SELECT DISTINCT author_id [id]
FROM Views
WHERE author_id = viewer_id
This code runs in 1594ms and beats 50.31% of other submissions.
Source Code 2: Adding ORDER BY Clause
The second solution is similar to the first one but with an additional ORDER BY clause. The ORDER BY clause sorts the results in ascending order by author_id.
SELECT DISTINCT author_id [id]
FROM Views
WHERE author_id = viewer_id
ORDER BY author_id
This code runs in 1898ms and beats 19.91% of other submissions.
Conclusion
Both of these methods achieve the desired result. However, the addition of an ORDER BY clause in the second solution slows down the query slightly, as sorting results comes with a performance cost.
The first solution performs better in terms of runtime on LeetCode, but performance may vary depending on the specific
RDBMS and data distribution in real-world scenarios. Therefore, it's essential to consider the characteristics of your own dataset and the capabilities of your RDBMS when choosing a solution.
You can find the original problem at LeetCode.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.


Top comments (0)