The HAVING clause
Contents
import sqlalchemy
engine = sqlalchemy.create_engine('postgresql://bob:1234@localhost:5432/dvdrental')
engine.execution_options(isolation_level="AUTOCOMMIT")
%load_ext sql
%sql $engine.url
'Connected: bob@dvdrental'
The HAVING clause#
The HAVING clause is used in conjuction with the GROUP BY
clause to filter grouped rows that do not satisfy a given condition. The basic syntax looks like this:
SELECT column_1, aggregate_fuction(column_2)
FROM table_name
GROUP BY column_1
HAVING condition;
The HAVING
clause set the condition for the grouped rows created once the GROUP BY
clause has been applied, whereas the WHERE
clause sets the condition to filter the rows before the GROUP BY
statement has been applied.
Example 1#
Let’s say we have a query to find the 10 best customers:
%%sql
SELECT customer_id, sum(amount) AS moneys
FROM payment
GROUP BY customer_id
ORDER BY moneys DESC
LIMIT 10;
* postgresql://bob:***@localhost:5432/dvdrental
10 rows affected.
customer_id | moneys |
---|---|
148 | 211.55 |
526 | 208.58 |
178 | 194.61 |
137 | 191.62 |
144 | 189.60 |
459 | 183.63 |
181 | 167.67 |
410 | 167.62 |
236 | 166.61 |
403 | 162.67 |
But now, we want to find out the customers who have spent a total within a given range, let’s say between \(170 and \)200:
%%sql
SELECT customer_id, sum(amount) AS moneys
FROM payment
GROUP BY customer_id
HAVING sum(amount) BETWEEN 170 AND 200;
* postgresql://bob:***@localhost:5432/dvdrental
4 rows affected.
customer_id | moneys |
---|---|
144 | 189.60 |
178 | 194.61 |
459 | 183.63 |
137 | 191.62 |
Example 2#
Let’s say we have a query to find out the amount of customers if all of our store locations:
%%sql
SELECT store_id, count(*) AS customers
FROM customer
GROUP BY store_id;
* postgresql://bob:***@localhost:5432/dvdrental
2 rows affected.
store_id | customers |
---|---|
1 | 326 |
2 | 273 |
In this case we have only two stores but let’s imagine we have a lot of them, and we want to see only the stores that have 300 customers or more:
%%sql
SELECT store_id, count(*) AS customers
FROM customer
GROUP BY store_id
HAVING count(*) > 300;
* postgresql://bob:***@localhost:5432/dvdrental
1 rows affected.
store_id | customers |
---|---|
1 | 326 |
Example 3#
Let’s say we want to see what’s the average rental rate of only a few movie ratings:
%%sql
SELECT rating, round(avg(rental_rate), 2) AS avg_rate
FROM film
WHERE rating IN ('R', 'G', 'PG')
GROUP BY rating;
* postgresql://bob:***@localhost:5432/dvdrental
3 rows affected.
rating | avg_rate |
---|---|
G | 2.89 |
PG | 3.05 |
R | 2.94 |
But we want to see only the ratings with average rental rate below 3 bucks:
%%sql
SELECT rating, round(avg(rental_rate), 2) AS avg_rate
FROM film
WHERE rating IN ('R', 'G', 'PG')
GROUP BY rating
HAVING round(avg(rental_rate), 2) < 3;
* postgresql://bob:***@localhost:5432/dvdrental
2 rows affected.
rating | avg_rate |
---|---|
G | 2.89 |
R | 2.94 |
Example 4#
We want to get a list of customer ids of those customers who have made at least a total of 40 transactions (we may want to offer them our platinum membership).
%%sql
SELECT customer_id, count(payment_id) AS total_transations
FROM payment
GROUP BY customer_id
HAVING count(payment_id) >= 40;
* postgresql://bob:***@localhost:5432/dvdrental
3 rows affected.
customer_id | total_transations |
---|---|
144 | 40 |
526 | 42 |
148 | 45 |
Example 5#
When grouped by rating
, what movie ratings have an average rental duration of more than 5 days?
%%sql
SELECT rating, round(avg(rental_duration), 2) AS avg_rental_duration
FROM film
GROUP BY rating
HAVING avg(rental_duration) > 5;
* postgresql://bob:***@localhost:5432/dvdrental
3 rows affected.
rating | avg_rental_duration |
---|---|
PG | 5.08 |
PG-13 | 5.05 |
NC-17 | 5.14 |
Example 6#
Return the customer IDs of customers who have spent at least $110 with the staff member who has an ID of 2. Answer: 187 and 148.
%%sql
SELECT customer_id, sum(amount) AS moneys
FROM payment
WHERE staff_id = 2
GROUP BY customer_id
HAVING sum(amount) >= 110;
* postgresql://bob:***@localhost:5432/dvdrental
2 rows affected.
customer_id | moneys |
---|---|
187 | 110.81 |
148 | 110.78 |
Example 7#
How many films begin with the letter J? Answer: 20.
%%sql
SELECT count(*) AS start_with_J
FROM film
WHERE title LIKE 'J%';
* postgresql://bob:***@localhost:5432/dvdrental
1 rows affected.
start_with_j |
---|
20 |
Example 8#
What customer has the highest customer ID whose name starts with E and has an address ID lower than 500? Answer: Eddie Tomlin.
%%sql
SELECT customer_id, first_name, last_name, address_id
FROM customer
WHERE first_name LIKE 'E%' AND address_id < 500
ORDER BY customer_id DESC
LIMIT 1;
* postgresql://bob:***@localhost:5432/dvdrental
1 rows affected.
customer_id | first_name | last_name | address_id |
---|---|---|---|
434 | Eddie | Tomlin | 439 |