import sqlalchemy
engine = sqlalchemy.create_engine('postgresql://bob:1234@localhost:5432/dvdrental')
engine.execution_options(isolation_level="AUTOCOMMIT")
%load_ext sql
%sql $engine.url
'Connected: bob@dvdrental'

The HAVING clause#

The HAVING clause is used in conjuction with the GROUP BY clause to filter grouped rows that do not satisfy a given condition. The basic syntax looks like this:

SELECT column_1, aggregate_fuction(column_2)
FROM table_name
GROUP BY column_1
HAVING condition;

The HAVING clause set the condition for the grouped rows created once the GROUP BY clause has been applied, whereas the WHERE clause sets the condition to filter the rows before the GROUP BY statement has been applied.

Example 1#

Let’s say we have a query to find the 10 best customers:

%%sql
SELECT customer_id, sum(amount) AS moneys
FROM payment
GROUP BY customer_id
ORDER BY moneys DESC
LIMIT 10;
 * postgresql://bob:***@localhost:5432/dvdrental
10 rows affected.
customer_id moneys
148 211.55
526 208.58
178 194.61
137 191.62
144 189.60
459 183.63
181 167.67
410 167.62
236 166.61
403 162.67

But now, we want to find out the customers who have spent a total within a given range, let’s say between \(170 and \)200:

%%sql
SELECT customer_id, sum(amount) AS moneys
FROM payment
GROUP BY customer_id
HAVING sum(amount) BETWEEN 170 AND 200;
 * postgresql://bob:***@localhost:5432/dvdrental
4 rows affected.
customer_id moneys
144 189.60
178 194.61
459 183.63
137 191.62

Example 2#

Let’s say we have a query to find out the amount of customers if all of our store locations:

%%sql
SELECT store_id, count(*) AS customers
FROM customer
GROUP BY store_id;
 * postgresql://bob:***@localhost:5432/dvdrental
2 rows affected.
store_id customers
1 326
2 273

In this case we have only two stores but let’s imagine we have a lot of them, and we want to see only the stores that have 300 customers or more:

%%sql
SELECT store_id, count(*) AS customers
FROM customer
GROUP BY store_id
HAVING count(*) > 300;
 * postgresql://bob:***@localhost:5432/dvdrental
1 rows affected.
store_id customers
1 326

Example 3#

Let’s say we want to see what’s the average rental rate of only a few movie ratings:

%%sql
SELECT rating, round(avg(rental_rate), 2) AS avg_rate
FROM film
WHERE rating IN ('R', 'G', 'PG')
GROUP BY rating;
 * postgresql://bob:***@localhost:5432/dvdrental
3 rows affected.
rating avg_rate
G 2.89
PG 3.05
R 2.94

But we want to see only the ratings with average rental rate below 3 bucks:

%%sql
SELECT rating, round(avg(rental_rate), 2) AS avg_rate
FROM film
WHERE rating IN ('R', 'G', 'PG')
GROUP BY rating
HAVING round(avg(rental_rate), 2) < 3;
 * postgresql://bob:***@localhost:5432/dvdrental
2 rows affected.
rating avg_rate
G 2.89
R 2.94

Example 4#

We want to get a list of customer ids of those customers who have made at least a total of 40 transactions (we may want to offer them our platinum membership).

%%sql
SELECT customer_id, count(payment_id) AS total_transations
FROM payment
GROUP BY customer_id
HAVING count(payment_id) >= 40;
 * postgresql://bob:***@localhost:5432/dvdrental
3 rows affected.
customer_id total_transations
144 40
526 42
148 45

Example 5#

When grouped by rating, what movie ratings have an average rental duration of more than 5 days?

%%sql
SELECT rating, round(avg(rental_duration), 2) AS avg_rental_duration
FROM film
GROUP BY rating
HAVING avg(rental_duration) > 5;
 * postgresql://bob:***@localhost:5432/dvdrental
3 rows affected.
rating avg_rental_duration
PG 5.08
PG-13 5.05
NC-17 5.14

Example 6#

Return the customer IDs of customers who have spent at least $110 with the staff member who has an ID of 2. Answer: 187 and 148.

%%sql
SELECT customer_id, sum(amount) AS moneys
FROM payment
WHERE staff_id = 2
GROUP BY customer_id
HAVING sum(amount) >= 110;
 * postgresql://bob:***@localhost:5432/dvdrental
2 rows affected.
customer_id moneys
187 110.81
148 110.78

Example 7#

How many films begin with the letter J? Answer: 20.

%%sql
SELECT count(*) AS start_with_J
FROM film
WHERE title LIKE 'J%';
 * postgresql://bob:***@localhost:5432/dvdrental
1 rows affected.
start_with_j
20

Example 8#

What customer has the highest customer ID whose name starts with E and has an address ID lower than 500? Answer: Eddie Tomlin.

%%sql
SELECT customer_id, first_name, last_name, address_id
FROM customer
WHERE first_name LIKE 'E%' AND address_id < 500
ORDER BY customer_id DESC
LIMIT 1;
 * postgresql://bob:***@localhost:5432/dvdrental
1 rows affected.
customer_id first_name last_name address_id
434 Eddie Tomlin 439