SQL Lab
Contents
SQL Lab#
This is a setup for learning SQL with the help of Jupyter notebooks. The idea is to be able of executing SQL queries from the notebook against a PostgreSQL server. We need two ingredients to make this happen:
Jupyter Lab, which will be running within a Conda virtual environment.
A PostgreSQL server and pgadmin, which will be running in Docker containers.
The original idea was to run PostgreSQL from within the Conda environment, but couldn’t make it work. Packages installed in such an environment, can’t work as system services, so I could only reach them from within the environment; they weren’t available when trying to reach them from, for example, pgadmin running in a Docker container.
Setting up the virtual environment#
To create a virtual environment using Anaconda we run:
conda create --name sql-lab
Let’s also create a folder to put the files (our database cluster and jupyter notebooks) we’ll be using on this environment:
mkdir sql-lab
Installing Jupyter Lab#
For the Jupyter Lab with PostgreSQL set up, we’ll be needing Jupyter Lab itself, and several dependencies. All the packages below should be installed in the Conda virtual environment, so make sure is activated.
theme-darcula, a cool theme for jupyter lab.
ipython-sql to get SQL magic functions (
%sql
).pgspecial so we can run Postgres metacommands (
\dt
).sqlalchemy to create the connection to the database.
Just follow the instructions in the links above to install the packages.
Additionally, I installed:
jupyter-book, to create the book you’re reading.
ghp-import, to publish our book to GitHub.
Launching the Lab#
Once everything is installed, we can run:
jupyter lab
That will start a local server, where we can reach the Jupyter lab interface. In the next section, we’ll go over how to create a connection to the database.
Launching Postgres and PgAdmin#
In order to launch Postgres and PgAdmin I wrote a couple targets in a Makefile
. Check the compose-postgres.yml
as well, to get familiar with both services.
Something weird I experienced when creating connections to Postgres, was that I had to use
localhost
and not the name of the Docker service (postgres
) in order to create connections to the database 🤔.