Developing with Postgres

#database #postgres

Databases are complex systems with many different moving parts that work together to store/retrieve data. There are many database systems each solving unique problems within its space. Postgres is a popular open source relational database. For our case, Postgres is a general purpose database which attempts to solve many different problems for the user.

The best way to understand such complex systems is to break them down into simple "easily understandable" parts. Database systems are composed of the following sub-systems(not an exhaustive list):

Query Planner
Optimizer
Executor
Tokenizer/Parser
Storage
Transaction Management
Write-Ahead-Logging etc

I've recently began going through the internals of Postgres and I have written an extension PG_WAL_EXT that reads a WAL(Write Ahead Log) file and generates back the raw SQL statements for Insert/Update/Delete commands.

For projects I'm working on, I usually create a simple and easy way to get it setup and running on my machine -- with the help of google of course :) . This helps to tear it down and build it up again as fast as possible when I need to test something or when I mess up.

I am going to share my setup for Postgres . The Dockerfile looks like this

# I have an AMD CPU. Use the right image for your case
FROM amd64/ubuntu:latest

# Prepare the directories and files required
ENV APP_HOME=/postgres/ ID=991 USR=postgres USR_HOME=/home/postgres PG_FILES=/usr/local/pgsql/ BASH_PROFILE=/etc/bash.bashrc

# Permissions
RUN groupadd -g ${ID} ${USR} && \
    useradd -r -u ${ID} -g ${USR} ${USR}

# If you have Postgres installed in the current directory. This should be enough to copy
# the source code to the image...You can also download it from the repo directly 
# with `RUN git clone https://github.com/misachi/postgres.git`
ADD . ${APP_HOME}
WORKDIR ${APP_HOME}
RUN chown -R ${USR}:${USR} ${APP_HOME} && \
        mkdir -p ${USR_HOME} && \
        chown -R ${USR}:${USR} ${USR_HOME}


# Requirements installations
RUN apt-get update && apt-get install -y g++ gdb \
            zlib1g-dev \
            make curl \
            tar gzip \
            git nano \
            libreadline-dev \
            flex bison libicu-dev

# Build and Install Postgres
# Install a debug build since want to develop with Postgres
# If debug build is not required, remove the CFLAGS="-O2 -ggdb3" section
# I had issues with installing icu library and making it work with PG, that is why I pass in the --without-icu flag. It can be removed if required
RUN CFLAGS="-O2 -ggdb3" ./configure --without-icu --enable-debug && \
        make  && \
        make all && \
        make install

RUN echo "export PATH=$PATH:/usr/local/pgsql/bin/" >>  ${BASH_PROFILE} && \
        chown -R ${USR}:${USR} ${PG_FILES}
USER ${USR}

# Post-Installation
RUN ${PG_FILES}/bin/pg_ctl -D /usr/local/pgsql/data initdb
CMD [ "pg_ctl", "-D", "/usr/local/pgsql/data", "-l", "logfile start" ]

The file is commented for clarity.

Building the image is as simple as:

docker build -t <name>-<version> .

Once the image build is complete. You can create and run the container with

docker run -d \
    --cap-add=SYS_PTRACE \
    --init --ulimit core=-1 --mount type=bind,source=/tmp/,target=/tmp/ \
    --name postgres-test -p 5432:5432 <image_name> /bin/bash -c '/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start && tail -f /dev/null'

Here we are mounting /tmp directory into the container. This is where our core files will be dumped into in case of a system crash or any other issues. We also set size of core-dumped files ulimit with --ulimit core=-1. This sets it to unlimited. We set --cap-add=SYS_PTRACE so that we are able to trace processes within the container.

By default core dumps files are written to /var/lib/apport/coredump/ on most Ubuntu systems. It might be different for your machine. You can change this to the /tmp directory(that we mounted in the container) with the following command:

echo '/tmp/core.%e.%t' | sudo tee /proc/sys/kernel/core_pattern

This also tells it to format the core dump file a certain way as specified by core.%e.%t. The format specifiers from core(5) manual are as follows:

%%  a single % character
%p  PID of dumped process
%i  TID of thread that triggered core dump
%c  core file size soft resource limit of crashing process (since Linux 2.6.24)
%u  Numeric real UID of dumped process.
%g  Numeric real GID of dumped process.
%s  Number of signal causing dump
%t  Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC).
%h  Hostname (same as nodename returned by uname(2)).
%e  executable filename (without path prefix)

With this setup, you should be able to run a debugger inside the container, attach it to a Postgres process and step through the code e.g to follow a function call or to help figure out an issue with a core dump file. I'll write about using the debugger with Postgres in another post.

Quadratic AI – The Spreadsheet with AI, Code, and Connections

AI-Powered Insights: Ask questions in plain English and get instant visualizations
Multi-Language Support: Seamlessly switch between Python, SQL, and JavaScript in one workspace
Zero Setup Required: Connect to databases or drag-and-drop files straight from your browser
Live Collaboration: Work together in real-time, no matter where your team is located
Beyond Formulas: Tackle complex analysis that traditional spreadsheets can't handle

Get started for free.

Watch The Demo 📊✨

DEV Community

Developing with Postgres

Quadratic AI – The Spreadsheet with AI, Code, and Connections

Top comments (0)