DEV Community

Jean-Yves Pellé
Jean-Yves Pellé

Posted on

Configuring Apache Superset 3 in a production environment

Apache Superset proves to be a fantastic Business Intelligence tool: open-source, feature-rich (with a multitude of charts, integration with multiple database management systems, etc.), and it has nothing to envy in comparison to well-known commercial alternatives (Tableau, PowerBI, BusinessObjects, and the like).

Unfortunately, being fantastic doesn't mean it's without flaws, particularly in terms of documentation, which may seem a bit sparse to some.

It is precisely to address this shortcoming that I propose we look at how to quickly set up a Superset instance tailored for production together.

Prerequisites

  • Have a production server running Linux (regardless of the distribution) with:
    • docker
    • git
    • caddy (or another reverse proxy of your choice)
    • ports 443 and 80 open at the firewall level
  • A domain name (e.g., bi.myawesomecompany.com) that points to your server

Installation

Cloning the superset repo

Start by cloning the repository directly on your production server (yes, this is not common):

cd $HOME
git clone https://github.com/apache/superset.git
cd superset
Enter fullscreen mode Exit fullscreen mode

Then select the desired version via its tag.

git checkout tags/3.0.0
Enter fullscreen mode Exit fullscreen mode

Editing configuration files

Now comes the step of customizing certain configuration files. Since you are on a git tag, you won't be able to commit your modifications as is, so you have the choice of:

  • documenting all your modifications to reproduce them exactly in case of reinstallation
  • or forking the project and creating a branch to commit your modifications

It's up to you.

The docker/.env-non-dev file (understand here: non-dev = prod) allows you to define a set of environment variables that will be used by the docker containers we will start later.

Add the following:

# We don't want demo data in production
SUPERSET_LOAD_EXAMPLES=no
# A random string to encode session cookies
SUPERSET_SECRET_KEY=4Sido8BkIjs54Vz2XyVD5GJIvANVIAT399dRESjdmr4vm92n
# To prevent XSS attacks (among other things)
TALISMAN_ENABLED=yes
# Number of workers: the higher the value, the fewer intermittent chart refresh failures you will have in your dashboards (adjust according to your server's power).
SERVER_WORKER_AMOUNT=64
Enter fullscreen mode Exit fullscreen mode

Also, make some adjustments in docker/pythonpath_dev/superset_config.py to enable alerting and the template engine (necessary for creating datasets with dynamic filtering):

FEATURE_FLAGS = {"ALERT_REPORTS": True, "ENABLE_TEMPLATE_PROCESSING": True}
Enter fullscreen mode Exit fullscreen mode

Disable telemetry by replacing in the docker-compose-non-dev.yml file:

x-superset-image: &superset-image apachesuperset.docker.scarf.sh/apache/superset:${TAG:-latest-dev}
Enter fullscreen mode Exit fullscreen mode

with

x-superset-image: &superset-image apache/superset:${TAG:-latest-dev}
Enter fullscreen mode Exit fullscreen mode

And switch to the latest stable version of postgreSQL by replacing:

     image: postgres:14
Enter fullscreen mode Exit fullscreen mode

with

     image: postgres:16
Enter fullscreen mode Exit fullscreen mode

Startup

Instantiate and start the docker containers:

docker compose -f docker-compose-non-dev.yml up -d
Enter fullscreen mode Exit fullscreen mode

Superset is accessible on your production server via http://127.0.0.1:8088.

Reverse proxy

Configure a reverse proxy to secure the connection, for example, using caddy:

Edit /etc/caddy/Caddyfile

bi.myawesomecompany.com {
        reverse_proxy http://127.0.0.1:8088
}
Enter fullscreen mode Exit fullscreen mode

Then restart caddy:

sudo service caddy restart
Enter fullscreen mode Exit fullscreen mode

First Login

Log in at https://bi.myawesomecompany.com with the username / password: admin / admin

Login page

Change your password.

Backup and Restoration

When editing the docker-compose-non-dev.yml configuration file, you may have noticed that a postgresql database is being instantiated.

Therefore, backup and restoration for superset only need to consider this database.

You can perform a hot backup with a simple command:

docker exec -t superset_db pg_dump superset -U superset | xz > backup.sql.xz
Enter fullscreen mode Exit fullscreen mode

For restoration, start only the postgresql container in advance, avoiding having a superset instance connected to a database being restored:

docker compose down
docker compose -f docker-compose-non-dev.yml up db -d
docker exec -t superset_db dropdb -U superset superset
docker exec -t superset_db createdb -U superset superset
xz -dc backup.sql.xz | docker exec -i superset_db psql -U superset -d superset
docker compose -f docker-compose-non-dev.yml up -d
Enter fullscreen mode Exit fullscreen mode

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more

Top comments (1)

Collapse
 
Sloan, the sloth mascot
Comment deleted

Some comments have been hidden by the post's author - find out more

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more →