DEV Community

Cover image for Setting Up Presto with Apache Superset using Docker 🐳 : Hands-On Guide
Saurabh Mahawar
Saurabh Mahawar

Posted on β€’ Edited on

Setting Up Presto with Apache Superset using Docker 🐳 : Hands-On Guide

In previous articles, we explored how to download and install PrestoDB locally on your machine. In this guide, we take it a step further: you'll learn how to set up and run a single-node Presto cluster using Docker, and connect it to Apache Superset. We'll walk through querying data from multiple sources like MySQL and MongoDB via PrestoDB. Whether you're a developer, data engineer, or BI enthusiast, this step-by-step tutorial will help you build a modern analytics stack with open-source tools and Docker.

Pre-Requisites:

  • Docker Application (I am using OrbStack).
  • Knowledge of Basic Docker Commands.

Step -1: Project Structure:

Project Structure

Step -2: Setting Up Docker Compose:

version: "3.8"

services:
  superset:
    image: apache/superset:latest
    container_name: superset
    ports:
      - "8088:8088"
    environment:
      SUPERSET_SECRET_KEY: 'supersecretkey'
      PYTHONUNBUFFERED: 1
    depends_on:
      - db
    volumes:
      - superset_home:/app/superset_home
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8088/health"]
      interval: 30s
      timeout: 10s
      retries: 5
    command: >
      /bin/bash -c "
      sleep 10 &&
      superset db upgrade &&
      superset fab create-admin --username admin --firstname Admin --lastname User --email admin@superset.com --password admin &&
      superset init &&
      superset run -h 0.0.0.0 -p 8088
      "

  db:
    image: postgres:15
    container_name: superset_db
    environment:
      POSTGRES_DB: superset
      POSTGRES_USER: superset
      POSTGRES_PASSWORD: superset
    volumes:
      - db_data:/var/lib/postgresql/data

  mysql:
    image: mysql:latest
    container_name: mysql
    environment:
      MYSQL_ROOT_PASSWORD: root
      MYSQL_DATABASE: testdb
    ports:
      - "3307:3306"
    volumes:
      - mysql_data:/var/lib/mysql

  mongo:
    image: mongo:latest
    container_name: mongodb
    ports:
      - "27018:27017"
    volumes:
      - mongo_data:/data/db

  presto:
    image: prestodb/presto:latest
    container_name: presto
    ports:
      - "8081:8080"
    volumes:
      - ./presto/etc/catalog/mongodb.properties:/opt/presto-server/etc/catalog/mongodb.properties
      - ./presto/etc/catalog/mysql.properties:/opt/presto-server/etc/catalog/mysql.properties
    depends_on:
      - mysql
      - mongo

volumes:
  superset_home:
  db_data:
  mysql_data:
  mongo_data:
Enter fullscreen mode Exit fullscreen mode

Step -3: Creating Presto Catalog Files:

mysql.properties (To connect MySQL Database)

connector.name=mysql
connection-url=jdbc:mysql://mysql:3306
connection-user=root
connection-password=root
Enter fullscreen mode Exit fullscreen mode

mongodb.properties (To connect MongoDB Database)

connector.name=mongodb
mongodb.seeds=mongodb:27017
Enter fullscreen mode Exit fullscreen mode

Step -4: Start all the Services:

  • Go to terminal and navigate to the docker-compose.yml file directory.

Present Working Directory

  • Hit the below command. (It will automatically start all the services, just wait for 3-5 mins, as docker will pull all the images).
docker-compose up -d
Enter fullscreen mode Exit fullscreen mode
  • Once all the images are pulled, hit the below command to check the status of all containers.
docker ps
Enter fullscreen mode Exit fullscreen mode

Docker Running Container's Status

Orbstack

  • You will see an output like snapshot shared above. Now, let's confirm that PrestoDB and Apache Superset are running on their respective ports or not.

  • Open browser and check Apache Superset is listening on port 8088 (http://localhost:8088/) and Presto on port 8081 (http://localhost:8081/).

Apache Superset is listening on port 8088

Presto is listening on port 8081

Step -5: Connecting PrestoDB as a database to Apache Superset:

  • Superset doesn't ship Presto driver by default. So, as a next step we need to install it manually. Hit the below command to enter inside superset docker container.
docker exec -it superset bash
Enter fullscreen mode Exit fullscreen mode
  • As soon as you hit this command, you will be inside superset docker container.

  • We need to install pyhive[presto], this is a important Python package to connect PrestoDB with Superset. Hit the below command.

pip install "pyhive[presto]"
Enter fullscreen mode Exit fullscreen mode
  • Once Installation is complete, exit the Superset container using exit command and restart Superset container.
docker restart superset
Enter fullscreen mode Exit fullscreen mode
  • Open Superset on browser: localhost:8088 and enter username and password.
Username:admin
Password:admin
Enter fullscreen mode Exit fullscreen mode
  • Navigate to Settings -> Database Connections -> Database.

Click on Test Connection to check the status

  • Click on CONNECT once you see "Connection looks good".

  • Congratulations, everything is running smoothly and Presto has connected with Apache Superset.

Step -6: Let's run a SQL Query also verify MySQL and MongoDB should visible as Catalogs:

Query Executed Successfully with MySQL and MongoDB as catalogs.

Conclusion:

Conclusion

Follow Presto at Official Website, Linkedin, Youtube, and Join Slack channel to interact with the community.

Top comments (2)

Collapse
Β 
propelius profile image
Propelius β€’

good post!

Collapse
Β 
saurabhmahawar profile image
Saurabh Mahawar β€’

Thanks @propelius