Datahub
DataHub is an open-source metadata platform for the data stack. DataHub is a modern data catalog built to enable end-to-end data discovery, data observability, and data governance. It supports various data sources including PostgreSQL.
Because YugabyteDB's YSQL API is wire-compatible with PostgreSQL, Datahub can connect to YugabyteDB as a data source using the PostgreSQL plugin.
Setup
You can run the Docker Compose quickStart example provided in the Datahub GitHub repository against YugabyteDB with the following changes:
- Replace the MySql Docker image with that of YugabyteDB.
- Specify the entrypoint command for the YugabyteDB Docker container.
- Change port from 5432 to 5433
- Change username and password to yugabyte.
- Change the driver to
org.postgresql.Driver
.
Make changes in the following files:
-
In
docker/quickstart/docker-compose-without-neo4j.quickstart.yml
, change the following:-
Change the EBEAN_DATASOURCE configuration [lines 80-84 and 126-130] as follows:
EBEAN_DATASOURCE_DRIVER=org.postgresql.Driver EBEAN_DATASOURCE_HOST=yugabyte:5433 EBEAN_DATASOURCE_PASSWORD=yugabyte EBEAN_DATASOURCE_URL=jdbc:postgresql://yugabyte:5433/yugabyte EBEAN_DATASOURCE_USERNAME=yugabyte
-
Change
mysql-setup
topostgres-setup
[line 123]. -
Replace the mysql and mysql-setup container [lines 197 - 231] with yugabyte and postgres-setup container as follows:
yugabyte: container_name: yugabyte hostname: yugabyte image: yugabytedb/yugabyte:latest command: /bin/bash /home/yugabyte/docker-entrypoint-initdb.d/yb-init.sh environment: POSTGRES_USER: ${POSTGRES_USER:-yugabyte} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-yugabyte} ports: - '5433:5433' volumes: - ./yb-setup/:/home/yugabyte/docker-entrypoint-initdb.d/ healthcheck: test: bin/ysqlsh -h `hostname -i` -U yugabyte -tAc 'select 1' -d yugabyte interval: 10s timeout: 5s retries: 20 postgres-setup: container_name: postgres-setup depends_on: yugabyte: condition: service_healthy environment: - POSTGRES_HOST=yugabyte - POSTGRES_PORT=5433 - POSTGRES_USERNAME=yugabyte - POSTGRES_PASSWORD=yugabyte - DATAHUB_DB_NAME=yugabyte hostname: yugabyte-setup image: ${DATAHUB_POSTGRES_SETUP_IMAGE:-acryldata/datahub-postgres-setup}:${DATAHUB_VERSION:-head}
-
-
Create a directory
yb-setup
indocker/quickstart/
and a script file namedyb-init.sh
with the following content and place it underdocker/quickstart/yb-setup/
in the repository. The script runs during container initialization to launch the YugabyteDB cluster.bin/yugabyted start sleep 5 bin/ysqlsh -h `hostname -i` -f /home/yugabyte/docker-entrypoint-initdb.d/init.sql tail -f /dev/null
-
Copy the file
docker/postgres/init.sql
todocker/quickstart/yb-setup/
.
Run the example
Run the example using the following command:
docker compose -f docker-compose-without-neo4j.quickstart.yml up -d
After all the containers are running, you can ingest some demo data by running ./datahub/docker/ingestion/ingestion.sh
, or head to http://localhost:9002 (username: datahub, password: datahub) to access the UI.