Seed the database tables
Create the tables
Create the tables in a database called bank.
PostgreSQL
postgres=# CREATE DATABASE bank;
Now, you need to disconnect and reconnect to bank
psql 'postgres://fabio:postgres@localhost:5432/bank?sslmode=disable'
Once on the SQL prompt, import the file, or copy-paste, as you prefer.
bank=# \i bank.ddl
bank=# \d
-- List of relations
-- Schema | Name | Type | Owner
-- --------+----------+-------+-------
-- public | orders | table | fabio
-- public | ref_data | table | fabio
-- (2 rows)
CockroachDB
defaultdb> CREATE DATABASE bank;
defaultdb> USE bank;
bank> \i bank.ddl
bank> SHOW TABLES;
-- schema_name | table_name | type | owner | estimated_row_count | locality
-- --------------+------------+-------+-----------+---------------------+-----------
-- public | orders | table | cockroach | 0 | NULL
-- public | ref_data | table | cockroach | 0 | NULL
-- (2 rows)
Generate datasets
Next, generate some CSV data to seed the database.
dbworkload has 2 built-in utility functions that can assist with this task:
dbworkload util yaml, which converts a DDL file into a data generation definition file, structured in YAML.dbworkload util csv, which takes the YAML file as input and outputs CSV files.
Let's use the yaml utility with our bank.ddl file.
dbworkload util yaml -i bank.ddl
For this exercise, we will use below simplified YAML file.
Replace the content of bank.yaml with below YAML
ref_data:
- count: 1000
sort-by:
- acc_no
columns:
acc_no:
type: sequence
external_ref_id:
type: uuid
created_time:
type: timestamp
acc_details:
type: string
Now let's create a CSV dataset
dbworkload util csv -i bank.yaml -x 1
The CSV files will be located inside a bank directory.
$ ls -lh1 bank
ref_data.0_0_0.tsv
$ wc -l bank/*
1000 bank/ref_data.0_0_0.tsv
Inspect it
$ head -n5 bank/ref_data.0_0_0.tsv
0 3a2edc9d-a96b-4541-99ae-0098527545f7 2008-03-19 06:20:27.209214 CWUh0FWashpmWCx4LF3kb1
1 829de6d6-103c-4707-9668-c4359ef5373c 2014-02-13 22:04:20.168239 QGspICZBHYpRLnHNcg
2 5dd183af-d728-4e12-8b11-2900b6f6880a 2019-04-01 16:14:40.388236 sEUukccOePdnIbiQyVUSi0HS7rL
3 21f00778-5fca-4302-8380-56fa461adfc8 2003-05-21 19:21:21.598455 OQTNwxoZIAdNmcA6fJM5eGDvMJgKJ
4 035dac61-b4a3-40a4-9e4d-0deb50fef3ae 2011-08-15 06:15:40.405698 RvToVnn20BEXoxFzw9QFpCt
Importing datasets
Now we are ready to import the CSV file into our table ref_data.
PostgreSQL
For PostgreSQL Server, at the SQL prompt, just use COPY
bank=# COPY ref_data FROM '/path/to/workloads/bank/ref_data.0_0_0.tsv' WITH CSV DELIMITER AS e'\t';
COPY 1000
Time: 8.713 ms
CockroachDB
For CockroachDB, my favorite method is to use a webserver for serving the CSV files.
Open a new terminal then start a simple python server
cd workloads/bank
python3 -m http.server 3000
If you open your browser at http://localhost:3000, you should see file ref_data.0_0_0.tsv being served.
At the SQL prompt, import the file
bank> IMPORT INTO ref_data CSV DATA ('http://localhost:3000/ref_data.0_0_0.tsv') WITH delimiter = e'\t', nullif = '';
job_id | status | fraction_completed | rows | index_entries | bytes
----------------------+-----------+--------------------+------+---------------+--------
1013454367369822209 | succeeded | 1 | 1000 | 0 | 71401
(1 row)
Review imported data
Good stuff, your dataset has been successfully imported, confirm on the SQL prompt
bank=# SELECT * FROM ref_data LIMIT 5;
acc_no | external_ref_id | created_time | acc_details
--------+--------------------------------------+-------------------------------+--------------------------
0 | a4a689d3-ae12-4ca2-8b74-a9929c89f420 | 2015-06-03 04:03:48.248701-04 | DiKysAoJeZ9hqDehOhE2N
1 | 01abe15c-fcae-4492-812e-37b2212ababa | 2019-06-11 17:57:05.76814-04 | nEYyCwhTJTXXaMfeHeWj
2 | 40cc3ccd-74bf-4688-ae85-7f68cd0e44f2 | 2001-10-10 18:42:21.865742-04 | DFhQ0aUbjqJ9CsczdyWv
3 | 9491509e-0424-4cdf-879f-790cff2d7289 | 2003-08-13 20:26:49.484942-04 | ggpx35EPH698N2MGlUPV13P0
4 | a54b9e47-10c7-4dce-9db8-12c7eaf80e17 | 2022-06-02 22:56:20.445508-04 | JCOhKbB3YMrOOBm
(5 rows)
bank=# SELECT count(*) FROM ref_data;
count
-------
1000
(1 row)
We're then ready to proceed to the next section.