The tpch dbgen utility generates, by default, a set of flat files suitable for loading into the tpch schema with the size based on the “Scale Factor” argument; a scale factor of 1 produces a complete data set of approximately 1 GB, a scale factor of 10 produces a data set of approximately 10 GB etc.
Download the following zip file http://www.tpc.org/tpch/spec/tpch_2_17_0.zip
to a temporary directory and unzip.
Go to the extracted tpch_2_17_0/dbgen directory and copy makefile.suite to Makefile; within the Makefile amend the following to suit your environment:
Then run make
ensuring a clean compilation!
After making sure that there is adequate filesystem disk space available (i.e. more than 1 GB!), run ./dbgen –s 1
If the following files are produced, then dbgen has been successfully built:
For completeness and readability, perform the following:
- Remove the just generated .tbl files under ../tpch_2_17_0/dbgen
- Create a new directory, /home/Informix/dbgen_article (for example)
- Copy ./tpch_2_17_0/dbgen/dists.dss to /home/Informix/dbgen_article/
- Copy ./tpch_2_17_0/dbgen/dbgen to /home/Informix/dbgen_article/
The dbgen utility can be run with various options, some examples are detailed below:
- ./dbgen —
- ./dbgen –s 1 –f
- Force overwrite of existing files
- ./dbgen –s 1 –T c
- Generate just the customers (there are options for each table)
The following helper script generates all table data in parallel as flat files:
The generated data can also be placed, in parallel, on pipes with a slight amendment to the above script:
However, the data generation will not proceed until each pipe has started to be read; the following helper script can be used for flushing all data through the pipes:
Hint – wait until 100% displayed for each dbgen execution before executing this script.
With the above information, there are two approaches that can be followed to load data; one is loading the data from flat files and the second is loading the data from pipes.