One major factor to obtaining, and sustaining, good performance within an IBM Informix database is having the appropriate statistics available. This article provides another method and explores the “output to pipe” utility within SQL.
There are several ways that statistics can be maintained, the most common, apart from home grown, are:
- AUS – Auto Update Statistics, driven by the dbscheduler and configured most easily through OAT
- Art Kagel’s dostats package – widely used in the IBM Informix Community and comprehensive
What is focused on here is having a simple method to generate and run efficient update statistics commands providing the recommended set of required statistics without getting into setting up AUS or compilation of dostats – i.e. a simple way to ensure your database has the relevant statistics generated using a single command; ideal for test environments or small production environments. This also makes use of the IBM Informix SQL statement “output to pipe” which is used to pass output to another program, in this example, “awk”.
Some basic premises:
- All tables should have “update statistics low” run against them and distributions are dropped at the beginning of this exercise (with PDQPRIORITY 1 to use parallel fragment scans if present)
- All leading columns in indexes should have “update statistics high” run against them (with PDQPRIORITY set to 1)
- All trailing columns in indexes should have “update statistics medium” run against them (excluding the columns which are also leading columns in indexes) (with PDQPRIORITY set to 1)
- All procedures should have update statistics run against them (with PDQPRIORITY set to 0, unless one wants to saturate PDQ resources!)
- DBUPSPACE 10000:50:0 is in use (but this can be experimented with)
This approach is based on three files:
Identifies all trailing columns in indexes and all leading columns in indexes, outputting to “pipe ‘awk -f jj.awk’ without headings” the mode (medium|high), table name and column name
Outputs “set PDQPRIORITY 1; update statistics low drop distributions; update statistics high for table tabN(col1, col2… coln); update statistics medium for table tabN(col1, col2… coln); set PDQPRIORITY 0; update statistics for procedure;”
Simple driver shell script requiring just the DATABASE NAME
Review of jj_stats_to_awk.sql
When run against the stores_demo database without the “output to pipe ‘awk -f jj.awk’ without headings” line, the following is produced:
So, what does jj.awk do?
Let’s see by running jj_stats_to_awk.sql against the stores_demo database with the “output to pipe ‘awk -f jj.awk’ without headings” line included:
Note we now have the two columns in calendartable, (c_id,c_name) in one update statistics statement meaning a single pass of the calendartable is required.
Note that in the above STDERR is forced to /dev/null (2>/dev/null) for readability only.
Note that we now have a set of statements ready to run against the database; we could just do the following:
But a simple shell script, jj_simple_stats.sh, makes things a bit easier, adding duration, single database statement and setting DBUPSPACE.
Now all that is required is:
Although this is not parallel running, the performance is still pretty impressive under 12.10!
The ability to run a comprehensive set of update statistics in a single command is a genuine requirement, and this can be achieved using the above in a single command. The approach of using output to pipe can be adapted for different requirements, and these may be covered here in the future!
The code fix suggested above is provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.