LOAD and UNLOAD functions

Updated June 2024

June 2024 change log:

sp_load into temp table handled
sp_load no longer specifies EXPRESS (only caused warning in message log)

September 2023 change log:

Work-around for tables containing hidden specialized columns such as VERCOLS
DBDATE external table option removed for better flexibility via environment
CONSTRAINTS also disabled as necessary for EXPRESS with hidden indexes.

We have raised relevant APARs to be fixed in 14.10.xC11. They are:

IT44526 SQL error: -26190 and -236 on inserting external table from a ‘table having VERCOLS’

https://www.ibm.com/docs/en/informix-servers/14.10?topic=statement-specialized-columns

Worked around in “sp_load” by excluding all type of hidden specialized columns.

IT44527 Inaccurate documentation on conditions forcing DELUXE load from external table

https://www.ibm.com/docs/en/informix-servers/14.10?topic=statement-table-options

DELUXE is in fact forced if the table contains BYTE/TEXT columns or the row size exceeds page size minus 32.

Abstract

The Informix statements LOAD and UNLOAD allow data to be transferred between a flat file and a database, but are in fact only implemented in certain clients such as DB-Access, 4GL variants, and AGS Server Studio. You may therefore need functions in Informix Dynamic Server (IDS) that do this, such as when coding in Java with JDBC. Furthermore, external tables are much lighter and quicker for this purpose, so you might in any case prefer the functions described in this article, which use them in Informix Stored Procedure Language (SPL).

Content

Examples in this article were produced using the provided “stores_demo” database in a Docker container from ibmcom/informix-developer-database.

An EXTERNAL TABLE definition allows reading and writing to a flat file as though it were a normal table, but with many restrictions: in summary, they cannot have indexes, triggers or replication, and support only SELECT, INSERT (always truncates it first) and DROP statements.

There are many other uses outside the focus of this article, such as permanent archives of data that you don’t want kept in your instance to keep it smaller, and which you are happy always to be scanned sequentially. They can reside on another system if presented to the Informix server in a Networked File System (NFS), which is available on all supported operating systems including Windows.

Because external tables are handled entirely within the database engine, extraction or ingestion of data is much faster. If the table is not logged – either type RAW or in an unlogged database – ingestion is especially quick as this uses Express Mode with “light appends” which bypass the buffer pool.

Before we look at the code and how the functions work, it is easier to see them in action first. Once created, you could do this in a dbaccess session:

Copy to Clipboard

Used in the simplest form above, the whole table is unloaded or loaded. The file name should be a full path as seen on the database server, not the client.

An optional third parameter can be given to specify the delimiter character if different to the default “|” pipe symbol. For example:

Copy to Clipboard

Beware that this does not produce true CSV content with strings in double-quotes. Any literal occurrence of the delimiter still has a back-slash before it, as does back-slash itself. For loading into Excel, a tab delimited text file will be more reliable:

Copy to Clipboard

When unloading, you can in fact give any SELECT statement instead of the table name, for example:

Copy to Clipboard

Note above that two single-quotes is a literal single-quote. Do not use double-quotes which might be reserved for SQL identifiers: see environment variable DELIMIDENT.

When loading, external tables require that the final column delimiter is present even if the last column is not empty (which LOAD allows and is common in CSV files).

Listings and explanations of the two functions follow.

sp_unload

Copy to Clipboard

EXECUTE PROCEDURE IFX_ALLOW_NEWLINE('t');

CREATE FUNCTION sp_unload
(
    select_sql LVARCHAR(32000), -- or just table name
    file_path  VARCHAR(255),
    delimit    CHAR(1) DEFAULT '|'
)
    RETURNING INT8 AS rows_unloaded;

-- Unload data to a delimited text file via an external table
    -- Doug Lawry, Oninit Consulting, July 2023

DEFINE ext_table VARCHAR(167);
    DEFINE row_count INT8;

SET DEBUG FILE TO '/tmp/sp_unload.debug';
    TRACE ON;

IF select_sql NOT MATCHES '* *' THEN

IF select_sql NOT MATCHES '*[:"]*' THEN

SELECT COUNT(*)
            INTO row_count
            FROM sysmaster:sysenvses
            WHERE envses_sid = DBINFO('sessionid')
            AND envses_name = 'DELIMIDENT';

IF row_count > 0 THEN
                LET select_sql = '"' || select_sql || '"';
            END IF

END IF

LET select_sql = 'SELECT * FROM ' || select_sql;

END IF

IF UPPER(SUBSTR(select_sql,1,7)) != 'SELECT'
    OR select_sql MATCHES '*;*' -- prevent SQL injection
    OR select_sql IS NULL
    THEN
        RAISE EXCEPTION -676, 0, 'Table name or single select required';
    END IF

IF file_path MATCHES '*;*' THEN -- prevent SQL injection
        RAISE EXCEPTION -676, 0, 'Semi-colon not allowed in file path';
    END IF

LET ext_table = 'temp_' || DBINFO('sessionid') || '_ext';

BEGIN -- IF EXISTS not available in IDS 11.50
        ON EXCEPTION IN (-206) END EXCEPTION WITH RESUME;
        EXECUTE IMMEDIATE ' DROP TABLE ' || ext_table;
    END

EXECUTE IMMEDIATE
        select_sql            ||
        ' INTO EXTERNAL '     || ext_table ||
        ' USING ('            ||
        ' DATAFILES (''DISK:' || file_path || '''),' ||
        ' DELIMITER '''       || delimit   || ''','  ||
        ' ESCAPE'             || -- IDS 11 compatibility
        ' )';

LET row_count = DBINFO('sqlca.sqlerrd2');

EXECUTE IMMEDIATE ' DROP TABLE ' || ext_table;

RETURN row_count;

END FUNCTION

DOCUMENT "
    -- Example usage:
    EXECUTE FUNCTION sp_unload('items', '/tmp/items.unl');
    EXECUTE FUNCTION sp_unload('items', '/tmp/items.csv', ',');
    EXECUTE FUNCTION sp_unload('items', '/tmp/items.txt', CHR(9));
    EXECUTE FUNCTION sp_unload(
        'SELECT * FROM items WHERE state = ''CA''',
        '/tmp/california.unl'
    );
";

sp_load

Copy to Clipboard

EXECUTE PROCEDURE IFX_ALLOW_NEWLINE('t');

CREATE FUNCTION sp_load
(
    table_name VARCHAR(167),
    file_path  VARCHAR(255),
    delimit    CHAR(1) DEFAULT '|',
    max_errors INTEGER DEFAULT 2 -- ignore column headings
)
    RETURNING INT8 AS rows_loaded;

-- Load data from a delimited text file via an external table
    -- Doug Lawry, Oninit Consulting, September 2023

DEFINE temp_table, ext_table, reject_file VARCHAR(167);
    DEFINE real_table SMALLINT;
    DEFINE row_count INT8;
    DEFINE no_log INT;

-- Ignore BEGIN/COMMIT WORK errors if database not logged
    ON EXCEPTION IN (-256) SET no_log END EXCEPTION WITH RESUME;
    LET no_log = 0;

-- SET DEBUG FILE TO '/tmp/sp_load.debug';
    -- TRACE ON;
/*
    -- May 2024:
    -- Check removed to allow loading into a temp table
    -- Fails appropriately anyway if the table doesn't exist

SELECT COUNT(*)
    INTO row_count
    FROM systables
    WHERE tabname = table_name;

IF row_count = 0 THEN
        -- "The specified table is not in the database"
        RAISE EXCEPTION -206, 0, table_name;
    END IF
*/
    -- Need this later instead:

SELECT COUNT(*)
    INTO real_table
    FROM systables
    WHERE tabname = table_name;

IF file_path MATCHES '*;*' THEN -- prevent SQL injection
        RAISE EXCEPTION -676, 0, 'Semi-colon not allowed in file path';
    END IF

LET temp_table  = 'temp_'         || table_name || '_tmp';
    LET ext_table   = 'temp_'         || table_name || '_ext';
    LET reject_file = '/tmp/sp_load.' || table_name || '.rej';

SELECT COUNT(*)
    INTO row_count
    FROM sysmaster:sysenvses
    WHERE envses_sid = DBINFO('sessionid')
    AND envses_name = 'DELIMIDENT';

IF row_count > 0 THEN
        LET table_name = '"' || table_name || '"';
        LET ext_table  = '"' || ext_table  || '"';
    END IF

BEGIN -- IF EXISTS not available in IDS 11.50
        ON EXCEPTION IN (-206) END EXCEPTION WITH RESUME;
        EXECUTE IMMEDIATE ' DROP TABLE ' || ext_table;
    END

EXECUTE IMMEDIATE -- exclude hidden columns like VERCOLS
        ' SELECT FIRST 1 * FROM ' || table_name  ||
        ' INTO TEMP '             || temp_table  || ';';

EXECUTE IMMEDIATE
        ' CREATE EXTERNAL TABLE ' || ext_table   ||
        ' SAMEAS '                || temp_table  ||
        ' USING ('                ||
        ' DATAFILES (''DISK:'     || file_path   || '''),' ||
        ' REJECTFILE '''          || reject_file || ''','  ||
        ' MAXERRORS '             || max_errors  || ','    ||
        ' DELIMITER '''           || delimit     || ''','  ||
        ' ESCAPE'                 || -- IDS 11 compatibility
        ' )';

EXECUTE IMMEDIATE
        ' DROP TABLE ' || temp_table;

IF real_table = 1 THEN

EXECUTE IMMEDIATE -- required for EXPRESS
            ' SET CONSTRAINTS, INDEXES FOR ' || table_name || ' DISABLED';

BEGIN WORK;

IF no_log = 0 THEN
            EXECUTE IMMEDIATE -- avoid lock structure growth for DELUXE
                ' LOCK TABLE ' || table_name || ' IN EXCLUSIVE MODE';
        END IF

END IF

EXECUTE IMMEDIATE
        ' INSERT INTO ' || table_name || ' SELECT * FROM ' || ext_table;

LET row_count = DBINFO('sqlca.sqlerrd2');

IF real_table = 1 THEN

IF no_log = 0 THEN
            COMMIT WORK;
        END IF

EXECUTE IMMEDIATE
            ' SET CONSTRAINTS, INDEXES FOR ' || table_name || ' ENABLED';

END IF

EXECUTE IMMEDIATE
        ' DROP TABLE ' || ext_table;

RETURN row_count;

END FUNCTION

DOCUMENT "
    -- Example usage:
    EXECUTE FUNCTION sp_load('items', '/tmp/items.unl');
    EXECUTE FUNCTION sp_load('items', '/tmp/items.csv', ',');
    EXECUTE FUNCTION sp_load('items', '/tmp/items.txt', CHR(9));
";

Remove or comment out the DEBUG and TRACE statements as preferred.
Both functions handle tables names in double-quotes containing special characters or mixed case if DELIMIDENT is set in the environment.
If a user supplies a SQL statement to unload, only SELECT is allowed.
Semi-colons are rejected throughout to prevent multiple statements.
For “sp_unload”, the external table temporarily created contains the session ID for concurrent use.
For “sp_load”, it contains the target table name which is in any case exclusively locked, and indexes are temporarily disabled for speed and as required by Express Mode.
These functions fail unless the user has RESOURCE or DBA database privilege needed to create tables.

You can find the SQL that created external tables in the debug files, as in the following example (with indentation added):

Copy to Clipboard

ESCAPE is required for compatibility with versions prior to IDS 12.

MAX_ERRORS defaults to 2 in the “sp_load” function, meaning it will abort on the second error when loading data. Files from other systems often have column headings in the first line which would otherwise abort the load. You can override that with a fourth optional function parameter. For example, loading “items.csv” with column headings inserted at the top and with MAX_ERRORS set to 2 then 1 produces:

Copy to Clipboard

The REJECTFILE in /tmp contains the details of any errors. In the above case, it contained:

sp_load.items.rej

Copy to Clipboard

You can even create an external table to read a reject file in SQL with a “~” tilde column delimiter as follows (change the table name and file path to your own):

Copy to Clipboard

If you run “sp_load” on the standard “items” table, you will see this in the message log:

Copy to Clipboard

The “sp_load” function specifies EXPRESS for reminders that it would be faster if the target table was not logged.

The following does not cause that message:

Copy to Clipboard

For a huge data set, that would be massively quicker and avoid this error:

Copy to Clipboard

To alter an existing table to RAW, you must first drop referential constraints. The items table was previously created the lazy way with no explicit indexes or constraint names, according to the following beautified output from “dbschema -d stores_demo -t items -nw -q”:

Copy to Clipboard

To obtain those names, our article on Forest of Trees Indexes shows the following form of SQL:

Copy to Clipboard

The data returned includes:

tabname	constrtype	refers	constrname	idxname
items	P		u105_10	105_10
items	R	orders	r105_11	105_11
items	R	stock	r105_12	105_12

SQL to retain indexes but drop constraints might therefore be:

Copy to Clipboard

WARNING: dropping a primary key constraint also drops foreign key constraints which reference it on other tables (none in this case) which you also need to record beforehand and recreate afterwards.

After loading, to reinstate logging, you would run:

Copy to Clipboard

Something like this will have been recorded in the message log:

Copy to Clipboard

In some circumstances, you might get this error trying to use a table after an express load:

Copy to Clipboard

That’s quite hard to reproduce. The following will fix it without delay, though you would still want to run a real level 0 archive as soon as possible afterwards:

Copy to Clipboard

Best practice would be to recreate constraints with explicit names, using NOVALIDATE for speed (the indexes will already be there):

Copy to Clipboard

To minimise down time, a better alternative on a real system might be to complete all the above on a copy of the table, and switch them round afterwards with RENAME TABLE.

If the table must be available to other sessions during a load and therefore cannot be exclusively locked, the following procedure provides a slower safe alternative even if the table is logged:

sp_dbload

Copy to Clipboard

EXECUTE PROCEDURE IFX_ALLOW_NEWLINE('t');

CREATE PROCEDURE sp_dbload
(
    table_name  VARCHAR(167),
    file_path   VARCHAR(255),
    delimit     CHAR(1) DEFAULT '|',
    skip_rows   INTEGER DEFAULT 0,   -- 1 for column headings
    max_errors  INTEGER DEFAULT 0,   -- bad rows before abort
    commit_rows INTEGER DEFAULT 2000 -- rows per commit
);

-- Ingest a delimited text file into a table using "dbload"
    -- Doug Lawry, Oninit Consulting, February 2023

DEFINE dbase_name   VARCHAR(167);
    DEFINE temp_file    VARCHAR(255);
    DEFINE pos, columns SMALLINT;

ON EXCEPTION IN (-668)
        RAISE EXCEPTION -676, 0,
            'System command failed - see files in /tmp for details';
    END EXCEPTION;

SET DEBUG FILE TO '/tmp/sp_dbload.debug';
    TRACE ON;

IF table_name MATCHES '*?:?*' THEN

-- Supplied table name prefixed with database

-- Avoiding string functions not in older IDS
        FOR pos = 2 TO LENGTH(table_name) - 1
            IF SUBSTR(table_name,pos,1) = ':' THEN
                EXIT FOR;
            END IF
        END FOR

LET dbase_name = SUBSTR(table_name,1,pos-1);
        LET table_name = SUBSTR(table_name,pos+1,167);

ELSE

-- Cannot use DBINFO('dbname') if procedure elsewhere

SELECT TRIM(odb_dbname) -- main session database
        INTO dbase_name
        FROM sysmaster:sysopendb
        WHERE odb_sessionid = DBINFO('sessionid')
        AND odb_odbno = 0;

END IF

LET columns = NULL;

PREPARE e_ncols FROM
        'SELECT ncols ' ||
        'FROM ' || dbase_name || ':systables ' ||
        'WHERE tabname = ?';

DECLARE c_ncols CURSOR FOR e_ncols;
    OPEN    c_ncols USING table_name;
    FETCH   c_ncols INTO columns;
    CLOSE   c_ncols;
    FREE    c_ncols;
    FREE    e_ncols;

IF columns IS NULL THEN
        -- "The specified table is not in the database"
        RAISE EXCEPTION -206, 0, table_name;
    END IF

IF file_path MATCHES '*[;"''$]*' THEN -- prevent command injection
        RAISE EXCEPTION -676, 0, 'Invalid character in file path';
    END IF

LET temp_file = '/tmp/sp_dbload.' || table_name;

SYSTEM
        'umask 0 ; '      || -- create writable files
        'echo '           ||
        '"FILE '''        || file_path   || ''''     ||
        ' DELIMITER '''   || delimit     || ''''     ||
        ' '               || columns     || ';'      ||
        ' INSERT INTO \"' || table_name  || '\";"'   ||
        ' 2> '''          || temp_file   || '''.out' ||
        ' 1> '''          || temp_file   || '''.cmd';

SYSTEM
        'umask 0 ; '      || -- create writable files
        'DELIMIDENT=y '   || -- quoted table name
        'dbload '         ||
        ' -d '''          || dbase_name  || ''''     ||
        ' -c '''          || temp_file   || '''.cmd' ||
        ' -l '''          || temp_file   || '''.rej' ||
        ' -e '            || max_errors  ||
        ' -n '            || commit_rows ||
        ' -i '            || skip_rows   ||
        ' -r '            || -- do not lock table
        ' 2> '''          || temp_file   || '''.out' ||
        ' 1>&2';

END PROCEDURE

DOCUMENT "
    -- Example usage:
    DATABASE stores_demo;
    EXECUTE PROCEDURE sp_dbload('items', '/tmp/items.unl');
    EXECUTE PROCEDURE sp_dbload('items', '/tmp/items.csv', ',', 1, 50);
    EXECUTE PROCEDURE sp_dbload('items', '/tmp/items.txt', CHR(9), 1);
    EXECUTE PROCEDURE sysadmin:sp_dbload('items', '/tmp/items.unl');
    DATABASE sysadmin;
    EXECUTE PROCEDURE sp_dbload('stores_demo:items', '/tmp/items.unl');
";

Nothing is returned, so this is a procedure, not a function.
The work is done in a sub-process by the dbload command line tool supplied with IDS.
Unlike external tables, there are separate optional procedure parameters “skip_rows” and “max_errors” corresponding to dbload argument options “-i” (ignore) and “-e (errors tolerated), both defaulting to zero (no column headings, abort on first error).
The number of rows committed per transaction can be tuned, but experience shows you probably won’t need to change it from the default of 2000 rows. A line is output on every commit, providing progress reports when loading a large amount of data.
Unlike the other two functions, it can be run from or on another database.
The number of columns in the file is needed by dbload. That can be fetched from system catalog table systables since we are loading all columns. The procedure is aborted if the table name cannot be found.
File paths are rejected if they contain characters that might enable command injection.
A dbload command file is created and run, creating an output and reject file.
All files created are in “/tmp” and can subsequently be overwritten by other users.
DELIMIDENT is set in the dbload environment and double quotes put around the table name in the command file to handle special characters or mixed case.

The following example loads a file we created earlier, telling it to skip the first row as it contains column headings:

Copy to Clipboard

The command file it created was:

sp_dbload.items.cmd

Copy to Clipboard

The reject file was empty and the output file contained:

sp_dbload.items.out

Copy to Clipboard

To see what happens if there is an error in the file, we can load the same file but telling it not to skip the first line, but to tolerate one error:

Copy to Clipboard

The same command file is produced, but the reject file is no longer empty, with the error details also in the output file:

sp_dbload.items.rej

Copy to Clipboard

sp_dbload.items.out

Copy to Clipboard

Conclusion

The functions in this article provide equivalents of LOAD and UNLOAD statements in clients where that SQL syntax is not available. Moreover, using external tables is much quicker.

Caveats

For all functions:

Using with IDS versions prior to 11.50 has not been tested.
Complex and binary column data types are not supported.
Specifying a comma delimiter does not ensure full CSV file handling.

For “sp_unload” and “sp_load”:

The user must have DBA or RESOURCE database privilege.
The function must be installed in each database where it will be used.

For “sp_load” and “sp_dbload”:

The file must contain all columns in the same order as the table.
A work-around for that could be devised using a VIEW and column DEFAULT

For “sp_load” only:

The function will fail if the table is in use and therefore cannot be exclusively locked.
If the table is not logged, that has issues to consider, but loading will be hugely faster.
Otherwise, “Long Transaction Aborted” may result with larger data sets.

Disclaimer

Suggestions above are provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.

Contact Us

If you have any questions or would like to find out more about this topic, please contact us.

Author

Doug Lawry
Senior Informix Consultant

Updated June 2024

Abstract

Content

Conclusion

Caveats

Disclaimer

Contact Us

Author

Related Technical Articles:

Share this story: