sql delete

Informix Stored Procedure for Mass Delete


IBM Informix Dynamic Server (IDS) needs careful handling of data manipulation language (DML) operations affecting millions of rows. If performed in a single SQL statement on a logged database, the database engine must keep each affected row locked until it completes: even if no transaction has been started with BEGIN WORK, the statement still runs inside an implicit transaction.

Assuming the table has LOCK MODE set to ROW (the norm for OLTP) rather than PAGE (usually avoided due to concurrency problems), the result is millions of locks, particularly as they are required on each index as well as the table. The documentation for the LOCKS configuration parameter, which defines the initial size of the lock structure in the resident shared memory segment, states that each requires 100-200 bytes depending on the platform. Should it run out and need to allocate an extension lock structure in virtual shared memory, it may have to add new segments dynamically totalling gigabytes. Unless the Enterprise Edition is in use, this may cause user sessions to be starved of memory or even rejected.

Following a feature request by the author, configuration parameter SESSION_LIMIT_LOCKS is available from IDS 12.10.xC4 that you can use to prevent an excessive number of locks by any one session, but that will obviously result in the statement being aborted if the limit is reached.

Locks can be avoided entirely by placing an EXCLUSIVE lock on the table or by temporarily disabling logging on the database or altering the type to RAW, but this will most likely interfere with other users, break applications, and would invalidate any replication.

There is a second major problem with very large transactions: if so much data is affected that most of the logical log space has been consumed, depending on configuration parameter LTXHWM, the statement will encounter "Long Transaction Aborted". It can then take at least as long for the transaction to be rolled back, during which time the instance may be unusable if LTXEHWM has also been reached.

The right way to avoid this entirely is to split DML statements into smaller transaction affecting only a few thousand rows each at most. The two most common scenarios are when loading fresh data from a file or deleting a large number of rows. These can be safely achieved committing a few thousand rows per transaction with the dbload tool provided with IDS (see wrapper script at the end of this article) and the dbdelete open source ESQL-C program by Art Kagel (see also a Python version by Andrew Ford). Other scenarios may need to be specifically coded.

This article describes a stored procedure alternative to dbdelete based on it.