Loading data into index organized tables
19 Jun 2016Most of the time you don’t have full control of how your data gets loaded into the database, because it is often imposed by business rules. But in the case you have full control, you should take advantage of it.
I recently came across an index organized table which was kind of a denormalized form of other tables and which was truncated and reloaded every day before some heavy batch jobs queried it.
The data was loaded into the IOT with an insert as select statement. Because the data returned by the select was not ordered (or at least not ordered after the primary key columns of the IOT), the insert caused a lot of 50/50 block splits on the IOT.
I added a simple order by clause to the statement with the effect that only 90/10 block splits were done afterwards. This also resulted in a 50% smaller index size.
Here’s a small test case for demonstration:
Create a heap table which is later used to load data into the IOT
The c1 column will be the pk column of the later created IOT. I used a random function so the values for this column are not ordered in any way.
Remove any duplicate values for c1 as this would lead to unique constraint violations later.
Create the IOT
Check the index split statistics before inserting
Insert without order by clause
About ~3k block splits were done, all of them 50/50. The size of the index is ~23 MB.
Now insert with order by clause
(Open a new database session, so we have “fresh” counters in v$mystat
)
The amount of block splits is cut in half and because of the order by clause, only 90/10 block splits were done. The index size is also considerably smaller (12 MB compared to 23 MB).