One of the quotes I like most I learned from Tom Kyte from Oracle.
It’s one from Artemus Ward:
“It ain’t so much the things we don’t know that get us into trouble.
It’s the things you know that just ain’t so.”
This blog post is about one of those cases.
A customer complained about the performance of the index creation process in one of his process chains.
I had a look at the processes messages and found this:
[...] 28.03.2011 06:10:55 SQL: 28.03.2011 06:10:55 ALE_BWP_010 28.03.2011 06:10:55 CREATE BITMAP INDEX 28.03.2011 06:10:55 "SAPBWP"."/BIC/ECUBENAME~040" ON "/BIC/ECUBENAME" 28.03.2011 06:10:55 ( "KEY_SDIMENS1" ) PCTFREE 10 INITRANS 28.03.2011 06:10:55 20 MAXTRANS 255 STORAGE ( INITIAL 16 K NEXT 28.03.2011 06:10:55 81920 K MINEXTENTS 1 MAXEXTENTS UNLIMITED 28.03.2011 06:10:55 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 28.03.2011 06:10:55 BUFFER_POOL DEFAULT ) LOCAL PARALLEL TABLESPACE 28.03.2011 06:10:55 "PSAPBWP" NOLOGGING COMPUTE STATISTIC
--- > 6 hours !!!!????
28.03.2011 12:47:30 SQL-END: 28.03.2011 12:47:30 06:36:35 [...]
There was one index that actually took 6 hours and 36 minutes to create, although the table scan was done in parallel on Oracle level.
The other indexes on the same table only took some minutes (except one other that took something around 30 minutes).
So what could be the reason for that?
The next thing I looked up was the number of distinct values for the different dimension KEY columns (since these are the only ones getting indexes on fact tables):
OBJECT DIST NUM_ROWS (LF) BLK Table: /BIC/ECUBENAME 783908640 14192040 KEY_SDIMENS1 445965129 <<<<<< KEY_SDIMENS2 1529 KEY_SDIMENS3 3223 KEY_SDIMENS4 795 KEY_SDIMENS5 439 KEY_SDIMENS6 28578 KEY_SDIMENS7 1126484 KEY_SDIMENS8 83675 KEY_SDIMENS9 53296 KEY_SDIMENSA 2465897 KEY_SDIMENSB 170 KEY_SDIMENSC 194594206 <<<<<< KEY_SDIMENSD 1 KEY_SDIMENSP 1 KEY_SDIMENST 631 KEY_SDIMENSU 43
And this was already the hint required to solve this mystery.
Bitmap indexes are extremely efficient during lookup and when used to combine selection criteria.
What they don’t do very good is to handle large numbers of distinct values (always compared to the total number of rows in the table of course).
In this case, the dimension for which the index creation took six hours (column KEY_SDIMENS1) had more than half as much distinct values than the total number of rows in the table.
If you’re used to data modelling in BW you know that it’s recommended not to have that many distinct values in the dimensions and if it cannot be avoided then the dimension should be marked as line item dimension.
This can be done in the Infocube modelling in transaction RSA1 by right-clicking on the dimension and opening the properties dialogue.
Flagging a dimension as line item dimension changes the data model by that it removes the intermediate SID table that is normally put between fact tables and the actual dimension tables (for flexibility reasons) and joins the fact table directly with the dimension table.
It changes this :
[DIMENSION-TAB1] >-----< (SID-TAB1) >---< [[FACT TABLE]] >---< (SID-TAB2) >-----< [DIMENSION-TAB2]
[DIMENSION-TAB1] >-----< [[FACT TABLE]] >---< (SID-TAB2) >-----< [DIMENSION-TAB2]
Besides this change in the data model the index for the line item dimension should not be a BITMAP index anymore, but a B*TREE index instead. Of course, the SAP BW developers know about this and provided a way to avoid the creation of BITMAP indexes on such dimension key columns.
The fallacy the customer and I became victims of was to believe that flagging a dimension as line item automatically includes the setting for the B*Tree index. After all, this totally makes sense, doesn’t it?
Unfortunately, but correctly, the BW developers separated these two functions and provided two flags for infocube dimensions that can be set independently:
[ x ] Line Item
[ x ] High Cardinality
Only setting the High Cardinality flag leads to the creation of B*Tree indexes!
Fortunately, this can be done anytime, even when the infocube is already in use and filled with data.
This is of course documented (here) but as often, things mix up in the memory and we end up believing the wrong stuff.
Before I forget to mention it: as far as I know the High Cardinality flag really only does something on Oracle-based BW systems as the Bitmap Indexes are exclusively available on Oracle.
Maybe there are similar features on other platforms, but I’m not aware of them, so I leave them out for now (and avoid believing the wrong stuff once more…).
So, make sure to check your line item dimensions to avoid long indexing runs.