Q&A forums can be a treasure trove of interesting questions that turn out to have counter-intuitive answers.
In “Why does select statement taking too long when primary key is present?” the OP made an interesting observation:
Given: a table with ca. 89 Mio records, one primary key column automatically filled via an IDENTITY sequence and some other columns with application data.
The query looked similar to this – the exact column names do not matter in this case.
select top 1000 aaa -- primary key column , aai , abi , _validfrom , _validto from io_test
Whenever the first 1000 records, without the primary key column, are selected, this takes ca. 4ms.
But, when the primary key column is included in the selection, the query takes 6721ms, nearly 7 seconds (1750 times the other query runtime).
What is going on?
Why would selecting the primary key column make the query slower? Isn’t using the primary key a long-held performance improvement advice? After all, the primary key columns are indexed, aren’t they?
While it is true, that the primary key columns have an index to enable the fast lookup of potential duplicate entries in the table, the query here does not have a WHERE clause and therefore no filter conditions for which an index could be used.
A lead to follow
When I first saw the question, several pointers to possible causes had already been given, but none did convince me. Knowing that the HANA column store uses dictionary encoding for values in a column I realized that the primary key column has to allocate a lot more memory than the other columns because every value is unique and no “dictionary compression” can take place.
To return records to the client, HANA has to go and replace the internal pointers (value IDs) that are stored for every record with the actual values from the dictionary. This process, called “materialization”, should be the culprit of the long runtime, I figured and a quick PlanViz trace confirmed this.
The trace showed very clearly that the majority of the runtime was spent there for both queries and it also showed that for the slower query the majority of its runtime was spent there.
All this is correct and makes sense and so I went and wrote an answer based on the understanding that materializing is taking too long here. I added some ideas of how the large dictionary size – after all 89 Mio. different entries – would make the lookup slower.
And the OP accepted the answer. It even received a couple of upvotes. All good I thought.
The cold case haunts the detective
For a while, I was happy to have found an explanation, but then some doubt started to creep into my thoughts. The column value dictionary was not, as I originally put into my answer, a hash table for which the lookup would be an O(1).
In reality, the data structure is a sorted list that is accessed via binary search.
The binary search worst-case effort is O(log n) which means that a difference of two orders of magnitude in the number of entries in the dictionary would only lead to 2 additional search steps.
Neither theory – hash-table nor binary search – could explain the 1750 time increase in runtime.
At this point, it was clear, that the actual reason was still in the dark.
I started to do some more or less random poking around and found that the primary key column (aaa in my example) always showed up as
UNLOADED in system view M_CS_ALL_COLUMNS.
Knowing this, I checked the unload-trace file of the indexserver (using system view M_CS_UNLOADS would have been easier) and sure enough, found that the primary key column had been unloaded immediately after the query finished.
The real reason
That was it! The primary key column was not in memory when the query started and HANA had to load it to perform the materialization.
LOAD table_name ALL I forced the table to be completely loaded into memory before running the TOP 1000 query again and now the runtime was much better.
In fact, it was pretty much the same as the runtime without the primary key column.
Just one more thing
Great, another case solved, I thought.
But why was the column unloaded at all?
Checking the column statistics again, now with the primary key column loaded, I could see that it allocated a bit more than one GB, which is pretty big alright. But the HANA instance I was testing this on had 30 GB of RAM available and plenty of it was free.
Why would HANA unload a column when there was still more than enough memory free?
A first try to avoid this unloading was to set the UNLOAD PRIORITY of the table to 0, which should exclude the table from automatic unloading.
To my disappointment this had zero effect on the unloading; after running the test query HANA unloaded the column again.
The next thing I looked at was the automatic memory “cleanup” of HANA that is aimed at avoiding the dreaded out-of-memory (OOM) errors that were so common in the earlier releases of HANA.
One of the parameters for this feature is an upper limit of how much memory is allocated before cleanup is performed.
global.ini - [memoryobjects] - unload_upper_bound was set to 800MB on my HANA Express system.
Clearly, every time the 1 GB-and-a-bit column was loaded, this triggered the cleanup and the subsequent unloading of the column.
Posting this post’s link on LinkedIn proved worthwhile when Robert Waywell (SAP) provided this piece of relevant information (emphasis mine):
It is worth noting that the setting of the global.ini – [memoryobjects] – unload_upper_bound to 800MB is specific to HANA Express (HXE) instances and is one of the pre-configured parameters set to help HXE run with a small memory footprint.Robert Waywell (SAP), LinkedIn (11.02.2020)
The default value on full HANA instances is “0” which means that HANA automatically manages the unload behaviour, rather than enforcing a pre-set limit.
There is also an unload_upper_bound_rel parameter which takes a %. The default value for that parameter is -1, which disables it.
Thanks a lot, Robert!
Very much appreciated.
After changing this parameter to a much larger value, the unloading stopped and the query with and without primary key column finished within a few milliseconds.
While this explains the unloading, it does not mean you should go and change the parameter on your system now.
The parameter represents a choice, a tradeoff between the performance of a single query and the stability of the system. One wants to be careful with that.
There you go, now you know.
Hi Lars,really, we need to go deeper to se what really happens behind the scenes. Thank you!!
Thank you for the insight and analysis. Very useful
Glad you liked it!