A journey of 1000 miles…
We are developers. We make the modern world tick and create the tools for the information workers. We are computer scientists, IT experts, coders, and we call ourselves engineers. We even renamed a part of our industry “software engineering” to get one step closer to this aspirational goal of being professionals when doing our jobs.
For several decades now, generations of IT-folk have come up with ways to measure, analyse, and regulate most aspects of software development. How to plan, how to estimate, how to implement, how to be social during development, how to be a human(e) developer, etc.
But looking at the day-to-day experience of many developers in a, say “classic corporate environment” (i.e. banks, manufacturers, airlines, retailers, in short, whoever uses software like SAP’s ERP offering), the focus is very much not on engineering, but on compliance and “making it work”.
Now, getting things done™ is very important, no doubt, but just wrestling with the IDE as long as necessary to make it compile without errors on to pass some form of test (maybe just a tick against the requirement description) leaves little opportunity to actually learn anything about the solution we have just reported as “done”.
To me, this comes up often, when I see how DB-related code gets developed. More often than not, the first SQL statement (or HANA model) that looks like it does the job is the one that becomes the solution.
This approach can work if the developer has experience with the database in question. Yes, with the specific database, just knowing SQL syntax and how some of the table “go together” is not sufficient. One needs to know the data and the workload on the DB to have an understanding of what new code will do.
But how to get this knowledge about the specific database and the workload on it? That sound a lot like testing on the production system. Not sure about your environment, but I have rarely seen a customer where a full performance/scalability testing landscape had been set up for the developers. Instead, when the code “works” on the development and Q/A system (usually with neither the data volume nor the workload of the production system), it gets deployed to production and any issues will have ot be handled there.
Sure enough, the whole simulating data volume and workload business in a separate landscape is terribly expensive, especially in classic SAP scenarios. Even with cloud-based virtualisation, this is not a cheap thing to do.
So, what other options are there?
In this post I will look at three different implementations of a database extraction program. Each of the variants has its own pros and cons, so the main point, I guess, is to show
- there are always alternative ways to implement something
- runtime is not the only relevant aspect to measure when it comes to SQL code
- there is no “the runtime” of your code when it should run on a shared database
- it is not actually that hard to get a better understanding of your code.
If the above is interesting to you, read on. Otherwise, this is going to be a long bore for you…
So, what is this “database extraction” program about? The goal is to generate a list of all SAP HANA SYSTEM PRIVILEGES that are currently assigned to the DB users. This should include all the privileges that are indirectly assigned, i.e. via roles. Since roles can “contain” other roles and privileges, finding all privileges for a given user turns out to be a complicated task. To make life easier, SAP HANA provides the EFFECTIVE_PRIVILEGES system view, that does the job for us.
The tricky bit now is, that the EFFECTIVE_PRIVILEGES view requires a WHERE condition equality filter on USER_NAME. This is what it looks like when we forget to include the “equal predicate”.
SAP DBTech JDBC: : predicates are required in a where clause: M_EFFECTIVE_PRIVILEGES_ needs predicates on columns (connected by AND if more than one): an equal predicate on USER_NAME
If we want to use the view for our program, we need to put every user name into the equal filter.
Now, I hear the SQL-savvy developers already typing in their sub-select (WHERE user_name IN (SELECT user_name FROM users)). Yeah, nah!
This is not supported by the view. It really only deals with straight equal conditions.
Any other ideas?
Three solutions for the problem
Let’s look at three possible solutions. They all deliver the same result, so we consider them equally “correct”.
The first solution can reasonable called “blunt”. We generate a list of all user names and fill an IN-list with those names.
“Why can you use IN-lists, when a sub-select doesn’t work?” – glad, you asked!
To generate this statement, I used a short Python script.
The second option is to make the database loop over all user names, execute the SELECT against EFFECTIVE_PRIVILEGES for every single one of them, and carefully making sure to keep the results computed before.
This is what I do with the cursor in this implementation.
The main “trick” is of course the UNION ALL where the current result “t_r” is combined with the overall result so far “result” and then assigned to “result” again. If you have ever written a hierarchical common table expression (WITH clause) in another DBMS, you know the drill here.
DO BEGIN /* user_sys_privs cursor loop */ DECLARE cursor c_users for (SELECT user_name FROM users); -- initialize result structure result = SELECT user_name, grantee_type, privilege FROM effective_privileges WHERE user_name ='' AND 1=0; FOR r_user AS c_users DO t_r = SELECT user_name, grantee_type, privilege FROM effective_privileges WHERE user_name = :r_user.user_name AND object_type = 'SYSTEMPRIVILEGE'; result = SELECT * FROM :t_r UNION ALL SELECT * FROM :result; END FOR; SELECT user_name, grantee_type, privilege FROM :result ORDER BY user_name ASC, grantee_type ASC, privilege ASC; END;
Lastly, for the SELECT and UNION approach, SAP HANA provides a special feature. Usually, one should follow the advice of the culinary industry and avoid single-use tools, but since the problem we are solving here is HANA specific, we might as well use some proprietary commands. The command I am referring to is the MAP_MERGE function/operator.
The idea with this operator is that if we tell HANA specifically that we want to run the UNION-trick (from above) for a specific function, then HANA can do this more efficiently.
All we need for this is a function to refer to. So this solution first defines a user-defined-table function that returns the data for a single user and then calls this function as part of the MAP_MERGE operator.
In code, this looks as follows:
CREATE FUNCTION eff_priv (IN user_name NVARCHAR(256)) RETURNS TABLE ( user_name NVARCHAR(256) , grantee_type NVARCHAR(4) , PRIVILEGE NVARCHAR(256)) AS BEGIN return SELECT user_name, grantee_type, privilege FROM effective_privileges WHERE user_name = :user_name AND object_type = 'SYSTEMPRIVILEGE'; END; DO BEGIN /* user_sys_privs map_merge function */ uns = SELECT user_name FROM users; res = MAP_MERGE (:uns, eff_priv(:uns.user_name)); SELECT user_name, grantee_type, privilege FROM :res ORDER BY user_name ASC, grantee_type ASC, privilege ASC; END;
Thanks for the code dump – which one is the best, you said?
It is time to get to the meat of this post.
While the code example may be helpful or instructional the interesting (to me at least) now is to get a better understanding of how these statements are different from one another.
Which is the fastest? Which uses the most CPU? (if you think this is the same question, think again!) Which uses the least amount of memory? Which one will impact the production HANA instance the least?
HANA experts will probably point to analysis tools like EXPLAIN PLAN or PlanViz (there is an OpenSAP course coming up for that), but frankly, these are not the tools for the typical SAP developers. The crux of those tools is that one already needs to know how HANA works internally to get good use of them. This is not limited to HANA, of course, but the same with other DBMS.
What I am going to use instead, are tools that are available to everyone with a HANA developer system (e.g. HANA express edition).
To start simple, we will be looking at the CPU usage of our HANA instance when we run our code. This should provide us with a form of visual impression of the kind of workload each code bit produces.
I use htop for this.
To simulate the workload of the “rest of the system”, i.e. whatever is happening in the DB aside from our code, I use Apache JMeter.
To have some queries to run, I created a SAKILA database schema and collected publicly available queries to run against it. This is in no way a proper simulation of actual system workload, but I figure it is closer to the real thing than running on an empty private database.
The following shows how the JMETER test plan looks like:
In addition to that, we will also query HANA to tell us details about the resource usage of the different implementations.
Getting a baseline impression
Looking at some CPU usage bar graphs likely will not be terribly useful without a baseline reference of what it looks like when the “normal” workload runs without the new code runs.
This is what the following video shows.
Baseline – the “background” workload
The “background” workload chucks along steadily, with relatively even utilisation of all CPU threads. So, this is what “normal” looks like for us.
Checking the code
Next is to run our code (finally!) and keep an eye on how this looks for each variant in htop.
First, let’s try this without background workload.
What have we seen in those three videos?
Both IN-list and CursorLoop lead to a CPU utilisation pattern where a single CPU thread is heavily used, while the remaining CPU threads are more or less idle.
In contrast, the MapMerge code puts the pedal to the metal and uses whatever CPU thread is available.
Unsurprisingly, this burst of computing efforts is over much quicker than the other two options.
Results, the first
The first round of “performance testing” gives us the following table.
|Query Scenario||–||Query Runtime|
|In-List||–||~ 9.2 secs|
|CursorLoop||–||~ 6.8 secs|
|MapMerge||–||~ 1.2 secs|
If all we wanted to know, is which version is the fastest, we could stop here.
MapMerge runs circles around the other two implementations.
The other two implementations are not only much slower, but also seem to ignore the strong warning about not using CURSORS if you want performance from the HANA performance guide and blog posts.
While this advice is commonly correct, our example is a nice exception to the rule.
To a more complete picture
But I didn’t build the whole test setup for nothing, so of course, we run the statements also with the “background” workload in full swing.
These videos are similar to the first set. IN-List and CursorLoop lead to high single thread usage, while the MapMerge version grabs all CPU threads.
But what happened to the “background” queries? You know, the “other” queries that represent the existing, money-making workload of our HANA system?
Have those queries maybe been affected in any way by the new code?
JMeter provides nice and straightforward to use reporting tools, so let’s have a look at those.
Response Time graphs of the background queries
Again, let’s look at what the baseline response time graph for our four SAKIRA test queries looks like.
Baseline – the “background” workload
After some ramp-up, all of the queries dial-in below 50ms response time. It’s very consistent and also very lightweight.
Now, let’s look at what changes when the new code is executed.
To illustrate the effect a bit better, each of the code versions is executed a few times in a row.
For the IN-list it is very easy to spot the spikes in the response times.
For the CursorLoop the effect on our test queries is less extreme than with the IN-list, but the overall response time is still worse than without the new code.
The “red” query is now consistently above 50ms.
Finally, with the MapMerge code we see very clearly, where all the CPU threads went: not to our test queries at all.
The spikes are of very short duration, that is true, but if your organisation is trying to keep the response times of production queries within two standard deviations, the MapMerge code just blew this goal out of the water.
Results, the second
Let’s look at the response timetable for the scenario with the “background”
noise workload. Note, that I included the numbers for the “solo” runs for comparison.
|Query Scenario||Background Throughput||Query Runtime|
|IN-List||–||~ 9.2 secs|
|IN-List + Background||119.3/sec||~10.5 secs|
|CursorLoop||–||~ 6.8 secs|
|CursorLoop + Background||118.7/sec||~ 16.6 secs|
|MapMerge||–||~ 1.2 secs|
|MapMerge + Background||117.3/sec||~ 1.2 secs|
As expected from the graphs above, the decreased response time of the test queries also reflects on the throughput for this workload. With none of the new code, the test queries managed to execute 119.7 times per second.
With the MapMerge code, this number is reduced to 117.3 executions per second – and we only ran our code three times during the 1-minute window of the test workload.
Given this, MapMerge is maybe not the single “best” solution.
But let’s also have a look at how the other implementations did with workload present in the system: Both IN-List and CursorLoop slowed down a bit – CursorLoop took more than double as long as before.
However, neither of those implementations did impact the workload throughput as badly as the MapMerge code did.
At this point, it should be clear that this is not going to be a “three simple tips to make your SQL fast” post. We have three implementations with different runtime performance characteristics. What else can we learn about those statements with relative ease?
Well, HANA records how much CPU and memory any statement uses.
Expensive Statements and the SQL Plan Cache
Expensive statements trace
OPERATION |DUR_SEC|STMT_STRING |MEM_MB| --------------------|-------|--------------------------------------------------|------| AGGREGATED_EXECUTION| 9.29|SELECT /* user_sys_privs blunt user list */ u|811.16| CALL | 1.82|DO BEGIN /* user_sys_privs map_merge function */ | 21.07| CALL | 7.58|DO BEGIN /* user_sys_privs cursor loop */ DECLARE| 7.71|
With a small custom query against the M_EXPENSIVE_STATEMENTS view, we can get some fundamental CPU/memory usage key figures as seen above.
A precondition for using this view is that the expensive statements trace is active (and the filter conditions are fine enough to “catch” our code).
What we see now confirms our impression about the statement runtime (duration): MapMerge finishes first, CursorLoop second, and the IN-List comes last.
But looking at the recorded memory usage we see another kind of “cost” for running our code. The simple IN-list statement clocks in with using 811.16 MB(!). This is nearly a GB of memory that cannot be used by any other query while this statement runs.
MapMerge is much better with 21 MB but still 3 times more memory-hungry than the CursorLoop.
Shared SQL Plan Cache
Another and slightly different (very different, in fact, but for our purpose here it’s “slightly”) source of resource usage data for queries the Shared SQL Plan cache.
EXECS|TOTAL_CURSOR_DUR_SEC|TOTAL_EXEC_MEM_MB|PLAN_MEM_MB|STMT_STRING | -----|--------------------|-----------------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------| 1| 9080.22| 811.10| 4.48|SELECT /* user_sys_privs blunt user list */ user_name, grantee_type, privilege FROM effective_privileges WHERE user_name IN ( 'DEVDUD| ----- 1| 3.98| 1.26| 0.03|SELECT user_name, grantee_type, privilege FROM "SYS"."_SYS_SS2_TMP_TABLE_0_RESULT_4_BA87DD2EBFEAAC44A33DBDE252A1BF77_2| 1| 14.07| 2.60| 0.03|SELECT user_name, grantee_type, privilege FROM "SYS"."_SYS_SS2_TMP_TABLE_0_RES_2_EEA3FA5F1AB5D84B964C200608D349CF_2" "| 1018| 5660.31| 21453.52| 0.16|/* procedure: "DEVDUDE"."EFF_PRIV" variable: _SYS_SS2_RETURN_VAR_ line: 8 col: 5 (at pos 192) */ SELECT user_name, grantee_type, privilege| 1| 1.80| 3.21| 0.02|SELECT "USER_NAME" FROM "SYS"."_SYS_SS2_TMP_TABLE_0_UNS_2_EEA3FA5F1AB5D84B964C200608D349CF_2" | 1| 1.71| 0.07| 0.03|/* procedure: "DEVDUDE"."(DO statement)" variable: UNS line: 2 col: 6 (at pos 54) */ SELECT user_name FROM users | 1| 0.00| 5.19| 0.02|DO BEGIN /* user_sys_privs map_merge function */ uns = SELECT user_name FROM users; res = MAP_MERGE (:uns, eff_priv(:uns.user_name)); | ----- 1018| 6340.59| 1563.29| 0.17|/* procedure: "DEVDUDE"."(DO statement)" variable: RESULT line: 20 col: 11 (at pos 651), procedure: "DEVDUDE"."(DO statement)" variable: T_R line: 12 | 1| 0.16| 7.71| 0.18|/* procedure: "DEVDUDE"."(DO statement)" variable: RESULT line: 6 col: 7 (at pos 161) */ SELECT user_name, grantee_type, privilege| 1| 7551.59| 7.71| 0.03|/* procedure: "DEVDUDE"."(DO statement)" variable: C_USERS line: 11 col: 21 (at pos 350) */ (SELECT user_name FROM users) | 1| 0.00| 5.29| 0.03|DO BEGIN /* user_sys_privs cursor loop */ DECLARE cursor c_users for (SELECT user_name FROM users); -- initialize result struc|
Now, this is a lot more complicated and confusing, so let’s unpack this a bit.
Except for the single SELECT IN-List solution, the implementations use several SQL commands in loops to compute the result. Since every SQL statement needs to be processed by the HANA SQL “engine” in order to be executed, we find entries for those executions in the Shared SQL Plan cache. It is possible that statements do not use the Shared SQL Plan Cache, but for our scenario let’s assume that this is not the case here.
I’ve grouped the different statements so that the ones belonging to the same code are in subsequent rows.
What we learn here is that besides runtime and “working memory” that is used during the execution, there is another memory requirement for every statement: the memory required to store the SQL plan itself.
For the IN-List just storing the plan requires 4.48MB, which is a lot if you consider how much is possible with that (e.g. the whole Secret of Monkey Island game for the Amiga came on four 880KB disks = 3.520 KB).
Our other two implementations come in at 0.29 MB (MapMerge) and 0.41 MB (CursorLoop).
Another interesting bit to learn here is the number of executions (EXECS) for each of the statements. Our “loopy” constructs show statements that got executed 1018 times.
Why is that?
So glad, you asked (again).
Our code loops over all users in the system. A quick
SELECT count(*) FROM users; COUNT(*)| --------| 1018|
reveals that there are currently 1018 users in my test system.
And yes, I forgot to mention that I created a number of users, roles, and role assignments for this test scenario so that our code would have something to work with.
Also, this kind of requirement (regularly dumping out current privilege assignments of all users to document the security setup) is more common in larger organisations with stricter compliance responsibilities. Having thousands of users in the HANA system is not outlandish for those organisations.
Which of the implementations would you choose?
Is the fastest (MapMerge) still the best for our scenario?
What about the huge memory demands of the IN-List solution? Also, as soon as the list of users changes, the statement will have to be regenerated and re-compiled. The two other solutions are “stable” in terms of the active code.
The CursorLoop is self-contained and does not require any function to be created in the DB ahead of time, so it may be more readily usable (think of using it to monitor many HANA systems across the landscape).
Of course, it all boils down to the specific requirements and circumstances. And weighing this up and making a decision is exactly the kind of engineering work our industry aspires to do.
There you go, now you know.
Maybe, if there’s interest in it, I’ll publish how to create the scenario presented here, but for now, this is already a long read.