A common requirement for SQL DB modeling is to have something that could be called conditional unique constraint.
The idea is that certain combinations of column values should only occur once if at all, while other combinations are allowed to occur in any number.
That certainly sounds a bit theoretic, so let’s have an example.
Say we have a table
STUFF in which we track the status of, well, stuff.
Each individual item of stuff can be identified by its
For every entry we store a
STATUS value and a transaction number
Now, the rule is that any item can have any status value multiple times but status 100 should only ever occur once per item. Let’s pretend status 100 represents the initial creation of the item or something similarly unique.
The common unique constraint cannot be a solution for this, because it would require all combinations of
STATUS to be unique.
Several RDBMS offer check constraints for this kind of model, but SAP HANA currently (HANA 2 SPS03) only support simple search conditions for those and no sub-queries.
If you’ve been reading any of my comments and posts including triggers, you probably noticed that I try to avoid using them whenever possible.
Tom Kyte of Oracle made really good arguments against triggers https://blogs.oracle.com/oraclemagazine/the-trouble-with-triggers and these hold true on other DBMS.
For the conditional unique constraint on SAP HANA, however, using triggers is one possible solution.
Let’s have a table
create column table stuff (id integer, status integer, tx_seq integer);
Insert some data
create sequence tx_seq; insert into stuff values (1, 10, tx_seq.nextval); insert into stuff values (1, 100, tx_seq.nextval); -- <- final state = 100 insert into stuff values (2, 10, tx_seq.nextval); insert into stuff values (2, 10, tx_seq.nextval); -- <- duplicate state = 10 insert into stuff values (1, 80, tx_seq.nextval);
Create the trigger
The trigger should ensure that the data in the table cannot be changed so that the “conditional unique” rule is broken. That means it has to run before anything is written into the table. It also needs to run for each and every record for both
When the trigger finds that the table already contains an entry for the current
ID and status
100 we want a proper error message for this.
The trigger for that could look like the following:
create trigger stuff_unique_status before update or insert on stuff referencing new row as new for each row begin declare cnt integer; declare non_unique_final_status condition for SQL_ERROR_CODE 10001; declare exit handler for non_unique_final_status resignal; -- only check if status is 100 if :new.status = 100 then select count(*) into cnt from stuff where status = 100 and id = :new.id; if :cnt > 0 then SIGNAL non_unique_final_status set message_text ='Final status 100 must be unique per ID. Affected ID: '||:new.id; end if; end if; end;
Try out if the trigger works
Let’s insert data that would violate the condition:
insert into stuff values (1, 100, tx_seq.nextval); /* Could not execute 'insert into stuff values (1, 100, tx_seq.nextval)' in 53 ms 755 µs . SAP DBTech JDBC: : user-defined error: "DEVDUDE"."STUFF_UNIQUE_STATUS": line 5 col 50 (at pos 274):  (range 3) user-defined error exception: Final status 100 must be unique per ID. Affected ID:1 */
The same happens if we try to update data so that a duplicate would occur:
update stuff set status = 100 where id = 1 and tx_seq = 1; /* Could not execute 'update stuff set status = 100 where id =1 and tx_seq=1' in 37 ms 66 µs . SAP DBTech JDBC: : user-defined error: "DEVDUDE"."STUFF_UNIQUE_STATUS": line 5 col 50 (at pos 274):  (range 3) user-defined error exception: Final status 100 must be unique per ID. Affected ID:1 */
We see that it paid off to create a custom condition and to include the ID value for which the exception condition was raised. It’s easy not to go back and review what caused the faulty data insert.
What about performance ?
When we have the DBMS run code for every record that gets modified in the table, that of course leads to longer processing times.
In order to get a feeling for this difference, let’s just insert some more data into our table and see how long it takes. Then we repeat the insert with a table with no trigger on it.
-- some data insert into stuff (select top 100000 row_number() over() , to_integer(floor (rand()*100) ) , tx_seq.nextval from objects cross join objects ); /* Statement 'insert into stuff (select top 100000 row_number() over() , to_integer(floor (rand()*100) ) , ...' successfully executed in 29.261 seconds (server processing time: 29.259 seconds) - Rows Affected: 100000 */
create column table stuff2 (id integer, status integer, tx_seq integer); insert into stuff2 (select top 100000 row_number() over() , to_integer(floor (rand()*100) ) , tx_seq.nextval from objects cross join objects ); /* Statement 'insert into stuff2 (select top 100000 row_number() over() , to_integer(floor (rand()*100) ) , ...' successfully executed in 6.071 seconds (server processing time: 6.070 seconds) - Rows Affected: 100000 */
That’s 6 seconds vs. 29 seconds with the trigger; quite a difference and if you’re doing mass data loading or high-speed transaction capturing it might not be viable to keep the trigger active while loading the data.
What about adding an index?
Having a trigger active on a table adds runtime in itself and there does not seem to be a way around this. Even if I only load records with a status different from 100, the trigger is executed (to check whether or not the status is 100).
We can, however, allow for a potential speedup of the COUNT query by adding an index.
Since the query always targets one specific combination of ID and
STATUS in an
EQUAL comparison, I use the inverted hash index type here to safe memory.
create inverted hash index unique_final_status on stuff (id, status);
Running the aggregate query in PlanViz confirms that the index is used for this query:
Name: Basic Predicate ID: cs_plan2_hxehost_39003_qoTableSearch1_qoTableSearchNode1 Summary: (ID,STATUS) = (2500,100) Schema: DEVDUDE Execution Time (Inclusive): 5.925 ms Execution Time (Exclusive): 5.925 ms Execution Start Time: 7,127.854 ms Execution End Time: 7,133.791 ms Estimated Result Size: 1 Estimation Time: 0.011 ms Evaluation Time: 5.914 ms Index Usage:: using INDEX ON (ID,STATUS) Inverted Index: used
There you go, now you know!
I will guess this is MySQL because it is what I need it to be right now. So I’m going to try it in a moment and I will came back to confirm.
Nope, this is not MySQL
That’s right – this article is about SAP HANA DB. Best of luck, though, with your MySQL DB!