Analysing the data in SAP systems can be done in many ways. While SAP would like all customers to use the tools from their portfolio, many customers prefer to use other platforms. Snowflake is arguably one of the most popular choices for that.
If getting the SAP tables into a different data platform would be all that is needed to start writing SQL queries and getting answers, I suspect a lot more SAP customers would move their analytics stack. Alas, it is not that easy at all.
SAP’s database design is peculiar and littered with design choices from decades long past, when saving a few bytes in storage could greatly improve the system’s performance or reduce it’s resource consumption.
SAP’s peculiar designs
One example for such a design choice is the SAP Factory Calendar table TFACTS. I have written a bit about this table before – the tldr version is this:
The table hold a single(!) row per factory calendar and year to mark each day of the year as “working day” or “non-working day”. This is accomplished by storing numeric strings 1
and 0
for each day in twelve columns named MON01
to MON12
. A typical row could look like this:P
IDENT JAHR MON01 MON02 ...
AU 2018 0111100111110011111001111000111 1100111110011111001111100111 ...
In order to find out if a certain date is a working day, one has to pick the row for the correct factory calendar and year, get the MON...
string for the month we’re interested in and then check whether the nth character in this string is 1
or 0
.
Obviously, SAP encapsulated this logic in a callable function in ABAP but also in HANA SQL workdays_between
.
That is all great and dandy, but what do you do when you cannot use HANA SQL for some reason. In earlier HANA versions, this situation could occur when creating calculation views that did not support SQL functions. Due to this there had been several workarounds created in the SAP Community (for example this or this).
This may have solved the issue for users of the SAP platforms, but for Snowflake users the question is not fully anwered yet.
Trying this on Snowflake
As I recently started to learn more about Snowflake I thought this might be a good exercise to find my way around the platform. This is may take on this problem.
The first observation is that ideally I would like to have a calendar table with a record for every day that tells me whether or not any given day is a working day. Looking at TFACS
we are several steps away from that.
Pivot the month columns into rows
Instead of a record for every day, the table has a record for every year and spreads the year out into twelve columns. To change that a PIVOT
-like transformation is needed. Nothing easier than that:
with tfacs_pivot as (
select IDENT, JAHR, MON01 as DAY_FLAGS, '01' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON02 as DAY_FLAGS, '02' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON03 as DAY_FLAGS, '03' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON04 as DAY_FLAGS, '04' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON05 as DAY_FLAGS, '05' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON06 as DAY_FLAGS, '06' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON07 as DAY_FLAGS, '07' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON08 as DAY_FLAGS, '08' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON09 as DAY_FLAGS, '09' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON10 as DAY_FLAGS, '10' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON11 as DAY_FLAGS, '11' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON12 as DAY_FLAGS, '12' as "MONTH" from TFACS),
Decode the “bit field”
Next is to decode the “bit field” now available in the DAY_FLAGS
column.
This seems to ask for “imperative” code; something like a loop over each character in the string comes to mind. I looked up how Snowflake does functions and stored procedures and realised that I really don’t want to introduce Python or JavaScript code for something simple like this.
Instead, I used the generator()
function in Snowflake to create a record for each possible character in DAY_FLAGS
. As these flags represent days in a month, there are a maximum of 31 records to be created.
day_point as (
select tfp.*
, length(DAY_FLAGS) AS NUM_FLAGGED_DAYS
, ROW_NUMBER() OVER (PARTITION BY "JAHR", "MONTH" ORDER BY "JAHR", "MONTH" ) as FLAG_POINTER
from tfacs_pivot tfp
cross join table ( generator (rowcount => 31)) d
)
In this common table expression the generator function creates the 31 rows.
What is missing now is to pick out the correct character/sub-string for each day from the DAY_FLAGS
column. This will be done in the next step via SUBSTR()
but for that we need to know the day of the month for each row we just created. For this I use the ROW_NUMBER()
function and make sure to restart the cound with every new month. This computation is called FLAG_POINTER
for now, even though it really is the day of the month.
Construct date-values
The next and final step then is to actually “decode” the flags and also to create a proper SQL date column:
select dp.*
, date_from_parts (JAHR, "MONTH", FLAG_POINTER) as FACTORY_DATE
, substr(dp.DAY_FLAGS, FLAG_POINTER , 1) as day_marker
, to_boolean (DAY_MARKER) as IS_WORKING_DAY
from day_point dp
where FLAG_POINTER <= NUM_FLAGGED_DAYS);
Here I used DATE_FROM_PARTS()
to use the JAHR
, MONTH
, and FLAG_POINTER
columns to assemble a proper date. As planned, I then use SUBSTR()
to get the character associated with the current day out as DAY_MARKER
.
Finally, I convert this DAY_MARKER
into a boolean. A nice surprise – to me at least – was that I could immediately reuse the DAY_MARKER
column in Snowflake SQL.
The result
And that is the complete implementation of the SAP Factory Calendar view I wanted:
create view SF_SAP_FACTORY_CALENDAR as (
with tfacs_pivot as (
select IDENT, JAHR, MON01 as DAY_FLAGS, '01' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON02 as DAY_FLAGS, '02' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON03 as DAY_FLAGS, '03' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON04 as DAY_FLAGS, '04' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON05 as DAY_FLAGS, '05' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON06 as DAY_FLAGS, '06' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON07 as DAY_FLAGS, '07' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON08 as DAY_FLAGS, '08' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON09 as DAY_FLAGS, '09' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON10 as DAY_FLAGS, '10' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON11 as DAY_FLAGS, '11' as "MONTH" from TFACS UNION ALL
select IDENT, JAHR, MON12 as DAY_FLAGS, '12' as "MONTH" from TFACS),
day_point as (
select tfp.*
, length(DAY_FLAGS) AS NUM_FLAGGED_DAYS
, ROW_NUMBER() OVER (PARTITION BY "JAHR", "MONTH" ORDER BY "JAHR", "MONTH" ) as FLAG_POINTER
from tfacs_pivot tfp
cross join table ( generator (rowcount =>32)) d
)
select dp.*
, date_from_parts (JAHR, "MONTH", FLAG_POINTER) as FACTORY_DATE
, substr(dp.DAY_FLAGS, FLAG_POINTER , 1) as day_marker
, to_boolean (DAY_MARKER) as IS_WORKING_DAY
from day_point dp
where FLAG_POINTER <= NUM_FLAGGED_DAYS);
Now I can simply run
select FACTORY_DATE, IS_WORKING_DAY from SF_SAP_FACTORY_CALENDAR;
to get the information I am after:
FACTORY_DATE IS_WORKING_DAY
2018-01-01 FALSE
2018-01-02 TRUE
2018-01-03 TRUE
2018-01-04 TRUE
2018-01-05 TRUE
2018-01-06 FALSE
2018-01-07 FALSE
2018-01-08 TRUE
... ...
With this view set up, and likely materialised in a production environment, it is only a small additional step to create a WORKDAYS_BETWEEN
function that mimics the SAP HANA equivalent:
create function workdays_between (factory_calendar_id varchar(2), start_date date, end_date date )
RETURNS INTEGER
AS $$
SELECT count(*)
FROM SF_SAP_FACTORY_CALENDAR
WHERE
IDENT = factory_calendar_id
AND FACTORY_DATE between start_date and end_date
AND IS_WORKING_DAY = true
$$;
SELECT workdays_between ('AU', '2018-02-01', '2018-05-01');
WORKDAYS_BETWEEN ('AU', '2018-02-01', '2018-05-01')
61
And this is it!
There you go, now you know.
p.s.
Some other useful posts around SAP data in Snowflake I stumbled over:
Great Blog Lars,
like many others you posted..such as https://www.lbreddemann.org/separate-business-logic-from-tables-and-avoid-dynamic-sql/
Could you give some hint on how to connect Snowflake with SDI or ADBC ?
Something like
https://docs.snowflake.com/en/developer-guide/odbc/odbc-linux
Take care,
Luc
Hi Luc,
thanks for the kind comment.
So far, I have not tried to access Snowflake from HANA.
I imagine that using the Camel JDBC connector for SDI could be a way to make that work.
If that’ll work for your use case the way you want it to, I really can’t tell.
Do you have a specific usage scenario in mind where this would be helpful? If so, can you share it?