One of the benefits of answering questions from users is that there usually is a pretty good chance that I will learn something along the way.
Today’s case was no different. In fact, I learned quite a bit.
The question was this…
“Your article https://blogs.sap.com/2013/01/23/hana-quick-note-checking-my-connections-and-using-them-securely/ … is amazing.unnamed LinkedIn user
Can you share the syntax to use the hdbuserstore key from RStudio? At this time I am looking to use the key to open a ODBC connection using ConnectionContext from hana.ml.r.”
If you don’t want to read through the whole sleuth story but jump right to the solution, see the Quick solution recipe-box below
Let me unpack this:
- the referenced blog post is about an SAP HANA client feature called User Store (hdbuserstore). This feature allows us to store login credentials for HANA databases in the local users’ profile (it’s a set of files in the file system of the user
With the stored credentials users can logon to HANA databases without ever having to type in the host, port, username or password.
It’s a bit like 1Password for HANA.
- this HANA specific credential store can also be used with ODBC and JDBC connections.
To use it with an ODBC connection, I showed in the old blog post, that all one has to do is to create the hdbuserstore entry and then put it into the ODBC DSN (Data Source Name) definition where the
To indicate that the HANA ODBC driver should use the hdbuserstore key, a
@has to be put in front of the KEYname.
- RStudio is a graphical UI and development environment for the R programming language and probably one of the most important tools for data scientists today.
It allows accessing many different databases and supports both ODBC and JDBC.
That means the ODBC topic from above applies here.
- the mentioned R library hana.ml.r is a library that is provided with the SAP client software and works as an interface to the machine learning features of HANA for R.
There is a blog post that explains the use of the library. Unfortunately, this blog post only shows how to generally set up an ODBC connection in RStudio to use with HANA, but explicitly not how the hdbuserstore gets used.
This last bit is what the question was about.
Now, it has been a while for me that I last played with R/RStudio and HANA and that was based on my old SAP laptop running MS Windows.
Currently, I use my trusty old 2015 MacBook Pro 13″, which means that some things look and feel different from what I showed in the old blog post.
I don’t know what set up the person who send me the question uses, but before I write “it works just the same as in the blog post” I rather try it out myself.
unixODBC on macOS
> brew install unixodbc
installs the current release (2.3.7).
To make things work, there are two main configuration files that need to be edited.
The brew-formula installs unixODBC without any GUI, so we’re left with text-config files:
/usr/local/etc/.odbcinst.ini-> this file contains information about installed ODBC driver software.
~/.odbc.ini-> this file contains the DSNs configured for the user
As the file locations indicate,
.odbcinst.ini is used by all users of the computer, while
.odbc.ini is user-specific.
Detail documentation about these files is not that easy to find.
man odbc.ini only reports what these files are but not how entries have to look like.
The page “unixODBC without the GUI” contains a lot more information.
For a general overview, “Linux/UNIX ODBC” is pretty comprehensive.
/usr/local/etc/odbcinst.ini looks like this:
[ODBC] TraceFile = /tmp/sql.log Trace = No [HDBODBC] Driver = /Applications/sap/hdbclient/libodbcHDB.dylib Description = SAP HANA ODBC Driver FileUsage = 1
[HDBODBC] section contains all information about the SAP HANA ODBC driver, namely the dynamic library file name on my computer.
~/.odbc.ini contains this:
[DSN_HXE] ServerNode = hxehost:39015 Driver = HDBODBC Description = HXE Tenant DB [DSN_SYSTEMDB] ServerNode = hxehost:39013 Driver = HDBODBC Description = SYSTEM Database
Hidden files in macOS file dialogs
Both of these configuration files are “hidden” files as their names start with an ‘.’ (UNIX convention).
While it’s easy with the command line to show those files in a directory (
ls -la) and to open them (e.g.
vim .odbcinst.ini) as a Mac user I like working with a GUI-editor.
Unfortunately, when using the standard macOS file open dialog, hidden files stay hidden.
To temporarily show hidden files in this dialog one can press
This switches between showing and hiding “hidden” files and folder.
Setting up the hdbuserstore entry
As we want to use hdbuserstore entry for the ODBC connection, we first have to create the entry:
> cd /Applications/sap/hdbclient > hdbuserstore SET HXE_ML hxehost:[email protected] ML Welcome123! > hdbuserstore list DATA FILE : /Users/larsbreddemann/.hdb/Silverfold.local/SSFS_HDB.DAT KEY FILE : /Users/larsbreddemann/.hdb/Silverfold.local/SSFS_HDB.KEYKEY HXE_ML ENV : hxehost:39015 USER: ML DATABASE: HXE
Test the connection “on HANA level”
> hdbsql -U HXE_ML -C "select current_user, current_time from dummy" CURRENT_USER,CURRENT_TIME ML,14:40:07 1 row selected (overall time 8467.022 msec; server time 569 usec)
This shows, that the connection to the HANA DB works when using the HANA client software.
The next step is to configure the ODBC connection for this hdbuserstore entry.
Adding a new DSN
Since the ODBC driver settings are already in my
.odbcinst.ini all that’s missing is the matching entry in
[HXE_SECURE] ServerNode = @HANA_HXE Driver = HDBODBC Description = HXE Tenant DB via hdbuserstore CHAR_AS_UTF8 = TRUE
as the documentation tells us, the old
@+ trick works here, too.
The mentioned blog post about the ML package also referenced an old StackOverflow answer I provided, to make sure that the ODBC-type conversion does not truncate or garble UTF data that is stored in HANA
That’s what the
CHAR_AS_UTF8 is for.
Property or parameter?
Interestingly enough, the documentation states that HANA specific connection parameter names are case sensitive while connection property names are case insensitive.
I tried out different spellings and could not find a difference, just as I am not quite sure what makes an entry a “connection parameter” and what a “connection property“.
Also, note how the
Driver entry ties back to the
.odbcinst.ini entry. This is how both files are connected.
Test the connection “on ODBC level”
Testing the new DSN entry is easy but different from the MS Windows version.
On macOS (and Linux) the command is
> odbcreg HXE_SECURE ODBC Driver test. SQLConnect with DSN: 'HXE_SECURE'. Driver version 02.04.0162 (2019-09-17). Select now(): 2020-01-15 14:53:11.068000000 (29)
Ok, so both the connection via HANA client software as well as via ODBC connect work.
Using the connection in RStudio
Following the ML blog post, I installed the
hana.ml.r-package in RStudio:
> install.packages("/Applications/sap/hdbclient/hana.ml.r_1.0.4.tar.gz", repos = NULL, type = "source") ERROR: dependencies ‘futile.logger’, ‘sets’, ‘uuid’ are not available for package ‘hana.ml.r’ removing ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library/hana.ml.r’ Warning in install.packages : installation of package ‘/Applications/sap/hdbclient/hana.ml.r_1.0.4.tar.gz’ had non-zero exit status
The first try failed because some dependent libraries were not yet installed. So I quickly added those:
> install.packages(c("futile.logger", "sets", "uuid"))
and after that, the installation worked fine:
> install.packages ("/Applications/sap/hdbclient/hana.ml.r_1.0.4.tar.gz", repos = NULL, type = "source") installing source package ‘hana.ml.r’ … ** using staged installation ** R ** byte-compile and prepare package for lazy loading ** help *** installing help indices ** building package indices ** testing if installed package can be loaded from temporary location ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path DONE (hana.ml.r)
Another dependency is the
RODBC-package, which was already installed.
To set up a new DB connection the current RStudio has a handy UI and with a few clicks, the connection works:
RStudio is even that nice to show the code for creating the connection:
> library(DBI) > con <- dbConnect(odbc::odbc(), "HXE_SECURE", timeout = 10)
Now, let’s quickly try out the connection via the
hana.ml.r-package as shown in the tutorial post and we’re done here:
> library (hana.ml.r) Loading required package: R6 Loading required package: futile.logger Loading required package: sets Loading required package: RODBC Loading required package: uuid > conn <- hanaml.ConnectionContext(dsn="HXE_SECURE") Error in ConnectionContext$new(dsn, username, password) : argument "username" is missing, with no default > conn <- hanaml.ConnectionContext(dsn="HXE_SECURE",username=, password = ) Error in ConnectionContext$new(dsn, username, password) : argument "username" is missing, with no default > conn <- hanaml.ConnectionContext(dsn="HXE_SECURE",username='', password='' ) WARN [2020-01-15 15:04:29] Failed: simpleWarning in RODBC::odbcDriverConnect("DSN=HXE_SECURE"): [RODBC] ERROR: state IM002, code 55767936, message [iODBC][Driver Manager]Data source name not found and no default driver specified. Driver could not be loaded ERROR [2020-01-15 15:04:29] Error: Error in value(cond):
This is rather surprising, given that we were able to connect with this setting just seconds before.
Remember when I mentioned that I use unixODBC at the beginning?
That’s another, different ODBC driver manager on macOS.
As it is, unixODBC is what SAP HANA uses on non-Windows platforms. That’s what is in the documentation, that’s what most examples show.
iODBC has a hard requirement that the ODBC driver comes as a shared library (
.so-files), while the SAP HANA client for macOS only provides dynamically loaded modules (
Put simply, iODBC cannot load the SAP HANA driver library.
There are also other aspects like that iODBC uses slightly different named configuration files (
~/odbc.ini instead of
~/.odbc.ini) that also follow slightly different rules about the content of the files.
These configuration files are similar enough to be easily confused and different enough to be incompatible.
So, it does look like the
hana.ml.r-package really does not work on macOS.
From what I can tell, the dependency on
RODBC is only present in the
dataframe.R file of the package. Maybe the problem is easy to fix.
I don’t know why SAP chose to use
RODBC here, but there is another ODBC package for R available: odbc.
And that package makes use of the
DBI-package and, as shown above, connection via
DBI work like a charm:
> con <- odbc::dbConnect(odbc::odbc(), "HXE_SECURE") > str(con) Formal class 'HDB' [package ".GlobalEnv"] with 4 slots [email protected] ptr : [email protected] quote : chr "\"" [email protected] info :List of 13 .. ..$ dbname : chr "" .. ..$ dbms.name : chr "HDB" .. ..$ db.version : chr "" .. ..$ username : chr "" .. ..$ host : chr "" .. ..$ port : chr "" .. ..$ sourcename : chr "HXE_SECURE" .. ..$ servername : chr "@HANA_HXE" .. ..$ drivername : chr "libodbcHDB.so" .. ..$ odbc.version : chr "03.52" .. ..$ driver.version : chr "02.04.0162 (2019-09-17)" .. ..$ odbcdriver.version : chr "03.51" .. ..$ supports.transactions: logi TRUE .. ..- attr(*, "class")= chr [1:3] "HDB" "driver_info" "list" [email protected] encoding: chr ""
Checking the rsession process
Don’t let yourself be fooled by
drivername: libodbcHDB.so – there is no such file.
Checking with the macOS Activity monitor we can check what libraries are actually loaded by the
Some learnings and yet no good solution
So, I learned a lot about macOS, R, the different ODBC managers and where to look and check.
But if SAP wants to make the
hana.ml.r-package work on macOS then there needs to be some work done on their end.
For MS Windows users, this should not be a problem as there the ODBC connect works just fine as shown in the ML blog post.
The RStudio way of handling DB connections
This post was all about how R can connect to HANA and how to use the HANA specific way to securely store login credentials. Since HANA is just one of many DB platforms used by data scientists, RStudio provides its own methods to deal with login data.
See the very well written “Securing Credentials” and “Setting up ODBC Drivers” for details.
The latter document also recommends using
unixODBC on macOS…
A new hope (Update 16.01.2020)
Glenn Neuber, a former colleague of mine, saw this post and took the time to point to a potential solution.
Followed his pointer to this StackOverflow question and learned that the
RODBC package actually provides configure option flags that allow selecting the odbc manager it should use.
And, of course, all this has been properly documented, in the “vignette” for the RODBC-package. Chapter A – Installation explains that with the configure option
--with-odbc-manager=odbc one can select unixODBC as the odbc manager to be used.
And here we go again
In the search for closure, which is probably a driving force for many problem solvers, I tried this out right away and … it did work!
First, I removed the old package installation:
> remove.packages("RODBC") Removing package from ‘/Library/Frameworks/R.framework/Versions/3.6/Resources/library’ (as ‘lib’ is unspecified)
Next, I reinstalled the package, this time with the config parameter:
> install.packages("RODBC", type = "source", INSTALL_opts="--configure-args='--with-odbc-manager=odbc'") trying URL 'https://cran.rstudio.com/src/contrib/RODBC_1.3-16.tar.gz' Content type 'application/x-gzip' length 1172291 bytes (1.1 MB) downloaded 1.1 MB installing source package ‘RODBC’ … ** package ‘RODBC’ successfully unpacked and MD5 sums checked ** using staged installation checking for odbc_config… /usr/local/bin/odbc_config checking for gcc… clang checking whether the C compiler works… yes checking for C compiler default output file name… a.out checking for suffix of executables… checking whether we are cross compiling… no checking for suffix of object files… o checking whether we are using the GNU C compiler… yes checking whether clang accepts -g… yes checking for clang option to accept ISO C89… none needed checking how to run the C preprocessor… clang -E checking for grep that handles long lines and -e… /usr/bin/grep checking for egrep… /usr/bin/grep -E checking for ANSI C header files… rm: conftest.dSYM: is a directory rm: conftest.dSYM: is a directory yes checking for sys/types.h… yes checking for sys/stat.h… yes checking for stdlib.h… yes checking for string.h… yes checking for memory.h… yes checking for strings.h… yes checking for inttypes.h… yes checking for stdint.h… yes checking for unistd.h… yes checking sql.h usability… yes checking sql.h presence… yes checking for sql.h… yes checking sqlext.h usability… yes checking sqlext.h presence… yes checking for sqlext.h… yes checking for library containing SQLTables… -lodbc checking for SQLLEN… yes checking for SQLULEN… yes checking size of long… 8 configure: creating ./config.status config.status: creating src/Makevars config.status: creating src/config.h ** libs clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I. -DHAVE_UNISTD_H -DHAVE_PWD_H -DHAVE_SYS_TYPES_H -DHAVE_LONG_LONG -DSIZEOF_LONG_INT=8 -I/usr/local/Cellar/unixodbc/2.3.7/include -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -fPIC -Wall -g -O2 -c RODBC.c -o RODBC.o clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o RODBC.so RODBC.o -lodbc -L/usr/local/Cellar/unixodbc/2.3.7/lib -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation installing to /Library/Frameworks/R.framework/Versions/3.6/Resources/library/00LOCK-RODBC/00new/RODBC/libs ** R ** inst ** byte-compile and prepare package for lazy loading ** help *** installing help indices ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** checking absolute paths in shared objects and dynamic libraries ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path DONE (RODBC)
Besides the additional parameter (marked in bold) for the clang call there is nothing that indicates that this time, unixODBC will be used. But let’s try and see if it is:
> library("RODBC") > odbcDataSources() DSN_HXE HXE_SECURE DSN_SYSTEMDB "HDBODBC" "HDBODBC" "HDBODBC"
RODBC now uses unixODBC and picks up the correct DSN definitions.
Would a database connection work now? Let’s see:
> conn <- odbcConnect("HXE_SECURE") > res <- sqlQuery(conn, "SELECT current_user, current_date FROM dummy") > res CURRENT_USER CURRENT_DATE 1 ML 2020-01-16
DSNs get picked up, connections using the hdbuserstore work and SQL statements get executed.
This thing works.
The final test
To round it all off, the final step is to see, if now the hana.ml.r package works.
>library ("hana.ml.r") Loading required package: R6 Loading required package: futile.logger Loading required package: sets Loading required package: uuid > conn <- hanaml.ConnectionContext("HXE_SECURE") Error in ConnectionContext$new(dsn, username, password) : argument "username" is missing, with no default
It does not like to have parameters with no values. That’s ok, let’s provide those:
> conn <- hanaml.ConnectionContext("HXE_SECURE",username='', password='' ) > check_df <- conn$sql("SELECT current_user, current_date FROM dummy") > check_df CURRENT_USER CURRENT_DATE 1 ML 2020-01-16
Finally, it works. Case closed!
What started as a straight-forward question turned into long-winded research into the many different technologies that many of our tools are built upon.
Quick solution recipe
If you’ve ever watched an episode of “Gourmet Makes” on the Bon Appétit youtube channel, then you know that after re-engineering famous food, chef Claire Saffitz summarizes the steps to take.
This section is just that.
In order to use the hdbuserstore secure credentials with R and the hana.ml.r package you need to:
- configure the hdbuserstore entry for the computer user that should use it.
The command is
hdbuserstore set .... Check above or the documentation for details
- With the user store key set up correctly, a DSN using this key needs to be created.
On macOS, you need to install
unixODBCand the SAP HANA client software first.
Once both pieces of software are installed, two text files need to be created and correctly configured:
- At this point, an ODBC connection without providing login credentials should work.
Now, the odbc access in R needs to be setup.
For this, the
RODBC-package needs to be correctly installed and configured. See details above.
- Finally, the
hana.ml.rpackage needs to be installed and loaded.
Now, a DB connection can be created using the configured DSN by providing empty strings for the
Without writing this blog post and the helpful comment from Glenn, I would have probably stopped looking for a solution, since I don’t really need to use the hana.ml.r package.
Instead, my audience spoke back to me and provided the exact right push to find an actual solution.
To me, this is immensely satisfying.
Not only, do I now know that there are people that actually read my posts (come on, every author likes to have this confirmed every now and then). And not only did the problem come to full closure since the original question was answered.
But in the course of writing this, I learned many useful bits and pieces.
There you go, now you (we) know!