From time to time I start to believe that there are certain problems that would not come up anymore, once the topic is documented well enough.
My colleagues from the development know that too well as I’m asking for documentation and notes all the time for everything.
However, the written documentation does help nothing if it’s not read…
BACK IN TIME
Last week I got this message and I believe it will be interesting for many MaxDB users “out there”.
The Server XYZ has been rebooted yesterday morning.
Best regards, …”
Ok, what’s that about?
The server got rebooted, and now the database is back at a previous state?
When I read the description of the customer I already suspected what turned out to be the actual reason for the loss of data.
To verify my assumption I only needed to check the DBM.PRT file.
This is what I found:
[...] command db_admin -f command auto_extend show command auto_update_statistics show command util_execute GET RESTART VERSION command auto_extend show command auto_update_statistics show command util_execute GET RESTART VERSION command util_execute SET LOG WRITER OFF <<<<<< !!!! command util_execute GET RESTART VERSION command auto_extend show command auto_update_statistics show [...]
There is was: the log writer of the database had been turned off.
So why is this something to note?
The short answer is:
As soon as you disable the log writer, no automatic savepoints are written anymore.
If you now ask yourself “Wait a minute – what have savepoints to do with the log writer and why does this cause a huge loss of data?” then you may want to read on.
THE CONVERTER – KEEPER OF THE DATABASE STATES… A LITTLE EXCURSION…
Ok, the first thing to understand here is the way MaxDB stores and retrieves data to and from the disks.
Unlike most other databases, MaxDB does NOT store data at s specific physical location.
Instead, a row is stored on a logical page and that logical page changes its physical location in the data volumes every time the page is changed and written to disk.
This concept is called “COPY-ON-WRITE”.
One advantage of this is that by writing out data to the disks MaxDB automatically can even out the filling of the data volumes and thereby guarantees an equal distribution of Ioverallall data volumes.
Of course, something needs to keep track of this so that the database can actually find the correct location of each page. This something is the so-called converter.
You put a logical page number into it and get a physical block location out (or vice versa).
Basically you could say when a page is in the converter, the database knows it – otherwise, it does not know it.
Seen from that angle, the converter holds the state of the database at all times.
“But, wait a minute, where is the information of the converter stored?”
In fact, it’s the very key point of the customer’s problem here, so pay attention! 😉
Since the converter is so utterly important to the database (without it all the pages in the data volumes are just nonsense), MaxDB needs to save it regularly, so that it’s there the next time the database is started.
Therefore, at the end of each savepoint, MaxDB writes down the converter into the data files to places that are currently not allocated by the actual user data or the old converter.
These places, in turn, are again spread dynamically overall data volumes.
Did you notice something?
We have data volumes full of pages that can only be brought into context and meaning by the converter. And this converter itself is also stored in the same cryptic way in the data volumes.
“How the heck can MaxDB figure out which pages belong to the converter when it starts up?”
Again, good question!
The trick here is the data structure used by the converter.
It’s a B-tree, so all we need to have is the root-page location to find all other pages that belong to the converter.
Unlike usual data pages, converter pages on disk do have pointers to physical locations to the other converter pages, enabling MaxDB to find and read all converter pages before having the converter available for the mapping of logical to physical pages.
Now, the last missing link, the one to the root page of the converter comes in:
The restart record.
This is a ‘magic’ data structure that is always located in the first block of the first data volume.
In that restart record, the link to the converter root page is stored as a combined volume/offset-information.
(Just for those super-techies among you readers: it’s the crConvRootBA value you see in x_diagnose.
It contains the information about the volume in the lower-byte and the offset in the upper bytes. E.g. crConvRootBA: 1164033 = 0x11C301 => Offset 0x11C3, Volume 0x01.)
Fine, to summarize:
- We have a data structure that brings meaning the randomly scattered pages in the data volumes called CONVERTER.
- MaxDB does only know the pages that are in the CONVERTER, since for user data only logical page numbers are used to reference between pages.
- We find this converter during the restart by following the link from the restart record and the B-Tree links of the CONVERTER itself.
So whenever the converter is written down to the data volumes, this link to the converter root needs to be updated. The restart records have to point to the new converter root.
In fact, this is done as the very last action during the savepoint.
If this fails, the old converter will be found the next time the database starts up.
By the way: this is precisely the way that SNAPSHOTS work in MaxDB.
We simply store multiple converters and protect the pages that belong to it from being overwritten.
When we then want to ‘restore’ one of the snapshots all we’ve to do is to change the restart record and load the converter by restarting the database.
Yes, here we have it.
During the month of work where the database was running, there seem to have been no SAVEPOINT.
How can that be?
Isn’t a SAVEPOINT expected to happen AT LEAST every 600 seconds (or in whatever interval is configured by the _RESTART_TIME parameter)?
Indeed it should, but for the SAVEPOINT to be actually triggered based on this timing, another condition has to be met:
There have to be some (5000) LOG-IO-Actions taken place since the last savepoint.
By disabling the log writer, the customer effectively turned off the automatic writing of savepoints!
Of course, he could have triggered savepoints manually or by taking backups, creating indexes or by just stopping the database via a DB_OFFLINE.
All these actions would have written a savepoint and thereby saving the precious converter information in the data area, available for the next restart.
Unfortunately, the customer decided, that the server needed a reboot and just ran “shutdown -r now“.
This command, of course, does know nothing about MaxDB or DB_OFFLINE but just KILLS all still running processes.
As a consequence of this, the restart record of course still pointed to the converter root of the converter that was written to disk a month ago.
It’s no wonder, that the database occurred as “of a month ago” when it was restarted afterwards.
The outcome of this was that the customer simply lost a month worth of work and there was no technical option to rescue his data.
WHAT A BAD MALICIOUS FUNCTION IS THAT?
One could argue that such a function should have no place in a DBMS because one of the main functions of DBMS is to keep the users’ data safe.
Well this feature is in fact not meant for normal operations.
It’s a benchmark feature that allows certain performance tests without the effect of log writing. Also, there are specific situations where you can greatly benefit from this function.
One example is the loading of data via R3Trans during the installation of a system.
Of course the R3Trans application is aware of the effects that switching off the log writer has and triggers savepoints manually.
All in all it’s just another feature that can be used correctly or the wrong way.
And unfortunately the user this time used it the extremely wrong way.
The only morale here is: understand what a command or feature does before using it!
Read the documentation and try things out beforehand!
Important concepts like the CONVERTER are of course explained and documented, so here is a list of links to the relevant resources (for those of you that still cannot get enough of tech-talking):
SAP support note #869267 – FAQ: MaxDB LOG area, 35 + 36
A LITTLE ADVERTISING
I was asked to blog about the TechEd Session I’m going to deliver, so here it is: If you’re not already booked for October 27-29 this year and you happen to stay in Vienna/Austria and you keep asking yourself what to do … then get your ticket for SAP TechED 2009 and make sure to attend my MaxDB Session