Incremental copies, a false sense of security!

In the DB2 for z/OS Utility manual, you can read that an incremental image copy is a copy of the pages that have changed since the last full or incremental image copy. Incremental copies are faster and produce smaller image copy files.

Image Copies play an important role in any DB2 recovery process. A normal DB2 recovery has three major parts:

The last full copy is being restored
Incremental copies are applied
From the last incremental copy to the recovery point log is being applied

We could visualize a recovery like this:

inc_pic1

The recovery cannot start without a full copy available; incremental copies by themselves are useless. When there is a problem with an incremental copy then the system is forced to switch to log apply sooner. If in the above example incremental copy #4 would fail then picture would look like this:

inc_pic2

So, “what could possibly go wrong?”

The most common reason for failure is that the incremental copy was uncataloged by mistake or an ICF catalog failure. However, there are many other reasons for an incremental copy to fail. OK, suppose the system starts to apply logs sooner than expected, “big deal”. Indeed a “big deal” because your recovery will now take (much) longer or even fail! Why fail? Because the log might not be available anymore. And the most common reason for an unavailable archive log is: you are outside the archive log retention period! There are two reasons for this:

The archive logs do not exist because they have expired and are already scratched or deleted. You cannot keep logs forever. Someone made a decision, in the past, how long the DB2 archive logs will be kept. If you are outside that period, then chances are the logs are gone. Beware, this expire mechanism might also be influenced by other software components like HSM or your tape management system.
DB2 does not know the old archive logs anymore because they are no longer in the BSDS. During the LOGAPPLY phase of the recovery, the system will look at the SYSIBM.SYSLOGRNX table to see which log ranges need to be applied. Then it will look in the BSDS to see which logs have these ranges. There are only a limited number of slots in the BSDS to record the archive logs. Although the number of slots was increased in DB2 9, there will be a point that DB2 drops a log from the BSDS. It is hard to say when that point is because it depends on how fast your system is producing archive log datasets.

Under normal circumstances, #1 kicks in sooner than #2. Using some special advanced recovery techniques you might be able to bypass these two limitations. In most cases for #1 gone, it means really gone. Some installations have set-up an archival mechanism for their DB2 archive logs and they are able to restore old archive logs that are no longer available online. Unfortunately, recovery from #2 is much more difficult. You will need to edit the BSDS using a special utility. Without any proper training or experience, I would not advise you to do this – you could make matters worse!

Now back to the title of this blog “Incremental copies, a false sense of security!”. I hope you see that if you keep creating incremental copies, you are skating on thin ice. It might work during recovery, but the recovery might fail if an incremental copy fails and that copy is outside the log retention period. Make sure you know how much archive log is available, if in doubt then ask your system programmer about archive retention period. If you do so, then make sure you discuss both #1 and #2 scenarios.

Your full copy needs to be inside the archive log retention period too, just in case the first incremental fails and log apply has to start from the full copy. You might even consider having at least two full copies inside the log retention period, just in case a full copy fails. Like you do dual logging, just in case…

Is there a way to “resolve” this log retention problem? Yes, it is possible to merge full and incremental copies back into full copy using MERGECOPY. Here is an example:

inc_pic3

This process uses similar amounts of resources (possibly more) than a full copy. Not only is this scenario resource intensive, it also has a major drawback. It will register the new full image copy in SYSIBM.SYSCOPY with the RBA (LSRN) of the last image copy but with an ICDATE of the time the MERGECOPY run date. This is not quite correct and can introduce problems after you run MODIFY. Make sure you read the topic “Avoiding MERGECOPY LOG RBA inconsistencies” in the DB2 Utility manual under MERGECOPY. Altogether you may be better off doing a full copy more often.

Is it any different in DB2 for LUW? LUW has a more options when it comes to “non-full” backups. They have a delta version (which is very similar to z/OS incremental with similar problems) and they have an incremental version which appends the changes since the last full copy (so inc2 has both the changes of inc1 and the new changes since inc1). This “forever incrementing” is a much better solution because you only need a full copy + the last incremental. This behavior can be mimicked by z/OS by merging all incrementals into a new incremental copy. But once you opt for merge then I would merge full + incrementals into a new full (see picture above).

Only use incremental copies, if your staff (and management) understands all implications of an incremental copy. If you want to keep things simple then only use full copies. Are you in trouble because of using incremental copies and have no idea how to solve the problem? Seek help from IBM support or a DB2 consultant!

Incremental copies, a false sense of security!

Subscribe To Our Newsletter

You have Successfully Subscribed!

Submit a Comment Cancel reply