Every so often, every DBA will receive from its users the request to “recover” a few tables (or a single table) from one of the databases that it manages… and surely the largest of them all will be the one requested for recover more frequently. Depending on the volume of the database, the location of backups, in backup appliances or on tapes, the allowed recovery window or even the version of the database (with an Enterprise Edition it is possible to “parallelize” the recovery, with a Standard Edition it’s not), the procedure can take hours.
To solve this issue and recover any database in a “express” way I will use a combination of Oracle database and storage system functionalities. These functionalities are found in all modern storage appliances and in advanced file systems (such as Oracle ACFS or ZFS) or even disk volumes in the cloud (such as those found in the Oracle Cloud). The idea is to keep “images” of the database at different points in time thus being able to start them in a few minutes if necessary. We will create a copy of the production database, and from time to time we will collect the changes that have been made in production and apply them to the copy. In other words: the disk copy will “advance” in time following the production in a staggered way (in “jumps” of several minutes or hours). Once the changes are applied we will save an “image” and start again: apply changes and save image, apply changes and save image, etc. We will keep as many “images” as we think necessary. When we need to recover we will go to the “image” (point in time) that best fit our needs and activate it to access the data.
In more technical words, we will start from a “zero” level backup of the production database using a datafilecopy backup. In this type of backup the files that make up the database are fully copied as they actually are (there is no space optimization, compression or deduplication): this copy will occupy in size exactly the same as the database. Then, with the periodicity that we decide, we will launch an incremental copy of level “one”, that is, a copy of all the blocks that are different from those copied in the previous zero level backup. The third step is to apply this incremental copy on the “zero” level backup making this “advance in time”. Finally, we will launch a “snapshot” by means of functionalities of the storage system that will save us the current image of the files on the disk, allowing us to return to them when necessary. To use this technique we will need disk space, in fact as much as that occupied by the database (remember that the base copy is complete, without any optimization), plus the space equivalent to the changes we make in the database during the time that we want to keep (the one of the oldest “image” we want to be able to return to). If we are going to do a “snaphot” every day and we want to keep seven of them, it will be the size of the database plus the changes at block level of one week.
As an example we start using the RMAN incremental functionality and launch a zero level incremental backup with a command similar to this one:
BACKUP INCREMENTAL LEVEL 0 DATABASE TAG ‘BACKUP_BASE’;
Once we have this backup we will use the snapshot functions of the storage system we have. For example for a NetApp storage appliance or for ZFS they would be commands similar to:
cluster1::> volume snapshot create -vserver vs0 -volume datafiles -snapshot day_month_year_hour
zfs snapshot datapool/datafiles@day_month_year_hour
We now have the base image of the backup. Some time after, we launch an incremental backup with a similar command to the following:
BACKUP INCREMENTAL LEVEL 1 FOR RECOVER OF COPY WITH TAG ‘BACKUP_BASE’ DATABASE;
In the filesystem we find the zero level backup files (basically the binary copy of the database) and the incremental one (with the changes made since the the zero level). The next step is to merge them using the “RECOVER COPY OF DATABASE” functionality of RMAN. The command would be similar to this:
RECOVER COPY OF DATABASE WITH TAG ‘BACKUP_BASE’;
The copy of the database has “advanced” until the start of the incremental backup, it’s time to launch another “snapshot” like we have done before. We will continue to carry out this procedure at the intervals that we deem appropriate, keeping the snapshots that we believe convenient, so that at all times we’ll have several snapshots that will show us the database at different times of time.
Suppose we are asked to retrieve a table of the copy of the DB from a few days ago and that we have a “snapshot” that fits the date required (remember that we can recover the DB only at the points in time we’ve launched the copies).
NOTE: With a little more work, and if we have the relevant archives, we could even go to points in between those copies, but we will not go into that level of detail in this post.
We ask the storage system to show us the files from a certain snapshot, in some cases this is “as simple” as going to a subfolder of the main filesystem (normally hidden). In other cases it will be necessary to mount a new file system on the recovery server. In most cases we see the snapshot files in read-only mode, but to open the database we need them in read / write. Copying them to another volume of type read / write implies duplicating the space and it can take a long time for what we’ll choose to make a “clone”. A “clone” is like a “snapshot of the snapshot”, and unlike the “snapshot”, it allows us to modify its contents. For each storage system we will have to use the appropriate commands. For example for a NetApp storage appliance or for ZFS it would be:
cluster1::> volume clone create -vserver vs0 -flexclone datafiles_clone -type RW -parent-volume datafiles
zfs clone datapool/datafiles@day_month_year_hour datapool/datafiles_clone
The creation of the clone is immediate, since no data copy is produced. The “clone” will be based on the snapshot itself. This implies that, initially, the “clone” will not occupy any space, and, as said before, unlike the “snapshot” from which the clone was created, it can be modified. As we begin to modify the recovered database (the files in the “clone” filesystem) it will begin to occupy space corresponding to the modified blocks and therefore different ones between the “clone” and the “snapshot”.
The next step is to make it visible on the machine where we want to start the database used to recover (if it is not already).
NOTE: Using the same machine on which we have the source database or a machine that has access to the file system of the source database is not recommended at all. We must be extremely careful to avoid problems of file overwriting.
Depending on whether we recover the database on the same machine and a different directory tree or on a different machine with the same or different directory tree, we must perform more or less steps. In an ideal case of a different machine and the same directory tree, after copying or recreating the initialization and password files, as well as creating the audit directories, we would be ready to start the database.
We will have to start the database in mount mode and launch the command to recreate the online redo files.
SQL> ALTER DATABASE OPEN RESETLOGS;
It will take a few minutes to do all these operations, especially if we already have everything proceeded. If we use the same machine as the source database (which is not recommended), we must prepare an initialization file pointing to the cloned control files and add the parameter “DB_UNIQUE_NAME”, indicating a new SID so the two databases that are called the same do not collide . We will have to catalog the new files in the cloned controlfile and perform a “SWITCH DATABASE TO COPY;” Also we have to modify some more files, for example the online redo logs (which still point to those of the production database), renaming them with the command “ALTER DATABASE RENAME FILE”.
Once the necessary tasks have been completed, we can open the database as well as in the previous case using “ALTER DATABASE OPEN RESETLOGS”.
After that we have our operational “cloned database”, and remember that the volume occupied in the storage system has practically not changed. We can create a DBLINK from the production to the cloned or use expdp / impdp to transfer the data between them.
BONUS: This operation works with both Standard Edition and Enterprise Edition versions of the database, but if you are the lucky owners of the latter there is a functionality called “Block Change Traking” that will increase the speed of incremental backups significantly. If you activate BTC the database will keep a list of the modified blocks, so that the incremental backup process doesn’t have to go through all the blocks of all the files looking whether hey have changed: it will go “directly” to the modified ones. In very large databases this implies big time savings.
BONUS 2: Whoever says a “cloned database” to retrieve one or more tables says the entire database acting as a DR system, as long as the storage environment and the recovery machine are remote and we can afford to lose the gap of data between copy and copy.