Database back and recovery is a key element in database technology because timely data availability is very important in this information age. However, the database can be subjected to data loss through human error (mismanagement), media failure, and natural disaster. Media failure involves storage disk failure, thus denying your database access to some or all files in the database. While human error may involve poor application coding or accidentally dropping a database schema or object like a table. Database backup, recovery, and restoration procedures provide a safety net that allows the DBA to ensure the availability of consistent data. Current database management systems (DBMS) like Oracle provide special tools like recovery manager (RMAN), data pump, and Oracle Secure Backup (OSB) that allow the DBA to perform routine and special backup procedures.

When data are not readily available, companies face potentially disastrous losses. Data backup and recovery procedures are critical in all database installations and the DBA must ensure that the data in the database can be fully recovered in case of either physical data loss or loss of database integrity. The backup and recovery measures/plans must include at least some of the following: Periodic data and application backup, proper backup identification, convenient and safe backup storage, physical protection of both hardware and software, personal access control to the software of a database installation, and insurance coverage for the data in the database where possible or available. Furthermore, data recovery and contingency plans must be thoroughly tested and evaluation, and practiced frequently. Since a backup and recovery program is not likely to cover all components of the database system, appropriate priorities must be established concerning the nature and extent of the data recovery process (Connolly & Begg, 2002).

Before we proceed, lets take a brief look at the evolution of Oracle’s backup and recovery management system. The history of backup and recovery can be traced from the time of the traditional operating systems’ (O/S) backup and recovery. This applies to all versions of Oracle and you can for example use the dd or tar command in Unix to create O/S backups. Then Oracle’s Enterprise Backup Utility (EDU), which was developed by Unix porting group, was introduced in Oracle7 followed by Recovery Manager (RMAN) in Oracle8. Enterprise Backup Utility (EBU) is a high performance interface for backing up Oracle7 databases. To manage database backups using EDU, a database is registered in the EBU catalog.  EDU also referred to as Oracle Enterprise Backup Utility (OEBU) defined the Oracle Secure Backup (SBT) interface used by RMAN today.

EDU’s disadvantage was that it took long to backup. Oracle also uses the Export/Import Utility for backup and recovery to export data and import it back during database restoration process. RMAN is superior to its predecessors and has its own language to interact with the database and scripts can be written for common operations. Then RMAN parses and executes the language. RMAN generates PL/SQL blocks using templates defined in recover.bsq and execute the blocks using non-blocking Oracle Call Interface (OCI) calls (Cooksey & Nanda).

Other backup and recovery approaches include physical standby and logical databases that used to replace the primary database in seconds during times of data catastrophe. In addition, Oracle has also introduced the more enhanced Oracle Secure Backup (OSB). This oracle backup feature is a highly scalable, centralized tape backup management solution that provides data protection the Oracle database management system. OSB administered through Oracle Enterprise Manager or obtool command-line, and makes it easy to use RMAN to backup directly to tape library (Oracle User Guide). This was probably introduced to offer a free alternative to the costly the media management layer/library (MML).

Other backup and recovery tools that have been integrated to backup and recovery software management systems with Oracle’s RMAN and EDU include: Veritas NetBackup, ManageIT Backup and Recovry, HP OMNIBACK II, EMC NetWorker, and so forth (Kuhn et al 2007, Oracle User Guide). The mean time between failures (MTBF), which is a basic measure of system reliability, can be applied when selecting the most durable and dependable backup and recovery technology with high availability for your company’s database management system. All these tools provide some sort of security policy to ensure that data is secure by enforcing passwords rules, encryption, and signing privileges. 

So what is database backup, restore, and recovery? Database backup is the process that involves making a copy of a database or parts of the database to insure it from data loss in case of a media/hardware failure or natural disaster. Database management systems such as Oracle offer different types of backups can be grouped as logical and physical backups; online and offline backups, whole and partial backups; consistent and inconsistent backups; and so forth. To briefly look at logical, physical, online, and offline, backups: Logical and physical backups are performed cold or hot. Logical backups involve the data pump export utility using the SQL to read database data and then export it into a binary file at the operating system level. You can then import the data back into a database using the data pump import utility. At the logical level, use Oracle Flashback Table work to recover and restore tables and their associated attributes such as current indexes, triggers and constraints to their state at some past SCN or time.

Whereas a physical backup is a backup of an Oracle database file or archived redo log located on the operating system. A cold backup is carried out when all users are disconnected and the database is shut down completely with the commands shutdown immediate, shutdown normal or shutdown transactional. After the shutdown, all files being actively used by the database should be backed up because these files provide a complete image of the database, as it existed before shutdown. All datafiles, control files, and redo log files must be backed up during a cold backup. The database does not have to be in archive log mode, but without archive logging, the database can only be recovered to a point in time when the cold backup was done. Cold backups are reliable and hence, chances of data corruption occurring during backup are highly minimized. To recover the database at the physical-level, Oracle Flashback Database, which is similar to point in time recovery, enables you to take the entire database to a past point in time (using flashback logs).

Meanwhile a hot backup is performed when the database is open and users are still connected to the database. The database is running in archive log mode, and this backup involves setting each tablespace into a backup state to backup the related files and then restoring the tablespace to their normal state. All datafiles, archived redo log files, and control files are backed up. The database can be fully recovered from hot backup when the archived redo logs are rolled forward or backward to any point in time. Another advantage to it is that users don’t have to worry about system shutdown during certain backup periods.

Similar to hot backups, online backups of the database must be in archivelog mode and open for use but restarted in mount mode. If your database is not in archivelog mode, error will be generated during online backup respecially when performing the backup using RMAN. Online redo log files are never backed up by RMAN. RMAN online backup mechanism is automated and has the flexibility to backup a target portion of a tablespace incrementally. RMAN offline backup is similar to cold backups; hence, the database does not have to be in archive log mode but has to be shutdown and restarted in mount mode. All datafiles, control files, and redo log files must be backed up during offline backups. The concept of User-managed backups and RMAN backups are similar. The only difference I realized is that RMAN is more automated and equipped with a variety of task simplifying tools compared to User-managed backup utility.

Database restoration is the process of copying files from a backup copy into an Oracle database. While database recovery is the processes of applying redo log entries from both redo log files and archive log files, back into the datafiles.  That is, recovery management deals with the recovery of the database after a failure, such as loss of datafiles, lost control files, or data block corruption, and so forth; which is critical for preserving the database’s integrity. Complete recovery is a when a database is totally recovered to the most recent status before the database failure occurred. That is, all the committed transactions before the failure occurred can be recovered. For instance, recovering the database to a state before its contents such as tables, views were deleted or dropped. RMAN Complete recovery enables you to automate restore and recovery operations, detects block corruption during, restoration and recovery operations, and can be used to perform tablespace point-in-time recovery etc. Complete recovery can be taken when performing hardware striping reconfiguration, after opening a database with the RESETLOGS option, and when media failure has damaged datafiles or the control file. (Velpuri, 2002).

Incomplete recovery restores a database to its state at some previous target SCN or time. You perform incomplete recovery when it’s the only option left to recovery lost data, a user (DBA) executes the wrong operation (batch job) logically corrupting the data in the application, the archived redo log file needed to perform complete recovery is not available or corrupted, and when online redo log file is lost due to hardware failure, and so forth. Unlike complete recovery, when performing incomplete recovery of a database, only a portion of the redo is used. As a result, incomplete recovery always incurs data loss ( Kuhn et al 2007, Velpuri, 2002).

RMAN’s block level recovery utility is used to catalog corrupt database blocks during an RMAN backup and recovery process. This capability allows you to selectively recover just the affected blocks within the damaged data file rather than whole datafiles. Thus making sure the rest of the data file is available to users. “This advantage to only recover blocks marked corrupt lowers the mean time recovery to recovery (MTTR). Hence, block media recovery minimizes redo application time and avoids I/O overhead during recovery.”

    Database backup and recovery is not only an important part of database system to ensure data is highly available to its users but also to ensure that the data is secure and up-to-date.


References:

Connolly, M. T, & Begg, E. C, (2002). Database Systems: A Practical Approach to Design, Implementation, and Management. Addison-Wesley

Kuhn, D., Alapati, S., & Nanda, A. (2007). RMAN Recipes for Oracle Database 11g A Problem Solution Approach. Apress.

Rajan, C. (2007). Database Administrator II: Backup/Recovery & Network Administration. Thompson.

Freeman, Robert G. (200). Oracle Database 11g, New Features. McGraw Hill-Orsbone

Kuhn, Darl, & Schulze, Scott: Oracle RMAN Pocket Reference
http://www.oreilly.com/catalog/rman/cover.html

Gorman, Tim. IS RMAN REALLY WORTH THE TROUBLE?
http://www.evdbt.com/TD_Rman.pdf

Oracle Database Backup and Recovery Advanced User's Guide 10g
http://www.stanford.edu/dept/itss/docs/oracle/10g/server.101/b10734/rcmconc1.htm

Cooksey, Donna and Nanda, Arup ORACLE RECOVERY MANAGER 10g: Exploring RMAN’s Unique Data Recovery Capabilities
http://www.oracle.com/technology/deploy/availability/pdf/1181_Cooksey_WP.pdf