Saturday, November 26, 2011

Managing the backup process

Managing the backup process

It is important to understand that backing up is a process. As long as new data is being created and changes are being made, backups will need to be updated.
 Individuals and organizations with anything from one computer to thousands (or even millions) of computer systems all have requirements for protecting data. While the scale is different, the objectives and limitations are essentially the same. Likewise, those who perform backups need to know to what extent they were successful, regardless of scale.

[edit] Objectives

Recovery point objective (RPO) 
The point in time that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of synchronization between the source data and the backup repository.[14]
Recovery time objective (RTO) 
The amount of time elapsed between disaster and restoration of business functions.[15]
Data security 
In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.

[edit] Limitations

An effective backup scheme will take into consideration the limitations of the situation.
Backup window 
The period of time when backups are permitted to run on a system is called the backup window. This is typically the time when the system sees the least usage and the backup process will have the least amount of interference with normal operations. The backup window is usually planned with users' convenience in mind. If a backup extends past the defined backup window, a decision is made whether it is more beneficial to abort the backup or to lengthen the backup window.
Performance impact 
All backup schemes have some performance impact on the system being backed up. For example, for the period of time that a computer system is being backed up, the hard drive is busy reading files for the purpose of backing up, and its full bandwidth is no longer available for other tasks. Such impacts should be analyzed.
Costs of hardware, software, labor 
All types of storage media have a finite capacity with a real cost. Matching the correct amount of storage capacity (over time) with the backup needs is an important part of the design of a backup scheme. Any backup scheme has some labor requirement, but complicated schemes have considerably higher labor requirements. The cost of commercial backup software can also be considerable.
Network bandwidth 
Distributed backup systems can be affected by limited network bandwidth.

[edit] Implementation

Meeting the defined objectives in the face of the above limitations can be a difficult task. The tools and concepts below can make that task more achievable.
Scheduling 
Using a job scheduler can greatly improve the reliability and consistency of backups by removing part of the human element. Many backup software packages include this functionality.
Authentication 
Over the course of regular operations, the user accounts and/or system agents that perform the backups need to be authenticated at some level. The power to copy all data off of or onto a system requires unrestricted access. Using an authentication mechanism is a good way to prevent the backup scheme from being used for unauthorized activity.
Chain of trust 
Removable storage media are physical items and must only be handled by trusted individuals. Establishing a chain of trusted individuals (and vendors) is critical to defining the security of the data.

[edit] Measuring the process

To ensure that the backup scheme is working as expected, the process needs to include monitoring key factors and maintaining historical data.
Backup validation 
(also known as "backup success validation") The process by which owners of data can get information about how their data was backed up. This same process is also used to prove compliance to regulatory bodies outside of the organization, for example, an insurance company might be required under HIPAA to show "proof" that their patient data are meeting records retention requirements.[16] Disaster, data complexity, data value and increasing dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful backups to ensure business continuity. For that reason, many organizations rely on third-party or "independent" solutions to test, validate, and optimize their backup operations (backup reporting).
Reporting 
In larger configurations, reports are useful for monitoring media usage, device status, errors, vault coordination and other information about the backup process.
Logging 
In addition to the history of computer generated reports, activity and change logs are useful for monitoring backup system events.
Validation 
Many backup programs make use of checksums or hashes to validate that the data was accurately copied. These offer several advantages. First, they allow data integrity to be verified without reference to the original file: if the file as stored on the backup medium has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can use checksums to avoid making redundant copies of files, to improve backup speed. This is particularly useful for the de-duplication process.
Monitored backup 
Backup processes are monitored by a third party monitoring center. This center alerts users to any errors that occur during automated backups. Monitored backup requires software capable of pinging the monitoring center's servers in the case of errors. Some monitoring services also allow collection of historical meta-data, that can be used for Storage Resource Management purposes like projection of data growth, locating redundant primary storage capacity and reclaimable backup capacity. The Wizards Storage Portal is an example of a solution that monitors IBM's well kno

No comments:

Post a Comment