January NPFM Meeting: How to sleep at night knowing your backup and disaster recovery strategies are at work

Russell Greenwald, Vice President, co-leads the delivery and management of professional consulting services at Insource Services. Russell has over 15 years’ experience advising organizations on strategic goals, developing multi-year plans, and seeing implementations through to completion, particularly in the areas of process and technology. Russell has worked across industries, ranging from nonprofits to venture capital, biotechnology, financial services, manufacturing and healthcare.

Russell gave a presentation about how to think about IT backup and data recovery. Backups used to be a magnetic tape in your bag on a Friday night, then USB drives, and then came cloud and we all assumed it was being backed up. These days your data is in various locations, new threats appear every day, and traditional IT may not even know what systems departments are using. Russell spoke about how to put systems in place to get your backup and disaster recovery (DR) strategy under control and implemented. He provided a framework for identifying and prioritizing the systems and dimensions needed in good backup and DR, covering: What is backup and DR?; What are the dimensions of backup and DR?;Map your existing processes; Review processes for potential improvements; Identify all the critical and non-critical systems; Prioritize day 1 applications and data; Look at backup systems that cover cloud and premise software and data; Discuss RPO, RTO, and retention; Establish procedures for ongoing use and maintenance of the system; and Look at future technology with an eye towards Backup and DR.

IT disasters can have many different causes such as a mistake by an employee, either by accident or on purpose, a hardware failure; or damage or loss of the facility.  An IT backup is a copy of the data.  Disaster recovery is the ability to get your IT environment up and running in a short period of time in order to resume the agency’s business.   First you have to have a retention policy for how long you keep data backups.  This policy is different from than record retention, which is usually for 7 years.  The backup retention period can be one year.  Just because you have a good backup policy, that doesn’t mean that you can be up and running quickly after some IT disaster.  Backup procedures and disaster recovery go hand in hand.

There are two types of backup:  onsite and offsite in the cloud.  If your server goes down, you can use your onsite backup.  If your building burns down, you will need offsite backup.  There are three types of disaster recovery.  Cold site is when it takes several days to get back up and running.  Warm site is when you are almost ready to get back to normal.  A hot site is when you can just move to another building and you are already setup.  There are 2 terms associated with disaster recovery.  RPO (recovery point objective) refers to when the last backup was conducted that you can restore to.  You are probably backup every night.  Can you restore back up from one month ago, two months, etc. How long do you keep backups? RTO is recovery time objective, or how long it will take you to get back to normal operations.

You need to write up a formal disaster recovery plan.  The considerations for backup are: what is backed up; when the system is backed up; how long are the backups retained; and is the backup on site, off-site, or some combination.   What data needs to be back up:  donor base, GL, files, email, etc.?  If your server fails, how fast can you be back up and running?  What data is most critical?  Some agencies do a full back up every Sunday night and do incremental backup during the week.  What systems are critical and need to be restored quickly:  GL, phones, emails, etc.?  If you have a premise based server, you need to run a virtual server.  You need to backup both your data and the server, dump it off-site, and then you can restore it relatively quickly.  If you have backed up to the cloud, you need to know that the cloud server is secure; how to recover your data; and how long it will take you to restore your operations.  Remember the cloud is does not automatically include a back-up – you need to specify and by a cloud backup service.   You need to know how long your information is backed up for and how fast it will take to restore your system.   There are a number of products that do cloud backups.

Once you have a backup procedure and disaster recovery plan, you need to test it. Your policy is useless unless you test it.  Your disaster recovery plan should be well documented and written.  Once you test it, you can see how well it works or doesn’t work.  Then you can make improvements.

The presentation can be found here:  https://docs.google.com/forms/d/e/1FAIpQLSfiSauZcqUuNlXHfPFjKQMk3t2QXNK2qAEVj0vlgD0YhB5FeA/viewform