Some time ago some of my colleagues started doing VMWare Snapshots (backups) of our Linux server that was running an Oracle database without mentioning it to me (I work part-time for them supporting important Oracle instances).
Almost immediately I got a call saying Oracles was down, asking me to take a look at it.
They were right Oracle was down, had crashed and would not restart due to a corrupted file issue and. I asked them if anything had changed and they mentioned they started taking VMWare snapshots of the VM. VMWare clearly states snapshots are not backups and not intended for production environments but, this was a production environment.
Make a long story short, the snapshot corrupted the Oracle database to the extent that recovery was necessary. Fortunately I had local rman backups, and after a quick restore and recover – with very minimal data loss we were up and running.
But, there is a way to avoid this – other than NOT taking snapshots at all.
You can download a “freeze & thaw” script that will quiesce the Oracle database that executes prior to after after the snapshot by
Note: there may be other problems when removing snapshots, so don’t put your production database thru this, or be prepared to rman recover! Not!
When taking the snapshow – refer to image 1.) unchecking – “Snapshot the virtual machine’s memory”, 2.) Checking “Quiesce guest file system (Needs VMware Tools installed). 3.) download script to: /etc/vmware-tools/backupScripts.d – as shown below.
I saved this file to /etc/vmware-tools/backupScipts.d/pre-post-freeze.bach and made the bash script executable. The path is fix/required, what file name you use is up to you, all executable scripts in the backupScripts.d directory will be executed.
#!/bin/sh # put this script in /etc/vmware-tools/backupScripts.d # make sure you check the quiesce box on snapshot - and uncheck Snapshot the virtual machines memory if [[ $1 == "freeze" ]] then # set log directory log="/home/oracle/scripts/vmware_snapshot_freeze_backup.log" # set and log start date today=`date +%Y\/%m\/%d\ %H:%M:%S` echo "${today}: Start of creation consistent state" >> ${log} # execute freeze command. # This command can be modified as per the database command cmd="echo \"alter database begin backup;\" | sudo -i -u oracle /u01/app/oracle/product/11.2.0/db_1/bin/sqlplus / as sysdba >> ${log} 2>&1" eval ${cmd} # set and log end date today=`date +%Y\/%m\/%d\ %H:%M:%S` echo "${today}: Finished freeze script" >> ${log} elif [[ $1 == "thaw" ]] then echo "This section is executed when the Snapshot is removed" log="/home/oracle/scripts/vmware_snapshot_freeze_backup.log" # set and log start date today=`date +%Y\/%m\/%d\ %H:%M:%S` echo "${today}: Release of backup" >> ${log} # execute release command cmd="echo \"alter database end backup;\" | sudo -i -u oracle /u01/app/oracle/product/11.2.0/db_1/bin/sqlplus / as sysdba >> ${log} 2>&1" eval ${cmd} # set and log end date today=`date +%Y\/%m\/%d\ %H:%M:%S` echo "${today}: Finished thaw script" >> ${log} elif [[ $1 == "freezeFail" ]] then log="/home/oracle/scripts/vmware_snapshot_freeze_backup.log" echo "**** Quiescing Failed ************." >> ${log} else echo "No argument was provided" fi