A cautionary tale about data recovery
Your giant database application is up and running. Hundreds of users from across the globe are managing the data within. Months go by. Productivity is up. Users are happy. They love you. Mission accomplished. Then you get the call…
“We’re in the database and it looks like a couple weeks ago so-and-so accidentally deleted one of the key records in the database. We didn’t notice it until today but now we need the information for a presentation in a few hours. How do we get it back?”
They no longer love you. You didn’t cause this problem but you’d damned well better fix it… and fast.
It’s for this reason that I don’t much care for the SQL DELETE function. Running it will wipe the record in question making it difficult to recover. Sure, you can retrieve it from your backups (you do have a backup plan right?) but that’s often tedious. You’ll have to locate the backup file (possibly offline on a tape), restore it to a separate location on the server so you don’t affect any of the data in the live database, retrieve the record and restore it.
As an alternative to flat out deleting data, I like to use a stored procedure that will transfer the record to a separate DELETED table within the database before removing it from the actual live table. This way the data is removed from the live application but easily retrieved when the client calls in a panic. This also has the added benefit of being able to mark the deleted record with a timestamp and the name of the user who deleted it.
Here’s the stored procedure I use to do this:
CREATE PROCEDURE dbo.DeleteMyRecord
(
@ID int,
@UserSession varchar(20)
)
AS
SET NOCOUNT ON
DECLARE
@new_id int
-- Copy the record to the deleted table
INSERT INTO DELrecords (id, myfield1, myfield2, myfield3)
SELECT id, myfield1, myfield2, myfield3 FROM records WHERE id=@ID
-- Get the new record's ID (the deleted table has a separate primary key called deleted_id)
SELECT @new_id = @@identity
-- Update the record with the date and username
UPDATE DELrecords SET deleted_by=@UserSession, deleted_date=GETDATE() WHERE deleted_id=@new_id
-- Delete from master table
DELETE FROM records WHERE id=@ID
GO
Note that using this method has obvious performance and disk space implications. I don’t advocate it’s use for systems where lightning-fast response time is mission critical or for extremely large systems where this method could cripple the disk space. For most smaller intranet-style applications though, this procedure has worked very well for me. You can experiment with different procedures but the principle is the same: Users will delete things they shouldn’t and you’ll have to fix it. Do yourself a favor and put a plan into place before you get that phone call.


3 Comments
Or how about simply including a “Deleted” bit field in each table? Set a default value of 0 for normal (undeleted) records, but set it to “1″ instead of physically deleting the record. This saves resources from constantly moving records and keeps everything in the same place even though they have been logically deleted.
SPROCs which pull data simply need to filter out records by “Deleted” flag. Seems faster / easier.
CS - January 9, 2006 5:14 PM
Also a great idea CS. I chose to move them to a different table simply because:
From a performace standpoint, it’s a toss-up. If you have a large table and the query optimizer uses a table scan for a search, then you could end up scanning lots of deleted rows needlessly. Smaller tables it may indeed make more sense to use a bit field to flag them.
I’ll leave it up to the user to experiment and find the most optimized way but my point remains the same. Be prepared for when the users call you in a panic, and have an easy mechanism for retrieving their data.
Jay Buys - January 9, 2006 8:42 PM
2 different methods to get to that “log” approach.
Nice blog-
CS - January 10, 2006 4:10 PM