How to recover database from In-Memory table corruption   3 comments


Hello All,

Corruption in your production database!! Always sounds scary isn’t it?

How about corruption in In-Memory OLTP table?? It’s even scarier…..

You may have situation where you have created an In-Memory database which contains both disk based and memory optimized tables. What will happen if you have corruption in one of the memory optimized tables? You will find the database in Restore_Pending state.

Easiest way to come out from this situation to restore from backup. Unfortunately in this situation we do not have backup so somehow we need to bring the database online. Remember, we cannot run DBCC CHECKDB on Memory optimized tables as well. Stuck, isn’t it?

Let’s explore some other options as well.

As I always do, before I start about this situation let me give you a quick background about what is memory optimized tables.

Memory optimized table’s stores data in memory. Unlike disk based tables in, memory optimized tables pages don’t need to be brought into the buffer pool. Also, the In-memory OLTP engine creates a set of checkpoint files on a filestream filegroup that keeps track of changes to the data to make sure data is persistent. These files are in append only mode and is used during recovery and the restore process.

In order to create a memory optimized table, you need to create a database with a filestream filegroup (along with the use of CONTAINS MEMORY_OPTIMIZED_DATA) or you can alter an existing database to add a filestream filegroup.

To read more about In-Memory OLTP, you can refer to BOL.

Now let’s comeback to the situation we are going to discuss in this blog.

I have one database which contains both disk and memory based tables. Now, due to some reason one of my checkpoint file got corrupt or not available, during the recovery it has put the database in to Recovery_Pending state.

Errorlog shows recovery related errors. (Note: I have re-created the scenario by removing one of the checkpoint files for demo prupose. In real word, the file may be corrupt to have the same issue )

7.1


2014-07-29 06:54:59.84 spid21s Recovery of database 'IMO_Corruption_demo' (6) is 8% complete (approximately 30 seconds remain). 
Phase 1 of 3. This is an informational message only. No user action is required.
2014-07-29 06:54:59.86 spid26s [INFO] The SQL Server service does not have the SE_MANAGE_VOLUME_NAME privilege. Memory 
optimized checkpoint file operations may be slower, resulting in significant performance degredation.
2014-07-29 06:54:59.92 spid26s [ERROR] FsStorage::OpenExistingFile(): Database: IMO_Corruption_demo [6]. Cannot re-open file 
'E:\Program Files\Microsoft SQL Server\MSSQL12.INMEMORY\MSSQL\DATA\IMO_Corruption_demo_dir\16bd0713-a382-48cf-a3f4-a2d7ff8103bd\ddfac875-5402-450a-b891-1f0027f03f8c0000020-000000a0-0041'. Error code: 0x80070002. (e:\sql12_main_t\sql\ntdbms\hekaton\sqlhost\sqlmin\fsstgmin.cpp : 1673 - 'FsStorage::OpenExistingFile')
2014-07-29 06:54:59.92 spid26s [ERROR] Database ID: [6]. Cannot find file '16bd0713-a382-48cf-a3f4-a2d7ff8103bd\ddfac875-5402-450a-b891-
1f0027f03f8c0000020-000000a0-0041'. Error code: 0x80070002. 
(e:\sql12_main_t\sql\ntdbms\hekaton\engine\hadr\ckptagent.cpp : 3948 - 'bindAndOpenFileFromExisting')
2014-07-29 06:54:59.92 spid26s [ERROR] Database ID: [6]. Failed to recover DATA file ID 
{F9903A99-1A5B-4CCD-AB89-4F69672B77F6}. Error was 0x80070002. (e:\sql12_main_t\sql\ntdbms\hekaton\engine\hadr\ckptagent.cpp : 4099 
- 'recoverExistingFileAllocationInternal')
2014-07-29 06:54:59.92 spid26s [ERROR] Database ID: [6]. Failure to process file allocation log record in RedoLogRecord. Error code: 0x80070002. (e:\sql12_main_t\sql\ntdbms\hekaton\engine\hadr\recmgr.cpp : 1136 - 'HkRedoMgr::RedoLogRecord')
2014-07-29 06:54:59.92 spid26s [ERROR] Database ID: [6]. Log redo failed with '0x80070002' for operation 9. 
(e:\sql12_main_t\sql\ntdbms\hekaton\engine\hadr\recmgr.cpp : 1319 - 'HkRedoMgr::RedoLogRecords')
2014-07-29 06:54:59.92 spid26s [ERROR] Database ID: [6] 'IMO_Corruption_demo'. Failed to redo log record at 
LSN 00000020:000000C8:0001. Error code: 0x82000018. (e:\sql12_main_t\sql\ntdbms\hekaton\sqlhost\sqlmin\hkhostdb.cpp : 1996 - 'RecoverHkDatabaseApplyTailOfTheLog')
2014-07-29 06:54:59.97 spid20s Filegroup IMO_Corruption_FG in database IMO_Corruption is unavailable because 
it is Offline. Restore or alter the filegroup to be available.
2014-07-29 06:55:00.01 spid8s Recovery completed for database IMO_Corruption (database ID 5) in 3 second(s) 
(analysis 2903 ms, redo 0 ms, undo 114 ms.) This is an informational message only. No user action is required.
2014-07-29 06:55:00.02 spid21s Error: 41316, Severity: 16, State: 134.
2014-07-29 06:55:00.02 spid21s Restore operation failed for database 'IMO_Corruption_demo'.

We do not have database backup to recover this database.

I started my normal troubleshooting where as a first step we put the database in to Emergency mode followed by single_user mode.

7

Then I tried to run DBCC CHECKDB and got following message.

“Object ID 277576027 (object ‘DemoCorrupt_table’): The operation is not supported with memory optimized tables. This object has been skipped and will not be processed.”

8

What does this mean is  we cannot run “Repair_allow_data_loss”. So, what next??

Now one thing is confirmed, we cannot save our memory optimized data. but what about data in the disk based tables? can we do something to save that data?

Yes we can!!

If you go back and check about how memory optimized tables gets created. Memory optimized table’s uses checkpoint files, these files are created into filegroup MEMORY_OPTIMIZED_DATA which is FILESTREAM filegroup act as a container to store these files. You can have one filestream filegroup for memory optimized tables with multiple directories, but you cannot have multiple filestream filegroups.

That’s where I started thinking, since this is a filegroup which contains the data what if we make this filegroup offline (anyway we know we have no way to get memory optimized data back).

I tried to make this file group offline using following command.

alter database IMO_corruption_demo modify file (name = 'IMO_corruption_demo_dir',offline)

alter database IMO_corruption_demo set online

As you can see in the diagram I was able to take the filegroup offline successfully.

 11

Also I could see the message reported in SQL Server ERRORLOG as well.

17

Now let’s try to bring the database online. It comes online successfully.

13

I also tried to access the data in the disk based tables and I was able to see the data.

15

Now just for double check I tried to check if In-Memory tables is also accessible, I get following error:

14

 

Hope this will help you in come out such difficult situations. Stay tuned for some more interesting blogs.

3 responses to “How to recover database from In-Memory table corruption

Subscribe to comments with RSS.

  1. I recently stumbled upon your blog & found it quite fascinating based on posted contents. Few days back, I already faced similar problem and I used Systools SQL Recovery Tool to recover my data without loss of any information.

  2. Thanks Sandra, Glad to know that you liked it.

  3. God bless you!!!!!

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: