Today I came across one more issue which was happening frequently in customer’s environment. Here is the brief background about the same.
DBA’s are complaining that Litespeed backups are going into hung state very often and they need to restart the SQL Server to get this issue fixed and problem is happening often on critical servers . Hence we decided to take a close look on what is happening and why recently this issue happening so often.
When we started troubleshooting first question that came to our mind is, how DBA’s reached to the conclusion that backup is hung. As per DBA’s customer has critical application hosted and they are taking regular log backup on that server. Suddenly they noticed that there is a blocking on the server also they got an alert for TLOG backup failure.
Below is the summary of our investigation and troubleshooting approach to resolve this issue.
We ran query select * from sysprocesses where cmd like ‘%backup%’
Output showed that this backup is getting blocked by spid 61
We ran blow command to see why this is getting blocked and what headblocker is doing?
select * from sysprocesses where spid =61
Spid kpid waittype waittime lastwaittype status program_name cmd
—— —— ——– ——————– —————————————————————————————————-
61 3192 0×0088 1610326653 BACKUPTHREAD suspended SQL Litespeed KILLED/ROLLBACK
61 7340 0×0000 0 PREEMPTIVE_OS_REPORTEVENT runnable SQL Litespeed KILLED/ROLLBACK
(2 row(s) affected)
Here is the input buffer for SPID 61.
DBCC INPUTBUFFER(61)
BACKUP database [Backuphung] TO VIRTUAL_DEVICE=’VDI_201AFBBD-4C
56-4A1A-92BF-72BC88DD4C5A_0′, VIRTUAL_DEVICE=’VDI_201AFBBD-4C56-4A1A-92B
Have a look at the wait time in the sysprocess output it was in KILLED/ROLLBACK state since long time.
So we went through the ERRORLOG from the time when this query started its execution, just to see if we find any error message or some other useful information.
In the ERRORLOG we see the dumps getting generated with same command as mentioned in the inputbuffer output.
***Stack Dump being sent to c:\MSSQL10.MYINSTANCE\MSSQL\LOG\SQLDump0001.txt
* *******************************************************************************
*
* BEGIN STACK DUMP:
* 12/23/12 02:01:16 spid
*
* Location: bckvd.cpp:
* Expression: req.m_Command.bytesTransferred == sizeOfThisChunk || req.m_Command.completionCode != 0
* SPID: 62
* Process ID: 7214
*
* Input Buffer 510 bytes -
* BACKUP database [Backuphung] TO VIRTUAL_DEVICE='VDI_201AFBBD-4C
* 56-4A1A-92BF-72BC88DD4C5A_0', VIRTUAL_DEVICE='VDI_201AFBBD-4C56-4A1A-92B
***Stack Dump being sent to c:\MSSQL10.MYINSTANCE\MSSQL\LOG\SQLDump0002.txt
* *******************************************************************************
*
* BEGIN STACK DUMP:
* 01/12/11 02:01:21 spid
*
* Location: bckfio.cpp:1351
* Expression: 0
* SPID: 62
* Process ID: 7214
SQL Server Assertion: File: <bckfio.cpp>, line=1351 Failed Assertion = ’0′. This error may be timing-related. If the error persists after rerunning the statement, use DBCC CHECKDB to check the database for structural integrity, or restart the server to ensure in-memory data structures are not corrupted.
Before looking into the dumps we decided to take a close look on ERROLOG once again.
We saw below errors as well.
2012-12-23 02:01:56.76 spid61 Error: 17066, Severity: 16, State: 1.
2012-12-23 02:01:56.76 spid61 SQL Server Assertion: File: <bckvd.cpp>, line=2046 Failed Assertion = ‘req.m_Command.bytesTransferred == sizeOfThisChunk || req.m_Command.completionCode != 0′. This error may be timing-related. If the error persists after rerunning the statement, use DBCC CHECKDB to check the database for structural integrity, or restart the server to ensure in-memory data structures are not corrupted.
2012-12-23 02:01:56.78 spid61 Error: 3624, Severity: 20, State: 1.
2012-12-23 02:01:56.78 spid61 A system assertion check has failed. Check the SQL Server error log for details. Typically, an assertion failure is caused by a software bug or data corruption. To check for database corruption, consider running DBCC CHECKDB. If you agreed to send dumps to Microsoft during setup, a mini dump will be sent to Microsoft. An update might be available from Microsoft in the latest Service Pack or in a QFE from Technical Support.
2012-12-23 02:02:56.79 Backup Error: 18210, Severity: 16, State: 1.
2012-12-23 02:02:56.79 Backup BackupIoRequest::ReportIoError: write failure on backup device ‘VDI_F73C77A5-D7C0-44F2-876C-2FD7875B87E9_13′. Operating system error 112(failed to retrieve text for this error. Reason: 1815).
2012-12-23 02:02:56.80 spid61 Error: 18210, Severity: 16, State: 1.
2012-12-23 02:02:56.80 spid61 BackupVirtualDeviceFile::RequestDurableMedia: Flush failure on backup device ‘VDI_F73C77A5-D7C0-44F2-876C-2FD7875B87E9_17′. Operating system error 995(failed to retrieve text for this error. Reason: 1815).
In the ERRORLOG we found that this backup is first hitting into one OS error 112 and then it is going into KILLED/ROLLEDBAK State.
We checked what does OS error 112 means? below is the output of net helpmsg.
As per the error message this is happening because there is not enough space on the disk but on the contrary there was enough space available on the disk.
We looked into the dump as well in case if we get some more details.
Here are the two threads for SPID 61 that were doing the backup.
Parent Thread: ============== 0:90> kc Call Site ntdll!ZwSignalAndWaitForSingleObject kernel32!SignalObjectAndWait sqlservr!SOS_Scheduler::SwitchContext sqlservr!SOS_Scheduler::SuspendNonPreemptive sqlservr!EventInternal<Spinlock<149,1,0> >::Wait sqlservr!BackupThread::Wait sqlservr!BackupStreamList::WaitUntilDone sqlservr!BackupOperation::CloseMedia sqlservr!BackupOperation::Close sqlservr!BackupOperation::~BackupOperation sqlservr!BackupOperation::`scalar deleting destructor' sqlservr!CPtr<BackupOperation>::Delete sqlservr!`BackupEntry::BackupDatabase'::`1'::catch$1 msvcr80!_CallSettingFrame msvcr80!__CxxCallCatchBlock ntdll!zzz_AsmCodeRange_End sqlservr!BackupEntry::BackupDatabase sqlservr!CStmtDumpDb::XretExecute sqlservr!CMsqlExecContext::ExecuteStmts<1,1> sqlservr!CMsqlExecContext::FExecute sqlservr!CSQLSource::Execute sqlservr!process_request sqlservr!process_commands Child Thread: ==============
0:061> kc Call Site kernel32!RaiseException msvcr80!_CxxThrowException sqlservr!TurnUnwindAndThrowImpl sqlservr!ex_raise2 sqlservr!ex_raise sqlservr!utassert_fail sqlservr!BackupFile::Quiesce sqlservr!BackupMediaIo::Discard sqlservr!BackupMedium::Discard sqlservr!BackupStream::DoClose sqlservr!BackupStream::ThreadMainRoutine sqlservr!BackupThread::ThreadBase sqlservr!SubprocEntrypoint
From the dump it does not look like there is some issue with SQL Server and we suspect VDI application (Litespeed) is throwing an error and SQL Server does not know how to handle VDI error and resulting assertion.
We suspect that the issue is triggered by the following errors
2012-12-23 02:02:56.79 Backup Error: 18210, Severity: 16, State: 1.
2012-12-23 02:02:56.79 Backup BackupIoRequest::ReportIoError: write failure on backup device ‘VDI_F73C77A5-D7C0-44F2-876C-2FD7875B87E9_13′. Operating system error 112(failed to retrieve text for this error. Reason: 1815).
At the same time we thought since SQL Server does not seems to be a culprit here, could it be a problem with litespeed?
We tried to understand how Litespeed takes the backup and we got below kb article which is mentioning about this error message.
“as per this KB “https://support.quest.com/SolutionDetail.aspx?id=SOL51287”
This may occur depending on how Litespeed extends the backup files. Litespeed initially tries to extend the file by 2146435072. If it fails, it drops it to half that amount. It will continue to drop it by half its amount and if it keeps failing, it will eventually fail completely with the 112 out of space error.
Below is the available in kb :
Script your Litespeed backup job and add the @ioflag.
EX: @ioflag=OVERLAPPED, @ioflag=SEQUENTIAL_SCAN, @ioflag=NO_BUFFERING, @ioflag=FILE_EXTENSION=-1
Also as per the below kb they have permanently resolved this issue in the future version on SQL Litespeed.
https://support.quest.com/SolutionDetail.aspx?id=SOL68482
Hope this will help you in resolving another backup hang issue.



Nice blog and useful. We have encountered this error many times and we had to restart SQL Server to clear the hung thread, so is outage on critical application. Now we came to know that Litespeed upgrade (5.0 version to 6.5) is required on all the servers for problem resolution.
Thanks Naveen for your time in reading blog..Good to know that it is helping