Discussion:
analyze a hung system
(too old to reply)
Andreas Born
2007-08-02 04:23:07 UTC
Permalink
Hi all,

I try do debug a stalled system, but I'm fairly new to kernel mode
debugging and WinDbg.

Problem: The system gets stalled in a stochastic-like manner, sometimes
after 30 minutes, sometimes after 20 hours. Screen, mouse, keyboard etc
are responding, but every thread that accesses the filesystem or that
causes a page fault gets blocked, all other are working fine. The system
does not recognize any error or displays any error message.

The first thing i tried is to get a memory dump for crash analysis, via
ctrl-scroll-scroll, after the first thread hangs.
The expected bluescreen appears, and it says the dump is being saved,
but i can't find it after next reboot. I'm quite sure there is an issue
with the file system driver, but maybe there's another hint how to
create a dump in such a situation?

I successfully started a debug session in kernel mode via rs232 and I'm
able to break into the target when the system hangs. "!analyze -v -hang"
shows blocked threads and some other information, like CURRENT_IRQL: 1c.
But i'm not sure how to interpret this information. (i copied some of
the output below).

What to do next? Is win32k.sys victim or criminal?
Can I debug a blocked thread? I guess this doesn't make sense, because
the main thing I'd like to know is the case for this permanent lock,
and/or the module that is causing it.

Maybe anyone could point me to the right direction, or to some
information material about debugging techniques of this kind. Would be
really appreciated.

And sorry for my bad english ;)


regards, andy




#########################################################################
Scanning for threads blocked on locks ...
Loading symbols for bf800000 win32k.sys -> win32k.sys
Loading symbols for 80a54000 halmacpi.dll -> halmacpi.dll
[...]
CURRENT_IRQL: 1c
BLOCKED_THREAD: 8089d8c0
BLOCKING_THREAD: 87d0a778
LOCK_ADDRESS: 89167718 -- (!locks 89167718)
Resource @ 0x89167718 Exclusively owned
Contention Count = 319595
NumberOfSharedWaiters = 1
NumberOfExclusiveWaiters = 2
Threads: 87501ca8-01<*> 87d0a778-01
Threads Waiting On Exclusive Access:
8712d798 86f7c320
1 total locks, 1 locks currently held
BUGCHECK_STR: LOCK_HELD
LAST_CONTROL_TRANSFER: from 80832f7a to 8088cf3e
FAULTING_THREAD: 87d0a778
STACK_TEXT:
b627cc6c 80832f7a 87d0a7f0 87d0a778 87d0a820 nt!KiSwapContext+0x26
b627cc98 8082927a 87d0a778 89167718 00000000 nt!KiSwapThread+0x284
b627cce0 8087c195 86f82040 0000001b 00000000
nt!KeWaitForSingleObject+0x346
b627cd1c 8087c582 b627cd64 0097ff5c bf8b36f8 nt!ExpWaitForResource+0xd5
b627cd3c 8087c5a9 89167718 00000001 b627cd64
nt!ExAcquireResourceSharedLite+0xc6
b627cd4c bf87ecab 89167718 bf8b3700 0097ff5c
nt!ExEnterCriticalRegionAndAcquireResourceShared+0x19
b627cd54 bf8b3700 0097ff5c 80888c6c 00000064 win32k!EnterSharedCrit+0xc
b627cd5c 80888c6c 00000064 7c94ed54 badb0d00
win32k!NtUserGetForegroundWindow+0x8
b627cd5c 7c94ed54 00000064 7c94ed54 badb0d00 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be
wrong.
00000064 00000000 00000000 00000000 00000000 0x7c94ed54
STACK_COMMAND: .thread 0xffffffff87d0a778 ; kb
FOLLOWUP_IP:
win32k!EnterSharedCrit+c
bf87ecab ?? ???
SYMBOL_STACK_INDEX: 6
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: win32k
IMAGE_NAME: win32k.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 434471b4
SYMBOL_NAME: win32k!EnterSharedCrit+c
FAILURE_BUCKET_ID: LOCK_HELD_win32k!EnterSharedCrit+c
BUCKET_ID: LOCK_HELD_win32k!EnterSharedCrit+c
Followup: MachineOwner
---------
Ivan Brugiolo [MSFT]
2007-08-02 20:35:40 UTC
Permalink
Knowing that it's the ERESOURCE at win32k!gpresUser makes
the problem somwhat easier to diagnose.

Given this outpout
Post by Andreas Born
Contention Count = 319595
NumberOfSharedWaiters = 1
NumberOfExclusiveWaiters = 2
Threads: 87501ca8-01<*> 87d0a778-01
The next step would be to chase after the thread owining win32k!gpresUser,
that is tagged by the * in the output above.

You can do
0:kd> !thread 87501ca8
and load the user-mode symbols for that process,
and look after the next culprit in the chain
--
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
Post by Andreas Born
Hi all,
I try do debug a stalled system, but I'm fairly new to kernel mode
debugging and WinDbg.
Problem: The system gets stalled in a stochastic-like manner, sometimes
after 30 minutes, sometimes after 20 hours. Screen, mouse, keyboard etc
are responding, but every thread that accesses the filesystem or that
causes a page fault gets blocked, all other are working fine. The system
does not recognize any error or displays any error message.
The first thing i tried is to get a memory dump for crash analysis, via
ctrl-scroll-scroll, after the first thread hangs.
The expected bluescreen appears, and it says the dump is being saved,
but i can't find it after next reboot. I'm quite sure there is an issue
with the file system driver, but maybe there's another hint how to
create a dump in such a situation?
I successfully started a debug session in kernel mode via rs232 and I'm
able to break into the target when the system hangs. "!analyze -v -hang"
shows blocked threads and some other information, like CURRENT_IRQL: 1c.
But i'm not sure how to interpret this information. (i copied some of
the output below).
What to do next? Is win32k.sys victim or criminal?
Can I debug a blocked thread? I guess this doesn't make sense, because
the main thing I'd like to know is the case for this permanent lock,
and/or the module that is causing it.
Maybe anyone could point me to the right direction, or to some
information material about debugging techniques of this kind. Would be
really appreciated.
And sorry for my bad english ;)
regards, andy
#########################################################################
Scanning for threads blocked on locks ...
Loading symbols for bf800000 win32k.sys -> win32k.sys
Loading symbols for 80a54000 halmacpi.dll -> halmacpi.dll
[...]
CURRENT_IRQL: 1c
BLOCKED_THREAD: 8089d8c0
BLOCKING_THREAD: 87d0a778
LOCK_ADDRESS: 89167718 -- (!locks 89167718)
Contention Count = 319595
NumberOfSharedWaiters = 1
NumberOfExclusiveWaiters = 2
Threads: 87501ca8-01<*> 87d0a778-01
8712d798 86f7c320
1 total locks, 1 locks currently held
BUGCHECK_STR: LOCK_HELD
LAST_CONTROL_TRANSFER: from 80832f7a to 8088cf3e
FAULTING_THREAD: 87d0a778
b627cc6c 80832f7a 87d0a7f0 87d0a778 87d0a820 nt!KiSwapContext+0x26
b627cc98 8082927a 87d0a778 89167718 00000000 nt!KiSwapThread+0x284
b627cce0 8087c195 86f82040 0000001b 00000000
nt!KeWaitForSingleObject+0x346
b627cd1c 8087c582 b627cd64 0097ff5c bf8b36f8 nt!ExpWaitForResource+0xd5
b627cd3c 8087c5a9 89167718 00000001 b627cd64
nt!ExAcquireResourceSharedLite+0xc6
b627cd4c bf87ecab 89167718 bf8b3700 0097ff5c
nt!ExEnterCriticalRegionAndAcquireResourceShared+0x19
b627cd54 bf8b3700 0097ff5c 80888c6c 00000064 win32k!EnterSharedCrit+0xc
b627cd5c 80888c6c 00000064 7c94ed54 badb0d00
win32k!NtUserGetForegroundWindow+0x8
b627cd5c 7c94ed54 00000064 7c94ed54 badb0d00 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be
wrong.
00000064 00000000 00000000 00000000 00000000 0x7c94ed54
STACK_COMMAND: .thread 0xffffffff87d0a778 ; kb
win32k!EnterSharedCrit+c
bf87ecab ?? ???
SYMBOL_STACK_INDEX: 6
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: win32k
IMAGE_NAME: win32k.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 434471b4
SYMBOL_NAME: win32k!EnterSharedCrit+c
FAILURE_BUCKET_ID: LOCK_HELD_win32k!EnterSharedCrit+c
BUCKET_ID: LOCK_HELD_win32k!EnterSharedCrit+c
Followup: MachineOwner
---------
Loading...