Discussion:
Process just disappears on Access Violation; no crash dump produced
(too old to reply)
r***@yahoo.com
2009-11-24 18:03:38 UTC
Permalink
I have a problem with a production system in that some of its
processes recently experience a software error leading to a crash. In
production machines, I have setup Dr. Watson to catch a crash dump
file for post-mortem analysis. The problem is that I observe recently
that despite Dr. Watson being correctly registered as JIT debugger
(AeDebug key), no crash dump is actually produced and the faulty
process magically just disappears without leaving any trace about what
happened whatsoever.

The software platform that I am using is Windows XP x64 Professional
Edition (64 bits). This is compiled into 64 bits release of Microsoft
Visual 8 (Developer Studio 2005) into native C code.

I have managed to reproduce this behavior with a small 30-line C
program. The crucial thing is that there is buffer overrun in one of
the functions. This buffer overrun also erases the function return
address. The function thus returns to a bogus location and the faulty
RIP register immediately produces Access Violation Exception when next
instruction is fetched.

This Access Violation Exception is always caught when the faulty
process is run under debugger. However, when the process is running
free (and not as debugee) then this Access Violation cannot be caught
by any means (is not caught by function level SEH __try+__except
block; is not caught by UnhandledExceptionFilter, is not caught by
external JIT debugger) - effectively disabling a possibility to
arrange for a post-mortem crash analysis.

As an experiment, I have switched back into win32 and discovered that
everything works properly there. Access Violation Exception is
properly caught by function level SEH frame, also by
UnhandledExceptionFilter and by external JIT debugger. Everything
works on win32, nothing works on x64.

Turned to some Internet studies and discovered that SEH in Windows x64
platform was given a major overhaul and is quite a different
implementation from SEH in win32.

Still can't find an answer however: why Access Violation Exception
stemming from buffer overrun and stack corruption cannot be caught by
any means in x64? Having critical processes just silently vanishing
from production system without leaving any trace behind is a big
concern to me.

The small 30-line C program reproducing this behaviour follows:
compile and link it with Microsoft Visual C 8, (console application), /
EHa, release mode, x64 (rest of settings are pretty much standard) -
have JIT debugger setup in registry and see how the process just
vanishes without giving JIT tool any chance to get hold of it

#include <stdio.h>
#include <stdlib.h>
#include <Windows.h>

int func_a()
{
int res = 3;
int arr[4000];
int i;
// write 12 extra bytes so that to overwrite the function return
address
// make this code complex enough so that optimizer does not
eliminate it
for (i = 0; i < 4003; ++i)
arr[i] = rand();
for (i = 0; i < 4003; ++i)
res += arr[i];
return res;

}

int main()
{
func_a();
printf("Sleep\n");
Sleep(2000);
printf("Done\n");
return 0;

}

I understand that the stack is corrupt, so stack trace might not be
available, but there is still lots of useful information in the failed
process for post-mortem analysis (partially corrupt stack trace, all
RAM image, registers etc)? Why can't crash dump be produced in this
scenario?
Robert
Kalle Olavi Niemitalo
2009-11-25 00:22:58 UTC
Permalink
Post by r***@yahoo.com
However, when the process is running
free (and not as debugee) then this Access Violation cannot be caught
by any means (is not caught by function level SEH __try+__except
block; is not caught by UnhandledExceptionFilter, is not caught by
external JIT debugger)
IIRC, Windows on x64 finds the structured exception handlers by
taking a stack backtrace and then comparing the return addresses
to some tables. If the stack is corrupt, then perhaps Windows
gives up at the backtrace. And if Windows cannot know whether
there are handlers, it seems OK to skip UnhandledExceptionFilter
too. On Windows XP x86, UnhandledExceptionFilter is what calls
ReportFault and I think also checks the AeDebug Registry key;
if that is true on Windows XP x64 too, it would explain why you
don't get a crash dump.

Some ideas to try:
- AddVectoredExceptionHandler might let your handler run before
Windows looks at the stack.
- Windows Vista might be able to save a crash dump.
- Run ADPlus in crash mode.
opedroso
2009-11-25 14:42:54 UTC
Permalink
Hi Robert,

Kale got the right approach here: run ADPlus in crash mode.

The most likely reason you don't see anything after the problem
happens is because while the code processing the exception is
executing, a second exception happens (most likely due to the corrupt
stack).

Windows summarily kills your process then.

Using ADPlus.vbs -crash will get you a minidump for each exception
that happens leading to the final crash. Concentrate on the first and
resolve it. Then move on to the next.
Once you clear them all, program works again.

Not too hard, just methodic work. And using the right tool.

Good luck and let us know what you find.

Osiris
r***@yahoo.com
2009-11-25 15:10:57 UTC
Permalink
Post by opedroso
Hi Robert,
Kale got the right approach here: run ADPlus in crash mode.
The most likely reason you don't see anything after the problem
happens is because while the code processing the exception is
executing, a second exception happens (most likely due to the corrupt
stack).
Windows summarily kills your process then.
Using ADPlus.vbs -crash will get you a minidump for each exception
that happens leading to the final crash. Concentrate on the first and
resolve it. Then move on to the next.
Once you clear them all, program works again.
Not too hard, just methodic work. And using the right tool.
Good luck and let us know what you find.
Osiris
Hi Osiris,
yes, this all starts to make perfect sense. Exception while processing
exception put Windows off tracks.
I must admit I have never used ADPlus yet - but now I definitely learn
it quickly and let you know the results.
Catch you later
Robert
r***@yahoo.com
2009-11-27 10:37:59 UTC
Permalink
Hi,
ADPlus shows Access Violation 1st and 2nd chance followed by
Process_Shut_Down Exception. All my experiments show that yes Windows
shuts the process down when exception happens during another exception
processing without giving the process a chance to have the crash dump
produced (the control flow does not make it to
UnhandledExceptionFilter).

The only application custom code be to be invoked is
VectoredExceptionHandler.

I am still not happy about it. Win32 confifguration does not have a
problem with JIT debugger invocation (by means of its toplevel
UnhandledExceptionFilter) even if I corrupt the stack the same way.
Why can't x64? Looks like careful stepping through
RtlDispatchException is needed.
Thanks
Post by r***@yahoo.com
Post by opedroso
Hi Robert,
Kale got the right approach here: run ADPlus in crash mode.
The most likely reason you don't see anything after the problem
happens is because while the code processing the exception is
executing, a second exception happens (most likely due to the corrupt
stack).
Windows summarily kills your process then.
Using ADPlus.vbs -crash will get you a minidump for each exception
that happens leading to the final crash. Concentrate on the first and
resolve it. Then move on to the next.
Once you clear them all, program works again.
Not too hard, just methodic work. And using the right tool.
Good luck and let us know what you find.
Osiris
Hi Osiris,
yes, this all starts to make perfect sense. Exception while processing
exception put Windows off tracks.
I must admit I have never used ADPlus yet - but now I definitely learn
it quickly and let you know the results.
Catch you later
Robert
r***@yahoo.com
2009-11-25 15:02:36 UTC
Permalink
Post by Kalle Olavi Niemitalo
Post by r***@yahoo.com
However, when the process is running
free (and not as debugee) then this Access Violation cannot be caught
by any means (is not caught by function level SEH __try+__except
block; is not caught by UnhandledExceptionFilter, is not caught by
external JIT debugger)
IIRC, Windows on x64 finds the structured exception handlers by
taking a stack backtrace and then comparing the return addresses
to some tables.  If the stack is corrupt, then perhaps Windows
gives up at the backtrace.  And if Windows cannot know whether
there are handlers, it seems OK to skip UnhandledExceptionFilter
too.  On Windows XP x86, UnhandledExceptionFilter is what calls
ReportFault and I think also checks the AeDebug Registry key;
if that is true on Windows XP x64 too, it would explain why you
don't get a crash dump.
- AddVectoredExceptionHandler might let your handler run before
  Windows looks at the stack.
- Windows Vista might be able to save a crash dump.
- Run ADPlus in crash mode.
There are huge changes to x64 SEH (as compared to x86). Apparently
sometimes they just give up to calling user-installed
UnhandledExceptionFilter and call JIT tool directly (see:
https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=101337).
In my case JIT is not called at all, which upsets me.

You are right.Probably the fact that the stack is corrupt determines
Windows behaviour regarding this.

You've come up with some interesting ideas to try:

AddVectoredExceptionHandler - I have checked this quickly and it does
get called!!! Yes, the docs specify that VectoredExceptionHandlers are
not dependent of user stack and are invoked before SEH handlers. Yes,
I can see my handler executed. My handler then returns
EXCEPTION_CONTINUE_SEARCH and the process is gone with no JIT tool
notified. Still better than nothing.

Thanks for this suggestion Kalle,
Later I'll try ADPlus
I do not have access to Windows Vista at the moment
Thanks
Continue reading on narkive:
Loading...