delphij's Chaos

选择chaos这个词是因为~~实在很难找到一个更合适的词来形容这儿了……

03 Mar 2004

minidump for FreeBSD?

junsu and I have discussed about some big hacks to FreeBSD kernel, and the implementation of a minidump is a good idea, I think.

On Windows there’s a minidump mechanism. The minidump will save ‘crashing’ pages only, while full dump will save the whole core memory when the system going to crash.

More technically, minidump stands that we do not save “unused” pages when having this type of dumps. Yes, you can easily argue that when we are having a kernel panic, it is possible that the pagetable is also damaged and hence depending on the status of the pagetable is not a good thing to do. Windows is different from FreeBSD because its micro-kernel architecture allows people to prove the correcness of the “microkernel” rather than proving the whole kernel’s correctness, and hence depending on the microkernel is possible if the pagetable itself is correctly protected by it.

However, in most cases the pagetable is correct and an average user will not build a debug kernel in his or her production environment. This caused a paradox - to reproduce problems, we definately want a production environment run, and there, the performance is important and a user will not (like most developers do) use a debugging kernel. When the system crashes, he or she will first post something on the mail list. Without a backtrace, it is hard to figure out a problem.

Dumping core on a machine with big RAM installed, on the other hand, is a nightmare for the end user. Having the ablity to dump a mini-core will make it easier for average user to submit their problems (also, we can even implement a script to collect the backtrace information on startup where possiblem, and automatically mail the result to administrator to permit him or her to forward that to the development team).

Implementing the mini-core needs to modify several parts, including kernel itself(of course) and gdb in order to support it. Also, the dumpcore could be improved to have some ablity to compress the core it will dump - when there are many coredumps and administrator have no time to clean them up, this will greatly save their time.