A Possible Fix: Xen HVM & Windows 2008

Recently, I switched my VPS from VMware ESX to a provider with Xen HVM since my original host disappeared off the face of the planet. However, the Xen HVM VPS has decided to randomly BSOD every 24-48 hours or so. Here’s how I troubleshooted and possibly fixed the issue.

The Issue:
BSODs randomly occur with the error “A clock interrupt was not received on a secondary processor within the allocated time interval.” and STOP code 0×00000101.

Caused by:
Well, the exact issue is unknown. I have a feeling that this occurs when the host has a higher than usual load, which causes some sort of delay of the RTC and eventual crashes. Examining 3 crash dumps showed that the Realtek networking driver (Rtnic64.sys) is always near the top of the stack trace with something to do with the RTC in the kernel.

The solution:
There are a few solutions. I’m not sure which one fixes the issue exactly, but here’s what you can try:

First try modifying your VM config to add these options (the first ensures time is always “wall clock time” and the second exposes the Hyper-V interface to the VM):
timer_mode = 2
viridian = 1

You can then also try
os_variant = "vista"

And if worse comes to worse, limit to 1 CPU core
vcpus = 1

Finally, if you run Windows you should install the GPL PV drivers. If you want to run Windows, you MUST use Xen HVM (Xen PV is unavailable). The default drivers when using HVM are designed to work with QEMU’s emulated devices and therefore carry a performance degrading overhead. Luckily, there’s been an effort to create PV drivers for Windows, so the VM can work more directly with the host’s hardware, creating less overheads and a significant performance boost.

Download the latest drivers (32 or 64bit, 2003/2008) from: http://www.meadowcourt.org/downloads/

If you run a 64-bit version of Windows later than 2003 (i.e. Vista, 7, 2008, 2008 R2) you need to disable driver signature integrity checks. You can also create a secondary boot option to disable the PV drivers and revert back to the less optimized drivers. For example:

Disable Driver Integrity Checks
bcdedit /set loadoptions DISABLE_INTEGRITY_CHECKS
bcdedit /set testsigning ON
bcdedit /set nointegritychecks ON

Create secondary boot option
bcdedit /enum /v
Note the GUID string next to "identifier" under "Windows Boot Loader"
bcdedit /copy {string from above} /d "Windows Server 2008 NOGPLPV"
Description can be whatever you want
bcdedit /set {new GUID from above} LOADOPTIONS "NOGPLPV"
Use the GUID outputted from the 2nd command

Now install the MSI, restart, open up Device Manager and you should find you have no more “Unknown Devices” and that you now have a different network card (among other things) installed.

, , ,