Troubleshooting PXE in SCCM OSD Part 2 Troubleshooting the TFTP Service
Now that the PXE process is working correctly, we can look at troubleshooting errors surrounding abortpxe.com. If you get this error message then you at least have a working PXE environment, even if SCCM doesn't think it should offer a Task Sequence to the machine. Here are some reasons you'll get this error.
- The machine has not been registered in a build collection
The simplest of all reasons why you get this error. Does the machine have a Task Sequence advertised to it? If not, create a collection, advertise a Task Sequence to that collection and add your machine to the collection. Check smspxe.log, you should see an error such as
ProcessDatabaseReply: No Advertisement found in Db for device 05/03/2011 08:51:36 10368 (0x2880)
- The machine has been recently registered in a build collection, but the server takes some time (up to an hour) to process this information
This can be commonly seen when a technician PXE boots the machine to write down the MAC address. If you then create a new computer object based on the MAC address, you need to wait an hour before the WDS service will lookup the database again. You can see this happening in the smspxe.log with an entry such as
MAC=FF:FF:FF:FF:FF:FF:FF:FF:FF:FF:FF:FF:FF:FF:FF:FF SMBIOS GUID=00000000-0000-0000-0000-000000000000 > Device not found in the database. 07/03/2011 15:18:46 8552 (0x2168)
This is fixed by this hotfix or SP2 for SCCM. The patch alone won't fix this behavior, you also need to configure a registry setting. If it doesn't already exist, create a
...or on a 64-bit server at...
Set the value of CacheExpire to the value you want in seconds - a value of 600 would be a timeout of 10 minutes. On a SP2 SCCM site, setting the value to be 0 will actually set the timeout to 3600 seconds (back to the 1 hour timeout).
If you are unable to apply the hotfix or SP2, stopping and restarting the WDS service can flush out the cache.
- The SMBIOS guid of the machine is not unique
This can be seen if you have older hardware, or if you've had an engineer swap out some motherboards and not flashed the BIOS correctly.
If you want to find out which machines have duplicate SMBIOS guids then you can run this report-
SELECT SMBIOS_GUID0, COUNT(SMBIOS_GUID0) AS Count
GROUP BY SMBIOS_GUID0, Active0, Client0, Obsolete0
HAVING (Active0 = 1) AND (Client0 = 1) AND (Obsolete0 = 0) AND (COUNT(SMBIOS_GUID0) > 1)
You can then use the following report to pull out the names of the machines with duplicate SMBIOS guids-
SELECT SMBIOS_GUID0, Name0
WHERE SMBIOS_GUID0= '00000000-0000-0000-0000-000000000000'
-where 00000000-0000-0000-0000-000000000000 is the GUID that you identified in the previous report.
The only way to solve this problem is to flash the BIOS on the affected workstation to set a unique SMBIOS guid. Contact the PC vendor for the tool to do this.
- The machine is linked to an obsolete object on the server
This can happen if you have "Automatically create new client records for duplicate hardware IDs" set in the Advanced tab of your site properties. The solution to this one is to manually delete those obsolete objects.
- The machine was imaged using a technology such as Ghost, but the SID and/or SCCM client guid were not reset
This can be a bit of a pain to troubleshoot on the server - I once saw a machine that according to the SCCM reports had 30 separate users logging into it. Since this machine was kept in a locked office, this appeared to be a bit odd. It turned out the support team had used Ghost to image one of their machines and then deployed this image to all the machines in their department.
This highlights a wider point in deploying SCCM in your environment - the process and procedures that worked in the past may need revising. In this case, they'd never had a problem before because their authentication was handled by Netware.
The easiest way to fix this problem is to power off the machine, delete the computer object in SCCM, recreate the record manually then PXE build the machine.
Other pre-Windows PE errors
- \Boot\BCD error
Assuming you can get past abortpxe.com, there is another error you can see at this stage. After pressing the F12 key to PXE boot you can sometimes see
Windows Boot Manager (Server IP: x.y.z.a)
Windows failed to start. A recent hardware or software change might be the cause.
Info: An error occurred while attempting to read the boot configuration data.
The simple solution is to delete the computer object and recreate it, which should fix this problem. I've only ever seen this problem with SCCM 2007 SP2 when deploying Windows 7.
This does look like a bcd error, but in the SCCM implementation of WDS there is no single boot.bcd file, the boot.bcd file is created on the fly in the
RemoteInstall\SMSTempfolder with a name of year.month.day.hour.minute.number.number.guid.boot.bcd.
If anyone knows the actual fix for this (without having to delete the computer object) please post in the comments!
- Only using 32-bit boot images when you have 64-bit machines in your environment
Again, this one seems a bit odd. If your workstation is 64-bit (and you'd be hard pressed to find a non-64-bit machine these days), then you need the 64-bit boot files available - even if you are only deploying 32-bit Windows, and are using a 32-bit boot image. The 64-bit boot files are extracted from the boot image and used during the initial PXE process, so if they're missing, you won't be able to PXE boot a 64-bit machine.
If you're getting this error, you'll see something like this in smspxe.log
The SMS PXE Service Point does not have a boot image matching the processor architetcure of the PXE booting device.