Ironic Clean PXE failure

One of our ironic baremetal nodes was suffering a cleaing failure. Fixing it was easy…once we knew the cause/

Cleaning is a process by which ironic prepares a node for use. It removes old data and configuration from a node. In order to do that, it has to run a simple image. We use a Debian based image, known as the IPA image, as it runs the Ironic Python Agent. This image is installed via PXE boot. So, if the PXE setup is broken, the node can’t be cleaned.

I watched the node boot messagesvia the ipmi Serial over LAN (SOL) console. What I saw was that there indicated “no response from PXE.”

The message is specific to the hardware you run.

The PXE server in Ironic matches the node to the the MAC address via the baremetal port.

To find out what the port is foir a given node, use the command like:

openstack baremetal port list  --node ac5bf47b-7185-4db5-ab24-7a396deeaf33

Which shows output like this:

+--------------------------------------+-------------------+
| UUID                                 | Address           |
+--------------------------------------+-------------------+
| 268926f8-eab5-4bdf-8b63-7337da43dd52 | 1c:34:da:5a:c7:b0 |
+--------------------------------------+-------------------+

Then, check the (MAC) address returned with the MAC address reported by PXE. In my case, they did not match. I created a new port with:

 openstack baremetal port create  --node ccad9fe2-1f04-45f1-a4cb-1a993f2a3b69  0c:42:a1:49:ee:c4

And now port list will show both ports. Delete the old one.

If the node is in the clean wait stage, you can use the following command to get it ready for cleaning:

openstack baremetal node abort ccad9fe2-1f04-45f1-a4cb-1a993f2a3b69

And then restart the cleaning process.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.