While reviewing the comments on the Ironic spec, for Secure RBAC. I had to ask myself if the “project” construct makes sense for Ironic. I still think it does, but I’ll write this down to see if I can clarify it for me, and maybe for you, too.
Baremetal servers change. The whole point of Ironic is to control the change of Baremetal servers from inanimate pieces of metal to “really useful engines.” This needs to happen in a controlled and unsurprising way.
Ironic the server does what it is told. If a new piece of metal starts sending out DHCP requests, Ironic is going to PXE boot it. This is the start of this new piece of metals journey of self discovery. At least as far as Ironic is concerned.
But really, someone had to rack and wire said piece of metal. Likely the person that did this is not the person that is going to run workloads on it in the end. They might not even work for the same company; they might be a delivery person from Dell or Supermicro. So, once they are done with it, they don’t own it any more.
Who does? Who owns a piece of metal before it is enrolled in the OpenStack baremetal service?
No one. It does not exist.
Ok, so lets go back to someone pushing the button, booting our server for the first time, and it doing its PXE boot thing.
Or, we get the MAC address and enter that into the ironic database, so that when it does boot, we know about it.
Either way, Ironic is really the playground monitor, just making sure it plays nice.
What if Ironic is a multi-tenant system? Someone needs to be able to transfer the baremetal server from where ever it lands up front to the people that need to use it.
I suspect that ransferring metal from project to project is going to be one of the main use cases after the sun has set on day one.
So, who should be allowed to say what project a piece of baremetal can go to?
Well, in Keystone, we have the idea of hierarchy. A Project is owned by a domain, and a project can be nested inside another project.
But this information is not passed down to Ironic. There is no way to get a token for a project that shows its parent information. But a remote service could query the project hierarchy from Keystone.
Say I want to transfer a piece of metal from one project to another. Should I have a token for the source project or the remote project. Ok, dump question, I should definitely have a token for the source project. The smart question is whether I should also have a token for the destination project.
Sure, why not. Two tokens. One has the “delete” role and one that has the “create” role.
The only problem is that nothing like this exists in Open Stack. But it should.
We could fake it with hierarchy; I can pass things up and down the project tree. But that really does not one bit of good. People don’t really use the tree like that. They should. We built a perfectly nice tree and they ignore it. Poor, ignored, sad, lonely tree.
Actually, it has no feelings. Please stop anthropomorphising the tree.
What you could do is create the destination object, kind of a potential piece-of-metal or metal-receiver. This receiver object gets a UUID. You pass this UUID to the “move” API. But you call the MOVE api with a token for the source project.  The move is done atomically. Lets call this thing identified by a UUID a move-request.Â
The order of operations could be done in reverse. The operator could create the move request on the source, and then pass that to the receiver. This might actually make mores sense, as you need to know about the object before you can even think to move it.
Both workflows seem to have merit.
And…this concept seems to be something that OpenStack needs in general.
Infact, why should the API not be a generic API. I mean, it would have to be per service, but the same API could be used to transfer VMs between projects in Nova nad between Volumes in Cinder. The API would have two verbs one for creating a new move request, and one for accepting it.
POST /thingy/v3.14/resource?resource_id=abcd&destination=project_id
If this is called with a token, it needs to be scoped. If it is scoped to the project_id in the API, it creates a receiving type request. If it is scoped to the project_id that owns the resource, it is a sending type request. Either way, it returns an URL. Call GET on that URL and you get information about the transfer. Call PATCH on it with the appropriately scoped token, and the resource is transferred. And maybe enough information to prove that you know what you are doing: maybe you have to specify the source and target projects in that patch request.
A foolish consistency is the hobgoblin of little minds.
Edit: OK, this is not a new idea. Cinder went through the same thought process according to Duncan Thomas. The result is this API: https://docs.openstack.org/api-ref/block-storage/v3/index.html#volume-transfer
Which looks like it then morphed to this one:
https://docs.openstack.org/api-ref/block-storage/v3/index.html#volume-transfers-volume-transfers-3-55-or-later
We went through some of this dance in cinder… there’s an API called ‘volume transfer’ for giving a volume from one user to another. It was written when keystone was in a much more primitive state than it is now, but it might still be instructive to take a look at.
The tl;dr version is that we used an out-of-band token, generated when the owner calls ‘transfer volume’ to initiate the transfer, and used by the receiving party to take ownership of the volume (if the sender didn’t take it out of a transfering state before the receiver redeemed their token).
It has been a couple of years since I looked at it, so you’ll have to look at the code for any other details, but this has the advantage that keystone hierarchy doesn’t matter, you could transfer a volume/server/etc to any user, assuming you have permissions to give a volume away.
Cool. Updated the article to point to the transfer APIs. Looks like there was a redesign. What lessons were learned from this? What is the difference in focus between the pre and post 3.55?
I think at a higher level, being able to move resources from one project to another project is an age old problem for OpenStack. But if physical machine owner (because, we have this concept, it is just not widely used) needs to reclaim the hardware from the lessee who is currently using the hardware, then they shouldn’t have to wait. They may not be able to wait. A disaster may be underway and their agreement with the lessee permits the machines to be taken back at any time. As such they should just be able to do it. Naturally there likely need to be workflows wrapped around this with physical metal. The boundary between the owner and lessee may be friendly, or not. And I’m not really sure that is the system admin’s problem, as long as work flow is enforced or able to be bypassed, which may be separate policy rights in itself.
I completely agree that OpenStack as a whole needs a single, central, orderly mechanism to move resources, however workflows still need to be able and be engaged or not, and that is ultimately an operational decision of the cloud operator. It is a good thing to have, but an easy thing to kind of focus in on. That may not be the right thing in that instance or moment to be focused on. Hopefully that makes sense.
I think the project construct, at least of “baremetal api lives in its own project” construct is limiting, and creates an inconsistent user experience. I think that is in part what your advocating, that ironic in essence remains unchanged experience wise which is completely confusing to me. Regardless there is definitely an intermediate party between a physical hardware owner and lessee who can and should be able to be that monitor and facilitator. And I think the RBAC model actually strengthens that and the underlying protections along with separation of roles and responsibilities.
Of course, I may not be making any sense at all. 🙂