This OpenStack summit marks the third that I have attended where we’ve discussed the algorithms to try and record quota in Keystone but not update it on each resource allocation and free.
We were stumped, again. The process we had planned on using was game-able and thus broken. I was kinda bummed.
Fortunately, I had a long car ride from Vancouver to Seattle and talked it over with Morgan Fainberg.
We also discussed the Pig War. Great piece of history from the region.
By the time we got to the airport the next day, I think we had it solved. Morgan came to the solution first, and I followed, slowly. Here’s what we think will work.
First, lets get a 3 Level deep project setup to use for out discussion.
The rule is simple: even if a quota is subdivided, you still need to check the overall quota of the project and all the parent projects.
In the example structure above, lets assume that project A gets a quota of 100 units of some resource: VMs, GB memory, network ports, hot-dogs, very small rocks, whatever. I’ll use VMs in this example.
There are a couple ways this could be further managed. The simplest is that, any resource allocated anywhere else in this tree is counted against this quota. There are 9 total projects in the tree. If each allocate 11 VMs, there will be 99 created and counted against the quota. The next VM created uses up the quota. The request after that will fail due to lack of available quota.
Lets say, however, that the users in project C33 are greedy, and allocate all 100 VMs. The people in C11 Are filled with righteous indignation. They need VMs too.
The Admins wipe everything out and we start all over. They set up a system to fragment the quota by allowing A project to split its quota assignment up and allocate some of it to subordinate projects.
Project A says “I’m going to keep 50 VMs for myself, and allocate 25 to B1 and B2.”
Project B1 Says I am going to keep 10 for Me and I’m going to allocate 5 to each C11, C12, C13. And the B1 Tree is happy.
B2 is a manipulative schemer and decides to play around. B2 Allocates his entire quota of 25 to C21. C21 Creates 25 VMs.
B2 now withdraws his quota from C21. There is no communication with Nova. The VMs keep running. He then allocates his entire quota of 25 VMs to C22, and C22 creates 25 VMs.
Nova says “What project is this? C22? What is its quota? 25? All good.”
But in reality, B2 has doubled his quota. His subordinates have allocated 50 VMs total. He does this again with project C33, gets up to 75 VMs, and contemplates creating yet another project C34 just to keep up the pattern. This would allocate more VMs than project A was originally allocated.
The admins notice this and get mad, wipe everything out, and start over again. This time they’ve made a change. Whenever the check quota on a project, they also will go and check quota on the parent project, counting all VMs underneath that parent. Essentially, they will record that a VM created in project C11 is also reducing the original quota on B1 and on A. In essence, they record a table. If the user creates a VM in Project C11, the following will be recorded and check for quota.
When a User then creates a VM in C21 the table will extend to this:
In addition, when creating the VM2, Nova will check quota and see that, after creation:
- C21 now has 1 out of 25 allocated
- B2 now has 1 out of 25 allocated
- A now has 2 out of 100 allocated
(quota is allocated prior to the creation of the resource to prevent a race condition)
Note that the quota is checked against the original amount, and not the amount reduced by sub allocating the quota. If project C21 allocates 24 more VMs, to quota check will show:
- C21 now has 25 out of 25 allocated
- B2 now has 25 out of 25 allocated
- A now has 26 out of 100 allocated
If B2 tried to play games, and removes the quota from C21 and gives it to C22, project C21 will be over quota, but Nova will have no way to trap this. However, the only people this affects is other people within projects B2, C21, C22, and C23. If C22 attempts to allocate a virtual machine, the quota check will show that B2 has allocated its full quota and cannot create any more. The quota check will fail.
You might have noticed that the higher level projects can rob quota from the child projects in this scheme. For example. If Project A allocates 74 more VMs now, project B1 and children will still have allocated quota, but their quota check will fail because A is at full. This could be mitigated by having 2 checks for project A: total quota (max 100), and directly allocated quota (max 50).
This scheme removes the quota violation by gaming the system. I promised to write it up so we could continue to try and poke holes in it.
EDIT: Quota would also have to be allocated per endpoint, or the endpoints will have to communicate with each other to evaluate usage.