Tracking Quota

This OpenStack summit marks the third that I have attended where we’ve discussed the algorithms to try and record quota in Keystone but not update it on each resource allocation and free.

We were stumped, again. The process we had planned on using was game-able and thus broken. I was kinda bummed.

Fortunately, I had a long car ride from Vancouver to Seattle and talked it over with Morgan Fainberg.

We also discussed the Pig War. Great piece of history from the region.

By the time we got to the airport the next day, I think we had it solved. Morgan came to the solution first, and I followed, slowly. Here’s what we think will work.

First, lets get a 3 Level deep project setup to use for our discussion.

The rule is simple: even if a quota is subdivided, you still need to check the overall quota of the project and all the parent projects.

In the example structure above, lets assume that project A gets a quota of 100 units of some resource: VMs, GB memory, network ports, hot-dogs, very small rocks, whatever. I’ll use VMs in this example.

There are a couple ways this could be further managed. The simplest is that, any resource allocated anywhere else in this tree is counted against this quota. There are 9 total projects in the tree. If each allocate 11 VMs, there will be 99 created and counted against the quota. The next VM created uses up the quota. The request after that will fail due to lack of available quota.

Lets say, however, that the users in project C33 are greedy, and allocate all 100 VMs. The people in C11 Are filled with righteous indignation. They need VMs too.

The Admins wipe everything out and we start all over. They set up a system to fragment the quota by allowing a project to split its quota assignment up and allocate some of it to subordinate projects.

Project A says “I’m going to keep 50 VMs for myself, and allocate 25 to B1 and B2.”

Project B1 Says I am going to keep 10 for Me and I’m going to allocate 5 to each C11, C12, C13. And the B1 Tree is happy.

B2 is a manipulative schemer and decides to play around. B2 Allocates his entire quota of 25 to C21. C21 Creates 25 VMs.

B2 now withdraws his quota from C21. There is no communication with Nova. The VMs keep running. He then allocates his entire quota of 25 VMs to C22, and C22 creates 25 VMs.

Nova says “What project is this? C22? What is its quota? 25? All good.”

But in reality, B2 has doubled his quota. His subordinates have allocated 50 VMs total. He does this again with project C33, gets up to 75 VMs, and contemplates creating yet another project C34 just to keep up the pattern. This would allocate more VMs than project A was originally allocated.

The admins notice this and get mad, wipe everything out, and start over again. This time they’ve made a change. Whenever they check quota on a project, they also will go and check quota on the parent project, counting all VMs underneath that parent. Essentially, they will record that a VM created in project C11 is also reducing the original quota on B1 and on A. In essence, they record a table. If the user creates a VM in Project C11, the following will be recorded and check for quota.

VM	Project
VM1	A
VM1	B1
VM1	C11

When a User then creates a VM in C21 the table will extend to this:

VM	Project
VM1	A
VM1	B1
VM1	C11
VM2	A
VM2	B2
VM2	C21

In addition, when creating the VM2, Nova will check quota and see that, after creation:

C21 now has 1 out of 25 allocated
B2 now has 1 out of 25 allocated
A now has 2 out of 100 allocated

(quota is allocated prior to the creation of the resource to prevent a race condition)

Note that the quota is checked against the original amount, and not the amount reduced by sub allocating the quota. If project C21 allocates 24 more VMs, the quota check will show:

C21 now has 25 out of 25 allocated
B2 now has 25 out of 25 allocated
A now has 26 out of 100 allocated

If B2 tried to play games, and removes the quota from C21 and gives it to C22, project C21 will be over quota, but Nova will have no way to trap this. However, the only people this affects is other people within projects B2, C21, C22, and C23. If C22 attempts to allocate a virtual machine, the quota check will show that B2 has allocated its full quota and cannot create any more. The quota check will fail.

You might have noticed that the higher level projects can rob quota from the child projects in this scheme. For example. If Project A allocates 74 more VMs now, project B1 and children will still have allocated quota, but their quota check will fail because A is at full. This could be mitigated by having 2 checks for project A: total quota (max 100), and directly allocated quota (max 50).

This scheme removes the quota violation by gaming the system. I promised to write it up so we could continue to try and poke holes in it.

EDIT: Quota would also have to be allocated per endpoint, or the endpoints will have to communicate with each other to evaluate usage.

Edit: And many years later I find myself returning to this article. The thing that struck me was that freeing quota can only be done after the resource itself is destroyed. This seems obvious, but it does show the complexity of moving quota from one location to another. Withdrawal of quota can be done prior to resource destruction, which will only result in a report that a subordinate is over quota, but that used up quota cannot be reallocated until is freed.

The act of withdrawing quota has to have the effect of preventing any party from re-allocating a new virtual machine that would use of that quota. Say B2 is operating at full quota and that quota is 1 VM allocated to C22. If B2 is moving the quota from C22 to C23, you could record the transaction as pending until C22 destroys its VM. Until then, C22 is shown as over quota, and B2 is at full. Once the VM in C22 is destroyed, the transaction can move forward. C23 has an available quota for the VM it wishes to create.

Put another way, once B2 removes the quota from C22, nothing effectively happens. A report would show C22 over quota, but B2 is fine. As is A. C23 would have the quota assigned. Prior to creating a virtual machine, a quota check would start with C23 and say “quota available” and allow the creation to proceed. Next the check would look at the parent of C23, which is B2. At this point, the quota check would return “insufficient quota” and the check would fail, stopping the new machine allocation.

Adam Young's Web Log

The Notebook of a Programmer Climber Musician Ex-Soldier Woodworker and a few other things

Leave a Reply