At work we have an ASP-powered business management system that includes, among other things, a cleaning rota. Because on a given day a staff member may be absent (client meetings, illness, holiday) it gained the ability to skip a user for a day.

This system has some implementation issues: the circular list has an edge condition when skipping the most recently hired staff member and the code to email people when the rota has been manually moved on … doesn’t. These are besides the point though, because the algorithm is biased towards people who are out of the office frequently.

The (Bad) Science Bit

Times have changed and more of our staff work from home, often on set days of the week. If you work from home on a Wednesday and a Friday then at best you have 3/5 odds of actually doing the clean up when you’re in. Worse than this – if it lands on you on a day that you’re out of the office, it may take 20 working days to cycle back around to you, and then it’s still 3/5 odds.

>>> (1.0 / 21) * (3.0 / 5) * (365 - (52 * 2))
>>> (1.0 / 21) * (365 - (52 * 2))

There’s 4.9 days difference between the most fair (you get picked one in 21 times) and the least fair (you get picked one in 35 times). I have not accounted for bank holidays or annual leave in the above, but the number of days doesn’t have much impact on fairness.

A New Hope

I propose the below:

import random

staff = {'User %d' % k :  for k in range(21)}

def pick():
    most_cleanups = max(staff.values()) > min(staff.values()) and \
        max(staff.values()) or 1 + max(staff.values())
    pool = []
    for k, v in staff.items():
        shares = most_cleanups - v
        pool.extend([k] * shares)
    choice = random.choice(pool)
    if random.random() > 0.75:
        print 'Skipped', choice
        print 'Picked', choice
        staff[choice] += 1

for i in range(80000):

results = [(v,k) for k, v in staff.items()]
print 'Total "unfairness" of the system:', abs(results[][] - results[-1][])

To create a simulation, I set up 21 fake staff members, and initialise them has having never done the cleaning. The process of picking clean-up on a given day is randomised, but everyone who is tied as having done clean-up the most is excluded from being picked.

Everyone else is added to a list with a weighting based on how many days behind “the lead” they are. If you’ve been skipped twice, it’s likely you are 2 or 3 days behind your co-workers. You get one share for each day.

A share is picked at random.

For the purposes of the simulation, I’m using a random threshold of 0.75 to decide if someone should be skipped. Obviously in reality that would be based on if they are in the office on a given day.

The system is fair in both the short and long term. I’ve run it through 80,000 cycles, and (spoiler alert) it’s still fair:

Picked User 19
Skipped User 1
Picked User 15
Picked User 7
Skipped User 9
Picked User 8
Skipped User 20
Picked User 20
Total "unfairness" of the system: 1

This is measuring fairness as the difference between the staff member who cleans up the least and the one who cleans up the most.

Is it really fair?

Well: no. There’s a pretty good argument to say if you’re only in the office 3 days a week that you should only have 3/5 of the chance of being picked as a full-time staff member.

The above is only one possible algorithm, fairer than what we have now: hopefully some discussion will shake out an even better one.