How to do Timetabling Optimization with Python?
Diego Akel, Tassia Forasteiro supervised by PhD Renata Onety
Summary
We tried to explain in this article how to optimize a school timetabling with Genetic Algorithm in Python. This article is focused on using teacher satisfaction as objective function to Genetic Algorithm selection.
Introduction
A recurring problem in any institution that needs to allocate people and resources to a limited set of schedules is how to make this combination as optimized as possible.
That’s not so easy, given that there are several different requirements and limitations, arising from business rules (opening hours, physical limits of spaces etc) and from restrictions on people and resources.
This is why this problem is combinatoric in nature.It becomes extremely complex, being it is not feasible to solve them manually and reach an optimal solution.
We showed here a mathematical model and solution to this problem,called Timetabling.
Problem Statement
Every year, college coordinators and professors are faced with a common problem: class timetables need to be redefined.
Restrictions for that:
- Availability of timetables
- Number of subjects
- Availability of teachers
- Availability of classrooms.
For this distribution we also need to take into account the performance of students and teachers, and many schools fail to see the importance of this in the allocation of timetables — Suarez Castrillón, 2013
What was our solution?
Answer — Design a mathematical solution to create a timetable taking into account each teacher’s satisfaction. For this we used a mono-objective genetic algorithm to choose the best timetable for the school, maximizing teacher’s satisfaction in each generation.
Objective Function — Defining Teacher’s Satisfaction
Focusing on teacher satisfaction, we start from the rule that we will have an input from each teacher, giving a grade from 1 to 5 (with 1 = horrible and 5 = great) for each of the available times.
Thus, after input from a teacher for 3 times, we would have a matrix M of the following format:
And after allocating a teacher to their schedules, we will have an allocation matrix T with binary values (1 = teacher allocated for that time, 0 = teacher not allocated), in the following format:
And multiplying those two matrices, we have:
And with those examples we would end up with the following F matrix that represents the satisfaction of this teacher:
And then, to score the full timetable, with all classes and all teachers, we add the satisfaction values of all teachers with the following equation:
The logic behind Genetic Algorithm
The following picture show us the phasis to build an Genetic Algorithm:
Initialization
An initial random generation of schedules is created. This generation, or initial population is composed of several timetable recommendations for possible periods, which meet all
strong restrictions, such as: teachers cannot be in two classes at the same time and all
disciplines of the period need to be taught.
Fitness Assessment
The population in question is analyzed by the Sat satisfaction criterion, through the objective function.
Each teacher’s satisfaction matrix is multiplied by their matrix of allocated times, and then all values in the matrix are added together, resulting in a single satisfaction value for the teacher.
To define when the optimization cycle can be ended, returning the final values, four stop parameters were defined:
- If the semester remains identical, without any different subject, for 5 generations, the cycle is ended;
- If the best semester has no improvement for 30 generations, the cycle is
finished;
- If all semesters in the population are identical, the cycle is ended.
- If any semester reaches the maximum value of relative satisfaction, the cycle is ended.
Selection
The semesters are then sorted in descending order based on the satisfaction value. All the semesters which are beyond a given position are eliminated from the generation (For example, for 100 semesters created and the generation limit equals to 90. The last 10 in the rank are eliminated from the next step).
Crossover
After ordering the semesters, the first and best semester in the population is combined with the last and worst semester. The combination consists of randomly choosing some subjects from one semester (Father 1) and the rest of subjects from another semester (Father 2). A new semester recommendation will be presented as a result of the combination of these two parents.
Mutation
For each semester selected from the previous stages, the subjects from them are swapped to generate more variability and increase the chance of getting more different semesters, bringing the space of solutions from “Local Optima” to “Global Optima” (Intrinsic Concept of Optimization).
Combining all semesters together
To solve this problem we did everything in Python and used mainly the libraries Pandas, Numpy and Seaborn for visualization. We created two classes, one for scholar Semesters and one for Subjects.
Since many semesters share the same teachers, it is mandatory to optimize them all together so no teacher is set to give two classes at the same time. To solve this problem, we firstly optimize each semester separately, by creating random allocations for each teacher in this semester and then combining multiple semesters to achieve a higher satisfaction rank. This individual optimization for each semester is done many times, and thus we are left with many different options to chose from.
After this optimization is done with all semesters we do the combination between the semesters themselves. For this we start out with the best option of one (randomly chosen) semester and try to combine it with the best option of other (randomly chosen) semester, and if it is not possible because some restriction are broken, we try the second best option of this semester, and so on. This is done for all semesters until we have the complete timetable.
Like the following pictures:
Results
After doing the optimization process for all of the semesters, we also created a random timetable for all of the semesters as a benchmark, without the optimization, only ensuring that the restrictions were met.
You can access the code here.