The project was done as part of the Home Depot Reinforcement Learning Hackathon. The goal of this hackathon was to optimize deliveries given a grid with locations of stores and a central location. We had a certain amount of trucks, with their own capacity with which we needed to maximize our earnings.
To build our model, we used Gym to set up an environment for it to learn, then set up the boundaries we wanted. Then we let it train, and built a front end view in order to show the movements of the trucks over time. You can view the code for the project here. We ended up winning the award for the best UI at the competition and learning a lot about Reinforcement Learning in practice.