Research Project - Stephen Robertson - 2005

Using Hierarchical Reinforcement Learning to Solve a Complex Problem Consisting of Multiple Conflicting Sub-problems

Click here to see my graphics project

View my project proposal

View my literature review

View my short paper

View my Poster

View my first presentation

View my second presentation

View my final presentation

View my final thesis

Reinforcement learning is a method whereby a system learns to perform the most advantageous action in a given situation by means of maximizing a numerical reward. At any point in the problem the system is not told what to do, but rather allowed to make a choice on its own, and then given a measure of how good the action was.

For relatively small problems, reinforcement learning works well, but as the complexity grows, this approach becomes increasingly inefficient. Therefore for more complex problems, a different approach needs to be adopted. One alternative approach is that of hierarchical reinforcement learning (also called feudal reinforcement learning).

Hierarchical reinforcement learning breaks a complex problem up into smaller sub-problems, and then relies on a scheduler, which uses reinforcement learning, to pick which sub-problem to attempt next. Each sub-problem in turn is also attempted using reinforcement learning, and hence the hierarchy is created.

The aim of this project is to apply the concept of hierarchical reinforcement learning to a complex problem, where reinforcement learning would prove inadequate because of the large scale of the problem. The problem will consist of an agent interacting with a given environment. The agent will have various needs, being food, comfort, health, which will have to be balanced in order to remain ‘happy’. The hierarchical scheduler will be in charge of balancing these needs, while the agent itself will just be concentrating on gaining a reward for a specific task at any time.

Division of time
2 weeks As a starting point, I intend understanding all fundamental aspects of reinforcement learning. I think a good understanding of reinforcement learning is necessary before attempting to completely understand hierarchical reinforcement learning. After that I will do some research into hierarchical reinforcement learning.
2 weeks It will be necessary to build a simple gridworld in which the agent can exist. I think Java would be a good language choice, as it handles simple graphics very well, in the form of applets. It also contains all the necessary abstract data types that will be used. Before attempting to build an extremely complex world, in which all the proposed aspects exist, it will be a good idea to keep it simple at first, and once it is functioning sufficiently for the simplified case, it can be further extended.
4 weeks Once the gridworld is functioning, it will be necessary to implement hierarchical reinforcement learning. This will first consist of determining some kind of reward function for each of the separate needs, then implementing the scheduler.
4 weeks This time will be allocated for the actual annotation of data and results and of the actual thesis.
2 weeks According to the literature, there seem to be various approaches to hierarchical reinforcement learning. These include different methods of identifying sub-problems, and different ways of implementing the scheduler. Time permitting, it would be nice to implement some of these different approaches, and see how they fare on my system.
2 weeks For the last few weeks optimization and testing will need to be done.