Skip main navigation

Recognising a reinforcement learning situation

Recognising a reinforcement learning situation

In reinforcement learning, we develop algorithms which associate actions with situations with the goal of maximizing a reward signal. This reward signal, which is invariably numerical, is used as the basis for feedback to determine if our future actions and their associations with situations should be adjusted.

This feedback mechanism creates a closed-loop system, a concept commonly explored in engineering fields like control systems It is important to highlight that the feedback given does not give explicit information regarding what actions should be adopted or how precisely they should be tuned. Rather, the learning entity must discover and explore which actions work best in a given situation. Finally, when I say “work best”, we may need to think not only about the immediate reward but rewards into the future.

Consequently, the most distinctive characteristics of reinforcement learning are:

  • It takes the form of a closed-loop interaction.
  • It takes place without explicit instruction as to the best action.
  • It involves actions that can have consequences extending far into the future beyond the current moment of the agent’s experience.

Essential Components Required to Formalize a Reinforcement Learning Challenge

While there are much more formal, mathematical approaches to specifying the reinforcement learning challenge we can, at this stage, make distinct elements of the problem for the purposes of abstraction and structuring.

Firstly, we have our entity which is doing the learning. We commonly refer to this as the agent. This agent interacts with its environment. To do this, just as in our example of the novice cyclist, the agent needs to be able to sense its environment and characterize it in terms of state. It must be able to perform actions, in order to interact with this environment too. Finally, the agent needs an objective which we will term a goal which it achieves through interaction with its environment. Reinforcement learning, therefore, refers to any method which performs well in solving these sorts of problems.

References:

Sutton, R. and Barto, A. (2018) Reinforcement Learning: An Introduction, 2nd ed., Cambridge, MA: MIT Press.

© Dublin City University
This article is from the free online

Reinforcement Learning

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now