Reinforcement learning is a type of machine learning in which an agent (A.I.) interacts with the environment and learns from the feedback that their minds give them the order to perform a certain action. In this learning, there is no supervisor who guides him or trains him. Here is only a real number and reward from which he gets to learn.
- Environment:- The situation that an agent has to face.
- Agent:- An A.I. which performs an action in an environment to get the reward.
- State:-Current situation where the agent is performing.
- Reward:- Return (in the form of real number ) given to agent which indicates that the agent has taken the right action or not.
Types of reinforcement learning
So Whenever an agent takes action, the state is going to change and it will get a reward based on the action.
There are two types of reinforcement learning:-
When an event occurs because of particular behavior which increases the strength and frequency of the behavior. Its advantages are, it maximizes the performance and sustains change for a longer period of time.
e.g.:- A student gets a reward for acquiring the top position. So here the reward he receives is the positive RL.
It is defined as the strengthening of behavior that occurs because of negative condition which should have stopped or avoided.
e.g.:- A child cleans their room. His parents come and disturb him(-ve behaviour) or ask him to clean the room repeatedly. Here the disturbance(-ve behaviour) reinforce the behaviour of cleaning because the child wants to remove that disturbance.
Learning models of reinforcement learning
There are two models in reinforcement learning:-
- MDP(Markov decision process)
- Q learning
Markov decision process:-
To solve any problem, we have to formulate it mathematically. This is where MDP comes in.
(see above fig.)Following parameters are used to get a solution:
- set of actions – A
- set of states – S
- Reward – R
- Value – V
- Policy – n
The Q stands for quality in Q learning. Its main objective is to learn the policy which tells the agent that what actions should be taken to maximize the reward under what situations.
Applications of reinforcement learning
Many sectors use the applications of RL to minimize human effort. Here are the applications:-
1. Game playing
RL has been already used in many games such as chess, AlphaGo, Atari breakout, Pacman, tic-tac-toe game, etc. Currently, many game industries are using RL to increase performance. I recommend you to watch the documentary AlphaGo (how AlphaGo defeats world champion Lee Sedol).
Businesses use reinforcement learning’s applications for business strategy planning.
Many finance companies use RL for the trading strategy(Stock market) to predict the price of stocks going up or down.
Industries use Robots to automate the system. Many industries currently use preprogrammed robots, not RL robots. But now companies are gradually using RL robots.
RL is used for adaptive control such as admission control in telecommunication, aircraft control. RL is also taking the place of the Helicopter pilot.
It helps you to create training systems that provide custom instructions and materials according to the requirement of students.
When reinforcement learning should be used
- It helps you to discover the highest reward(best result) for a long time.
- It provides the learning agent with a reward function.
When not to be used reinforcement learning
- When you have enough data(defined or undefined) to solve the problem, you should go with supervised or unsupervised learning respectively.
- Reinforcement learning is time-consuming and requires heavy computing. You need to give time to solve the problem by reinforcement learning.
I hope you like the article.
If you have any question, please mention it in the comment box.