TY - GEN
T1 - Optimizing Reinforcement Learning Using Failure Data
AU - George Cui, Suzeyu
AU - Parron, Jesse
AU - Modery, Garrett
AU - Wang, Weitian
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Learning from both successes and failures is key to developing robust and efficient policies in reinforcement learning (RL). Traditional RL excels at learning from rewards but often neglects non-rewarding states, especially those leading to negative outcomes. This paper introduces a novel approach that integrates a modified Gaussian distribution into a Deep Q-Network (DQN) framework to learn from failures. By penalizing state-action pairs near historical failure points, the model guides the agent away from pitfalls. The optimized DQN shows improved learning speed and stability, achieving higher and more consistent scores than a standard DQN. This approach highlights the potential of hybrid RL models that combine value-based methods with failure-aware mechanisms to accelerate learning and enhance decision-making.
AB - Learning from both successes and failures is key to developing robust and efficient policies in reinforcement learning (RL). Traditional RL excels at learning from rewards but often neglects non-rewarding states, especially those leading to negative outcomes. This paper introduces a novel approach that integrates a modified Gaussian distribution into a Deep Q-Network (DQN) framework to learn from failures. By penalizing state-action pairs near historical failure points, the model guides the agent away from pitfalls. The optimized DQN shows improved learning speed and stability, achieving higher and more consistent scores than a standard DQN. This approach highlights the potential of hybrid RL models that combine value-based methods with failure-aware mechanisms to accelerate learning and enhance decision-making.
KW - learning from failure
KW - machine learning
KW - reinforcement learning
KW - robotics
UR - https://www.scopus.com/pages/publications/105002702482
U2 - 10.1109/URTC65039.2024.10937604
DO - 10.1109/URTC65039.2024.10937604
M3 - Conference contribution
AN - SCOPUS:105002702482
T3 - URTC 2024 - 2024 IEEE MIT Undergraduate Research Technology Conference, Proceedings
BT - URTC 2024 - 2024 IEEE MIT Undergraduate Research Technology Conference, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE MIT Undergraduate Research Technology Conference, URTC 2024
Y2 - 11 October 2024 through 13 October 2024
ER -