Training reinforcement neurocontrollers using the polytope algorithm
Abstract
Type
Type of the conference item
Journal type
peer reviewed
Educational material type
Conference Name
Journal name
Neural Processing Letters
Book name
Book series
Book edition
Alternative title / Subtitle
Description
A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorithm to adjust the weights of the action network so that a simple direct measure of the training performance is maximized. Experimental results from the application of the method to the pole balancing problem indicate improved training performance compared with critic-based and genetic reinforcement approaches.
Description
Keywords
reinforcement learning, neurocontrol, optimization, polytope algorithm, pole balancing, genetic reinforcement
Subject classification
Citation
Link
Language
en
Publishing department/division
Advisor name
Examining committee
General Description / Additional Comments
Institution and School/Department of submitter
Πανεπιστήμιο Ιωαννίνων. Σχολή Θετικών Επιστημών. Τμήμα Μηχανικών Ηλεκτρονικών Υπολογιστών και Πληροφορικής