Publication Type:

Conference Paper

Source:

Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton Conference on (2012)

Keywords:

Approximation algorithms, average cost control, constrained Markov decision processes, Constrained MDP, constrained routing problem, Decision theory, function approximation, inequality constraints, Lagrange multiplier method, learning (artificial intelligence), Markov processes, Minimization, multi-stage stochastic shortest path problem, multistage queueing network, multitimescale Q-learning algorithm, network theory (graphs), Parameter estimation, parameter update, policy parameter, Q-learning with linear function approximation, Q-value parameter, Queueing theory, Reinforcement learning, routing, Vectors, zinc

Cite this Research Publication

L. K. and Bhatnagar, S., “A novel Q-learning algorithm with function approximation for constrained Markov decision processes”, in Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton Conference on, 2012.