Variational characterizations in Markov decision processes
Coauthor(s): Paul Schweitzer.
Most quantities of interest in discounted and undiscounted (semi-) Markov decision processes can be obtained by solving a system of functional equations. This paper derives bounds and variational characterizations for the solutions of such systems. These are useful for at least three reasons: (1) in any solution procedure the upper and lower bounds can be used to measure the deviation of the current solution from optimality; (2) this in turn may permit elimination of suboptimal actions; and (3) the variational characterizations suggest numerical algorithms (linear programming, policy iteration algorithms, successive approximation schemes).
Source: Journal of Mathematical Analysis and Applications
Federgruen, Awi, and Paul Schweitzer. "Variational characterizations in Markov decision processes." Journal of Mathematical Analysis and Applications 117, no. 2 (August 1, 1986): 326-357.