Contraction mappings underlying undiscounted Markov decision problems
Coauthor(s): Paul Schweitzer, H. C. Tijms.
This paper is concerned with the properties of the value-iteration operator which arises in undiscounted Markov decision problems. We give both necessary and sufficient conditions for this operator to reduce to a contraction operator, in which case it is easy to show that the value-iteration method exhibits a uniform geometric convergence rate. As necessary conditions we obtain a number of important characterizations of the chain and periodicity structures of the problem, and as sufficient conditions, we give a general "scrambling-type" recurrency condition, which encompasses a number of important special cases. Next, we show that a data transformation turns every unichained undiscounted Markov Renewal Program into an equivalent undiscounted Markov decision problem, in which the value-iteration operator is contracting, because it satisfies this "scrambling-type" condition. We exploit this contraction property in order to obtain lower and upper bounds as well as variational characterizations for the fixed point of the optimality equation and a test for eliminating suboptimal actions.
Source: Journal of Mathematical Analysis and Applications
Federgruen, Awi, Paul Schweitzer, and H. C. Tijms. "Contraction mappings underlying undiscounted Markov decision problems." Journal of Mathematical Analysis and Applications 65, no. 3 (October 1978): 711-730.