## Awi Federgruen

*The asymptotic behavior of undiscounted value iteration in Markov decision problems*

Coauthor(s): Paul Schweitzer.

**Abstract:**

This paper considers undiscounted Markov Decision Problems. For the general multichain case, we obtain necessary and sufficient conditions which guarantee that the maximal total expected reward for a planning horizon of *n* epochs minus *n* times the long run average expected reward has a finite limit as *n* approaches infinity for each initial state and each final reward vector. In addition, we obtain a characterization of the chain and periodicity structure of the set of one-step and *J*-step maximal gain policies. Finally, we discuss the asymptotic properties of the undiscounted value-iteration method.

**Source:** *Mathematics of Operations Research*

**Volume:** 2

**Number:** 4

**Pages:** 360-381

**Date:**
11
1977