Formal proofs will be given in section 2. In this paper, we give a generalization of a result by Borkar and Meyn (2000), on the stability and convergence of synchronous-update stochastic approximation algorithms, to the case of asynchronous stochastic approximations with delays. He is known for introducing analytical paradigm in stochastic optimal control processes and is an elected fellow of all the three major Indian science academies. The ODE method for convergence of stochastic approximation and reinforcement learning VS Borkar, SP Meyn SIAM Journal on Control and Optimization 38 (2), 447-469, 2000. The actor-critic algorithm as multi-time-scale stochastic approximation VIVEK S BORKAR* and VIJAYMOHAN R KONDA Department of Computer Science and Automation, Indian Institute of Science, Bangalore 560 012, India Abstract.
The o.d.e approach to stochastic approximation was initiated by Ljung. DOI: 10.1137/S0363012997331639 Corpus ID: 16795817. This book is a great reference book, and if you are patient, it is also a very good self-study book in the field of stochastic approximation. In 1999, Borkar and Meyn [13] developed sufficient conditions which guarantee both the stability and convergence of stochastic recursive equations. Book Description: The book deals with a powerful and convenient approach to a great variety of types of problems of the recursive monte-carlo or stochastic approximation type. This is motivated by the emergent applications in communications. Although powerful, these algorithms have applications in control and communications engineering, artificial intelligence and economic modeling. 