Skip to main contentProblem statement
Finite State Machine of a a recycling robot and MDP dynamics LUT
Solution
Key references: (Bency et al., 2019; Schmidhuber, 2015; Marco et al., 2017; Hamrick et al., 2017)
References
- Bency, M., Qureshi, A., Yip, M. (2019). Neural Path Planning: Fixed Time, Near-Optimal Path Generation via Oracle Imitation.
- Hamrick, J., Ballard, A., Pascanu, R., Vinyals, O., Heess, N., et al. (2017). Metacontrol for Adaptive Imagination-Based Optimization.
- Marco, A., Berkenkamp, F., Hennig, P., Schoellig, A., Krause, A., et al. (2017). Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization.
- Schmidhuber, J. (2015). On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models.