Show Notes:
- (2:00) Arthur talked about his undergraduate studying Psychology at North Carolina State University.
- (3:28) Arthur mentioned his time working as a research assistant at the LACElab in NCSU that does human factor and cognition research.
- (5:08) Arthur discussed his decision to pursue a graduate degree in Cognitive Neuroscience at the University of Oregon right after college.
- (6:35) Arthur went over his Master's thesis (Navigation performance in virtual environments varies with fractal dimension of landscape) in more detail
- (10:30) Arthur unpacked his popular blog series called “Simple Reinforcement Learning in TensorFlow” on Medium.
- (12:56) Arthur recalled his decision to join Unity to work on its reinforcement learning problems.
- (14:31) Arthur recalled his choice to do the Ph.D. part-time while working full-time.
- (16:24) Arthur discussed problems with existing reinforcement learning simulation platforms and how the Unity Machine Learning Agents Toolkit addresses those.
- (18:30) Arthur went over the challenges of maintaining and continuously iterating the Unity ML Agents toolkit.
- (20:36) Arthur emphasized the benefit of training the agents with an additional curiosity-based intrinsic reward, which is inspired from a paper from UC Berkeley researchers (check out the Unity blog post).
- (22:33) Arthur talked about the challenges of implementing such curiosity-based techniques.
- (25:15) Arthur unpacked the introduction of the Obstacle Tower - a high fidelity, 3D, third person, procedurally generated environment - released in the latest version of the toolkit (read his blog post “On “solving” Montezuma’s Revenge”).
- (29:15) Arthur discussed the Obstacle Tower Challenge, a contest that offers researchers and developers the chance to compete to train the best-performing agents on the Obstacle Tower Environment.
- (32:49) Referring to his fun tutorial called “GANs explained with a classic sponge bob squarepants episode,” Arthur walked through the theory behind the Generative Adversarial Network algorithm via an explanation using an episode of Spongebob Squarepants.
- (34:30) Arthur extrapolated on his post “RL or Evolutionary Strategies? Nature has a solution: Both.”
- (38:36) Arthur shared a couple of approaches to balance the bias and variance tradeoff in reinforcement learning models, referring to his article “Making sense of the bias/variance tradeoff in Deep RL.”
- (41:19) Arthur talked about successor representations and their applications in deep learning, psychology, and neuroscience (read his post "The present in terms of the future: Successor representations in RL”).
- (42:38) Arthur reflected on the benefits of his Psychology and Neuroscience background for his research career.
- (44:21) Arthur shared his advice for graduate students who want to make a dent in the AI / ML research community.
- (45:30) Closing segment.
His Contact Info:
- Twitter
- GitHub
- Medium
- LinkedIn
- Google Scholar
- Unity Blog
His Recommended Resources:
- DeepMind
- Google Brain
- Being and Time (by Martin Heidegger)
This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit datacast.substack.com/subscribe