Rewards for AGI

Contents

  1. Introduce the role of the rewards in reinforcement learning
    1. Use the example of dog training
    2. Explain the dopamine
    3. Explain the
  2. Introduce an extrinsic rewards
    1. Classic rewards
    2. Used in reinforcement learning examples
      1. Atari
      2. Alpha Go
      3. World model paper
  3. Introduce an intrinsic rewards

    1. Explain the limitation of the extrinsic rewards
      1. Montezuma’s revenge
    2. Explain the mechanism of the intrinsic rewards
      1. How the curious or explorative behavior relates to the intrinsic rewards
      2. How to formulate the intrinsic rewards mathematically?
      3. Model-free vs model-based reinforcement learning
      4. Intrinsic rewards examples
        1. Introduce paper by Stanford
        2. World Discovery Model paper
    3. Introduce the limitation of the intrinsic rewards

      1. The search space is too vast for a single agent learn by exploration

        • Example: Radnom permutation of alphabet to generate a meaningful sequenc.
        • Example: Musical note
      2. Explain the higher order Markov chains

  4. Introduce the social rewards

    1. Introduce the imprint
    2. Introduce how the baby learns
      1. Learn by imitation
      2. Joint attention
      3. Gated learning
    3. Theory based mathematically driven algorithm vs quick dirty rule based system
  5. Future research direction

    1. Models that take into account the innate mechanisms
    2. Environments that can test the models with social rewards

Reward plays a key role in the reinforcement learning.

In this post, I will explain three types of the rewards that can shape the intelligent behavior.

  • Extrinsic rewards: A reward triggered by the external entity.
  • Intrinsic rewards: A reward that can be generated by the agent himself.
  • Social rewards: A rewards that can be generated by the social activities

Extrinsic rewards

Extrinsic rewards means the positive or negative reward triggered by the external entity. A concrete example is the Previously rewards meant mostly extrinsic rewards.

Intrinsic rewards

Social rewards

A search space to generate a meaningful space is very vast and only very few are meaningful. But we can learn from others the meaningful sequence which reduces the search space

This can be expressed with music note.

Should we use reward mechanism or reflex mechanism?

Imprint is an example of reflex mechanism. Because we have the freedom to follow or not, I think reward mechanism makes more sense.

References

  1. Jones, Susan S. (2012-12-10). “Human Toddlers’ Attempts to Match Two Simple Behaviors Provide No Evidence for an Inherited, Dedicated Imitation Mechanism”. PLOS ONE. 7 (12): e51326. Bibcode:2012PLoSO…751326J. doi:10.1371/journal.pone.0051326. ISSN 1932-6203. PMC 3519587. PMID 23251500
  2. Jones, Susan S. (2009-08-27). “The development of imitation in infancy”. Philosophical Transactions of the Royal Society B: Biological Sciences. 364 (1528): 2325–2335. doi:10.1098/rstb.2009.0045. ISSN 0962-8436. PMC 2865075. PMID 19620104.
  3. Rowland, D.C., Yanovich, Y. and Kentros, C.G. (2011). A stable hippocampal representation of a space requires its direct experience. Proceedings of the National Academy of Sciences. 108(35). 14654-14658. -> An evidence for Gated language
  4. CONSPEC and CONLERN: a two-process theory of infant face recognition. J Morton, MH Johnson - Psychological review, 1991 - psycnet.apa.org Evidence from newborns leads to the conclusion that infants are born with some information about the structure of faces. This structural information, termed CONSPEC, guides the preference for facelike patterns found in newborn infants. CONSPEC is contrasted with a
  5. Newborns’ preferential tracking of face-like stimuli and its subsequent decline MH Johnson, S Dziurawiec, H Ellis, J Morton - Cognition, 1991 - Elsevier Abstract Goren, Sarty, and Wu (1975) claimed that newborn infants will follow a slowly moving schematic face stimulus with their head and eyes further than they will folow scrambled faces or blank stimuli. Despite the far-reaching theoretical importance of this …
  6. How the brain processes social information: searching for the social brain TR Insel, RD Fernald - Annu. Rev. Neurosci., 2004 - annualreviews.org ▪ Abstract Because information about gender, kin, and social status are essential for reproduction and survival, it seems likely that specialized neural mechanisms have evolved to process social information. This review describes recent studies of four aspects of social
  7. Eye contact detection in humans from birth T Farroni, G Csibra, F Simion… - Proceedings of the …, 2002 - National Acad Sciences Making eye contact is the most powerful mode of establishing a communicative link between humans. During their first year of life, infants learn rapidly that the looking behaviors of others conveys significant information. Two experiments were carried out to demonstrate …

Related