, 2005; Lau and Glimcher, 2008; Cai et al , 2011; Kim et al , 200

, 2005; Lau and Glimcher, 2008; Cai et al., 2011; Kim et al., 2009, 2013). In addition, signals see more necessary for updating the value functions, including the value of the chosen action and reward prediction errors, are also found in the striatum (Kim et al., 2009; Oyama et al., 2010; Asaad and Eskandar, 2011). Moreover, the dorsolateral striatum, or the putamen, might be particularly involved in

controlling habitual motor actions (Hikosaka et al., 1999; Tricomi et al., 2009). Although the striatum is most commonly associated with model-free reinforcement learning, additional brain areas are likely to be involved in the process of updating action value functions, depending on the specific type of value functions in question. Indeed, signals related to value functions and reward prediction errors are found in many different areas (Lee et al., 2012). Similarly, using a JQ1 cell line multivariate decoding analysis, signals related to rewarding and punishing outcomes can be decoded from the majority of cortical and subcortical areas (Figure 2; Vickery et al., 2011). The neural substrates for model-based reinforcement learning are much less well understood compared to those for Pavlovian conditioning and habit learning (Doll et al., 2012). This is not surprising, since the nature of computations for simulating the possible outcomes and their neural implementations might vary widely across various decision-making problems. For

example, separate regions of the frontal cortex and striatum in the rodent brain might underlie model-based reinforcement learning (place learning) and habit learning (response learning; Tolman et al., 1946). In particular, lesions in the dorsolateral striatum and infralimbic cortex impair habit learning, while lesions in the dorsomedial striatum

and prelimbic cortex impair model-based reinforcement learning (Balleine Dichloromethane dehalogenase and Dickinson, 1998; Killcross and Coutureau, 2003; Yin and Knowlton, 2006). In addition, lesions or inactivation of the hippocampus suppresses the strategies based on model-based reinforcement learning (Packard et al., 1989; Packard and McGaugh, 1996). To update the value functions in model-based reinforcement learning, the new information from the decision maker’s environment needs to be combined with the previous knowledge appropriately. Encoding and updating the information about the decision maker’s environment might rely on the prefrontal cortex and posterior parietal cortex (Pan et al., 2008; Gläscher et al., 2010; Jones et al., 2012). In addition, persistent activity often observed in these cortical areas is likely to reflect the computations related to reinforcement learning and decision making in addition to working memory (Kim et al., 2008; Curtis and Lee, 2010). Given that persistent activity in the prefrontal cortex is strongly influenced by dopamine and norepinephrine (Arnsten et al., 2012), prefrontal functions related to model-based reinforcement learning might be regulated by these neuromodulators.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>