When producing a description of a target referent in a visual context, speakers need to choose a set of properties that distinguish it from its distractors. Computational models of language production/generation usually model this as a search process and predict that the time taken will increase both with the number of distractors in a scene and with the number of properties required to distinguish the target. These predictions are reminiscent of classic fndings in visual search; however, unlike models of reference production, visual search models also predict that search can become very ecient under certain conditions, something that reference production models do not consider. This paper investigates the predictions of these models empirically. In two experiments, we show that the time taken to plan a referring expression (as reflected by speech onset latencies) is influenced by distractor set size and by the number of properties required, but this crucially depends on the discriminability of the properties under consideration. We discuss the implications for current models of reference production and recent work on the role of salience in visual search.
|Number of pages||36|
|Early online date||6 Jun 2016|
|Publication status||Published - May 2017|
- language production
- visual search
- computational models