Crowding in humans is unlike that in convolutional neural networks

Ben Lonnqvist; Alasdair D. F. Clarke; Ramakrishna Chakravarthi

doi:10.1016/j.neunet.2020.03.021

Crowding in humans is unlike that in convolutional neural networks

Ben Lonnqvist^*, Alasdair D. F. Clarke, Ramakrishna Chakravarthi

^*Corresponding author for this work

Psychology

Research output: Contribution to journal › Article › peer-review

7 Citations (Scopus)

5 Downloads (Pure)

Abstract

Object recognition is a primary function of the human visual system. It has recently been claimed that the highly successful ability to recognise objects in a set of emergent computer vision systems—Deep Convolutional Neural Networks (DCNNs)—can form a useful guide to recognition in humans. To test this assertion, we systematically evaluated visual crowding, a dramatic breakdown of recognition in clutter, in DCNNs and compared their performance to extant research in humans. We examined crowding in three architectures of DCNNs with the same methodology as that used among humans. We manipulated multiple stimulus factors including inter-letter spacing, letter colour, size, and flanker location to assess the extent and shape of crowding in DCNNs. We found that crowding followed a predictable pattern across architectures that was different from that in humans. Some characteristic hallmarks of human crowding, such as invariance to size, the effect of target-flanker similarity, and confusions between target and flanker identities, were completely missing, minimised or even reversed. These data show that DCNNs, while proficient in object recognition, likely achieve this competence through a set of mechanisms that are distinct from those in humans. They are not necessarily equivalent models of human or primate object recognition and caution must be exercised when inferring mechanisms derived from their operation.

Original language	English
Pages (from-to)	262-274
Number of pages	13
Journal	Neural Networks
Volume	126
Early online date	27 Mar 2020
DOIs	https://doi.org/10.1016/j.neunet.2020.03.021
Publication status	Published - Jun 2020

Bibliographical note

Acknowledgements
We would like to acknowledge the use of a Tesla K40 GPU card that has been donated to Dr M. S. Baptista by Nvidia. We would also like to thank Dr Micha Elsner for helpful discussions.

Keywords

convolutional neural networks
object recognition
crowding
OBJECT RECOGNITION
MASKING
SPATIAL INTERACTION
Convolutional neural networks
Object recognition
Crowding

Access to Document

10.1016/j.neunet.2020.03.021Licence: Unspecified

Lonnqvist_et_al_NN_CrowdingInHumans_AAM
© 2020. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
Accepted author manuscript, 6.67 MBLicence: CC BY-NC-ND

Cite this

@article{fb90796356a04b44b7689e2e0cc818ae,

title = "Crowding in humans is unlike that in convolutional neural networks",

abstract = "Object recognition is a primary function of the human visual system. It has recently been claimed that the highly successful ability to recognise objects in a set of emergent computer vision systems—Deep Convolutional Neural Networks (DCNNs)—can form a useful guide to recognition in humans. To test this assertion, we systematically evaluated visual crowding, a dramatic breakdown of recognition in clutter, in DCNNs and compared their performance to extant research in humans. We examined crowding in three architectures of DCNNs with the same methodology as that used among humans. We manipulated multiple stimulus factors including inter-letter spacing, letter colour, size, and flanker location to assess the extent and shape of crowding in DCNNs. We found that crowding followed a predictable pattern across architectures that was different from that in humans. Some characteristic hallmarks of human crowding, such as invariance to size, the effect of target-flanker similarity, and confusions between target and flanker identities, were completely missing, minimised or even reversed. These data show that DCNNs, while proficient in object recognition, likely achieve this competence through a set of mechanisms that are distinct from those in humans. They are not necessarily equivalent models of human or primate object recognition and caution must be exercised when inferring mechanisms derived from their operation.",

keywords = "convolutional neural networks, object recognition, crowding, OBJECT RECOGNITION, MASKING, SPATIAL INTERACTION, Convolutional neural networks, Object recognition, Crowding",

author = "Ben Lonnqvist and Clarke, {Alasdair D. F.} and Ramakrishna Chakravarthi",

note = "Acknowledgements We would like to acknowledge the use of a Tesla K40 GPU card that has been donated to Dr M. S. Baptista by Nvidia. We would also like to thank Dr Micha Elsner for helpful discussions.",

year = "2020",

month = jun,

doi = "10.1016/j.neunet.2020.03.021",

language = "English",

volume = "126",

pages = "262--274",

journal = "Neural Networks",

issn = "0893-6080",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Crowding in humans is unlike that in convolutional neural networks

AU - Lonnqvist, Ben

AU - Clarke, Alasdair D. F.

AU - Chakravarthi, Ramakrishna

N1 - Acknowledgements We would like to acknowledge the use of a Tesla K40 GPU card that has been donated to Dr M. S. Baptista by Nvidia. We would also like to thank Dr Micha Elsner for helpful discussions.

PY - 2020/6

Y1 - 2020/6

N2 - Object recognition is a primary function of the human visual system. It has recently been claimed that the highly successful ability to recognise objects in a set of emergent computer vision systems—Deep Convolutional Neural Networks (DCNNs)—can form a useful guide to recognition in humans. To test this assertion, we systematically evaluated visual crowding, a dramatic breakdown of recognition in clutter, in DCNNs and compared their performance to extant research in humans. We examined crowding in three architectures of DCNNs with the same methodology as that used among humans. We manipulated multiple stimulus factors including inter-letter spacing, letter colour, size, and flanker location to assess the extent and shape of crowding in DCNNs. We found that crowding followed a predictable pattern across architectures that was different from that in humans. Some characteristic hallmarks of human crowding, such as invariance to size, the effect of target-flanker similarity, and confusions between target and flanker identities, were completely missing, minimised or even reversed. These data show that DCNNs, while proficient in object recognition, likely achieve this competence through a set of mechanisms that are distinct from those in humans. They are not necessarily equivalent models of human or primate object recognition and caution must be exercised when inferring mechanisms derived from their operation.

AB - Object recognition is a primary function of the human visual system. It has recently been claimed that the highly successful ability to recognise objects in a set of emergent computer vision systems—Deep Convolutional Neural Networks (DCNNs)—can form a useful guide to recognition in humans. To test this assertion, we systematically evaluated visual crowding, a dramatic breakdown of recognition in clutter, in DCNNs and compared their performance to extant research in humans. We examined crowding in three architectures of DCNNs with the same methodology as that used among humans. We manipulated multiple stimulus factors including inter-letter spacing, letter colour, size, and flanker location to assess the extent and shape of crowding in DCNNs. We found that crowding followed a predictable pattern across architectures that was different from that in humans. Some characteristic hallmarks of human crowding, such as invariance to size, the effect of target-flanker similarity, and confusions between target and flanker identities, were completely missing, minimised or even reversed. These data show that DCNNs, while proficient in object recognition, likely achieve this competence through a set of mechanisms that are distinct from those in humans. They are not necessarily equivalent models of human or primate object recognition and caution must be exercised when inferring mechanisms derived from their operation.

KW - convolutional neural networks

KW - object recognition

KW - crowding

KW - OBJECT RECOGNITION

KW - MASKING

KW - SPATIAL INTERACTION

KW - Convolutional neural networks

KW - Object recognition

KW - Crowding

UR - http://www.scopus.com/inward/record.url?scp=85082653827&partnerID=8YFLogxK

U2 - 10.1016/j.neunet.2020.03.021

DO - 10.1016/j.neunet.2020.03.021

M3 - Article

C2 - 32272430

SN - 0893-6080

VL - 126

SP - 262

EP - 274

JO - Neural Networks

JF - Neural Networks

ER -

Crowding in humans is unlike that in convolutional neural networks

Abstract

Bibliographical note

Keywords

Access to Document

Other files and links

Fingerprint

Cite this