Exposure to naturalistic occlusion promotes generalized, human-like robustness in deep neural networks

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Human object recognition is robust to challenging conditions, such as when one’s view of an object is fragmented due to an occluding foreground object. In comparison, deep neural networks (DNNs) are typically more susceptible to occlusion, suggesting that human vision relies on distinct mechanisms. Here, we investigated the role of visual diet in the emergence of these mechanisms by asking whether human-like robustness might arise in DNNs when trained with image datasets that better reflect the properties of occlusion in natural vision. We trained convolutional and transformer DNNs to classify clear images only, images augmented with artificial occluders (i.e., geometric shapes) or natural occluders (objects segmented from photographs). We then evaluated DNN occlusion robustness and compared their performance profiles with 30 human participants. We found that DNNs trained with artificial occluders remained vulnerable to natural occlusion and exhibited less human-like performance than those trained with natural occlusion. Our findings suggest that human robustness to visual occlusion arises from learning to disentangle natural objects from each other rather than simply learning to recognize objects from partial views. They also imply that commonly used forms of artificial occlusion are unsuitable for the evaluation or promotion of robustness to real-world occlusion in DNNs.

Article activity feed