Vision-Based Pick and Place Robots Using Faster R-CNN and EfficientNet for Real-Time Object Detection and Classification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper describes a vision-based pick-and-place robotic system that uses Faster R-CNN for object detection and EfficientNetB0 for classification. The system employs an eye-in-hand 2D camera on a UR5 robotic arm to collect real-time RGB images, which are processed through the dual-model architecture to detect and classify objects from 36 categories. Training and validation were conducted using a publicly available fruit and vegetable dataset to simulate an industrial sorting application. The combined classification accuracy reaches 83%, with high F1-scores for most classes. This architecture provides visual recognition capabilities and real-time processing suitable for automated industrial settings.

Article activity feed