Vision-Based Pick and Place Robots Using Faster R-CNN and EfficientNet for Real-Time Object Detection and Classification
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper describes a vision-based pick-and-place robotic system that uses Faster R-CNN for object detection and EfficientNetB0 for classification. The system employs an eye-in-hand 2D camera on a UR5 robotic arm to collect real-time RGB images, which are processed through the dual-model architecture to detect and classify objects from 36 categories. Training and validation were conducted using a publicly available fruit and vegetable dataset to simulate an industrial sorting application. The combined classification accuracy reaches 83%, with high F1-scores for most classes. This architecture provides visual recognition capabilities and real-time processing suitable for automated industrial settings.