Improvement of YOLOv8 algorithm through integration of Pyramid Vision Transformer architecture

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Addressing the issue of poor target detection accuracy in complex backgrounds with the YOLOv8s model, this chapter proposes an improved YOLOv8s model that incorporates the Pyramid Vision Transformer (PVT). Specifically, to enhance the feature extraction capabilities of the base module, this paper proposes using PVT in the Backbone stage of YOLOv8s to replace the previous basic convolutional feature extraction blocks. This structure allows the model to process images at different resolution levels, thereby more effectively capturing details and contextual information.

Article activity feed