Advancements in Computer Vision: Exploring Deep Learning and Transformer-Based Models for Enhanced Visual Perception

Yajnavalkya Bandyopadhyay

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Recent advancements in computer vision have significantly transformed various industries, from healthcare to autonomous driving. This paper presents a comprehensive survey of these developments, with a particular focus on deep learning and transformer-based models. We explore the fundamental concepts and methodologies, including feature extraction, classification, segmentation, and object detection. The paper also highlights the evolution of computer vision frameworks and tools, emphasising the contributions of convolutional neural networks (CNNs), generative models, and transfer learning. Additionally, we discuss emerging trends such as vision transformers and multimodallearning, while acknowledging persistent challenges like data scarcity and realtime processing. Through an in-depth analysis, we aim to provide scholars and professionals with a detailed understanding of the current state and future prospects of computer vision. The paper further examines specific applications in healthcare, autonomous cars, retail, agriculture, and security, illustrating how computer vision technologies are redefining established practices and enhancing decision-making capabilities.

Version published to 10.31219/osf.io/rzsxj on OSF Preprints
Nov 1, 2024

Human Activity Recognition in the Deep Learning Era: Different Modalities, Recent Advances in Applications, and Emerging Techniques

This article has 2 authors:
1. Mohammad Osman Khan
2. Imran Khan Apu
This article has no evaluationsLatest version Dec 10, 2025
A Comprehensive Comparative Analysis of Convolutional Neural Network Architectures for Image Classification and Object Detection Tasks

This article has 3 authors:
1. Fahim Al Islam
2. Saif Hossain
3. Monir Hosen
This article has no evaluationsLatest version Feb 3, 2026
Lightweight Deep Learning Models for Face Mask Detection in Real-Time Edge Environments: A Review and Future Research Directions

This article has 1 author:
1. Saim Rasheed
This article has no evaluationsLatest version Jan 7, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Human Activity Recognition in the Deep Learning Era: Different Modalities, Recent Advances in Applications, and Emerging Techniques

A Comprehensive Comparative Analysis of Convolutional Neural Network Architectures for Image Classification and Object Detection Tasks

Lightweight Deep Learning Models for Face Mask Detection in Real-Time Edge Environments: A Review and Future Research Directions