Scene Text Detection Using Attention with Depthwise Separable Convolutions for Mobile Applications

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Text detection from images or videos contributes well in many applications since deep-learned features can effectively capture textual cues. However, many existing methods give average performance when they are applied to detect Arbitrary-shaped text present in the image. This limitation is mainly due to the constraints of their text representations, which include horizontal boxes, rotating rectangles, and quadrangles. This paper proposes a Deep-Learned Fusion Attention Network (DLFANet) for learning the prominent features of arbitrary shaped text by using a lightweight network known as shared network which is further fine-tuned by the proposed Feature Attention Module Enhancement (FAME). In addition, the Final Feature Module (FFM) with an Attention Detection Head (ADH) and Geometry Aware Pixel Network (GAPN) are used to detect the location of the text effectively. The performance analysis of the proposed work on standard datasets Total-Text, CTW 1500, and ICDAR 2015 gives better results when compared to other state-of-the-art algorithms.

Article activity feed