Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection

You are here

IEEE Transactions on Multimedia

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection

By: 
Pengwen Dai; Hua Zhang; Xiaochun Cao

Scene text plays a significant role in image and video understanding, which has made great progress in recent years. Most existing models on text detection in the wild have the assumption that all the texts are surrounded by a rotated rectangle or quadrangle. While there also exist lots of curved texts in the wild, which would not be bounded by a regular bounding box. In this paper, we develop a novel architecture to localize the text regions, which can deal with curved-shape scene texts. Specifically, we first design a text-related feature enhancement module by incorporating the prior knowledge of the text shape to enhance the feature representations. After that, based on the enhanced features, we employ a region proposal network to generate the candidate boxes of scene texts. For each text candidate, a pyramid region-of-interest pooling attention module is utilized to extract the fixed-size features. Finally, we exploit the box-aware contextbased text segmentation module and box refinement network to obtain the location of scene text. Experiments are conducted on four challenging benchmarks CTW1500, totalTEXT, ICDAR-2015 and MLT, and the experimental results have demonstrated the superiority of our model. 

SPS ON X

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel