MCL Student Junting Zhang presented a paper at the 5th IEEE Global Conference on Signal and Information Processing (GlobalSIP 2017) in Montreal Quebec, Canada on November 15, 2017. Here’s an abstract of the paper :
Scene text detection is a critical prerequisite for many fascinating applications for vision-based intelligent robots. Existing methods detect texts either using the local information only or casting it as a semantic segmentation problem. They tend to produce a large number of false alarms or cannot separate individual words accurately. In this work, we present an elegant segmentation-aided text detection solution that predicts the word-level bounding boxes using an end-to-end trainable deep convolutional neural network. It exploits the holistic view of a segmentation network in generating the text attention map (TAM) and uses the TAM to refine the convolutional features for the MultiBox detector through a multiplicative gating process. We conduct experiments on the large-scale and challenging COCO-Text dataset and demonstrate that the proposed method outperforms state-of-the-art methods significantly.