KISS: Keeping It Simple for Scene Text Recognition

November 19, 2019 ยท Entered Twilight ยท ๐Ÿ› arXiv.org

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 6.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, LICENSE, README.md, __init__.py, commands, common, config.cfg, config, data_server.py, datasets, evaluate.py, evaluation, functions, image_manipulation, insights, iou, optimizers, requirements.txt, resnet, run_eval_on_all_datasets.py, tensorboard_filter, text, train_text_recognition.py, train_utils, transformer, updaters

Authors Christian Bartz, Joseph Bethge, Haojin Yang, Christoph Meinel arXiv ID 1911.08400 Category cs.CV: Computer Vision Citations 18 Venue arXiv.org Repository https://github.com/Bartzi/kiss โญ 110 Last Checked 2 months ago
Abstract
Over the past few years, several new methods for scene text recognition have been proposed. Most of these methods propose novel building blocks for neural networks. These novel building blocks are specially tailored for the task of scene text recognition and can thus hardly be used in any other tasks. In this paper, we introduce a new model for scene text recognition that only consists of off-the-shelf building blocks for neural networks. Our model (KISS) consists of two ResNet based feature extractors, a spatial transformer, and a transformer. We train our model only on publicly available, synthetic training data and evaluate it on a range of scene text recognition benchmarks, where we reach state-of-the-art or competitive performance, although our model does not use methods like 2D-attention, or image rectification.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision