VinnyOCR
VinnyOCR is a fast and small OCR library written in Swift. Powered by the Accelerate and Vision frameworks, it uses a neural network to recognize text. VinnyOCR was created and optimized to recognize short, one line long, alphanumeric codes (e.g. DI49CM). Only iOS and macOS are supported at the moment.
Inspiration
This project was inspired by SwiftOCR. The difference between the two projects is that VinnyOCR uses internal
Apple frameworks (Accelerate
and Vision
) for image manipulation and text location.
Features
-
Easy to use training utility
-
High accuracy
-
Fast
-
Small
Requirements
-
iOS 11.0+/macOS 10.13+
-
Swift 5.0+
Installation
CocoaPods
To integrate VinnyOCR into your Xcode project using CocoaPods, add it to your Podfile
:
pod 'VinnyOCR'
How it Works
-
Input image is thresholded (binarized)
-
Characters are located within the image and individually converted into float arrays
-
Each float array is fed through the neural network
How to Use
Using VinnyOCR is very simple.
import VinnyOCR
let modelUrl = Bundle.main.url(forResource: "VinnyOCR", withExtension: "ocr")!
let vinny = VinnyOCR(url: modelUrl)!
vinny.recognize(someImage) { (text) in
print(text)
}
VinnyOCR runs much faster when the build configuration is set to Release
.
Training
To train a model for use, clone this repository and open the VinnyOCR Trainer
Xcode project found in the Trainer
directory.
After starting the training app, do the following:
-
Select which fonts you want the model to train with by clicking the checkbox in the
Should Train?
column for each font. -
Select which backgrounds you want the model to train with by clicking
Choose Backgrounds
and selecting some image files. Sample training backgrounds are provided in theTrainer/Sample Backgrounds/
directory of this repository. -
Select where to output the model file by clicking
Choose Output Directory
. Once training is complete, the model file will be saved to this directory. -
Press
Start Training
. The default error threshold is set to a value that the network will most likely not reach. Instead, after starting training, wait 5-10 minutes and pressStop Training
- the generated model will work as long as the model error is less than 100. The model error can be seen next to the start/stop training button in<current model error>/<target model error>
format.
Notes
-
After you start training, if the model error does not come below 100 after a minute, restart the training process by by pressing
Stop Training
andStart Training
. This will cause a network with new weights to be generated. -
Training parameters can be modified by editing the values in
TrainingParameters.swift
.
Planned Improvements
-
Add training screenshots to README
-
Distort generated training images during training
-
Allow dynamic character set during training
Acknowledgements
-
SwiftOCR for inspiring this project.
-
Swift-AI for providing the code that is used by the neural network within this project.
License
VinnyOCR is released under the MIT license. See LICENSE for details.