# Offline handwritten mathematical expression regnition via stroke extraction

The project provide a proof-of-concept stroke extractor that can extract strokes from clean
bitmap images. The stroke extractor can be used to recognize offline handwritten
mathematical expression if a online recognizer is given. For example, when combined
with MyScript, the resulting offline recognition system was **ranked #3 in the offline
task in CROHME 2019.**

## Accuracy

Dataset|Correct|Up to 1 error|Up to 2 errors|Structural correct
---|---|---|---|---
CROHME 2014|58.22%|71.60%|75.15%|77.38%
CROHME 2016|65.65%|77.68%|82.56%|85.00%
CROHME 2019|65.05%|||

Although good accuracy is achieved on datasets from CROHME, the program
may produce poor results on real world images. For example, the procedure do not
work well on the following images:
- Image containing other objects. An image should contains exactly one formula and nothing else.
Ordinary text and grid lines are not allowed.
- Image with low contrast. The strokes may not be distinguished from background properly.
- Image with low resolution. The stroke extractor may not segment touching symbols correctly.
- Printed mathematical expressions. Serifs can distract the stroke extractor.

## Usage

In order to use the MyScript Cloud recognition engine, you need to [create a account](https://sso.myscript.com/register)
and create an application.

### Graphical interface

1. Run the JAR by double click or command like `java -jar mathocr-myscript.jar`
2. Choose `Image file` from the menu `Recognize`
3. Choose the image file
4. Click the button `Recognize` under stroke preview

### API

First add the jar file to classpath. If you are using Maven, add the following
to you `pom.xml`:

```xml
<dependency>
	<groupId>com.github.chungkwong</groupId>
	<artifactId>mathocr-myscript</artifactId>
	<version>1.0</version>
</dependency>
```

Then you can recognize images of mathematical expression by using code like:

```java
String applicationKey="your application key for MyScript";
String hmacKey="hmac key of your Myscript account";
String grammarId="an uploaded grammar of your Myscript account";
int dpi=96;
MyscriptRecognizer myscriptRecognizer=new MyscriptRecognizer(applicationKey,hmacKey,grammarId,dpi);
Extractor extractor=new Extractor(myscriptRecognizer);

File file=new File("Path to file to be recognized");
EncodedExpression expression=extractor.recognize(ImageIO.read(file));
String latexCode=expression.getCodes(new LatexFormat());
```