The main task is to find the result of an equation based on a video sequence.

The equation will be indicated by a moving robot.

•Several mathematical operators (multiplication, division, minus, plus, equal) are placed on the table.

•Several handwritten digits (0 to 8) are placed on the table.

•From an initial location somewhere on the table, the robot moves around the table.

Each time the robot passes above an operator or a digit, the symbol located below the robot is added to the equation.

For example, the sequence “2” → “+” → “3” → “=” becomes “2+3=”.

The goal is, given a new video sequence, to retrieve the formula and its associated answer.

To test the pipeline, three different scenarios will be presented:

•SC1: All operators/ digits have vertical orientations.

•SC2: Both operators and digits have random orientations.

•SC3 (Bonus): Both operators and digits have black colors. Orientations are random.

The input of the algorithm is a “.avi” video sequence, recorded at 2 FPS. The output should be a video sequence with

the same frame rate, duration and resolution as the input video. Each frame (e.g., frame at time t) of the output video

should contain the following information, printed on the same frame:

•The current state of the formula at time t.

•The trajectory of the robot from start to time t.

Project Overview

IAPR Project –2020

Evan Béal, Maxime Délitroz & Eric Bergkvist

Global architecture

●Object-Oriented Programming:

○Object class: Robot, Number and

Operator derived classes

○Equation class

IAPR Project –2020

Evan Béal, Maxime Délitroz & Eric Bergkvist

Workflow

●Segmentation

○Objects (num & op) on initial frame

○Robot in all frames

○With and without colored operators

●Description

●Classification

○Numbers

○Operators

○Robot

Operator

Segmentation &

Classification

IAPR Project -2020

Scenario 1 & 2

Scenario 3

Generalized Hough transform

●Number of sub-parts

●Fourier descriptors

First assessment

Generalized Hough transform

Second assessment

Number of sub-parts

Elongation

Third assessment

Fourier descriptors

Reference image

Target image

Evan Béal, Maxime Délitroz & Eric Bergkvist

Operator

Segmentation &

Classification

IAPR Project -2020

Scenario 1 & 2

Scenario 3

Generalized Hough transform

●Number of sub-parts

●Fourier descriptors

First assessment

Generalized Hough transform

Second assessment

Number of sub-parts

Elongation

Third assessment

Fourier descriptors

Reference image

Target image Accumulator array

Evan Béal, Maxime Délitroz & Eric Bergkvist

Operator

Segmentation &

Classification

IAPR Project -2020

Scenario 1 & 2

Scenario 3

Generalized Hough transform

●Number of sub-parts

●Fourier descriptors

First assessment

Generalized Hough transform

Second assessment

Number of sub-parts

Elongation

Third assessment

Fourier descriptors

Reference image

Target image Accumulator array

Detection

Evan Béal, Maxime Délitroz & Eric Bergkvist

Number

Classification

IAPR Project –2020

"TI-pooling: transformation-invariant pooling for feature learning in Convolutional Neural

Networks" D. Laptev, N. Savinov, J.M. Buhmann, M. Pollefeys, CVPR 2016.

Network architecture: TI pooling

Results on MNIST test set 2.65% error rate

Segmented number processing

Randomly selected samples with random rotations applied

Evan Béal, Maxime Délitroz & Eric Bergkvist

Results on the

most complex

scenario

IAPR Project -2020

Evan Béal, Maxime Délitroz & Eric Bergkvist

1 2