How does it work?

On the mobile device

By opening the application, the system allows the user to take a photo of a text.
Then, he can, if he wishes, select a region of interest on the screen in which he sees the photo he has taken.
This step is followed by the selection of language for the reading or the translation of the text.
Then, he clicks to send to photo the server to process the image.
Once the job is done on the server, the extracted text is displayed on the phone screen, and the user can click to listen to the text.

On the server

The server receives the uploaded image, the language and the values of the region of interest selected by the user, if there is one.
The server performs the pre-processing of the image.
This step is followed by the extraction of the text: Optical Character Recognition, and then the translation is performed if required by the user.
The final text is then accessible by the client (phone).

Links