2014/15 : Extraction automatique d’informations

Our project is related to extracting meaningful information from weakly structured data or images collected from the web.
Agrobase-Logigram is an international company developing global databases about pesticides and residues in food and feed. We work in 66 countries at this moment. We collect labels in the Ministry’s sites but also we have pictures from containers in countries where the Ministry does not publish the correct data.

All this information has to be collected, digitalized and standardized to be then cross-country compared in an Oracle database.
We are located in Archamps(Haute-Savoie) but we are willing to come to Grenoble if necessary.

The goal of the proposed project is to  develop data processing tools to analyze public labels from Netherlands Ministry. Those labels are created in a standard way but not easy to find in a correct structure but just through keywords.

Students will have to go through 2 main steps:
1st- Try to recognize keywords from Netherland’s labels creating a standardized file
2nd- Create an automatic match between the file created at the present and the one created in the past giving an alert with what have changed.

During the course of this project Agrobase will provide all necessary data as well as regular meeting with Agrobase experts to get a feedback on their work.
Students will defined with Agrobase experts the  development tools they evaluate as most adapted for these tasks (java, python, ….)

We are a French company and we speak French but I prefer to give explanations in English.

B. Lemaire : Mme Perez travaille comme free-lance depuis Crolles et peut se déplacer sans problèmes sur le Campus ou ailleurs sur Grenoble.

Contact : Patricia Perez, Agrobase-Logigram SARL, patricia.perez@agrobase-logigram.com

