Teaching - DMKM

Opinion Mining - Master DMKM

Sentiment classification

This task consists in predicting if a text (or a paragraph) is positive or negative. It enables us to make automated survey on everything (company reputation, politics...). This research has many profitable applications and a wide bibliography is available based on Machine Learning and Natural Language Processing. However, sentiment classification is not an easy task: the opinion expression relies on complicated markers (structured expression) that are domain-dependent.

In this practical session you must:

  • Understand the interest and the difficulties of sentiment classification
  • Develop your skills regarding text classification (from bag of words conversion to performance evaluation)

All data and software are available here, you have to develop your own system.

useful tricks

To read the dictionary in octave (to check the content of a document):

 d=textread('dico/dicospUnisp.txt',"%s" );