Sentiment analysis

Sentiment analysis (aka sentiment detection or opinion mining) is an automated technique in natural language processing that tries to infer people’s sentiments as expressed in text documents.

Not intended to be comprehensive, the following summary is primarily meant to introduce the issue and raise awareness for its complexity. It excerpts some aspects from the much more detailed description in: Liu, Bing (2010) Sentiment Analysis and Subjectivity. in: Handbook of Natural Language Processing, Second Edition, (editors: N. Indurkhya and F. J. Damerau)

Example of an "opinionated" document (i.e. a document that expresses the opinion of its author):

“(1) I bought an iPhone a few days ago. (2) It was such a nice phone. (3) The touch screen was really cool. (4) The voice quality was clear too. (5) Although the battery life was not long, that is ok for me. (6) However, my mother was mad with me as I did not tell her before I bought it. (7) She also thought the phone was too expensive, and wanted me to return it to the shop. … ”

As can easily be seen, there are several opinions expressed in this review. Sentences (2), (3) and (4) express positive opinions, while sentences (5), (6) and (7) express negative ones. The opinions all have some targets or objects on which sentiments are expressed. The opinion in sentence (2) is on the iPhone as a whole, and the opinions in sentences (3), (4) and (5) are on the “touch screen”, “voice quality” and “battery life” respectively. The opinion in sentence (7) is on the price of the iPhone, but the opinion/emotion in sentence (6) is on “me”, not the iPhone. This can be important, since users often may be interested in opinions on certain targets or objects, but not on all. Finally, the source or holder of the opinions in sentences (2), (3), (4) and (5) is the author of the review (“I”), but in sentences (6) and (7) it is “my mother”. Good sentiment analysis would have to be able to distinguish all these cases. However, what for a human reader might seem quite intuitive and easily done, can be an arduous task for a machine.
Nevertheless, today automated sentiment analysis is a widely used and rapidly developing technology.


  • object: iPhone
  • component (can be an object in its turn): battery
  • feature (or topic): battery life
  • general opinion: “I like iPhone”
  • specific opinion: “The touch screen of iPhone is really cool”
  • explicit feature: “The battery life of this phone is too short”
  • implicit feature: “This phone is too large”
  • feature indicator: “large” is not a synonym of size. It is just an indicator for it.
  • opinion holder or source: the holder of an opinion
  • orientation of an opinion on a feature: positive, negative or neutral.
  • explicit opinion: "The phone is great"
  • implicit opinion: "The phone broke in two days"
  • strength of opinion can be scaled: e.g. strong (“This phone is a piece of junk”), weak (“I think this phone is fine”).
  • direct opinion: a quintuple (\(o_j, f_{jk}, oo_{ijkl}, h_i, t_l\)), where \(o_j\) is an object, \(f_{jk}\) is a feature of the object \(o_j\), \(oo_{ijkl}\) is the orientation or polarity of the opinion on feature \(f_{jk}\) of object \(o_j\), \(h_i\) is the opinion holder and \(t_l\) is the time when the opinion is expressed by \(h_i\). The opinion orientation \(oo_{ijkl}\) can be positive, negative or neutral
  • comparative opinion: a relation of similarities or differences between two or more objects, and/or object preferences of the opinion holder based on some of the shared features of the objects. usually expressed using the comparative or superlative form of an adjective or adverb

Objective of mining direct opinions: Given an opinionated document \(d\),

  • 1. discover all opinion quintuples (\(o_j, f_{jk}, oo_{ijkl}, h_i, t_l\)) in \(d\), and
  • 2. identify all the synonyms (\(W_{jk}\)) and feature indicators \(I_{jk}\) of each feature \(f_{jk}\) in \(d\).

A feature-based summary of opinions:

opinion summary

A buzz summary shows the frequency of mentions of different competing objects. It can tell the popularity of different objects (products or brands) in a market place.

Trend tracking: needs the time of an opinion to be recorded; indicates how things change over time

The tasks to identify opinion holders, object names and time of postings are three extractions tasks collectively known as Named Entity Recognition (NER).

Document-Level Sentiment Classification

Most existing techniques for document-level sentiment classification are based on supervised learning, some focus on unsupervised methods.

Supervised methods

Supervised learning methods use training and testing data, for example taken from product reviews, as references according to which a textual expression is classified with probability-based classifiers like Naive Bayes or Support Vector Machines. The selection of classified terms follows different principles. Sometimes the frequency of certain terms is considered as relevant. Other approaches focus on adjectives as important indicators of subjectivities and opinions. Often, so called opinion words are tried to be extracted, like beautiful, wonderful, good, amazing as indicators for positive sentiments, and bad, poor, or terrible as negative opinion words. Apart from individual words, there are also opinion phrases and idioms. And the focus is also often on negations since their appearances can change the opinion orientation. In the sentence “I don’t like this camera” like would be positive, but don't changes its orientation. 

However, classification is highly sensitive to the domain from which the training data is extracted. A classifier trained using opinionated texts from one domain often performs poorly when it is applied or tested on opinionated texts from another domain. E.g.: the adjective "unpredictable" may have a negative orientation in a car review (e.g., “unpredictable steering”), but it could have a positive orientation in a movie review (e.g., “unpredictable plot”). Sometimes the same word might have different orientations in different contexts even in the same domain. For example, in the digital camera domain, the word long expresses different opinions in the two sentences: “The battery life is long” (positive) and “The time taken to focus is long” (negative).

Unsupervised methods

Unsupervised learning techniques focus on opinion words and phrases. Often they proceed in three steps:

At first, consecutive words, where one member of the pair is an adjective/adverb and the other is a context word, are extracted if they conform to certain patterns of adverb-noun or adverb-adjective pairs.

In the second step the orientation of the extracted phrase is estimated using the pointwise mutual information measure (PMI) given below:

\[PMI(term_1, term_2) = log_2(\frac{Pr(term_1 \land term_2)}{Pr(term_1)*Pr(term_2)})\]

with \(Pr(term_1 \land term_2)\) indicating the co-occurrence probability of \(term_1\) and \(term_2\) and \(Pr(term_1)*Pr(term_2)\) giving the probability that the two terms co-occur if they are statistically independent. The ratio is thus a measure of the degree of statistical dependence between them. The log of this ratio is the amount of information that one acquires about the presence of one of the words when the other is observed.

The opinion orientation (\(oo)\)) of a phrase is computed based on its association with a positive reference word (e.g. “excellent”) and its association with a negative reference word (e.g. “poor”):

\[oo(phrase)=PMI(phrase, "excellent")-PMI(phrase,"poor")\]

The probabilities are calculated by issuing queries to a search engine and collecting the number of hits. For each search query, a search engine usually gives the number of relevant documents to the query, which is the number of hits. Thus, by searching the two terms together and separately, one can estimate the probabilities used in the PMI-equation above.

In the third step the algorithm computes the average \(oo\) of all phrases in the review, and classifies an object as positive if the average \(oo\) is positive, or negative otherwise. Alternatively more complex functions for aggregating opinions can be used. 

Opinion lexicon generation

Opinion lexica are used as reference for comparison and consist of opinion words, which in the research literature are also known as polar words, opinion-bearing words, and sentiment words. Examples of positive opinion words are: beautiful, wonderful, good, and amazing. Examples of negative opinion words are bad, poor, and terrible. Apart from individual words, there are also opinion phrases and idioms, e.g., this costs someone an arm and a leg. Collectively, they are called the opinion lexicon. Two types exist, the base type and the comparative type, with the second one being used to express comparative and superlative opinions. Examples of such words are better, worse, best, worst, etc.

One of the simple techniques for non-manuel lexicon generation is based on bootstrapping using a small set of seed opinion words and an online dictionary, e.g., WordNet, ConceptNet5, SenticNet, DBPedia, Freebase. The strategy is to first collect a small set of opinion words manually with known orientations, and then to grow this set by searching in a data base like WordNet for their synonyms and antonyms. The newly found words are added to the seed list, with which a next iteration is started. The iterative process stops when no more new words are found. After manual inspection and correction the result can be used as reference base.

A more sophisticated technique is a corpus-based approach which relies on syntactic or co-occurrence patterns together with a seed list of opinion words. The technique starts with a list of seed opinion adjective words, and uses them and a set of linguistic constraints or conventions on connectives to identify additional adjective opinion words and their orientations. One of the constraints for instance can be conjunction (AND), which says that conjoined adjectives usually have the same orientation. For example, in the sentence, “This car is beautiful and spacious,” if “beautiful” is known to be positive, it can be inferred that “spacious” is also positive. Similar rules or constraints are designed for other connectives, like OR, BUT, EITHER-OR, and NEITHER-NOR.

Context-aware sentiment analysis

Weichselbraun Albert, Gindl Stefan, Scharl Arno (2013) Extracting and Grounding context-Aware Sentiment Lexicons. IEEE Intelligent Systems
See project: Media watch on climate change

Context-aware sentiment analysis tackles the problem of ambiguity by attempting to determine the superordinate concept of the sentiment term in a given context. Straightforward for humans with ample domain experience, this can be a difficult task for automated systems.

One lately developed methodology is concept grounding, a technique that focuses on ambiguous terms and tries to detect their context. A term is considered ambiguous if: 1. its observed sentiment values show a high standard deviation, and 2. the deviations from its average sentiment value lead to considerably different polarities. Then, for each identified ambiguous term, the system collects context terms and stores them in a contextualized sentiment lexicon. The number of co-occurring context terms in positive and negative documents serves as an indicator for the ambiguous terms’ positive or negative charge. The system considers all terms, independent of their part of speech and independent of whether they represent a named entity or not. Statistical refinement removes irrelevant terms, using only context terms with the strongest probabilities for a positive or negative context. A Naıve Bayes classifier then can be used to estimate the polarity of an ambiguous term based on the probabilities of collected context terms

The actual sentiment analysis then combines polarity values for unambiguous and ambiguous terms, detects negation, and determines the sum of all sentiment values as the overall polarity of the document.

In order to circumvent the strong corpus- and thus domain specifity of machine learning approaches many text corpora used for sentiment analysis are assembled - sometimes in realt time - by crawlers from the web. This is relatively easy in the context of movie or product reviews, but can be difficult in domains, where pre-tagged corpora are sparse or not available at all. In these cases, generic lexica are merged from other contextualized lexicons of multiple corpora. Using the graph structure of some of these lexica like ConceptNet5 or DBPedia contexts can be depicted as graphs, where nodes represent concepts and edges the relations between those concepts. This graph yields connectivity measures between potential concepts (e.g., WordNet senses) and the senses of its context terms that are translated into corresponding similarity measures. The result then is compared for example to WordNet senses. The system chooses the WordNet sense with the strongest connection to the context terms and retrieves its sentiment charge.