Microsoft Word turney-littman-acm doc


warm, sweaty hands in your pockets.  0.1535  3.  If you want poor



Yüklə 200 Kb.
Pdf görüntüsü
səhifə2/18
tarix22.05.2023
ölçüsü200 Kb.
#119806
1   2   3   4   5   6   7   8   9   ...   18
warm, sweaty hands in your pockets. 
0.1535 
3. 
If you want poor customer service and to lose money to ridiculous 
charges, Bank of America is for you. 
-0.1314 
In Table 1, for each sentence, the word with the strongest semantic orientation has 
been marked in bold. These bold words dominate the average and largely determine the 
orientation of the sentence as a whole. In the sentence that is misclassified as positive, the 
system is misled by the sarcastic tone. The negative orientations of “stranger’s” and 
“sweaty” were not enough to counter the strong positive orientation of “warm”. 
1
See http://www.epinions.com/. 


5
One application of review classification is to provide summary statistics for search 
engines. Given the query “Paris travel review”, a search engine could report, “There are 
5,000 hits, of which 80% are positive and 20% are negative.” The search results could 
also be sorted by average semantic orientation, so that the user could easily sample the 
most extreme reviews. Alternatively, the user could include the desired semantic 
orientation in the query, “Paris travel review orientation: positive” [Hearst 1992]. 
Preliminary experiments indicate that semantic orientation is also useful for 
summarization of reviews. A positive review could be summarized by picking out the 
sentence with the highest positive semantic orientation and a negative review could be 
summarized by extracting the sentence with the lowest negative semantic orientation. 
Another potential application is filtering “flames” for newsgroups [Spertus 1997]. 
There could be a threshold, such that a newsgroup message is held for verification by the 
human moderator when the semantic orientation of any word in the message drops below 
the threshold. 
Tong [2001] presents a system for generating sentiment timelines. This system tracks 
online discussions about movies and displays a plot of the number of positive sentiment 
and negative sentiment messages over time. Messages are classified by looking for 
specific phrases that indicate the sentiment of the author towards the movie, using a 
hand-built lexicon of phrases with associated sentiment labels. There are many potential 
uses for sentiment timelines: Advertisers could track advertising campaigns, politicians 
could track public opinion, reporters could track public response to current events, and 
stock traders could track financial opinions. However, with Tong’s approach, it would be 
necessary to provide a new lexicon for each new domain. Tong’s [2001] system could 
benefit from the use of an automated method for determining semantic orientation, 
instead of (or in addition to) a hand-built lexicon.
Semantic orientation could also be used in an automated chat system (a chatbot), to 
help decide whether a positive or negative response is most appropriate. Similarly, 
characters in software games would appear more realistic if they responded to the 
semantic orientation of words that are typed or spoken by the game player. 
Another application is the analysis of survey responses to open ended questions. 
Commercial tools for this task include TextSmart
2
(by SPSS) and Verbatim Blaster
3
(by 
StatPac). These tools can be used to plot word frequencies or cluster responses into 
categories, but they do not currently analyze semantic orientation. 
2
See http://www.spss.com/textsmart/. 
3
See http://www.statpac.com/content-analysis.htm. 


6
3. SEMANTIC ORIENTATION FROM ASSOCIATION 
The general strategy in this paper is to infer semantic orientation from semantic 
association. The semantic orientation of a given word is calculated from the strength of 
its association with a set of positive words, minus the strength of its association with a set 
of negative words: 
(1) 
(2) 
(3) 
(4) 
We assume that A(word
1
, word
2
) maps to a real number. When A(word
1
, word
2
) is 
positive, the words tend to be associated with each other. Larger values correspond to 
stronger associations. When A(word
1
, word
2
) is negative, the presence of one word 
makes it likely that the other is absent.
A word, word, is classified as having a positive semantic orientation when 
SO-A(word) is positive and a negative orientation when SO-A(word) is negative. The 
magnitude (absolute value) of SO-A(word) can be considered the strength of the semantic 
orientation. 
In the following experiments, seven positive words and seven negative words are 
used as paradigms of positive and negative semantic orientation: 
(5) 
(6) 
These fourteen words were chosen for their lack of sensitivity to context. For example, a 
word such as “excellent” is positive in almost all contexts. The sets also consist of 
opposing pairs (good/bad, nice/nasty, excellent/poor, etc.). We experiment with randomly 
selected words in Section 5.8. 
It could be argued that this is a supervised learning algorithm with fourteen labeled 
training examples and millions or billions of unlabeled training examples, but it seems 
more appropriate to say that the paradigm words are defining semantic orientation, rather 
than training the algorithm. Therefore we prefer to describe our approach as 
unsupervised learning. However, this point does not affect our conclusions. 
This general strategy is called SO-A (Semantic Orientation from Association). 
Selecting particular measures of word association results in particular instances of the 

Yüklə 200 Kb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   ...   18




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin