Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Determine Temporal Shifts of Opinion in OSNs

“Hybrid Sentiment
Analysis Utilizing Multiple
Indicators To Determine
Temporal Shifts of
Opinion in OSNs”
April 19th
, 2016
Joshua White, Robert Hall,
Jeremy Fields, Holly White

2

Introduction

Shifts in Opinions

Dataset
– Dataset Storage Schema

Analysis
– Language Characteristics
– Demographic Characteristics
• Gender
• Location
• Group Affiliation

Conclusion / Future Work

References / Contact Info
Overview

3

Social networks allow individuals to share ideals with like minded
people at a faster/broader rate than ever before.
– This is true for “extreme” ideals as well (Danger)

We continue to attempt to understand the mechanisms of change
in opinion
– Both public opinion and individuals (over time, not suddenly)

Two Major Findings:
– We find that groups are affected most by high confidence level “experts”,
typically males, who imbue trust
• Equally, Undecided or uninformed individuals have a positive affect on these
groups . (Increasing group rationality)
– We find clusters of low confidence, like minded individuals, increase
overall confidence in a group through positive feedback mechanisms
• Women are more likely to comprise the other two groups [1]
Introduction

4

Shifts of public opinion has been the object of
research for some time (psychology / sociology)
– Doing so at scale is fairly new
– Most progress in the area has resulted from increased
computation capabilities
• The ability to simulate or replay long term changes
– Actual lab investigations at this level would be impractical
– Researchers have identified three primary actors in
change of sentiment (As discussed previously):
• The expert
• The undecided/uninformed
• Clusters of low confidence individuals
Shifts in Opinions

5

Experts are actors with a high level of certainty
(confidence)
– Doesn't need to actually be an expert
– If the percentage of experts within a group hits ~15% then
they can affect group opinion
– Often the only offer vague amounts of actual knowledge

Shifts of individuals who are (uninformed or undecided),
not due to expert influence are considered to be noise

Clusters of low confidence individuals with congruent
opinions great stable state (majority rule)
– This also creates a positive “boost” feedback in their own
confidence.
Shifts in Opinions

6

Trust
– In the case of this work was found to be important when
compounded with distance of similarity
– Higher trust = higher shifts in opinion
• Especially if the trust was for an “Expert”
– Actors with similar interests were found to increase confidence in
a bidirectional manner
– Actors with high dissimilarity between ideas were found to have
negligible effects on each others opinions [2]
– Example:
• Democrats and Independents who trusted scientist became
increasingly concerned with global warming where as increased
knowledge was uncorrelated to concern in skeptics of scientists and
among Republicans [3]
Shifts in Opinions

7

Started with a series of political hashtags that
were collected as part of a previous research
project, researchers at SUNY Polytechnic
collected 9Million+ tweets from the trickler API.
Dataset Selection

This dataset is
available upon
request in full or
summarized form,
under a data sharing
agreement. A
complete summation
of the dataset is also
available in report
form.

8

As will be discussed in another presentation:
– We represent the data within a semantic model which
expresses relationships within the social network
– We define this model as Fine-Grained User Diffusion
(FGUD)
– This model allows for analytic traversal at the user level
– Sample: (:Post attribute)
Dataset Storage Schema

9

“Simple” Language Analysis
– K-Means Clustering of Shannon's Entropy
• Language Agnostic Calculation [9, 10]
• Represent the calculated entropy of each message
in the dataset as a 1-dimensional array in R and
compute the initial graph
Entropy K-Means
Analysis

10
●
Entropy scale 1-8
●
Previous work has shown that Twitter has 3 distinct
groups: Human, Bot, Cyborg
Analysis

11

Allowing K-Means automatic cluster number
selection, we get 27 distinct groups:
Analysis

12

Gender Detection
– Both, name (if known) of author, and message content
is used
– Utilizes a Naive Bayesian classifier based on Mustafa
Atik, and Nejdet Yucesoy’s, (Genderizer) [13]
• Gender was determined for 82.05% of all messages
– Did not use S. Sakaki, et. al method combined gender
inference due to the 6 fold increase in computation for
0.48% increased detection
Analysis

13

Time Zone subdivision
– Dataset contained only 0.116% geo-tagged
– Cheap Geo-inferencing
– Concentrated on only US Time Zones
– Broke into Male/Female for each
Analysis

14

Still working to impliment: M. Conover et. al.
work “Predicting the Political Alignment of
Twitter Users” [15].
– This is a TF-IDF (Term Frequency – Inverse
Document Frequency) method
– Allows categorization of “Left” and “Right”
affiliations
– This method has not been implemented on data
subsets like ours: (human only, gender, and
geographic specific)
Analysis (Group Affiliation Issue)

15

M. Conover et. al. work only addresses network
membership and use of specific hashtags
– Leaves out a number of scenarios:
• Joining a network just to troll it or try to sway others
• Frequent communication with a group/network that
they are not a part of, etc.
Analysis (Group Affiliation Issue)

16

Presented a down selection approach to select posts

Examined group affiliation detection and found that
work needs to be done in this area before methods can
be implemented in order to lower inaccuracies

We are continuing this work currently
– Traversing and collecting “snapshots” of all posts,
following/followed relationships, profiles at moments in
time
– 1 complete snapshot of the same accounts each quarter
for 1.5 years before and after the 2016 US presidential
election
– Measuring resultant changes in individuals
Conclusion / Future Work

17
For more information
contact:
Joshua S. White
Josh@rsignia.com
References / Contact Info

Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Determine Temporal Shifts of Opinion in OSNs

More Related Content

Similar to Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Determine Temporal Shifts of Opinion in OSNs

More from Joshua S. White, PhD josh@securemind.org

Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Determine Temporal Shifts of Opinion in OSNs