Falsehood flies, and the Truth comes limping after it
Influence campaign & disinformation on Twitter

Abstract
“Falsehood flies, and truth comes limping after it, so that when men come to be undeceived, it is too late; the jest is over, and the tale hath had its effect”
Jonathan Swift

In the first week of January a toll-free number 88662 88662 was shared by the Indian government asking people to give a missed call on this number to show their support for the Citizenship Amendment Act 2019. [wiki article]

In a few day the number was shared thousands of times by differnet users and soon reports started coming in how this number was shared by promising tantalising offers for people who called the number. [News report here]

I was interested in the extend and spread of disinformation campaign on Twitter. Therefore I scraped 3589 tweets (tweeted from 10:00 AM of 2nd Jan 2020 to 10:00 PM of 5th Jan 2020) that mentioned this number and then I analyzed the content of these tweets. I was mainly interested in:

  1. Extend of the disinformation i.e. tweets encouraging social-media users to call the toll-free number but concealed the exactly function of the number.
  2. Spread of the disinformation i.e. retweets and likes for tweets encouraging social-media users to call the toll-free number but concealed the exactly function of the number.
  3. Different types of disinformation campaigns

Please note.: The intention here is not to make a statement about the Citizenship Amendment Act but to analyze how twitter is used to promote disinformation. I used this because of this being topical and to see if the reports suggesting about disinformation about the number on twitter were anecdotal or does the data suggest that the tweets promoting falsehoods were a substantial percentage of all the tweets.


Challenges

As mentioned above the idea is to classify the tweets in different categories (clusters) and sub categories. This task was callenging as there is no training set for these kind of classification. Many tweets were in latin script but the language was Hindi. Another big challenge was the presence of images and videos in the tweets which could not be analyzed textually.

There were also many restrictions set by Twitter through API Limits. Rate limits are also divided into 15 minute intervals with a limit of 180 requests.

Because of the above challenges the data set is only 3589 tweets (to make it managable)


Data Collection & Methodology

The data was collected using Tweepy and Twitter API in Python. The data set has 3589 tweets, tweeted from 10:00 AM of 2nd Jan 2020 to 10:00 PM of 5th Jan 2020 mentioning '8866288662' or '88662-88662' in the content. The data and analysis only focus on the time window of 3 days mentioned and was last updated on 20th Jan 2020 (so its possible the retweet and likes amount are not updated and many users accounts might now be inactive or suspended). The whole data set can be downloaded here.

First round of classification is done using keyword search. For ex. if a tweet had 'CAA' or 'CAB' and 'support' mentioned and words like 'Netflix' or 'free offer' are not mentioned then it was classified to category of tweets which supported CAA (truthful tweets) but if it didn't mention those things and mentioned 'free netflix subscription' then it was classified to category of tweet which spread disinformation. After this prelimanary tagging a secondary manual tagging was done for the tweets which could not fall under these categories.

  1. Truthful Tweets: Tweets encouraging twitter user to give give a missed call on the number to support the Citizenship Amendment Act (the original intention of the number)
  2. Tweets Spreading Falsehoods: Tweets encouraging user to call on number based on tantalising offer and tweets discourging user to call on the number based on fake news.
  3. Fact Checking + Trolling: Tweets which fact check or troll the users that shared the tweets spreading disinformation and the tweets spreading disinformation
  4. Unclassified / Neutral: Tweets which I was not able to classify in any of the above category mainly because of lack of context or because the content of the tweet seemed neutral to me

Key Findings

Of 3589 tweets analyzed, 1853 tweets are truthful tweets, 1059 tweets spreads falsehoods, 336 tweets are tweets trolling other users and 341 tweets are either neutral or unclassified.


Around 1 in 3 (29.5%) tweets that mentions 88662 88662 spreads falsehood
3589total tweets51.6%29.5%9.4%9.5%Truthful Tweets (1853)Tweets Spreading Falsehoods (1059)Fact Checking + Trolling (336)Unclassified / Neutral (341)

Timeline of tweeting

Truthful TweetsThere was an initial wave of truthfultweets that peaked around1 PM on 3rd of JanTweets Spreading FalsehoodsThen a wave of tweets spreading falsehoods kicked in.In this period the no. of tweets tweeting falsehoods, weremore than that of no. of tweets that tweeted truthful info.Fact Checking + TrollingThen once people started realizing the extend of false tweets,a small wave of fact checking and trolling tweets warning theusers about the misinformation started02/01 10:0002/01 22:0003/01 10:0003/01 22:0004/01 10:0004/01 22:0005/01 10:0005/01 22:00

Types of falsehoods

Tweets spreading falsehoods can be further divided in 2 categories:

  1. Tweets encouraging users to call using falsehoods
  2. Tweets discourging users to call using falsehood

95.7% of tweets spreading falsehoods encouraged users to call the number on the basis of tantalising offers. Therefore, it can be said that a vast majority of tweets spreading falsehood are from the side that is in support of the act
1059falsehood tweets95.7%4.3%Tweets encouraging to call (1013)Tweets discouraging to call (46)

A lot of bizzare promises we made for dialing the number.


Around 1 in 9 (10.5%) tweets that mentioned the number promised sex or hot chat with hot girls or 72 virgins or porn site subscription by giving a call on the number

Some of the most shared or popular and bizzare categories of tweets are:

  1. Tweets promising free sex or hot chat with hot girls or 72 virgins or porn site subscription: There are 378 tweet (around 1 in 9 tweets) promised either free sex, hot chats or promised this is phone number is of porn stars or film actresses. Tweets in this category are like“Hey TweetHearts Save my number & Call me 8866288662” or “Too bored today, so ready to share my number with all my followers”
  2. Tweets promising free Netflix subscription: This is by far the most retweeted or shared false promise. On average this was retweeted about 9 times and liked about 34 times. An example of tweet of such kind: "@MuralikrishnaE1 Thanks, #NetFlix I got my 6 Months free subscription by calling 8866288662"
  3. Tweets promising free offers like data plans, free pizzas, photoshop subscription, jobs and even Rs. 15 Lac in the account: Tweets like "#Jio Maha offer- Call on this no. And get 10 GB data. 8866288662", "For free pizza call this number now. Offer valid only till 6 PM today 8866288662" or "Breaking: Modi govt is finally giving 15 Lakhs to people. But you have to call 8866288662 today to register. Hurry up. Call NOW. That number again 8866288662"
  4. Tweets promising "Babri Masjid back": This is small minority of tweet but with very communal content like "Want ur Masjid back? okay give a Simple Call on #8866288662 and get Rs 64.60 Talk time Voting Started now https://t.co/jXccqTNMJC"
  5. Tweets saying calling this number to register you protest against BJP or to support the protest against CAA: This category of tweets was targeted specifically toward the people who are either opposing the act or are anti-BJP(the ruling party) or Indian National Congress or Aam Aadmi Party supporters. The content in this category varied from supporting Rahul Gandhi to be next Congress president to supporting Arvind Kejriwal to be the next CM of Delhi to supporting the protest against the act.


Spread of truthful tweets and falsehoods

The most popular tweets, not surprisingly, was by Mr Amit Shah (home minister of India)


On average, truthful tweets are retweeted about 8 and liked about 24 times. But if you remove the no. of retweets and likes Mr. Amit Shah's (Home Minister of India) got the average no. of retweets and likes drops down to 5 and 12 respectively (number very close or same as that of tweets spreading falsehoods).

On average, tweets spreading falsehood are retweeted about 4 and liked about 13 times

On average, tweet promising Netflix subscription by calling this number was retweeted 9 and liked about 34 times. The average user engagement with the tweet with this falsehood was even higher than the average user engagement with the truthful tweets.

The spread of disinformation as so great that @NetFlixIndia had to tweet "This is absolutely fake. If you want free Netflix please use someone else's account like the rest of us" [tweet here] and Mr. Amit Shah (Home Minister of India) also had to warn people of the rumours by clarifying: "Since y'day rumours are being spread that the number (toll free number launched by BJP to garner support for #CitizenshipAct) belongs to some channel called,Netflix. I would like to clarify that the number never belonged to Netflix rather it is BJP's toll free number." [Source: ANI].
Lexicon of different categories

The visualization below visualizes the 50 most commont words and emojis (after removing stop words and stemming) used in the tweets of different categories


Truthful Tweets
Tweets Spreading Falsehoods
Fact Checking + Trolling




Project Name
Inactive
tweet content
All Tweets
Truthful Tweets
Tweets Spreading Falsehoods
Fact Checking + Trolling
Unknown / Neutral
Inactive Users

Size of circle represents the no. of retweets. Mouseover to see the details and highlight other tweets by the same user; and click to open the tweet in new window

02/01 10:0002/01 22:0003/01 10:0003/01 22:0004/01 10:0004/01 22:0005/01 10:0005/01 22:00

Conclusion

Although this is very small subset of all the available tweets, this helps us to see a pattern. It highlights the big problem with getting infomation from Twitter.

"Fake news is perfect for spreadability: It’s going to be shocking, it’s going to be surprising, and it’s going to be playing on people’s emotions, and that’s a recipe for how to spread disinformation", Miriam Metzger, a UC Santa Barbara communications researcher explained to a Vox reporter [article here]. It seems that the same strategy was used here to promote the number.

Many times trolling help to achieve the goal it aims to defeat, as it promotes the same thing it aims to impede. As there is no such thing as bad PR, if no one has heard of you. For example in this case some of the trolling tweets were shared hundreds of time (as trolling tweets are generally shared more time than fact checking tweets), which although make fun of the tweets spreading falsehood, also helped spreading the number and therefore helping to promote it.

These visualizations can also be used to identify bots or paid accounts by looking at the users who have

  1. Tweeted same tweets multiple times
  2. Tweeted multiple tweets in burst in a small period of time
  3. Promoted multiple falsehoods