
Analysis and Visualisation
News Media Sentiment Analysis
​
We are observing an increasing antipathy and even hostility towards refugees, specifically refugees from the Middle East, both in the UK and globally. There is also a growing trend in creating tighter restrictions in laws and public policy with regards to refugees. Kaye (1994) has argued that such changes have been made possible in part because of how the public perception of refugees has been altered.
​
In carrying out this news media sentiment analysis, we would like to find out what is the general image of refugees created by the media in the UK and then compare it with the discourse on social media, namely Twitter. Our hypothesis is that 1) there will be an increased number of articles published around the time of three major terrorist attacks that happened in the United Kingdom in 2017; 2) the sentiment scores of articles will be largely either highly negative, or highly positive, which could indicate high bias on the side of the publishers; 3) the sentiment scores will prove to become more polar for a period of time after each of the attacks.
​
The attacks were: the Westminster attack (22/03/2017), the Manchester Arena attack (22/05/2017), and the London Bridge attack (03/06/2017).
As predicted, the sentiments of the articles are more likely to be on polar ends, than near the neutral score. The dataset is also leaning towards negative sentiments. This could be due to: 1) the articles are negatively describing refugees, 2) the word "refugee" is usually related with some kind of trauma happening to a person and words to describe that trauma could be identified as negative by VADER, 3) both. The last option is most likely what is influencing the sentiment values and it remains a limitation in the methodology used to assess the sentiment of the articles. To further look into the distribution of the sentiment scores, we looked at the average sentiment score that was given to articles on each day of the year in order to look for any patterns that might emerge throughout the year.
This graph shows that there is no clear media attitude towards the topic of refugees and that it varies in time quite a lot. The calculated variance of 0.84 is very high, which indicates that the data points tend to be spread out from the mean and each other. This implies that opening an article from our dataset at random can both yield extremely positive and extremely negative articles at high confidence.
The red lines signify the attacks that took place in the spring and summer of 2017 and it is surprising to see that there was no apparent prolonged decrease in the average sentiment score immediately after each event, but there are other three distinctive periods of sentiment scores below zero. At the end of the year, you can observe that the average sentiment scores are mostly consistently negative, but still there are some spikes towards a score of +1.
​
As there was a high number of publishers with only between 1 and 20 articles in the dataset, which is not a representative sample, we chose six publishers with the highest number of articles to further investigate their reporting style.
​
As you can see from the graph above, The Independent and The Guardian were the leading publishers in this dataset, with the Daily Mail slightly behind them. According to this Digital News report by the Reuters Institute [2], both the Independent and The Guardian rank highly in the attention given to them by the British audience. Mail online, which is the online version of the Daily Mail, is also high on the list. The Telegraph and The Times also appear in the summary. What is surprising is that the BBC is not represented well enough in our dataset to reflect its popularity among the British population.
​
Nexis Library does not allow filtering for specific magazines and it is assumed that their algorithm for the articles that appear in the search might be random, however the fact that the highest number of articles come from widely recognised magazines that appeared in the Reuters report might suggest otherwise.
In terms of their political stance, The Independent advertises itself as "free from party political bias, free from proprietorial influence" [3]. However, in the 2015 elections they endorsed the Liberal Democrats party [4], which is a strong advocate for the improvement of treatment of refugees [5]. The Guardian is viewed as a largely left-wing newspaper and both in the 2015 [4] and the 2017 elections it endorsed the Labour Party [6]. The Daily Mail is also leaning towards the left-wing political stance, endorsing the Labour Party in the 2015 elections [4]. The Times, on the other hand, voices their support towards the Conservative Party, whose 2017 manifesto pointed towards focusing on help abroad and a decrease in help offered to refugees in the UK. It states “wherever possible, the government will offer asylum and refuge to people in parts of the world affected by conflict and oppression, rather than to those who have made it to Britain” [7]. Furthermore, Theresa May, the party’s Leader since 2016, has spoken about the refugee “crisis” in terms of the disadvantages of welcoming them into the country [8].
​
A number of websites analysing media bias, including All Sides and Media Bias Check, categorised The Telegraph as right-leaning [9] [10]. The newspaper is sometimes referred to as the Torygraph [11], and Media Bias Check claims it uses loaded words to favour conservative causes [10].
Express Online is also leaning towards right-wing bias and is openly supporting the Conservative Party.
Even though the number of articles in the dataset is greater for the left-leaning magazines, we believe that it is well balanced with three magazines representing each of the two political stances.
Do terrorist events influence the number of articles published by the selected magazines?
From the graph above showing the average score given to articles on each day of the year, we can see that the three terrorist events that happened in the UK in 2017 did not influence the sentiment scores of articles in any particular way. To look at whether the events influenced the reporting of individual magazines, we will first look at the numbers of articles that were published by each of the selected publications to look for any emerging patterns.
Surprisingly, the graph above shows a relatively big spike in the number of articles published on the 28th of January to 78 overall. This is driven mainly by the Daily Mail with 38 articles published on that day, but an increase can also be seen for The Times and The Telegraph. On that day, the US president Donald Trump's executive order to close America’s borders to refugees and immigrants was enforced [12]. This was quite an important topic in last year's politics, but nevertheless it is unexpected that international news received more media attention than the attacks. Another quite visible increase in the number of articles published happened on the 20th of September. On that day, the media talked about refugees in Australian camps being resettled to the US. One reason for the differences could be that magazines do not want to give their attention to terrorist attacks, in order not to help the terrorist organisations achieve their goals.
This report on Terrorism and the Media, describes three ways, in which terrorism uses the press: 1) gaining public attention, 2) gaining sympathy for its cause, 3) spreading concern and terror in the general public and influencing political change [13]. By not giving them the attention the media could limit the exposure terrorists are received. However, it is highly unlikely this assumption is true, as a recent study done at the University of Alabama has reported that terrorist attacks carried out by Muslims receive 357% more press attention than those committed by non-Muslims. Therefore, this discrepancy has to be attributed to the dataset we received and in order to gain deeper understanding of the actual reporting of the events, the data gathered needs to be more specific.
What is the sentiment of articles about refugees published by the selected UK magazines?
What is interesting, most of the selected magazines display a more negative sentiment, even if their political stance is more left-leaning and thus can be thought of as supporting the values and policy propositions of the parties they endorse. Especially interesting in the context of the previously mentioned Torygraph nickname, is the fact that The Telegraph displays the most positive sentiment scores among all of the magazines analysed.
​
Only three magazines, The Independent, The Guardian and The Times published a relatively small proportion of articles classified as neutral. Again, the reason for such a distribution of scores for the articles and magazines can be attributed to the fact that reports on events relating to refugees can involve descriptions of traumatic events and situations of war, which can have an influence on the sentiment analysis carried out by VADER. In fact, the article that scored the lowest in this analysis, was one published in The Guardian about Behrouz Boochani’s diaries, which talks about the author’s struggles with escaping Iran and seeking asylum in Australia, and in the end being sent to a detention centre on Manus Island [12]. The article describes the conditions in the detention centre and evaluates the new policy introduced by the right-wing Coalition and Labor governments. It uses loaded language like “Australia built a hell for refugees on Manus”, “the agony of extreme hunger”, but mostly it is report-style. However, it also includes excerpts from the diaries, which contain sentiments like “[t]he agony of extreme hunger…”, “Again, I wake from nightmares […] I am starving […] I am extremely sleepy”, “It is harrowing to witness this middle-aged man lamenting in pain; he has been utterly degraded.” Such sentences are clearly influencing the overall sentiment score of the article.
​
Based on this finding, one could infer that it is almost the other way around in terms of the sentiment categories – the ones scoring higher are potentially more negatively biased towards the subject of refugees. To test this hypothesis, I also read the article that scored the highest. My hypothesis proved wrong as the highest scoring article was also published by The Guardian and it described how Borussia Dortmund help give refugees hope [13]. However, this is obviously not a perfect method of testing the hypothesis and it only shows limitations of the method we employed to analyse the texts. Sentiment analysis does not show the attitude of the text towards refugees, but rather the tone of the language used in these articles.it can be inferred that VADER Sentiment Analysis is not the perfect method of analysing sentiments of articles that handle such sensitive and emotional topics.

Twitter Sentiment Analysis
Reports by the United Nations (Emmerson, 2016) state that there is no evidence that migration leads to increased terrorist activities. However, our hypothesis is that public sentiment changes towards refugees after terrorist attacks.
We focused on three major terrorist attacks occurring in the United Kingdom in 2017 to analyse the sentiment of social media; the London Bridge attack, Manchester Arena bombing, and the Westminster attack.
Our analysis looks at the Twitter sentiment of refugees (all tweets containing the word 'refugee') in the hours surrounding each attack. The sentiment of each tweet is calculated with VADER sentiment.
​
After this preliminary analysis, we use natural language processing to understand the choice of words social media uses when talking about refugees.
London Bridge Attack date and time: June 3rd 2017, 22:16
​
VADER sentiment scores:
+0.5 to 1: Strongly positive
+0.05 to 0.5: Slightly positive
-0.05 to +0.05: Neutral
-0.05 to -0.5: Slightly negative
-0.5 to -1: Strongly negative
​
Tweets containing the search term 'refugee'.
​
Number of tweets are placed into bins of 10 minute intervals.
The London Bridge Attack occurred on the 3rd of June 2017 at 22:16. As can be seen from the histogram, the number of tweets containing the search term 'refugee' rose rampantly after the attack until it reached a peak just past midnight at 00:20, a little over two hours after the attack. At this peak, 61 tweets have a strong negative sentiment, 36 a slightly negative sentiment, 37 a slightly positive sentiment and 30 a strongly positive sentiment. Hence, at this peak, 37.2% of the tweets have a strong negative sentiment. If you compare this to the the tweets on the 3rd of June at 00:20 (24 hours prior), only 23.8% of the tweets containing the word ‘refugee’ had a strong negative sentiment. This is a much lower percentage than after the attack.
​
In the 24 hours leading up to the attack, there were a total of 7043 tweets, and the positive to negative sentiment ratio of tweets was 1.166:1.
​
In the 24 hours after the attack, the number of tweets jumped to 14375 (a 104.1% increase), and the positive to negative sentiment ratio of tweets fell down to 0.645:1 (for every positive tweet, there were 1.551 negative tweets).
Manchester Arena Bombing date and time: May 22nd 2017, 22:31
​
VADER sentiment scores:
+0.5 to 1: Strongly positive
+0.05 to 0.5: Slightly positive
-0.05 to +0.05: Neutral
-0.05 to -0.5: Slightly negative
-0.5 to -1: Strongly negative
​
Tweets containing the search term 'refugee'.
​
Number of tweets are placed into bins of 10 minute intervals.
The Manchester Arena Attack took place on the 22nd of May 2017 at 22:31.
Similar to the London Bridge Attack, the data shows that the number of tweets containing the term ‘refugee’ again increased after the attack.
In the 24 hours leading up to the bombing, there were 11,066 tweets, and the positive to negative sentiment ratio of tweets was 0.804:1 (for every positive tweet, there were 1.244 negative tweets).
In the 24 hours after the attack, there were 16,046 tweets (a 45% increase) the positive to negative sentiment ratio of tweets fell down to 0.665:1 (for every positive tweet, there were 1.504 negative tweets).
There are three interesting outliers on this plot on May 22 2017, at 15:10. 15:30. and 20:30. Incredibly, these three outliers are all largely be attributed to a large number of auto-spam accounts using an automated social media management platform to promote the same Mashable article (there must have been time lags in between).
Ignoring these spam tweets may perhaps make the positive to negative sentiment ratio larger than 1 for the 24 hours leading up to the bombing, likely following a similar ratio to the London Bridge attack.
Generally, a greater number of tweets with strongly negative sentiment are posted on Twitter after the attack.
Westminster Attack date and time: May 22nd 2017, 14:40
​
VADER sentiment scores:
+0.5 to 1: Strongly positive
+0.05 to 0.5: Slightly positive
-0.05 to +0.05: Neutral
-0.05 to -0.5: Slightly negative
-0.5 to -1: Strongly negative
​
Tweets containing the search term 'refugee'.
​
Number of tweets are placed into bins of 10 minute intervals.
The Westminster Attack happened on the 22 March 2017 at 14:40. Similarly, as with the two other terrorist attacks, the number of tweets posted containing the search term ‘refugee’ was greater after the attack, than before.
​
The number of tweets before and after was somewhat similar to the Manchester Arena bombing, with 11,696 tweets before the attack and 15,705 tweets after the attack, a 34.3% increase.
​
Interestingly, the Twitter sentiment ratios during the Westminster attack differs from the previously studied incidents. In the 24 hours leading up to the bombing, the positive to negative sentiment ratio of tweets was 0.794 (for every positive tweet, there were 1.260 negative tweets). In the 24 hours after the attack, the positive to negative sentiment ratio of tweets actually rose 0.873 (for every positive tweet, there were 1.146 negative tweets).
​
What is also interesting to note about this histogram is that irrespective of the attack, there seems to be a parabolic trend in the number of tweets posted which contain the term ’refugee’, hitting troughs at British night times.
Overall, it can be observed that after all three attacks, there was an increase in the number of tweets containing the word ‘refugee’. Possibly due to this overall increase in tweet number, there was also a greater number of tweets with a strongly negative sentiment.
​
One limitation is that the tweets displayed in the histogram are global tweets. It would be interesting to look at tweets from only the UK, to gain a clearer understanding of how the public sentiment specifically in the UK changed after the three attacks. It would also be interesting to compare the histograms above with a histogram from a random date in 2017, on which no terrorist attack took place in the UK.
​
Another limitation of these visualisations is that as correlation does not mean causation, it may merely be a coincidence that generally the number of tweets containing the word ‘refugee’ increased after the attack. This is addressed later using Natural Language Toolkit with dispersion plots (we will see that negative tweets also uses the words 'terrorist[s]' and 'terrorism' more after each attack compared to leading up to each attack).
Natural Language Toolkit Analysis
Python's Natural language toolkit (NLTK) allows simple natural language processing (NLP) visualisations and analysis to be carried out. It is extremely useful in diving deeper into the language of social media.
​
To begin with, sentiment analysis has told us how social media feels, but NLP can give us insight on how Twitter users expresses their sentiment, what words and phrases they use to convey their expressions, and what topics related to refugees Twitter discusses.
​
https://www.wonderfoundation.org.uk/blog/refugees-media-%E2%80%9Cdifferent-species%E2%80%9D
​
Based on the literature review conducted, we create lexical dispersion plots on words that are commonly used to describe the status of and movement of refugees. These plots can also serve to show Twitter's reaction to each terrorist attack.
​
We then create word clouds to see if we can gain an understanding of how Twitter users convey their sentiment.
​
"Refugee arrivals are [often] referred to in metaphors comparing refugees to movement of water (flooding, pouring, or streaming over borders; camps or centres overflowing) or pestilence (swarms of refugees), which contribute to an image of these groups not as individuals seeking asylum but as some kind of uncontrolled and unpredictable force of nature."
(Greenbank, 2017)
http://www.languageonthemove.com/refugees-in-the-media-villains-and-victims/
London Bridge Attack
Tweets from 2 June 2017 22:17 to 4 June 2017 22:16
Positive sentiment tweets
​
'Welcome' is a popular word for positive tweets related to refugees, likely due to the slogan 'Refugees welcome'. The graph shows that the most of the words, including "London," "terrorist," and "terrorism" appeared less in positive tweets.

Negative sentiment tweets
​
What is most noticeable from this plot is that words linked to terrorism increased after the London Bridge attack. Moreover, mentions of 'London' immediately appeared and in great numbers after the attack, compared to just one mention in the 24 hours prior.
​
It also appears that words like 'influx', 'wave', and 'flood' are mentioned more in negative tweets than positive tweets, although from a dispersion plot, the difference that can be perceived may actually only be negligible.

Neutral sentiment tweets
​
'Migrants' and 'immigrants' are the most common words in tweets considered to be 'neutral'. Interestingly, while in both positive and negative sentiment tweets, 'immigrants' is overwhelmingly preferred for use compared to 'migrants', for neutral sentiment tweets, they appear to have a similar amount of usage.
​
Neutral sentiment tweets also have no mention of 'swarm'; in fact, 'swarm' has only 3 mentions across all tweets.

Manchester Arena Bombing
Tweets from 21 May 2017 22:32 to 23 May 2017 22:31
Positive sentiment tweets
​
'Welcome,' again is the most frequent word that was mentioned in positive tweets after the bombing. We can also see that both "asylum" and "seekers" appeared more often that previous attacks.
​
We see a more balance distribution of the use of 'migrants' and 'immigrants.'

​
Negative sentiment tweets
​
Similar results to the London Bridge attack can be seen here. Again, 'influx' is the most used description of refugee movement, and again, 'flood' is the second most used.
​
We do see the frequency of 'influx' and 'flood' increase in frequency after the bombing compared to before the attack.

Neutral sentiment tweets
​
"Migrants" and "immigrants" are still the most frequent words used in neutral sentiment tweets. Interestingly, the word "conflict" only appeared three times in the neutral tweets.

Westminster Attack
Tweets from 21 March 2017 14:41 to 23 May 2017 14:40
Positive sentiment tweets
Similar results to previous attacks, "welcome" is the most used description of the refugee movement, and "immigrants" is the second most used. However, both "asylum" and "seekers" appeared less than the Manchester Arena Bombing.
​
​
​

Negative sentiment tweets
We do see a frequency of words related to terrorism increased after the attack. The word "illegal" also increased in negative tweets compared to positive ones.

Neutral sentiment tweets
Migrants' and 'immigrants' are still the most common words in tweets considered to be 'neutral'. Interestingly, in both positive and negative sentiment tweets, "immigrants" is used more often than "migrants," but in the neutral tweets, it seems that the words have similar amount of usage.
​
