Nexalogy Teams up with York U and JMI, India, to support India election research project

India Election 2014 Study by Nexalogy, York University, and JMI India

Have you ever wondered how social media can be studied?

To get a good idea check out the paper below by York and JMI university professors that used the NexaMaster platform to collect and analyze over 40,000 Tweets about the most recent election in India, the world’s largest democracy.

Their paper is titled:

Report on Media Activism and Other Manias: How the English Mass Circulation Indian Press Framed the 2014 Election Campaign  

Daniel Drache, York University Fred Fletcher, York University Biswajit Das, JMI, Taberez A. Neyazi, JMI

In it, they argue that “The growing use of the Internet and mobile telephones has dramatically altered the dynamics of political communication and has helped political candidates appeal to voters directly and over the heads of the elites.”

They carried out an analysis of traditional media  but also worked with Nexalogy to evaluate the emergent field of social media as a realm for political communication:

“As a supplement to the content analysis of the press, a sample of more than 40,000 tweets was downloaded into a file for analysis.  The sample was drawn from the period May 5 to May 17, with the intention of capturing not only the final days of the campaign, when the strategies and appeals had become routinized, but also the online speculation about reasons for the outcome.  Because the key research questions here involve the impact of the AAP on the campaign, the sample includes only those tweets in which the AAP or Kejriwal are mentioned.  In order keep the detailed coding manageable, only tweets in English were included – to facilitate comparison with the English-language press – and every second tweet was coded, a total of 21,151.  In the discussion that follows, some descriptive data are drawn from the entire sample, while a more detailed analysis is based on the sub-sample.  Because the tweets were organized by publisher, the sub-sample reflects quite accurately the most frequent “tweeters.” “

As the authors argue, using a robust sampling strategy and the NexaMaster scoring manager which allows teams to qualitatively code Twitter data by publisher:

“We were able to develop a similar methodology for our study of Twitter usage during the weeks prior to the May 2014 election and for five days following the release of the official results. We collected 45,000 tweets in order to analyze what Indians said online about the election, particularly their perceptions of the main issues, their views of the leaders and their electoral preferences. Throughout the election, the mainstream Indian English language print media helped shape and manage the style, tone and substance of Tweets, particularly around Twitter content portraying Modi as the man most capable of leading India into the 21st century, but also in many other ways.”

And the findings stress the importance of Twitter as an emerging domain of political contestation in India:

“Perhaps the most significant development in the 2014 election was that social media, and in particular Twitter, emerged as an alternative channel of communication for hundreds of thousands of people. Social media users get the impression that they are participants in the campaign and in the making of history. The sense of ‘we-ness’ while often superficial and fleeting is important nonetheless. In the eyes of tweeters it helps corrects the impression that the public is indeed a phantom, something remote and distant under the control of the elites.”

NexaMaster helped this research team to identify the top influencers:

“The 140-character limit on Twitter makes it primarily a headline and referral service, alerting followers to news and events they might like to be aware of.  In other jurisdictions, studies have shown that the Twitter-sphere tends to be dominated by the most active publishers, who post a significant proportion of tweets and often set the agenda. (See, for example, Chu and Fletcher, 2014: 158-162). The “influencers” are those whose messages are retweeted, quoted or cited most often.  In this sample, not surprisingly, the top influencers were Arvind Kejriwal, with a score of 917, and the aamaadmiparty (527).  The third most mentioned influencer was narendramodi (291). Several of the others were celebrities (musicians, actors, poets) who identified themselves as AAP supporters.  Two of the top ten were news organizations – NDTV and ANI-News – and two others identified themselves as online commentators. As these observations indicate, the organized party units were the most influential, followed by a reasonably wide range of other participants.  In other jurisdictions, leading journalists are often among the top publishers and influencers (Chu and Fletcher, 2014: 158-161), but this does not seem to have been the case here. The ten most used hashtags tell a similar story.  The top two are related to AAP accounts and are intended to promote the party.  They constitute a full 57.3% of the top ten hashtags.”

“As Table 6 shows, the most mentioned campaign themes in the print sample and on Twitter were essentially the same in the sample period (May 1 to 17).  The major exceptions are the absence of references to platforms and policies on Twitter and the modest attention to corruption.

Table 6

Themes in newspaper and Twitter content 

(% of reports / tweets in which theme is mentioned)



% of items mentioning topic May 1-16



% of tweets mentioning topic May 5-17


Leadership / character





Regional influence on election





Corruption / accountability





Polls / predictions





Platforms / manifestos




Dirty tricks in election





Number of items



Source: Drache-Fletcher 2014 Indian Election Data Set

This methodology led their team to novel findings:

“In general, the key elements of the AAP campaign were present in the tweets.  Direct positive references to the AAP were present in 6.2% of tweets, with the majority emphasizing fighting corruption as the reason for their support (4.9% of all tweets).  During this period, negative references to the AAP were more common than positive ones (11.2% of the tweets), primarily reflecting disappointment with the party’s standing in the polls or Kejriwal’s decision to abandon his Delhi assembly seat to contest the national election (5.4% of tweets).  Nevertheless, many tweets later in the period supported the view that the party had done well for a first campaign and would do better in future (7.9% of tweets posted on May 16 and 17). “

And in conclusion they found that social media is moving from the margins to the mainstream of political communication:

“The rise of social media such as Facebook and Twitter has complicated the complex relationship between media and diversity, challenging conventional understandings of the ‘gatekeeper’ and ‘watchdog’ functions of the news media. As such, satellite channels today rely heavily on social media accounts for their stories. At the same time, as seen in the case of Egypt and Tunisia, it is often their framing of these accounts that allow social media stories to have national or international credence. The public expression of dissent in the Middle East demonstrates the consequences of the devolution of power downwards, as allowed by the participatory technology of Web 2.0. In another example from Canada, in 2008, about 400,000 citizens signed online petition to protest the prorogation of the minority parliament by the Prime Minister. This event enables us to better understand how the web has helped social activism, not always, but increasingly and unpredictably, move from the margins to the mainstream of the political landscape.”

For a full copy of the paper contact Daniel Drache

drache [at] yorku [dot] ca

To know more about NexaMaster as a research solution, write to us at:

info [at] nexalogy [dot] com

Alberta’s Orange Revolution

Alberta’s election results this year took everyone by surprise – even the pollsters who predicted them. To their great relief, the seemingly unlikely forecast that the NDP would topple the Conservatives’ 43-year reign turned out to be absolutely spot on.

Given the uncertainty of the outcome and the polls’ previously shaky track record in the province, we thought it would be instructive to run the social media conversation around the elections through our platform and make a comparison.

The polls closed at 8pm GMT, and the flow of Twitter commentary on the elections peaked at 9:00pm GMT, when the CBC predicted the NDP victory on air. Therefore, we decided to compare the conversation for 12 hours before and after the CBC’s announcement.

This lexical map charts the 125 most important words in the discussion leading up to 9pm and their interconnections. Here, it reveals some pretty interesting themes. There is a huge push for increased voter participation, especially in the wake of Prince Edward Island’s outstanding voter turnout of 81% in Monday’s provincial election. It seems this spurred a little friendly competition – though ultimately Alberta’s turnout was only 58.25%, it’s the province’s highest turnout in 22 years.

Nexalogy Lexical Map

A popular reference to PEI’s massive voter turnout:
PEI turnout

Several tweets bantered about the potential inaccuracy of the polls. There was concern on the part of NDP partisans that polls would turn out to be disastrously wrong, or that they might even mislead part of the voting public into assuming the Conservative party would lose, and therefore not bother to go vote. This may have significantly contributed to voter turnout.

The pollsters were regarded with considerable suspicion.

There is also a rather entertaining conversation on “pigs”, as the results began to come in.

10th top retweet:
Pigs Are Flying

Rachel Notley was the top influencer leading up to CBC’s prediction, with a whopping 1255 results, followed closely by @albertandp.
Nexalogy -  Top Influencers Before

We tabulated the mentions of different leaders and parties in order to determine their share of voice and turned up a couple of interesting tidbits. Notley and Prentice, but especially Notley, have an overwhelming number of mentions in their own right, indicating popularity reminiscent of Jack Layton’s “orange wave” in Quebec. Jean is much less proportionally significant, but this is not that surprising: he has been at the helm of Wildrose for a relatively short time.
Nexalogy - Share of Voice of Party Leaders Nexalogy - Share of Voice of Party Leaders with Party Hashtags

Following 9pm, expressions of congratulations and relief by Notley’s supporters were at the heart of the discussion. Major sidebars concern references to Jack Layton, celebration of Greg Clark for winning the first seat for the Alberta Party, and schadenfreude around Prentice’s loss as well as anticipation of his resignation speech.

Nexalogy Lexical Map - After

A comment from a Jack Layton fan:Jack Layton

Though there was considerable anxiety on social media that pollster predictions of a NDP landslide would somehow “jinx” the party, they ultimately proved correct. The gender balance in the conversation was fairly even with slightly more men, and the Twitter demographics of the #abvote discussion show a middle-aged cohort of journalists, educators, writers and senior managers who used Twitter well to support their cause.



Overwhelming support of the NDP in general and Notley in particular is apparent in the Twitter conversation leading up to the peak of discussions, and relief, amazement, and pride are dominant in the aftermath. The alignment of social media analysis and advance polling predictions is good evidence for the validity of the former to bolster the latter – especially in the case of an impending major political upset.

The “orange crush” victory of the NDP in Canada’s most longstanding Conservative bastion has given way to fervent discussion of the results at the federal level. It will be interesting to see how the next federal elections play out in light of Alberta’s major political upheaval, and clearly worthwhile to pay attention not only to advanced polls but also to social media buzz as indicators of future votes.

#MTLBDW: MTL Big Data Week

Last week, Nexalogy had the pleasure of helping bring Big Data Week to Montreal for the second time, in conjunction with MTL DATA, an organization for professionals of all domains who want to solve problems using data. The Montreal community came together in full force, and were blown away with the support in terms of partners, sponsors, and attendees.

#MTLBDW kicked off on Monday at Centre, with experts from Nexalogy, Ajah, Datacratic, Intel Security, KeaText, and McGill HPC.

View clip here

Pablo’s slides here

Tuesday was the first ever Open Data Book Club in Montreal, where attendees gathered to collaboratively explore an open dataset, in this case the STM metro turnstile data.

Wednesday was Papers We Love, where Peter Zion, Chief Architect at Fabric Engine presented two great papers on deep learning: Yann LeCun’s seminal 1998 paper, “Gradiant-Based Learning Applied to Document Recognition“, and Krizhevsky, Sutskever, and Hinton’s 2012 paper, “ImageNet Classification with Deep Convolutional Neural Nidetworks“.

On Thursday, McGill HPC gave an introductory workshop on GPU/CUDA, and come Saturday was the much anticpated hackathon, where the data and developer communities came together to build cool big data projects in 24 action-packed hours, at La Gare, a spanking new co-working space.

Highly skilled mentors were on-site to give lightning talks/workshops and lend a hand as needed.

La Gore closed up at midnight on Saturday, and opened up again at 8am on Sunday, where keen hackers came in bright and early to prepare for the hour of judgement to come.

1pm. Showtime. Meet the judging panel. From left to right,

- Nicolas Krutchen, Staff Software Engineer, Datacratic

- Alexis Smirnov, Chief Evangelist, Radialpoint

- Greg Gilbert, VP R&D, Analytics, Aviva Canada

- Michael Lenczner, CEO, Ajah

- Charles-Olivier Simard, VP & CTO, KeaText

MTLBDW 2015 Judges

Presentation #1: Select your own defining criteria for a perfect neighbourhood, and Dream Island will help you identify it. Git Repo | Demo

Presentation #2: Attitude Analysis used a corpus of 988,000 tweets retrieved from Twitter’s Search API from January to April 2015 containing the words “feminism”, “feminist” or “feminists”, trained a classifier to label them as pro-feminist, anti-feminist or neither (regardless of sentiment), and determined the most characteristic words used by each group with the log-likelihood keyness method. Git Repo | Slides – Congratualations on winning the Data Science Award and Data Visualization Award!

Presentation #3: Hackapun hacked an algorithm that discriminates between “homonymous puns” and non-pun sentences that contain homonyms Git Repo — Congratulations on winning the Big Data Award!

Presentation #4: Lingua Franca hacked a recommendation engine for #topic engagement, which creates and grows a receptive audience by finding common ground using Nexalogy’s API, and #idlenomore Twitter data. Git Repo | Demo | Slides – Congratulations on winning the Grand Prize!

“Lingua Franca” means a common third language used to communicate between two parties with differing backgrounds. How it works: user searches for a hashtag or topic. The Lingua Franca engine finds all topics related to that original hashtag or topic, and produces a ‘score’ for how closely related those topics are to the original topic.  Lingua Franca then runs the “common topics” again through Twitter, and identifies individuals a receptive audience for the original topic.  The cross-platform identity of those individuals are then presented and combined with “suggested topics” to engage them with. Here is a video of the hack in progress:

Lingua Franca from Gnito on Vimeo.

Presentation #5: One-man team Frederic Gingras’ initial idea was to gather insights on what makes successful open-source project thrive and work so well. Problem being that there is a huge gap between the mass of unknown good projects and the few widely used and maintained. Git Repo – Congratulations on winning the Solo Dev Award, a 1-year subscription to Safari!

Presentation #6: Great Explorer’s Footprint looked at street names along French Canadian explorer’s travels, used a simple n-gram frequency, and categorized the language of the streets. Git Repo | Demo

Presentation #7: EXPG Hockey calculated expected goals based on shot location using the SPORTLOGiQ datasets Git Repo

Presentation #8: Fedorer Streamer finds relationships in the fedora event stream Git Repo

For more hackathon photos, check out La Gare’s album on Facebook.

Thanks to all who joined us for MTLBDW! Nexalogy & MTL DATA will be joining forces again to champion a Data tent at the International Startup Festival, July 15-18 – contact us if you’d like to get involved, or click here to purchase a ticket at 10% off.