Ok “Ostracism is Community”  sounds very Orwellian.  It could very well be a mantra for a dictatorship based on isolating people or one of the lines from The Sphinx in Mystery Men.

But there is something to this mantra.  The story starts with Stowe, as many stories do.  In our discussions he had brought up mathematical based community identification algorithms.  Now this is a question that has been plaguing graph theory specialists for a long time it turns out, and not something an astrophysicist could solve on the back of an envelope (bear in mind that in astronomy an order of magnitude is often seen as close enough).

So Stowe sent me a paper that proposed a new algorithm for the detection of communities in networks.   Modularity is the key metric in this business and this paper is no exception.  The quick explanation of modularity is that it is a measure of how a subnetwork is more tightlly knit than what randonmenss would indicate for the same group of nodes in that subnetwork.   I will spare you the details of the paper as I my tenuous grasp of graph theory barely enabled me understand them myself, but suffice to say that small communities are very hard to delimit and that the calculation time required to perform this analysis is prohibitive for large networks.

What makes algorithmic community detection so difficult is a problem inherent in most algorithms:  When do you make it stop?  Mathematically, communities exist within networks, and in society the same is true.  In todays nomenclature we often confuse networks with community.  Where does the community end if it doesn’t take up the whole network?  What is the maximum number of communities do we want to set in the network as the boundary condition?

The delimitation of communities is then a fundamental distinguishing factor between a community and a network, in fact it is a necessary condition of community. Communities are more closely knit, (as modularity calculations demonstrate) but additionally at least in real human communities,  they also include goals or a a set of guiding principles, and the human boundary condition to community is a process of ex-communication or ostracism.  You may ask: “How can something as vile and hurtful as ostracism be a part of something as inclusive and nurturing as community”?

In fact it is the ostracism that is making the community nurturing, because without ostracism the bounds of trust that are needed in communities are broken by the possible disruptive elements that may operate within that community. The community’s only answer to this is to expel the elements which break either the guiding principles of the community or injure the bonds of trust that enable it to take action.

The very word ostracism comes from an actual community that regurlarly kicked people out.  The first recoded democracy in history in fact, Athens.  Every year Athenians had a secret vote on one citizen they would chose to expel.  They wrote the names of people they wanted to exclude on pieces of broken pottery called ostricon.

The Athenian who received the most “votes” was then kicked out of Athens or ostracized. For instance this Ostricon has the name of the Athenian statesman Cimon written on it.  It’s kinda like voter recall but for everyone.

It sucks when the cool kids don’t want to hang out with you, I know (Astrophysicists are rarely part of the cool kid crowd in high school).  Ostracism hurts when it happens, but everyone amongst us has told a troll to leave the discussion on a forum or thread, (Mr. Troll you know who you are).  We have all stopped inviting the party guest who has to be right all night, diverges every discussion, starts fights and pisses every other guest off.  We have excluded the carnivore at the vegetarian pot luck, the swing dancer at a tango night, or my favorite, the speed metal bassist in a country band.  We have all done it, because we all cherish our communities and the goals they aim to achieve.

A recent startup called SNTMNT has opened its Trading Indicator API, which provides price predictions about the price of S&P 500 stocks. The company says its accuracy is about 64%, which suggests that users of the API will be consolidating these indicators with other sources of information.

The approach is a very detailed level of analysis of Twitter sentiment about the stocks, based on so-called ‘stock tickers’ — references to companies like AT&T, whose ticker is T. Many Twitter users talk about stocks, referring to the companies by ticker,  which is preceded in most cases by a dollar sign, as in $T.

Johan Bollan and other researchers published a paper, Twitter mood predicts the stock market, back in 2010, which was widely discussed. Their work suggests that the general mood on Twitter predicts the rise or fall of the daily closing price of the Dow Jones Industrial Average with 86.7% accuracy. SNTMNT is strongly influenced by that work, and is another indicator of the growing possibilities in computational social science.

My prediction is that sentiment analysis at either the macro- or micro-level — predicting stock market aggregate moves or the trends for specific stocks — will become a commonplace over the next few years, one of the most obvious applications of big social data.

Now, if someone could only predict the weather…

On June 30th, 2012, in order to keep the time of day close to the mean solar time, a second was inserted into our calendars.  So here in Montreal, clocks showed 19:59:59, then 19:59:60, then 20:00:00.

You might guess that a leap second can disturb the highly-dependant-on-time computing world, something like Y2K.

You would be right.

In fact, several web services were affected by that time distortion;
Reddit, Wikimedia, Mozilla to name a few… and us.

Our analysis systems thus became confused for a while, and new Tweets could not be fetched; fortunately we have recovered most of them, using the current means we have for time travel!

Leap second seldom occur: only happened three times so far in this century.  We’re happy to say the underlying technical fix is on its way into our infrastructure.

Thank you for your patience in this matter.

Gossip — talking about people when they are not present — is a staple of human societies, a universal aspect of human interaction. Not surprising, gossip occurs in all social contexts, including online conversations, like email, Twitter, and instant messaging.

“Gossip is a sort of smoke that comes from the dirty tobacco-pipes of those who diffuse it: it proves nothing but the bad taste of the smoker.” – George Eliot

We may consider gossip as negative like George Eliot did, but the anthropological research is fairly consistent in showing that gossip is a necessary to healthy social organizations, whether small or large. Robin Dunbar proposed the idea that human language evolved from the latent desire to gossip (see Gossip, Grooming, And The Evolution Of Language). There is a great deal of research into the exchange of social information that might prove useful, and also the moral implications of the actions of other which can lead to social repercussions for those considered to be bad actors, or untrustworthy.

So, it will come to no surprise that a study of gossip in a large collection of email — the Enron email database of 517,431 messages — shows a consistent set of patterns of gossip. This was reported in Have You Heard?: How Gossip Flows Through Workplace Email, where the researchers — Tanushree Mitra and Eric Gilbert of Georgia Institute of Technology — analyzed the email and found 7.206 messages that they believed were clearly gossip-oriented.

Their findings are very revealing:

“[...] sending email to a small set of people is more frequent and it is more common to see gossip in messages targeted to a smaller audience.

[...] gossip is present in both personal and business email and across all sections of the hierarchy, which demonstrates its all-pervasive nature in organizations.

[...] the hierarchical position of an employee affects his gossip behavior, both in terms of his frequency of gossip and the audience with whom he gossips. Our results indicate that people are most likely to gossip with their peers.

[...] gossip is a social process. Some people are actively involved in generating gossip messages (“gossip source”), while others are silent readers of the messages (“gossip sink”’), and there are some who play both roles.

[...] frequent dyadic email interactions do not show an increase in gossip email. [That is to say that those who are likely to be working more closely may have other opportunities to gossip than email.]“

Other findings are more organizational, like the finding that VPs and Directors at Enron were very likely to move gossip-related emails up to their next immediate supervisor. Also, people at the bottom of the totem pole are most likely to gossip, and to do so among themselves.

Gossip Flow Across The Enron Hierarchy

This research also suggests that lower tier management is least involved in gossip, although that suggests they may be the topic of greatest gossip, too.

Enron was a disaster as a company, but it may be the case that the patterns of gossip there are perhaps not unusual. The researchers did not delve very deeply into the question of what was being said in these emails, aside from some superficial observations (like the phrase ‘in response to your email’ being common), however, a sentiment analysis suggests that gossip is more strongly correlated with negative emotions (37.44%) than positive ones (13.99%), but most strongly linked to neutral emotions (48.63%). My hunch is the more negative the emotional state of the company, the more negative gossip might arise, but that gossip itself is a kind of background radiation, and is always present.

We were profiled in PwC, Nice one!

Here at Nexalogy we have been developing new ways to cut to the heart of online conversations and reveal the discussions that matter most. We have been putting social media intelligence technology to work to find better business intelligence solutions. This week, we are happy to report these efforts are lauded in the most recent edition of PwC’s technologyforecast.

The technologyforecast highlights Nexalogy’s work with Josée Latendresse and the Latendresse Groupe Conseil, looking at online conversations about boots made for industrial uses. Using our custom lexical analysis, not only did we discover that people were talking about the boots in terms of safety, but interestingly, important conversations were also talking about them as a fashion statement. That revelation helped Latendresse discover a whole new niche they did not know existed, despite focus group research.

Unlike the anecdotal discussions of a focus group, online conversations are sprawling, unmoderated, and diverse. We have honed our skills to sift through the landscape of online conversation, highlight what is important, and to give expert analysis on the back-end. PwC points out that Nexalogy is among the first to take advantage of cloud analytics. We are delighted they have featured our work and thank the PwC Centre for Technology and Innovation for their ongoing interest in our efforts. Thanks!

Think smart? We do! Nexalogy’s software solutions have been developed with the Academic community in mind. “Making Sense of the Conversation Concerning Sleep Apnea on the Internet” a study that Nexalogy collaborated on with McGill University, Mount Sinai Hospital and ORSMedical, illustrates how Nexalogy’s solutions can be effectively used in research.

Contact us for more information on how our full range of solutions can be put to use in your Academic Institution.

click image to enlarge

One thing we can learn from even a casual inspection of the science of networks is the limits of our everyday understanding of people. For example, consider how much time and energy social metrics companies are spending convincing us that they can find those with the highest levels of influence (‘influencers’) relative to a market or a group of brands, and once discovered, these influencers simply need to be influenced as a stepping stone to convincing the entire market to buy your corn flakes, mobile device, or book. A sort of marketing Domino theory.

But it turns out that people — and marketers — don’t really understand influence very well, despite being embedded in social networks their entire lives: we really don’t understand the way that we are influenced by other people. For example, if someone touches you when you first meet, you are ten times more likely to remember that person. But we are unaware, later, that the touch was the reason for our recollection. We underestimate the impact of a kind word, or the chilling effects of workplace fear. There are dozens of examples of this sort coming out of cognitive science that demonstrate that we are being strongly influenced below the conscious level, physiologically, all the time. The actions of others can make us fearful, or confident, or curious, or suspicious — and it can happen invisibly. People just don’t have a great insight into the social interactions of people, despite being involved in them.

Most contemporary thinking about our social interactions is derived from an economic view that considers groups as collections of individuals, where each individual makes more-or-less rational decisions intended to maximize benefits to themselves and their loved ones.

I think there is a analogy with the historical physics view of how fluids work, like water, or water specifically.

Like people, water is everywhere. and we come into contact with it everyday, when we wash, cook, drink, or bathe. But, just like people, close contact with water does not let us understand water’s workings. And, strangely, we don’t come into regular contact with other liquids to any extent like our experience of water, so it is both familiar and yet badly understood.

For example, liquid water is not a amorphous blob of H20 atoms, as I was taught in high school. It is a complex, quasi-crystalline substance, with giant aggregations of water molecules forming and breaking apart all the time. These supermolecules are responsible to a great extent for the extremely unusual qualities of water, like its high heat coefficient: water changes temperature very slowly because these supermolecules are slow to change their rates of vibration. If water was really just a bunch of uninvolved H2O molecules, it wouldn’t be anything like water. Water is also inclined to move from networks of supermolecules into a crystalline solid as it gets colder, which leads to one of the strangest qualities of water: solid water floats on liquid water. We take this for granted, and don’t consider it unusual: but it is very very unusual, and life on Earth depends on that property of water.

Just like water, we are around people all the time, but we don’t understand how connected together we are. It’s easy to consider people as individuals, bumping into each other like billiard balls, making independent decisions, individuals coincidentally living and working in close proximity. But we aren’t like that, at all, any more than water is made up of totally independent molecules.

The new physics has opened our understanding of the most common liquid on earth, one absolutely central to life, one that we touch everyday. Just so, social physics will allow us to understand our connection to each other, and how our thinking, beliefs, values and behavior are almost totally shaped by social ties to others. But it may require that we reconsider almost everything we think we know about ourselves and others.

Valdis Krebs makes an interesting supposition. He wonders if Amazon might launch a new social network based on the connections we have through books? If I have commented on David Weinberger’s new book, Too Big To Know, that could connect me with others who have read and commented on it.

Below is a network map(via social network analysis) of a very interesting new book — Too Big to Know [2B2K] by internet scholar David Weinberger. David’s book is shown by the magenta node in the center of the network.  Directly connected to his book are the books that Amazon mentions that customers also bought [green nodes], in addition to 2B2K. These books are probably more similar than different to 2B2K. The blue nodes are books that are 2 steps away from 2B2K, they are probably more different than 2B2K, but retain similarities. The arrows show the direction of the majority of also-bought activity. If you find 2B2K interesting, you will probably find a pleasant read in one of green books or possibly a blue book — depending upon your desire for difference.

Today, Amazon introduces you to similar books. Tomorrow, they will introduce you to similar readers.

Of course, Amazon doesn’t have to build such a network from scratch, they could begin by acquiring Goodreads or Readmill.

Readmill would be particularly relevant, since it is based on highlights and notes that readers make within the Kindle experience, which are shared with friends, as shown here:

I think Valdis is onto something, since reading is perhaps one of the deepest ways of characterizing our identities. The works that have informed and shaped us could be a bridgework to bring us closer together, certainly.

The city is a built environment, and has been designed by our actions and behaviors. I grew up in Boston, where there is a perplexing layout in the oldest sections of town, with the apocryphal rationale is that the streets were paved over the wanderings of cows and drunken sailors.

There are cognitive universals in how people chose to wander, and what paths we would rather use in cities, and what emerges is a spectrum of quite complex societal outcomes.

Walk this way via The Economist

According to Ruth Conroy Dalton of the University of Northumbria, people perceive routes with changes in directions to be longer than straighter ones, even if they are actually the same distance. Odder still, equivalent routes that have landmarks on them are also reckoned to be longer than routes that do not. That may be because memorising changes in direction and landmarks both require the brain to do more work than a route that simply heads in as straight a line as possible.

These preferences may help explain why it is that some city streets are more crowded than others. Why is it that Oxford Street, for instance, is London’s busiest shopping street and not, say, Regent Street or Piccadilly? Tim Stonor of Space Syntax, an architectural consultancy, says that the answer lies in graph theory, a branch of mathematics that studies nodes and the connections between them. Counterintuitively, though, Space Syntax’s model represents street segments as the graph’s nodes and road intersections as connections between the nodes. The resulting topsy-turvy simulation is then used to chart the most linear conceivable route to join every street in a city with every other street. It soon becomes clear that not all roads are equal. Some are more accessible and integrated than others—which is why Oxford Street is more likely to be walked along than any other street in London.

That has implications for how street layouts can be consciously designed to create areas that are more or less vibrant—more suited to shopping, say, or family living. It can also be used to identify places that are unhealthily segregated. Mr Stonor points out that 85% and 96% of riots last August in north and south London respectively took place within a five-minute walk of a post-war housing estate. Most observers would put that down to the fact that the estates’ cheap accommodation draws poorer folk, resulting in pockets of poverty and deprivation whose denizens are more likely to commit crime and engage in acts of vandalism. But Mr Stonor believes that the complex, insular design of many housing estates exacerbates the problem by limiting interactions between people and thus encouraging anti-social behaviour—the exact opposite of what their creators envisaged.

So, network science seems to get at some fundamentals in our thinking about space, and our movement through it. Stonor’s observations show that alienation will arise when groups are  dead-ended in pockets of our cities’ networks of streets, and the result is a corresponding isolation from the non-physical, societal social networks and the social capital that they create, and distribute.

I wonder how we will be able to work through complex social issues like the riots in London, the underlying motivations of the Occupy movement, or the growing inequities of or globalize economy, when our leaders have no understanding of the social physics that underlies human association? We are in a world dominated by business leaders, lawyers, and economists, alas, and far too few anthropologists, physicists, and scientists.

Do you have a big data set you’ve been wanting to analyse? We can help.

Nexalogy is pleased to announce that we have developed a JSON API for our clients. Based on popular demand, we are releasing the API which will allow you to input data into our system. We have developed a generic system that uses JSON formatted data with a few simple API calls. That will allow you to create projects and populate them with your data. In the weeks and months ahead, the API will be expanded and will be made more robust.

You can follow the API Specs here and try it for yourself. Does it allow you to do what you want? What features would you like to see? How can we improve it ? We look forward to your feedback.