Tracking social circles and measuring civilization

(This is basically an attempt to build a primitive but working model of Psychohistory using contemporary AI techniques)

Society can be viewed as a collection of humans and a set of overlapping groupings of those humans.

Example groupings are: family, friends, religion, neighbours, city, state

People have varying degrees of interest in each group, i.e. a varying strength binding them to that group. The interests of groupings can align or oppose each other. People within one group support each other. people from different groups treat each other according to the relation between the groups.

Groupings can overlap and can even be contained in other groupings. this leads to conflicts of interest. For example: Nepotism is a condition where a human assigns higher importance to their 'family' group than to the groups derived from their responsibilities, such as 'coworkers', 'company' and 'state'.

I propose the following hypotheses:

In more civilized countries, people assign higher importance / have a higher feeling of belonging to larger groups. Primitive tribes focus on family, friends and neighbours. Civilizations also have larger groupings people can have loyalty to, like the state and society as a whole. The more civilized the country, the stronger this effect.

I believe that this tendency is consistent, measurable, and useful. If we develop an objective way to measure a person's feeling of belonging to various groups, this will allow us to create a Civilization Index. I believe that this index would very strongly correlate with how pleasant a society is to live in, much more so than economic indicators like the GDP.

This is how some sample types of states would score on this index (very roughly):
- A small isolated tribe actually scores fairly well: So long as the village is small enough that everyone can know everyone, people feel loyalty to the entire village as a whole.
- Once a village's population exceeds Dunbar's Number, people can no longer feel loyal to the entire village as a whole, and tend to focus more on family, friends and neighbours. The index of such a village would be lower than that of a smaller tribe, which reflects the fact that many people view life in such small communities as preferable to life in larger societies.
- A feudal society has a hierarchy of groupings. Besides the groupings mentioned before, there is one additional grouping for each level of the nobility. Most people focus their loyalty on the smaller groups, and there is animosity between many groups. The index would show that feudalism is terrible to live in.
- A larger, capitalist society like most Western countries are has fewer political/geographical groupings but people have a stronger belief in abstract ideals that encompass many people, such as a belief in the consitution. In societies with a strong divide between political factions, or with many idelogies and beliefs that are at odds with each other, the index would be lower.
- The ideal of communism would likely score extremely highly on this index. Obviously any practical implementation of communism in reality leads to disaster for a number of reasons, but it's interesting to see that a theoretical perfect communist state would actually score very highly.
It is possible to implement such an index in practice, using only contemporary machine learning technology and access to the database of a sufficiently large social network, such as facebook.
In addition to providing an index on the degree of civilization of a nation, it would also has a much more practical use:

The model can track changes of groups over time and make predictions about this. It will be able to predict when two political parties will make common cause, or when a group will experience a schism.

There are a myriad of possible applications if it works.

The NSA and facebook probably already do this, but there is no publicly available repository of such information, or else I would expect news agencies to refer to it quite frequently, since it would provide concrete measures of how much different groups like each other.
This model can also be used to predict the behavior of individuals in relation to groups:

Will two individuals be able to get along, based on their group affiliations? How much does this individual value his family compared to his friends? His nation compared to his religion? How likely is this religious person to join a more radical group of the same religion?

The implications of this model are immense. While undoubtedly very useful, it also has a lot of potential for abuse.

Worse, the model wouldn't even be very reliable (compared to the model that operates on groups), so there would be a lot of wrong accusations.

Building the model

How could such a model be built with contemporary technology?

First, we need to get the data. What we need is the set of interactions between as many people as possible over time, classified by positive/neutral/negative sentiment. To do this, we can perform sentiment analysis on social networks. This is complex, but people are already doing it for various other applications, so this problem can be considered solved.

Now we need to generate groups from this data. Off the top of my head there is one very simple algorithm that would probably already yield good results: Principal Component Analysis. Each Principal Component will represent at least one group of people. It is possible that a principal component represents two distinct groups, the difference between which is described by another principal component. This distinction can be detected by comparing the way the principal components change over time. The mathematics behind this are non-trivial, but shouldn't pose much of a problem to a machine learning expert.

For making predictions, it is relevant to consider that some types of groupings evolve differently over time than others. The social bonds of religions differ from those of families. This can be determined automatically by machine learning algorithms by adding an algorithm that finds the biggest behavioral differences of the people within a group compared to those outside the group and makes those available as features for a prediction algorithm.

To train the model, experiment until you can get it to automatically recognize important groups that we know to exist, such as families, religions and political parties.

There are a number of problems that need to be addressed and that I am unsure about, but I believe they are all doable with a concerted effort:

Some groupings have many instances but belong to the same category.

For example: There are multiple cities and multiple families, but each city and each family is the same type of group, which should be recognized by the algorithm.

This is a problem if Principal Component Analysis is used, because the principal components of each individual city or family would be small, even though the influence of families in general on social cohesion is enormous. The algorithm would likely need to be modified substantially to take this into account, so we need something more complex than Principal Component Analysis.

How can this be modeled? How can it be derived automatically without hardcoding things?
How to measure the degree of cooperation/fighting between people?

Sentiment analysis can not capture everything there is to know about social interactions because some interactions are subtle. If we rely solely on sentiment analysis in its current state of the art, the model would be good enough to be useful, but would still be flawed and could thus probably be exploited somehow.
How to measure the degree of harmony and coordination within a grouping? By the size of the Principal Component? or something more complex?

I think all of these problems are easily surmountable if an expert thinks about them for a while. The potential payoff of building such a model would be immense.