Why study the Marvel Universe?

A graph is simply a mathematical tool. It consists of a set of points (called the nodes) joined by a set of lines (calle the arcs). Graphs have many uses in science and engineering. They can model communication networks, electrical circuits, molecules, etc. In the last few years it has become of fashion to study very large graphs, with thousands or even millions of nodes and arcs, mainly because of the socioeconomic interest of networks that can be modeled as one of these graphs.

For instance, the Internet can be modeled as a large graph, with each node representing a page, and each arc representing a link between two pages. Another possible repesentation is for each node to represent anything related to e-mail (from personal web pages to servers) and arcs representing direct physical connections for e-mail to go through. Given the economic importance of the Internet it is very important to know how a computer virus spreads through the e-mail graph, or which are the nodes that could completely disconnect the net should they be attacked by a terrorist group, or if and how a cascade of server failures could make the whole net collapse.

Large graphs are also used in epidemiology. We can model the spread of infectious diseases, such as AIDS, pneumonia, or sexually transmitted diseases by studying a graph where each node represents a person and each arc an infection path. It obviously is very relevant to know how a disease makes its way through one of these graphs to be able to, for instance, establish vaccunation policies. Similar graphs are studied in sociology. In fact, the graphs used in epidemiology are particular cases of social graphs, where the nodes represent people, and the arcs some kind of relationship, be it having sexual contact, being friends, having a commercial transaction, or being a member of the same club. These graphs help you study the spread of ideas, information, diseases, or wealth through a given society.

In all these cases the graphs of interest are either too big (as is the case of the web) or are not well-enough known (as is the case of the infection graphs). This is why rather than studying the actual graph, the scientists create a mathmatical model of the graph that will allow them to deduce the information they are interested in. This model must preserve both how the graph is created and its basic characteristics. For example, most of the large graphs that have been studied have several properties in common: they have much fewer arcs than they could have; you can usually go from one node to another through a 'short' path (traverssing a small number of arcs); two nodes that are connected through an arc to a third one have a higher probability of being connected among them than two randomly selected nodes; the fraction nodes from which k arcs spawn declines with k following a special type of mathematical function, etc.

Some 'ideal' models of large graphs, very regular or completely random, are well known in physics and mathematics, but none seem suited to represent interesting real graphs of this kind. It is therefore necessary to develop new graph models that will allow us to understand these social networks. To be able to build these models it is necessary to know the data and the evolution of as many real graphs as possible, with as much variety in their nature as can be achieved. And here is where our analysis of the Marvel Universe enters.

As an experiment, previous to the analysis of a certain large graph that appears in our research of computational biology we decided to analyze the Marvel Comics collaboration graph. There were several reasons. First of all there was a database, that with some work, allowed us to perform this study. Also, the Marvel Universe is a totally artificial social network that pretends to imitate a real social graph, and therefore it seemed intereesting to find out if the properties of this graph were similar to a real one. Of course, we also were aware that the subject was amusing and fun, and that physicists and mathematicians that were Marvel fans would like to know who was at the center, or whichs was the diameter, of the Marvel Universe

Our analysis shows that the Marvel Universe is closer to a real social graph than one might expect, but is not exactly 'real.' This data, and a following analysis of how the graph has grown, can be used to contrast and refine the models for the social graphs that have been used to date, the ones that later on will imprint subjects as dissimilar as epidemiology or security.