Foundations (F) Session 5
Time and Date: 10:45  12:45 on 22nd Sep 2016
Room: M  Effectenbeurszaal
Chair: Marta SalesPardo
438  Breaking of Ensemble Equivalence in Complex Networks
[abstract] Abstract: It is generally believed that, in the thermodynamic limit, the microcanonical description as a function of
energy coincides with the canonical description as a function of temperature. However, various examples
of systems for which the microcanonical and canonical ensembles are not equivalent have been identified.
A complete theory of this intriguing phenomenon is still missing. Here we show that ensemble
nonequivalence can manifest itself also in random graphs with topological constraints. We find that,
while graphs with a given number of links are ensemble equivalent, graphs with a given degree sequence
are not. This result holds irrespective of whether the energy is nonadditive (as in unipartite graphs) or
additive (as in bipartite graphs). In contrast with previous expectations, our results show that (1) physically,
nonequivalence can be induced by an extensive number of local constraints, and not necessarily by longrange
interactions or nonadditivity, (2) mathematically, nonequivalence is determined by a different
largedeviation behavior of microcanonical and canonical probabilities for a single microstate, and not
necessarily for almost all microstates. The latter criterion, which is entirely local, is not restricted to
networks and holds in general.

Diego Garlaschelli 
234  Universal temporal features of ranking in sports and games.
[abstract] Abstract: Sports and games can be described as complex systems. We describe these systems through a mathematical theory to find a universal behavior in the dynamics of different sports and games related to the players performance. This description can be done studying the ranking dynamics. We employed data from tennis, chess, golf, poker and football. Players are ordered by each official association.
Each player or team has a score that varies with time, this results in a time dependent ranking. We tested five models to describe the distribution of scores for a given time. The rank diversity is introduced to quantify evolution of rankings; it is defined as the number of distinct elements of the complex set which have a given rank k within a given period of time. In other words, we analyze the time dependence of the ranks and observed a global behavior, despite the differences in distributions. Rank diversity in six cases follows closely the cumulative
of a Gaussian. We introduce a simple model to reproduce the observed behavior.
A member with rank k(t), at the discrete time t, is converted to rank k(t+1), tanking k(t) we add a Gaussian random number generator of the standard deviation corresponding to the original rank diversity getting s(t+1), once we obtained this for all members they are ordered according to their magnitude, a new data set is gotten. Good agreement is obtained.

JosÃ© Antonio MoralesAlvarez and Carlos Gershenson 
56  How to estimate epidemic risk from incomplete contact diaries data
[abstract] Abstract: Social interactions shape the patterns of spreading processes in a population. Techniques such as diaries or proximity sensors allow to collect data about encounters and to build networks of contacts between individuals. The contact networks obtained from these different techniques are however
quantitatively different. Here, we first show how these discrepancies affect the prediction of the epidemic risk when these data are fed to numerical models of epidemic spread: low participation rate,
underreporting of contacts and overestimation of contact durations in contact diaries with respect to sensor data determine indeed important differences in the outcomes of the corresponding simulations. Most importantly, we investigate if and how information gathered from contact diaries can be used in such simulations in order to yield an accurate description of the epidemic risk, assuming that data from sensors represent the ground truth. The contact networks built from contact sensors and diaries present indeed several structural similarities: this suggests the possibility to construct, using only the contact diary network information, a surrogate contact network such that simulations using this surrogate network give the same estimation of the epidemic risk as simulations using the contact sensor network.
We present and evaluate several methods to build such surrogate data and show that it is indeed possible to obtain a good agreement between the outcomes of simulations using surrogate and sensor data, as long as the contact diary information is complemented by publicly available data describing the heterogeneity of the durations of human contacts.

Alain Barrat and Rossana Mastrandrea 
139  Epidemic Mitigation via Awareness Propagation in Communications Network: the Role of Time Scale
[abstract] Abstract: The pervasiveness of the Internet and smartphones enables individuals to participate in multiple
networks such as communications networks and the physical contact network. The feedback between these networks opens new possibilities to mitigate epidemic spreading. For instance, the
spread of a disease in a physical contact network may trigger the propagation of the information
related to this disease in a communications network, which in turn may
increase the alertness of some individuals resulting in their avoidance of contact with their infected
neighbours in the physical contact network, possibly protecting the population from the infection
[1,2]. In this work, we aim to understand how the time scale of the information propagation
(speed at which information is spreaded and forgotten) in the communications network relative to
that of the epidemic spread in the physical contract
network influences such mitigation using awareness information. We firstly propose a model of the
interaction between information propagation and epidemic spread, taking into account their relative time scale. We analytically derived the average fraction of infected nodes in the metastable
state for this model (i) by developing an individual based MeanField Approximation method and
(ii) by extending the Microscopic Markov Chain Approach, firstly introduced in [1,2]. Furthermore, we find that the optimal mitigation, can be achieved at a relative time scale, not necessarily zero nor infinity depending on the rate at which an infected individual gets aware.
Contrary to our intuition, fast information spread in the communications network could even reduce the mitigation effect. Finally, we find that the mitigation tends to perform better when more
node pairs are connected both in the physical contact and communications network.
We explain these findings using both theoretical analysis and physical interpretations.
[1] Granell C, Gomez S, Arenas A, Phys.Rev.E. 2014;90(1):012808.
[2] Granell C, Gomez S, Arenas A, Phys.Rev.L. 2013;111(12):128701.

Huijuan Wang, Chuyi Chen, Bo Qu, Daqing Li and Shlomo Havlin 
108  The ground truth about metadata and community detection in networks
[abstract] Abstract: Across many scientific domains, there is common need to automatically extract a simplified view or a coarsegraining of how a complex system's components interact. This general task is called community detection in networks and is analogous to searching for clusters in vector data. It is common to evaluate the performance of community detection algorithms by their ability to find socalled "ground truth" communities. This works well in synthetic networks with planted communities because such networks' links are formed explicitly based on the planted communities.
In real world networks there are no planted communities. Instead, it is standard practice to treat some observed discretevalued node attributes, or metadata, as ground truth. However, failure to find a good division that correlates with our metadata is a highly confounded outcome, arising for any of several reasons: (i) these particular metadata are irrelevant to the structure of the network, (ii) the detected communities and the metadata capture different aspects of the network's structure, (iii) the network contains no communities, or (iv) the community detection algorithm performed poorly. Most work on community detection assumes that failure to find communities that correlate with metadata implies case (iv).
Here, we show that metadata are not the same as ground truth, and that treating them as such induces severe theoretical and practical problems, and can lead to incorrect scientific conclusions even in relatively benign circumstances. However, node metadata still have value and a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class of community structure models.
We demonstrates both techniques using sets of synthetic and realworld networks, featuring multiple types of metadata and community structure.

Leto Peel, Daniel Larremore and Aaron Clauset 
251  Effect of multiple initial spreaders on the spreading velocity in SIS epidemics
[abstract] Abstract: Epidemic processes describe several realworld dynamic activities on networks, where we are concerned about how fast the virus will overspread the network. We use the average pervasion time to measure the spreading velocity in a SIS process, where the pervasion time is defined as the minimum time when every node in the network is infected at least once. We show that the pervasion time resembles a lognormal distribution, and the average pervasion time for the same spreaders exponentially decreases with the effective infection rate.
Increasing the number of initially infected nodes can accelerate the spreading. The simulation results show the different effect of multiple spreaders on the spreading velocity for the different underlying topology, which provide some clues whether more spreaders are worthy to be invested. The investment utility is defined as the normalized decrement of the average pervasion time after adding a new spreader, which measures the effect of this spreader. We show that the investment utility decreases steadily with the number of initial spreaders in the homogeneous graphs like complete graph and lattice; However, in the network with powerlaw degree distribution, the investment utility of the newest spreader becomes suddenly low when the number of spreaders is over a value. After that, the investment utility of the lowest degree node dramatically becomes the highest among the rest of nodes. In addition, we reveal that the total investment utility for the same number of spreaders increases with the effective infection rate in a network, which implies the higher investment value to deploy multiple spreaders for the process with a higher effective infection rate.

Zhidong He and Piet Van Mieghem 