Foundations (F) Session 7
Time and Date: 13:45 - 15:30 on 22nd Sep 2016
Room: N - Graanbeurszaal
Chair: Yamir Moreno
298 | Reinventing the Triangles: Assessing Detectability of Communities
[abstract] Abstract: Statistical significance of network clustering has been an unresolved problem since it was observed that community detection algorithms produce false positives even in random graphs. After a phase transition between undetectable and detectable cluster structures was discovered [1,2], the spectra of adjacency matrices and detectability limits were shown to be connected, and they were calculated for a wide range of networks with arbitrary degree distributions and community structure [3].
In practice, given a real-world network neither the hypothetical network model nor its full eigenspectrum is known, and whether a given network has any communities within detectability regime cannot be easily established.
Based on the global clustering coefficient (GCC) we construct a criterion telling whether in an undirected, unweighted network there is some/no detectable community structure, or if the network is in a transient regime. To that end we use approximations of GCC and the spectra of adjacency matrices, together with the empirical observations showing that: a) for graphs with community structure GCC reaches the random graph value before its connectivity is fully random, and b) this saturation of GCC coincides with the theoretically predicted limit of detectability.
We compare the criterion against existing methods of assessing significance of network partitioning. We compute it on various benchmark graphs, as well as on real-world networks from SNAP database, and compare it with the results of state-of-the-art community detection methods. The method is simple and faster than methods involving bootstrapping; it is robust also on sparse networks. Analogous criteria are plausible also for directed graphs.
[1] J. Reichardt and M. Leone, Phys. Rev. Lett. 101, 078701 (2008).
[2] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborova, Phys. Rev. Lett. 107, 065701 (2011).
[3] X. Zhang, R. R. Nadakuditi, and M. E. J. Newman, Phys. Rev. E 89, 042816 (2014).
|
Jeremi Ochab and Zdzisław Burda |
334 | Optimising peer review with evolutionary computation
[abstract] Abstract: Peer review is the most prevalent mechanism for validating the quality of published research. While scientists are not blind to many of the shortcomings of this system (e.g. bias), studies show that peer review is perceived as a necessary tool that does improve the quality of published papers. Surprisingly, despite the importance of this process and its place in the scientific world, peer review has not been extensively studied until recently. The goal of our work was twofold. Firstly, we wanted to deepen our understanding of the peer review process as perceived from the viewpoint of a journal editor. Secondly, through extensive numerical experiments based on real-world data, we wanted to improve the effectiveness of the process.
In order to understand some of the dynamical properties of the peer review process, we analysed a dataset which contained information about 58 papers submitted to the Journal of the Serbian Chemical Society. The dataset consisted of 311 “review threads” - full descriptions of the review process initiated by sending an invitation to a potential reviewer. We were able to separate the process into distinct phases, recreate transition probabilities between phases as well as their durations. This allowed us to uncover many interesting properties of the process and to create a framework that can be used as a basis for simulations.
We were particularly interested in how editorial workflows – sets of guidelines and rules that journal editors use to acquire reviews for submitted manuscripts – can be improved. To this end, we employed Cartesian Genetic Programming to evolve and optimise editorial strategies. We showed that these evolved strategies are better than typical strategies used by some editors and lead to shorter review times.
|
Maciej J. Mrowinski, Agata Fronczak, Piotr Fronczak, Olgica Nedic and Marcel Ausloos |
331 | Network structure, metadata and the prediction of missing nodes
[abstract] Abstract: The empirical validation of community detection methods is often based
on available annotations on the nodes that serve as putative
indicators of the large-scale network structure [1]. Most often, the
suitability of the annotations as topological descriptors itself is
not assessed, and without this it is not possible to ultimately
distinguish between actual shortcomings of the community detection
algorithms on one hand, and the incompleteness, inaccuracy or
structured nature of the data annotations themselves on the other.
In this work we present a principled method to access both aspects
simultaneously. We construct a joint generative model for the data and
metadata, and a non-parametric Bayesian framework [2] to infer its
parameters from annotated datasets. We assess the quality of the
metadata not according to its direct alignment with the network
communities, but rather in its capacity to predict the placement of
edges in the network. We also show how this feature can be used to
predict the connections to missing nodes when only the metadata is
available.
By investigating a wide range of datasets, we show that
while there are seldom exact agreements between metadata tokens and
the inferred data groups, the metadata is often informative of the
network structure nevertheless, and can improve the prediction of
missing nodes. This shows that the method uncovers meaningful patterns
in both the data and metadata, without requiring or expecting a
perfect agreement between the two.
[1] J. Yang and J. Leskovec, "Structure and overlaps of ground-truth communities in networks", ACM Trans. Intell. Syst. Technol., vol. 5, pp. 26:1–26:35, Apr. 2014.
[2] T. P. Peixoto, "Parsimonious Module Inference in Large Networks", Phys. Rev. Lett., vol. 110, p. 148701, Apr. 2013.
|
Darko Hric, Tiago Peixoto and Santo Fortunato |
54 | Dynamical conditions for emergence imply its logical conditions
[abstract] Abstract: Complex systems have large numbers of components with correspondingly large numbers of possible interactions. Even very large systems, however, can be simple in that they can be understood in terms of additive combinations of the contributions of their parts. Even the largest computer can be fully analysed this way. Furthermore, their behaviour for a given input is fully predictable for any length of time one chooses. It was long assumed that all systems were similarly analysable and predictable. Our only problem was to show this for more complex systems. This assumption was justified both by striking successes, but also on theoretical grounds. However, such systems are special among the range of all possible systems (Rosen 1991), and some systems with few components violate them by being both irreducible and noncomputable. We call such systems complexly organized. The complexity is not in the numbers, but in their behaviour. Such systems can be given a dynamical characterization so that for energy based systems and information based systems the key is the loss of available energy or information as entropy disposed outside the system, maintained by an external source of energy or information. This permits what is called self-organization, though not all complexly organized systems are self-organized. The unpredictability and irreducibility of complexly organized systems fits well with ideas of emergence developed in the 19th Century to explain living systems, and formalized more carefully by C.D. Broad in the 20th Century. I will argue that the dynamics of complex systems implies the logical conditions that these philosophers laid down. The advantage of having a dynamical account is that it gives us a way to interact with and test the systems for their properties. The logical conditions, on the other hand, were mostly attributed qualitatively (and sometimes erroneously), which is distressingly common even today.
|
John Collier |
344 | Benard cells as a model for Entropy Production, Entropy Decrease and Action Minimization in Self-Organization
[abstract] Abstract: In their self-organization, complex systems increase the entropy in their surroundings and decrease their internal entropy, but the mechanisms, the reasons and the physical laws leading to this processes are still a question of debate. In our approach, energy gradients across complex systems lead to change in the structure of systems, decreasing their internal entropy to ensure the most efficient energy transport and therefore maximum entropy production in the surroundings. This approach stems from fundamental variational principles in physics, such as the principle of least action, which determine the motion of particles. We compare energy transport through a fluid cell which has random motion of its molecules, and a cell which can form convection cells. We examine the signs of change of entropy, and the action needed for the motion inside those systems. The system in which convective motion occurs, reduces the time for energy transmission, compared to random motion. For more complex systems, those convection cells form a network of transport channels, for the purpose of obeying the equations of motion in this geometry. Those transport networks are an essential feature of complex systems in biology, ecology, economy and society in general. This approach can help explain some of the features of those transport networks, and how they correlate with the level of organization of systems.
|
Georgi Georgiev, Atanu Chatterjee, Thanh Vu and Germano Iannacchione |