UrbanNet 2016: Smart Cities, Complexity and Urban Networks (U2SC) Session 2
Time and Date: 14:15  18:00 on 21st Sep 2016
Room: B  Berlage zaal
Chair: Oliva Garcia Cantu / Fabio Lamanna
14008  Electric vehicle charging as complex adaptive system  information geometric approach
[abstract]
Abstract: In all major cities in the Netherlands, charging points for electric vehicles seem to spring up like mushrooms. In the city of Amsterdam alone, for example, there were 231 charging points by the end of 2012 in comparison with 1, 185 today, and roughly two new charging stations added every week. Over the same period of time, the average number of charging sessions per week went up from 550 to 8, 000. All charging sessions in the Netherlands are recorded by the service providers and those from Amsterdam, Rotterdam, Utrecht, The Hague and provinces of Northern Holland, Flevoland and Utrecht are made available for research through the respective municipalities to the Urban Technology research program at the University of Applied Sciences Amsterdam1. The dataset of charging sessions, which is the largest of its kind in the world, currently holds more than 3.3 million records, containing information about duration, location and a unique identifier of the users [1]. The tremendous growth in electric vehicle adoption, in combination with the existence of this large and rich dataset, creates a unique opportunity to study many aspects of electric mobility and infrastructure in the context of complex social systems. The question we focus on is the following: if we consider the emobility system as complex and adaptive, what is its phase structure? Are there regime changes in the system? And, could we define distinct states of the dynamics of the system at hand? The framework in which we study these questions is that of information geometry [2]. To construct the framework we first define observables of interest from the data. We then estimate the probability distributions of these observables, as a function of time or other parameters of the system. As the system evolves, the shape of the probability distributions might change. We say that a regime shift has occurred when a large and persistent change in the probability distributions has happened. To define a large change in the probability distribution we use Fisher information [3]. Our approach is based on an analogy with the theory of phase transitions in statistical physics, especially second order or ?critical? transitions. In statistical physics one can study the information geometry of the Gibbs distribution and show that at second order phase transitions and on the spinodal curve the curvature of the statistical manifold diverges [4]. Taking it a step further, Prokopenko et al. showed that one can use the Fisher information matrix directly to serve as an order parameter [5]. Following these results, a maximum of the Fisher information matrix is used as a definition of criticality in complex systems, e.g. in [6]. The application of our approach is particularly challenging in the charging infrastructure system since 1) it is an open system (the number of users and charging points changes over time), and, 2) it is an irreversible system (the municipalities gain experience in deploying charging points, the users of the system optimize their usage of the charging point infrastructure, and policies and user support systems change). All this indicates that there is no straightforward notion of phase space for this system, which would allow for a Gibbslike distribution to be defined. Our previous work, which was applying this framework to a nonlinear reactiondiffusion system (the GrayScott model), is encouraging since we were also able to detect regime changes based on a macroscopic distribution of observables, independent of the microscopic dynamics of the system [7]. These challenges, however, are typical of complex adaptive social systems and therefore finding a satisfactory solution to them might allow for a generalization of the method to different social systems. In the talk we will present the results of pursuing this line of investigation. We will discuss different observables 1http://www.idolaad.nl 1 we tried and insights we gained into the system from our work. Understanding the phase structure of electric vehicle charging, and hence the dynamics of charging, can have large implications on our understanding of the dynamics of neighborhoods, on planning and policy implementation and on the study of Urban science in general.
Close

Omri HarShemesh 
14009  Residential Flows and the Stagnation of Social Mobility with the Economic Recession of 2008
[abstract]
Abstract: The movement of people within a city is a driver for the growth, development, and culture of the city. Understanding such movements in more detail is important for a range of diverse issues, including the spread of diseases, city planning, traffic engineering and nowcasting economic wellbeing [1, 3]. Residential environment characteristics have been shown to be strongly associated with changes in individual socioeconomic status, making residential relocation a potential determinant of social mobility [2]. Examining residential mobility flows therefore offers an opportunity to better understand the determinants of social mobility. By using a novel dataset, recording the movement of people within the city of Madrid (Spain) over a time period of 10 years (20042014), we studied how residential flows changed during the economic recession of 2008. Here we present preliminary results from these investigations. In particular, we found that the crisis had a profound impact on social change, reducing the social mobility within the city as a whole, thus leading to a ?social stagnation? phenomenon. Methods: We used data from a continuous administrative census of the entire Spanish population (the ?Padron?) that includes universal information on all residential relocations. Using this data, we can assess the mobility within and in and out of the city of Madrid, stratified by age, education and country of origin. For analysis involving property value and unemployment, the granularity of our analysis is on the level of neighborhood ( 20,000 people each, n=128 in Madrid). For all other analysis, our granularity is on the level of census section ( 1,500 people each, n 2400 in Madrid), providing a very fine grained perspective on the residential flows within the city. To examine changes in residential mobility flows, we categorized these into the following: any mobility (any change of residential location), mobility within the city of Madrid, and mobility within the city but to a different area. We further divided these last type of flow into upward (from poorer to richer) or downward (from richer to poorer) mobility. Figure 1 (left) shows an example of the geographical delineations and the associated residential mobility flows. Figure 1: (Left) A data overlay of a section of Madrid. Red outlines correspond to neighborhoods, colored by quintile of property value for 2004 (red areas indicate the highest property value quintile). Black outlines correspond to census sections, and arrows represent residential mobility flows. In particular, white arrows indicate movement to areas of higher property value, black to lower, and blue to areas of equal value. (Right) The total movement (inflow + outflow) within each census section for the year 2004. Red areas indicate high residential flows. ?1 0 1 2005 2007 2009 2011 2013 Year Quintile of Destination minus Origin Neighborhood Social and Residential Mobility in Madrid 0.05 0.10 0.15 0.20 2006 2008 2010 2012 2014 Year Unemployment Rate (%) Unemployment Rate per Neighborhood in Madrid Figure 2: (Left) Time series of social mobility (average change in quintile of property value of all movers in the neighborhood; where a positive number represents upwards mobility and 0 represents no social mobility) in the six neighborhoods with the highest change in social mobility from 2005 to 2014. (Right) Unemployment time series in all neighborhoods of Madrid, with thicker lines for the six neighborhoods pictured in the left. Results: We find that residential mobility peaked in 20072008, especially due to the contribution of incoming flows to Northern and Southeastern Madrid. A centrality based analysis of the residential mobility network reveals the intensity of change in the downtown area (Centro) of Madrid (Figure 1, Right). We further assessed the effect of the 2008 financial crisis on residential mobility flows, showing that neighborhoods in the lower end of the socioeconomic spectrum and those that had changed the most during the housing boom of the 2000s were the most affected by the recession (Figure 2, Right). In particular, these neighborhoods showed a decrease in social mobility associated with residential relocation, with a decreasing proportion of people in poorer relocating to neighborhoods with a higher property value (Figure 2, Left). Moreover, there was also a decreasing proportion of people in richer areas relocating to neighborhoods with a lower property value. This lack of upward mobility (from poorer areas) and downward mobility (from richer areas) led to an stagnation of residential mobility in the aftermath of the recession. Discussion: A combination of finegrained relocation, socioeconomic and property value data has allowed us to detect communities with increased mobility flows, as well as areas of relative residential stability or stagnation. It has further allowed us to explore changes with the economic recession. Our finding that social mobility at the neighborhood level has stagnated is consistent with previous findings of increased economic segregation concurrent with the economic recession of 2008[4].
Close

Usama Bilal 
14010  title to be confirmed (invited talk)  Filippo Simini 
14011  Smart Street Sensor
[abstract]
Abstract: Urban street structures are a snapshot of human mobility and resources, and are an important medium for facilitating human interaction. Previous studies have analyzed the topology and morphology of street structures in various ways; fractal patterns [1], complex spatial networks [2] and so on. Through a functional aspect, it is important to discuss how street networks are used by people. There are studies analyzing the efficiency [3], accessibility[4] and road usage[5] in the street networks too. In those studies, the researchers investigated either empirical travel routes or theoretical travel routes to understand the functionality of the street network. A travel route is a path within the network selected by people or selected under a given condition. Since the determination of a travel route is directly influenced by travel demand and the spatial pattern of the city, including street network and landuse formation, a selected route is a good way to capture complex interactions among the factors which are often hidden. For instance, fastest routes estimate the possible distribution of traffic as well as the street structure in a city. In this study, we analyze the geometric property of routes to understand the street network considering hierarchical property and traffic condition. Although many studies discuss the efficiency of a route or a street network, few people investigate the geometry of a route [6] or study how individual routes are intrinsic to the city structure. Two cities with similar efficiency can have a different geometry of congestion pattern and traffic pattern [7]. Therefore, understanding the geometric feature of routes can link the the existing knowledge of routes and the structure of urban street network. We especially focus on how much a route is skewed into the city center by measuring a new metric, Inness. The inness I of a route is defined as the difference between inner travel area Pinner and outer travel area Pouter as I = Pinner  Pouter. The areas are defined, after a route is divided into inner part and outer part based on the straight line connecting the origin and destination as described in the Fig.1. We measured the inness of the collected optimal routes within 30km radius from the center for 100 global cities including NYC, London, Delhi and so on. In the cities, we identified two competing forces against each other. Due to the agglomeration of businesses and people, street networks grow denser around the center area to meet the demand, and attract traffic toward the interior of the city. On the other hand, many cities deploy arterial roads located outside of the city to help disperse the congestion at urban core. The arterial roads act as the other force pushing traffic toward the exterior of the city. This tendency is well captured by our suggested metric. We analyze two types of optimal routes by minimizing the travel time and distance. While the shortest routes reveal mere road geometric structure, the fastest routes show the geometry in which the road hierarchy is reflected. We systematically select the origin and the destination having different bearings and different radii from the center. Then, we collect the optimal routes of the OD pairs via the OpenStreetMap API. Our results consist of two parts. We first compare the general average inness of both the shortest and fastest routes of the 100 global cities in order to point out the their fundamental differences. Later, we analyze the inness patterns of individual cities and discuss street layout and the effects of street hierarchy in each city.
Close

Balamurugan Soundararaj 
14012  A Retail Location Choice Model: Measuring the Role of Agglomeration in Retail Activity
[abstract]
Abstract: The objective of our work is to build a consumers choice model, where consumers choose their retail destinations only based on a retailers? floorspace and the agglomeration with others. In other words, at a very aggregated level, the goal is to describe a retailers success with a model which only takes into account its position, and its floorspace. We define the attractiveness of a retailer r as Ar = f? r +X r0 f? r0e""drr0 (1) where fr is the retailer?s floorspace, drr0 is the distance between r and some other retail unit r0. Eq.(1) states that the composite perceived utility Ar that a consumer attaches to a particular retailer r is equal to its individual utility, quantified as choice and therefore floorspace f? r , and the utility of the shops in its vicinity. In eq.(1), ? controls the extent of the internal economies and " of the external economies of scale. If ? > 1, the relationship between consumer perceived utility of a shop and its size is superlinear and the economies of scale are positive, meaning that a retailer would benefit from larger floorspace. Similarly, low values of ", which translate into a slow decay, would imply a strong dependency of on vicinity to other attractive neighbours, and viceversa. Exploiting eq.(1) we define the probability of consumer i shopping in r as pi!r = Are"#C(dir,$) P r0 Ar0e"#C(dir0,$) (2) where C(dir, #) is the cost function of travelling from i to r, $ and # are two parameters. Eq.(2) has been formulated using random utility theory and as once can see in the proposed crossnested logit model in eq.(2) consumers prefer to shop at larger shops (internal economies of scale) and at locations with higher concentration of retail activity (external economies of scale). In this work we have considered two types of trips, namely work to retail and home to retail. The model is therefore defined by 6 di?erent parameters, two describing the attractiveness of retailers through their internal and external economies (?, "), and two for each kind of trips describing the cost function, ($h, #h), and ($w, #w). Therefore the total modelled turnover will be of the form Yr = Y w r + Y h r = X l ? nw l pw l!r + nh l ph l!r ? (3) the $ and # have been calibrated using the LTDS datasets, as survey that includes 5004 home and retail and 2242 work to retail trips. Having completed the calibration of the distance profiles we can now calculate the modelled turnover estimates for each retailer r for a set of (?, ") parameters, defined in eq.(3) . This will tell us the modelled fraction of population that will end up shopping 1 in each retailer given their attractiveness and distance. Following this, we calculate the correlation level between the modelled turnovers and the observed floorspace rents. For each retailer r, we use the VOA rateable value as an indicator for willingness to pay for floorspace fr. The Rateable Value (a) Correlations (b) Scatter Plot Figure 1: As we can see from this figures the model yields high correlations with the VOA dataset?s rents. In the left panel we show the correlation between the expected turnover Yr(?, ")/fr and the Rateable Value / Size found in the dataset. Cmax ? C(? = 1.3, " = 0.008). These values are in agreement with a superlinear scaling in floorspace and with the observed retail agglomeration. In the right panel we present a scatter plot of the two quantities. is considered a very good indicator of the property value of the respective hereditament. In fig.(1) we compare the results of the models with rent data coming from VSOA. In fig.(1a) we can see how the maximum correlation between the modelled and real rents per squared meters is given by teh set of parameters (?max = 1.3, "max = 0.008). The ? value is in line with superlinear scaling of floorspace and expected earnings, and seems incredibly realistic, while the " values indicates a benefit in agglomeration of retail activities (the sign is positive), and indicates that the vicinity of a retail activity does have a non negligible role in defining an attractiveness.
Close

Duccio Piovani 
14013  Revealing patterns in human spending behavior
[abstract]
Abstract: In the last decade big data originating from human activities has given us the opportunity to analyze individual and collective behavior with unprecedented detail. These approaches are radically changing the way in which we can conceive social studies via complex systems methods. Large data, passively collected from mobile phones or social media, have informed us about social interactions in space and time [1], helping us to to understand the laws that govern human mobility [2?4] or to predict wealth in geographic areas [5]. More recently, data from Credit Card Shopping Records (CCSR) has also been explored providing new insights on human economic activities. Ref. [6] has shown that a fingerprint exists in the sequence of individual payment activities which permits the users to be identifiable with only few of their records. The shoppers spending behaviors and visitation patterns are very much related to urban mobility [7]. Both mobility decisions and expenditure behavior are subject to urban and geographical constraints [8] and to economic and demographic conditions [9, 10]. Further understanding consumer behavior is valuable to model the market dynamics, and to depict the differences between income groups [11]. In particular CCSRs have the potential to transform how we conceive the study of social inequality and human behavior within the geographic and socioeconomic constraints of cities. Here we present a novel method to exploit CCSRs to provide new insights in the characterization of human spending patterns and how these are related to sociodemographic attributes. We analyze CCSRs of approx. 150, 000 users over a period of 10 weeks. The dataset is anonymized, and for each user the following demographic information is provided: age, gender, zipcode. For all users we have the chronological sequence of their transaction history with the associated shop typology according to the Merchant Category Codes (MCC) [12]. Our analysis of the aggregated CCSR data reveals that the majority of shoppers adopt the credit card payment for twelve types of transactions among the hundreds of possible MCCs. These are: grocery stores, eating places, toll roads, information services, food stores, gas stations, department stores, telecommunication services, ATM use, taxis, fast food restaurants, and computer software stores. These transaction activities are depicted as icons in Fig. 1. Interestingly, the temporal sequence of how these transactions occur are different among individuals. First, we identify the dominant sequences of transactions for each user using the SEQUITUR algorithm [13]. Then we evaluate the significance level of each sequence calculating the zscore with respect to the sequences computed from 100 randomized sequences whilst preserving the number of transactions per type. Each sequence of transactions defines a path in the space of the transaction codes.We define the User Transaction Network (UTN) connecting the codes of most statistical significant sequence (with zscore> 2), preserving the order.We compute the matrix of user similarity (Fig.1 lower left) calculating the Jaccard index between all the users with at least 3 link in their UTN. Applying the Louvain Method [14] for community detection we are able to group users according to their the most significant sequence of payments. Fig. 1 shows our results for the six different behavioral groups detected, with each cluster ordered in appearance from 1 to 6 in the matrix of users similarity. The upper part of the figure describes the most common sequences of transactions for each group, the link value with the error represents the probability for a user of the group to follow that particular transaction order, and the value in parenthesis defines the fraction of users in the group that perform that transaction sequence. The bottom part shows the demographic attributes of each group with respect to the average population in red. In summary, we have uncovered lifestyles groups in the transaction history of the CCSR data that relates to nontrivial demographic groups. We will discuss future applications of these clusters of life styles in the context of adoption of innovations in the city.
Close

Riccardo Di Clemente 
14014  The universal dynamics of urbanization (invited talk)  Marc Barthelemy 
14015  Identifying and tackling Water Leaks in Mexico through Twitter
[abstract]
Abstract: As cities became smarter, the amount of daily data generated has become increasingly granular. Sensors, cameras, crowdsourcing, social media sharing, etc., can monitor different aspects in our cities, such as commuter flows, air quality over different time periods or public transport performance. The rise of the ?smart city? has then the potential of through some light into many fundamental urban problems, and pave the way to make cities a more livable and efficient places. Particularly, Twitter has attracted a lot of attention in recent years (Ausserhofer & Maireder, 2013) for its richness in content. People is not only sharing personal information through its closest contacts, but is using Twitter as a social and political platform to inform and disseminate all sort of statements or ideas (Weng & Menczer, 2015; Lu & Brelsford, 2014; Pi?aGarc?a, Gershenson, & SiqueirosGarc?a, 2016). Exploring this type of data has is gradually getting more and more important in terms of data collection. In addition, mining urban social signals can provide quick knowledge of a realworld situation (Roy & Zeng, 2014). It should be noted that the enormous volume of Twitter data has given rise to major computational challenges that sometimes result in the loss of useful information embedded in tweets. Apparently, more and more people are relying on Twitter for information. Twitter has been tagged a strong medium for opinion expression and information dissemination on diverse issues (AdedoyinOlowe, Gaber, Stahl, & Gomes, 2015). Leveraging largescale public data from Twitter, we are able to analyze and map the spread of information related to water leaks in the street, under the pavement and roads in Mexico (see Fig. 1). We gathered an initial sample of 2000 geolocated tweets posted by 1599 users tweets that contains the Spanish keywords: "fuga de agua" (water leaks).
Close

Carlos Adolfo Piña García 
14016  Estimating nonlinearity in cities' scaling laws
[abstract]
Abstract: The study of statistical and dynamical properties of cities from a complexsystems perspective is increasingly popular [1]. A celebrated result is the scaling between a city specific observation y (e.g., the number of patents filed in the city) and the population x of the city as [2] y = ?x? , (1) with a nontrivial (? 6= 1) exponent. Superlinear scaling (? > 1) was observed when y quantifies creative or economical outputs and indicates that the concentration of people in large cities leads to an increase in the percapita production (y/x). Sublinear scaling (? < 1) was observed when y quantifies resource use and suggests that large cities are more efficient in the percapita (y/x) consumption. Since its proposal, nonlinear scaling has been reported in an impressive variety of different aspects of cities. It has also inspired the proposal of different generative processes to explain its ubiquitous occurrence. Scalings similar to the one in Eq. (1) appear in physical (e.g., phase transitions) and biological (e.g., allometric scaling) systems suggesting that cities share similarities with these and other complex systems (e.g., fractals). More recent results cast doubts on the significance of the ? 6= 1 observations [3, 4, 5]. These results ask for a more careful statistical analysis that rigorously quantifies the evidence for ? 6= 1 in different datasets. We propose a statistical framework based on a probabilistic formulation of the scaling law (1) that allows us to perform hypothesis testing and model comparison. In particular, we quantify the evidence in favor of ? 6= 1 comparing (through the Bayesian Information Criterion, BIC) models with ? 6= 1 to models with ? = 1. The scaling relation in Eq. (1) describes a relation between two quantities y and x. However, the empirical data indicates that this relation can only be fulfilled on average. The statistical analysis we propose is based on the likelihood L of the data being generated by different models. Following Ref. [6], we assume that the index y (e.g. number of patents) of a city of size x is a random variable with probability density P(y  x). We interpret Eq. (1) as the scaling of the expectation of y with x E(yx) = ?x? . (2) This relation does not specify the shape of P(y  x) , e.g., it does not specify how the fluctuations V(yx) ? E(y 2 x) ? E(yx) 2 of y around E(yx) scale with x. Here we are interested in models P(y  x) satisfying V(yx) = ?E(yx) ? . (3) This choice corresponds to Taylor?s law. It is motivated by its ubiquitous appearance in complex systems, where typically ? ? [1, 2], and by previous analysis of city data which reported nontrivial fluctuations. The fluctuations in our models aim to effectively describe the combination of different effects, such as the variability in human activity and imprecisions on data gathering. In principle, these effects can be explicitly included in our framework by considering distinct models for each of them. We specify different models P(y  x) compatible with Eqs. (2,3): City models are the ones where we assume that each data point yi is an independent realization from the conditional distribution P(yxi), effectively to each city the same weight when computing the BIC of the model. For this model, we considered two different types of fluctuations, one Gaussian and the other Lognormally distributed, thus choosing a priori a parametric form for P(y  x). Person models are based in the natural interpretation of Eq. (1) that people?s efficiency (or consumption) scale with the size of the city they are living in. This motivates us to consider a generative process in which tokens (e.g. a patent,a dollar of GDP, a mile of road) are produced or consumed by (assigned to) individual persons, which leads to a P(y  x) that effectively weights the observations in of people. 1 100 101 102 103 104 y, BrazilAids City Model Person Model Running mean 103 104 105 106 107 x, Population 0.0 0.2 0.4 0.6 0.8 1.0 fraction < x 80% of the cities 75% of the population (A) (B) Figure 1: Comparison of the model of Cities and Persons. (A) Reported deaths by AIDS with respect to cities? population (dots). The lines represent the estimated scaling law giving the same weight to each city (city model, ? = 0.61) and giving the same weight to each person (person model). (B) Cumulative distribution of heavytailed distribution of citysizes in terms of cities and persons, i.e. the fraction of i) cities of size ? x (City Model); and ii) the population in cities of size ? x. We apply this approach to 15 datasets of cities from 5 regions and find that the conclusions regarding ? vary dramatically not only depending on the datasets but also on assumptions of the models that go beyond (1). We argue that the estimation of ? is challenging and depends sensitively on the model because of the following two statistical properties of cities: i The distribution of citypopulation has heavy tails (Zipf?s law). ii There are large and heterogeneous fluctuations of y as a function of x (Heteroscedasticity). We found that in most cases models are rejected by the data and therefore conclusions can only be based on the comparison between the descriptive power of the different models considered here. Moreover, we found that models which differ only in their assumptions on the fluctuations can lead to different estimations of the scaling exponent ?. In extreme cases, even the conclusion on whether a city index scales linearly ? = 1 or nonlinearly ? 6= 1 with city population depends on assumptions on the fluctuations. A further factor contributing to the large variability of ? is the broad citysize distribution which makes models to be dominated either by small or by large cities. In particular, these results show that the usual approach based on leastsquare fitting is not sufficient to conclude on the existence of nonlinear scaling. Recent works focused on developing generative models of urban formation that explain nonlinear scalings. Our finding that most models are rejected by the data confirms the need for such improved models. The significance of our results on models with different fluctuations is that they show that the estimation of ? and the development of generative models cannot be done as separate steps. Instead, it is essential to consider the predicted fluctuations not only in the validation of the model but also in the estimation of ?.
Close

José M. Miotto 
14017  Estimating Railway Travel Demand Through Social Media Geolocalised Data
[abstract]
Abstract: The fundamental fourstage modelling framework on railway planning is highly focused both on modal choice models and on the assignment of passengers' flows over networks. These last steps pursue the achievement of the maximum potential of new policies of transportation modes, constantly running towards more efficient and ecological modes. In Europe we assist at the emergence of several projects that aim to interconnect urban areas within and among countries, both with new or betterperforming links and through the developing of rolling stock able to interoperate among national networks characterized by different powersupply infrastructures and signalling/security systems and protocols. Linking demand and supply is therefore a challenge to project, provide and validate better international services that are both reliable and of high quality. Here we develop a new framework able to estimate railway traffic demand through the detection of a set of geolocalised tweets, posted in the last three years, overlapping railway lines in Europe. We scale the data of the potential passengers over a line through the socalled Òpenetration rateÓ, able to get an estimation of the sample we got over the total tweeting population. We compare our data per line with the frequency of the services on several railway branches in order to calibrate our estimations on flows. Our findings provide information about passengers' flows through regions, running over current methodologies that generally constrained data within single countries or administrations. Therefore the potential of the methodology goes towards the interoperability of data through countries, helping planners not only in getting a new source of crosscountry demand estimation, but moreover to get a new tool and set of data for the calibration and validation of transportation demand models.
Close

Fabio Lamanna 