WWPS401: Data Polygamy: The Many-Many Relationships among Urban Spatio-Temporal Datasets
AWS re:Invent 2016 - A podcast by AWS
Categories:
In this session, learn how Data Polygamy, a scalable topology-based framework, can enable users to query for statistically significant relationships between spatio-temporal datasets. With the increasing ability to collect data from urban environments and a push toward openness by governments, we can analyze numerous spatio-temporal datasets covering diverse aspects of a city. Urban data captures the behavior of the city’s citizens, existing infrastructure (physical and policies), and environment over space and time. Discovering relationships between these datasets can produce new insights by enabling domain experts to not only test but also generate hypotheses. However, discovery is difficult. A relationship between two datasets can occur only at locations or time periods that behave differently compared to the regions’ neighborhood. The size and number of datasets and diverse spatial and temporal scales at which the data is available presents computational challenges. Finally, of several thousand possible relationships, only a small fraction is actually informative. We have implemented the framework on Amazon EMR and show through an experimental evaluation using over 300 spatial-temporal urban datasets how our approach is scalable and effective at identifying significant relationships. Find details about the work at