Geospatial Awareness

From Knoesis wiki
Jump to: navigation, search

Participants

  • Derek Doran, Advisor (derek.doran@wright.edu)
  • Matthew Maurice, MS Student (matt@knoesis.org)
  • Matthew Pickenrock, MS Student (matthew.pickenrock@wright.edu)

Project Overview

Wide area motion imagery (WAMI) is routinely captured by aerial surveillance platforms to survey vary large geographic regions. Whereas current work focuses on a pittance of the data collected in WAMI to track single targets (e.g. people or cars), this project will leverage great quantities of the data to discover the positional, temporal, and dynamic characteristics of a broad region for geospatial awareness. This work seeks to extract temporal network representations of regions where dynamic flows (arcs) that begin and end at different positions (nodes) are captured. Analysis of these networks could be used to automatically reveal critical information about a region, including the ranking of positions by their ‘interestingness’, learning patterns of normal regional dynamics to discover abnormal activity, building identifiable fingerprints of a region for classification, and predicting the evolution of regional activity over time.

Our first effort focuses on building systems to ingest WAMI, extract a network representation, and translate sequences of frames into temporal network models. Extracting the network representation may be done with existing algorithms, but translating frame sequences into temporal networks is an open research question. Multiple experimental designs for translating WAMI into a temporal network will be explored. Diagnostics rooted in information theory will be devised to evaluate translation quality.

Project Objectives

The two objectives of this CSR project are to:

- Build a software tool (a Java library) that automatically extracts network representations of the dynamics of geospaces. Ideally, the library should be useful to any research group wanting to embark on WAMI analysis without building their own in house tools from scratch.
- Exploration of techniques to automatically develop temporal network representations from streaming data. The techniques could resolve this open research problem for any type of data source, but we also or instead may make our solution tailor made to WAMI.

Tentatively, the work is partitioned so that Spring and Summer 2015 are devoted to tool development, while Fall 2015 and Spring 2016 are devoted to temporal networks. An undergraduate research assistant, with aspirations for pursuing a graduate degree at WSU, will join the CSR team Fall 2015.

Project Details

Part 1: WAMINet

As the collection and use of WAMI becomes more prevalent, automated systems able to extract intelligence from them will become ever more important. However, two challenges stand in the way of the development and use of such automated systems in practice. The first challenge is that current state-of-the-art systems focus on identifying the features and dynamics of an individual person, a car, or a small number of other tracks moving within the geospace, without any regard for the “global” context of where and how these tracks are moving. Track identification and move- ment analysis is very important, but such movements may not be interpretable without a broader understanding of the shape and dynamics of the entire geospace. For example, it may not be possible to identify if the trajectory of a track, discovered with modern WAMI processing systems, is routine or unexpected without knowing the common trajec- tories that tracks in the geospace take. Completely automated analysis of WAMI, therefore, requires the discovery and analysis of not only local tracks and movements in the imagery, but also a reference model that characterizes the global features and dynamics of the geospace. The second challenge is the lack of software tools or libraries that support the development of WAMI analysis platforms. The typically high resolution and the environmental factors faced when capturing WAMI may require a specialized set of preprocessing steps compared to typical image processing tasks. This preprocessing must be built from scratch by every research group interested in running experiments over WAMI and by developers wanting to build practical tools.


WAMINet Progress

Note: Progress current as of (November 1 2015)

1. Project make and build, integrating GDAL, OpenImaj, using Apache Maven for simple project building -- Done

2. WAMI image registration from Greene 2007 SDMS WAMI dataset -- Done

- GDAL-based image rasterization and scaling
- Registration algorithms: after a comparative analysis, WAMINet uses RANSAC for key-point detection and the ASIFT algorithm for frame alignment.

3. Foreground (track) detection -- Done

- from k frames, we construct a median image that eliminates dynamic movement. Gaussian threshold filter applied to the difference of a frame against the median reveals the position of foreground objects.

4. Track assignment -- On Hold -- Ground-truth tracking data from AFRL has been provided.

- Resolve track assignment problem across multiple image difference frames
- Explore use of hungarian algorithm may be promising. Dealing with noise that gets through the gassuian filter

5. Point of interest detection -- Done

- We use a hierarchical clustering approach: for each track, use non-parametric clustering algorithm (dbscan) to identify locations where the individual does not move for extended periods of time
- Superimposing the centroids of such locations over all users onto a single dataset, we run another non-parametric clustering algorithm (OPTICS) to find locations -- points of interest -- where a large number of individual tracks are stationary.

<poi placeholder>

6. Library development -- Done

- WAMINet has been refactored using a Python wrapper. This Python module calls Image processing code from Java, and point of interest detection (hierarchical dbscan and optics) and data visualization and analysis code from R into a single pipeline for easy execution. All that should be necessary is specification of input image files, and the parameters k and beta to extract temporal network representations.
- R code has been developed into an importable R package, with a refactored and generic interface for any dataset.

Part 2: Extracting Temporal Network Representations

Time-ordered data such as WAMI frames stand to benefit through a representation as a temporal network, which is an n-tuple of networks G_1, …, G_n where each G_i is a single network (called a frame) representing the structure of the data observed during the ith discrete time period. However, data must be discretized or "binned" into time windows from which each G_i may be constructed. But without a sound methodology to perform discretization, it is unclear if any insights can be reached with a temporal network representation: analysis of a temporal network that does not adequately model the true dynamics and patterns of the time-ordered data can easily lead to inappropriate conclusions in a scientific study.

<temp net figure placeholder>

Hyper-parameters that determine how time-ordered data should be discretized, namely the length of the time interval k and the amount of overlap allowed between intervals \beta impact the shape of the resulting temporal network and its interpretation. An illustration of this effect is seen in the Figure above. For some data stream, the top representation uses large time windows so that each G_i summarizes a large number of data transactions or relations. The bottom representation over the same stream uses smaller windows, with some overlap, resulting in network frames that have structural correlations and are less dense. Recognizing that the two series of networks have very different structural properties and represent different sets of relationships, how one chooses to discretize the data will strongly impact the interpretation of the model.

Short of a natural method to divide WAMI frames into discrete bins, an ideal discretization is one yielding a temporal network that conveys as much "information" as possible. Consider for example a discretization of data about relationship formation among n entities. For small values of k and \beta, the intervals will feature fewer records of relationship formation, thus the resulting G_i's will have few edges. These sparse frames, likely to be very different from each other, thus convey little information and will lack meaningful patterns and correlations between the frames. At the other extreme, for large values of k and \beta, the larger number of relationships formed in a time interval will create Gi’s that are heavily correlated and dense, hiding significant structural patterns in its complex structure. A useful representation is thus one where frames are sufficiently "complex"to capture important structural patterns, but not so dense that such patterns are hidden within highly correlated frames.

In the same way that the Shannon entropy of a probability distribution quantifies the amount of information realizations convey to an observer, network entropy quantifies the randomness, and hence information, the structure of a network conveys. The discretization parameters maximizing the entropy of the resulting temporal network may thus be an appropriate choice. Yet network entropy has scarcely been explored in the context of temporal networks; analysis of the adaptability of present measures is thus necessary.

Network entropy measures broadly take the form

E =  -\sum_{i} \frac{f(V_i)}{\sum_j f(V_j)} \log{\frac{f(V_i)}{\sum_j f(V_j)}}

where f(V_i) is a functional mapping each vertex of the network to a real number. Different choices of f thus lead to different definitions of network entropy.

Temporal Network Progress

1. Identify UGRA supporting this task -- Done

2. Identify relevant R packages and frameworks for temporal network modeling, visualization, and analysis -- Done

3. Investigate relevant entropy measures for temporal networks

- Single-network measures and projections of temporal networks to a single structure
- Measures over random graph generators: ERGM still appears promising
- How may entropy measures be compared? Rates of change over time; expected changes in entropy against small perturbations in the network model

4. Develop methods for finding the 'best' temporal network representation across a collection of n > k frames -- In Progress

- From "best" model: appropriate max or min points on the surface (k, beta, Entropy) -- if convex.
- Current Experiments: given a simulated temporal network where edges are added and deleted with uniform probability based on a candidate set of nearest neighbors,

we explore different ways to measure entropy.

Reporting

Journals

1. Software Tools and Libraries for WAMI (placeholder title)

- Target: Journal of Systems and Software paper on WAMINet. New Ideas and Trends Paper (NITP) submission.

2. Dynamic Geospatial Analysis from Temporal Networks

- "Capstone" publication along with Matt Maurice's thesis.

3. WAMINet / OPTICS++

- Target: Journal of statistical software, 2016

Conference Proceedings

1. WAMINet : An Open Source Library for Dynamic Geospace Analysis Using WAMI

- IEEE International Symposium on Multimedia, December 2015
- Status: Accepted

2. Translations of of entropy measures for temporal networks

- Target: Graphs as Models

3. Temporal network representations from streaming data

- Target: NetSci 2016

4. OPTICS++

- Target: ICDM/KDD/SDM

CSR Monthly Reports

October/November 2015

There have been a number of recent developments to the project:

1. We have completed implementation of a compiled R package for WAMINet. This R package implements the DBSCAN track-level clustering, OPTICS geospace-level clustering, builds a temporal network representations of the data given k (window size) and beta (window overlap) and offers a number of visualizations for an analyst.

2. Extracting temporal network representations of the data given k and beta is complete. Edges between points of interest in the geospace are defined when a track is observed moving from an area in one point of interest to an area in another. The area of a point of interest is defined as the convex hull of the geospace-level cluster found by OPTICS (see figure above).

3. MP has been iterating and running experiments over a synthetic dataset where energy passes between neighboring vertices in a network with randomly placed vertices over time. Various ways to define the 'entropy' of a temporal network have been explored, with an eye towards finding a measure that is intuitive and yields a surface Entropy(k, \beta) that is convex. Initial experiments used an entropy measure of the form E =  -\sum_{n} \sum_{i} \frac{d^n_i}{\sum_n\sum_j d^n_j} \log{\frac{d^n_i}{\sum_n\sum_j d^n_j}} where n enumerates over all temporal network frames and d^n_i is the undirected degree of node i in temporal network frame G_n. In other words, the entropies of each individual network, using degree as our network statistic to induce a probability distribution, are computed with respect the sum of all node degrees across every temporal network. Recognizing that this entropy measure are, possibly inappropriately, combining the local distributions of vertex degree with a 'global' distribution of vertex degrees across all network frames, MP is turning to forms that sum only local entropies.

4. MP has devised a number of visuals for an analyst to interpret the temporal network that has been generated given some k and \beta. They may be used to study, among other things, sets of tracks that appear and disappear during common periods of time, paths over the geo-space that are consistently or sporadically followed.

5. MM is translating data about the temporal network representations into a visualization over Google Maps. The Google Maps API enables easy navigation and visualization of the geo-space with overlays for where the points of interest are located, and how tracks tend to move across the space over time. Summary visuals of movements are currently developed.

Steps for December 2015:

- MM is examining how to integrate semantic data about points of interest into a Google Maps visualization of track movements and points of interest. Reverse geocoding APIs may be used to annotate points of interest with metadata. Paths frequently seen across the set of temporal network frames {G_i}, annotated with this metadata, reveals qualitative information about patterns of life. Paths that emerge sporadically may also reveal an 'interesting' one-time event.

- MP will continue exploring entropy measures for temporal networks. With WAMINet now able to compute a temporal network of a geospace given k and beta, MP will apply these measures to the real tracking and synthetic datasets. Our target is to find an entropy measure specifically tailored to the tracking data provided by AFRL. We will also reach out to gain new tracking datasets to test if the entropy measure generalizes.