By John S. Hollywood, Operations Researcher; Kevin J. Strom, Senior Scientist; and Mark Pope, Research Analyst, RTI International, Research Triangle Park, North Carolina
ince the terrorist attacks of September 11, 2001, there have been repeated calls to find better ways to “connect the dots” by identifying and assembling small bits of information that collectively could spotlight a potential terrorist plot in progress. In recent years, the federal government has encouraged state and local law enforcement agencies to establish fusion centers to integrate data from different sources to create actionable knowledge. Yet while substantial direction has been provided on how to establish, organize, and manage a fusion center, little guidance has been offered on how to conduct data fusion for counterterrorism purposes.1
Past experiences have shown that both international and domestic terrorist organizations use surveillance-based activities as part of their site selection and operational planning phases. In 2006, for example, two men were charged with taking surveillance video of the U.S. Capitol building, the World Bank, a Masonic temple, and a fuel depot in Washington, D.C., to send to overseas terrorist groups.2 One approach to detecting terrorist activity is to identify instances of preplanning behaviors, including surveillance and probing of potential targets. Preplanning behaviors can include videotaping, photographing, or writing notes or drawing sketches of a building’s structural components or security defenses. Other activities include trespassing in secure areas, asking detailed questions about a target’s occupants or defenses, or leaving suspicious packages or making bomb threats to study emergency response operations.
This article describes a method for using 9-1-1 calls-for-service (CFS) data to find potential instances of surveillance by terrorists and presents a test case of the method.3 RTI International collaborated with the Washington, D.C., Metropolitan Police Department (MPD) to develop and test the method using more than 1.3 million CFS records that covered nearly two years of
9-1-1 calls. The project was sponsored by the National Institute of Justice (NIJ), the research arm of the U.S. Department of Justice.
Using 9-1-1 CFS records offers two distinct advantages. First, because of the nature of the calls themselves—callers observed a behavior that was suspicious enough to call 9-1-1—the call data have already been filtered to some extent in terms of the level of perceived seriousness of the incidents. Second, 9-1-1 call data are public information that can be analyzed without infringing on individuals’ privacy rights. This places the method in stark contrast to methods that rely on analyzing personal data, such as credit card transactions, phone records, and information on individuals brokered by data aggregators, which have come under heavy criticism for violating individual privacy rights.4
It should be stressed that 9-1-1 CFS data should not be analyzed in isolation for homeland security purposes. Rather, analyses are intended to augment a jurisdiction’s ongoing counterterrorism efforts by providing an additional information source that can assist in identifying locations (or types of locations) at an elevated risk of attack. This information can be cross-referenced with other investigative information, including known threats to specific locations or types of infrastructure (such as bridges or tunnels). In some cases, information can also be extracted that can be used in follow-up investigations, such as vehicle tag numbers related to certain suspicious activity.
One of the key advantages to the methods described in this article is that these processes can be easily replicated and do not require extensive technical training or software. Once refined and tested in additional jurisdictions, the methods could be implemented more widely to monitor levels of suspicious activity as part of an agency’s homeland security processes, identifying locations at increased risk.
The method for analyzing 9-1-1 calls has four major phases:
- Preprocessing the 9-1-1 CFS data to produce a single searchable data set
- Filtering the data set by type and keyword to identify incidents that might constitute surveillance or probing activity and reviewing the remaining incidents for relevance
- Identifying the clustering of incidents by location, time, and type of activity
- Prioritizing clusters of incidents and searching for additional information to test whether the cluster locations are at risk of being targets
Phase 1: Preprocessing the Data
To effectively analyze a jurisdiction’s 9-1-1 data, the following four fields are required:
- Location of the call, at minimum an address and preferably a location name (for landmarks) and geospatial coordinates
- Time and date of the call (or at minimum the date of the call)
- Type of call, including at least codes for suspicious persons or vehicle activities, suspicious packages, and bomb threats
- The 9-1-1 caller’s free-text description of the incident; note that specific requirements for preprocessing 9-1-1 call data to create these fields will vary by jurisdiction
Phase 2: Filtering the Data
With millions of CFS records, the second phase filters out all but a small fraction of records that might reflect instances of surveillance or probing behavior. This phase involves two primary steps: filtering the data using certain keywords and manually reviewing remaining cases.
Keyword Filtering: While there are sophisticated algorithms for filtering data, one can also categorize records effectively through call type and key phrase matching. For the MPD data, potential surveillance reports were found in calls typed as “suspicious person,” “suspicious vehicle,” “investigate the trouble,” and “other.” Similarly, potential probing reports were found in calls typed as “suspicious package,” “hazmat,” and “bomb threat.” To find calls potentially related to surveillance, records containing the following key strings were selected:
- Photography: photo, camera, picture
- Video: video, taping, film, camcorder
- Note-taking: note, write, typing
- Visual aids: binocular, telescope, lens
Jurisdictions might need to identify keywords in order to exclude records as well. In the case of the MPD data, many of the records including photography keywords were requests for pictures of crime scenes or prisoners; this discovery led to a second round of keyword matching to eliminate such records. For the MPD data, the filtering process reduced the candidate pool of calls to about 1,150 records—a major reduction from the more than 1.3 million initial records.
Manual Review: The next step in the filtering process is to review manually the remaining records of interest to assess call relevance with respect to potential surveillance and probing activity. During the review of the MPD CFS records, the following types of reports were excluded:
- Calls in which the reported behavior was suspicious but was strongly associated with more traditional crime such as larceny or burglary (such as suspicious individuals videotaping the inside of parked cars or wandering in and around personal residences)
- Calls that included a follow-up stating that the person engaged in the reported activity had no criminal intent (for example, a traveler accidentally leaving a suitcase or a genuine tourist engaging in risky photography)
- Calls in which the key phrase resulted in a mismatch, such as references to security cameras, photographs taken by crime scene technicians, or someone threatening to steal a camera
This manual review reduced the total candidate pool of calls to about 850 records, which were considered to have at least minimal potential relevance to hostile surveillance or probing. Although this number of records was fairly large because it covered 20 months of data, it was still feasible to assess all the records manually. In an operational setting, an analyst would most likely have no more than a few dozen new records to assess each month.
Phase 3: Identifying Incident Clusters
The third phase of the method involves the identification of incident clusters by time, location, and type. Examining the MPD data, clusters of incidents were found in two ways. First, clusters were found by address. Incidents at the same address (or immediately nearby) were found by sorting the records by address and flagging duplicate addresses. The call data also included geospatial coordinates, which allowed for further identification of groups of incidents at nearby addresses by sorting on the coordinates. Several additional clusters were found by plotting the geospatial coordinates using geographic information system (GIS) software. The next step was to determine whether the clusters of incidents were further related by time (if the incidents were within a few days or weeks of each other) or by type (if the activities described in the comments were similar).
|Figure 1. Tracking suspicious incidents by time and location. The horizontal axis numbers represent three-month intervals, and the vertical axis numbers indicate number of incidents.|
Second, clusters of incidents were found by location type, such as hotels, hospitals, and highway stretches. To find these clusters, incidents were manually assigned consistent labels for “landmark” locations that were not a generic residence or office building. Twenty-three types of locations were found, some of which were single locations with a large number of incidents. From these data, it was easy to prepare bar graphs comparing the numbers of incidents at each type of location and panels of line graphs tracking whether the number of incidents changed over time. Figure 1 shows a subset of the line graphs used to track MPD incidents by quarter, indicating increased activity over time for hospitals and hotels.
Phase 4: Prioritizing Incident Clusters
The final phase of the method identifies and analyzes incident clusters posing the greatest risk. The first step in this phase is to assess the risk of the incidents that make up the clusters using an assessment framework developed in consultation with MPD staff and other subject matter experts. The second step is to identify clusters of incidents and their associated “locations of interest”—that warrant further study. For the MPD 9-1-1 data, 12 locations of interest were identified, with multiple moderately suspicious incidents.
The third step is to search for additional evidence that the locations of interest are being targeted. This step involves revisiting the 9-1-1 call data set to review every incident at the location of interest, identifying additional incidents potentially related to surveillance or probing. In the analysis of the MPD call data, the review yielded a number of additional cases at the locations of interest. Most of these were low-risk suspicious package calls, but a handful of calls were considered moderately suspicious and worthy of additional attention.
The fourth step is to summarize the evidence for and against the hypotheses that the locations of interest are actually being targeted. Evidence for targeting includes descriptions of the relevant clusters of incidents, along with any notable details. Evidence against targeting includes descriptions of incidents that initially appeared suspicious but were not, along with potential alternate explanations for the behavior based on input from investigators and analysts in the department.
The final step is to assess the evidence critically and prioritize the locations of interest by risk. For the MPD 9-1-1 data, 4 of the 12 locations of interest were identified as warranting additional police attention. Each of these locations experienced multiple instances of genuinely atypical behavior that continued or escalated through the end of the examination period. Table 1 provides a version of what the evidence table looked like for these four locations, with the specific locational information removed.
|Table 1. Locations of Interest|
As noted previously, no examples of highly suspicious activity warning of a terrorist plot were found in the MPD CFS data. However, four areas were identified that had seen enough activity to warrant further investigation. Representatives from the MPD found these results to have significant operational value, because they could reinforce and expand the department’s existing knowledge of at-risk locations of interest or, in some cases, provide the department with new locations of interest. In some cases, the MPD noted likely ordinary explanations for these findings; for example, the MPD felt that most of the highway and bridge incidents described tourists taking pictures.
Attempting to identify suspicious behavior indicative of far more sinister plans often resembles looking for the proverbial needle in the haystack. Behaviors can be misinterpreted by citizens, officers, or security personnel, resulting in an unknown number of false-positive reports. Yet calls from citizens about suspicious activity could also represent an important source of information on potential terrorist threats. Unfortunately, the use of 9-1-1 data has been largely unexplored in terms of its potential utility to homeland security.
The objectives of this project were twofold: (1) to apply data analysis approaches to a data source commonly available to law enforcement agencies to verify if relevant findings could be produced and (2) to document this process so that it could be tested and refined in other jurisdictions and ultimately incorporated into standard operating procedures within law enforcement agencies or data fusion centers. The intention of this pilot study was not to identify confirmed terrorist activity; rather, the objective was to develop a process for reducing a large volume of information to a smaller subset of incidents and associated locations of interest that met some predefined criteria and were considered worthy of additional follow-up investigation.
The findings show that simple analytic approaches can produce operationally relevant findings using 9-1-1 CFS data. Using type and keyword-based filtering successfully reduced the number of CFS records to a number accessible to a human analyst. This subset of records then enabled characterizing, clustering, and typing of these few records to produce a list of locations worth further investigation.
Plans are under way to continue to refine the method (for example, fully automating some of the data processing steps) and to expand the testing of the method to additional jurisdictions. As shown by this study, analyzing 9-1-1 call data can produce results that were previously unknown or can reinforce existing information on counterterrorism activities. In both of these scenarios, results can be produced that are operationally valuable to law enforcement and homeland security officials. More broadly, findings from CFS analysis can establish a baseline level of suspicious activity in jurisdictions. Ultimately, suspicious activity could be monitored on an ongoing basis, and the results could be used to take a proactive approach to terrorism prevention. This regular flow of prioritized information could be incorporated into a department’s strategic review process or into a fusion center environment.
The approaches taken in this study also have relevance to traditional crime prevention. Many calls of suspicious or criminal activity do not result in a formal police report; such calls include drug activity, disorderly conduct, and suspicious activity related to criminal activities (such as casings of locations or victims). As such, crime incident and arrest data might be insufficient for analyzing local crime trends and predicting emerging patterns in crime. By utilizing 9-1-1 CFS data, it should be possible to identify and predict small-area upswings in crime, as well as to better understand which types of suspicious and criminal activity are precursors to violent crime. Ultimately, the systematic use of 9-1-1 CFS data can help law enforcement agencies take more complete advantage of citizen reporting, both in terms of counterterrorism and violent crime prevention.
The authors would like to thank Chief Lanier and the MPD for their support. They would especially like to thank Anne Grant, Mel Blizzard, and Steven Sund, who provided valuable assistance and support throughout the course of this project. They would also like to thank the National Institute of Justice (NIJ), which provided funding for this research. ■
1U.S. Department of Justice, Global Justice Information Sharing Initiative, Fusion Center Guidelines: Developing and Sharing Information and Intelligence in a New World, July 2005, http://www.fas.org/irp/agency/ise/guidelines.pdf (accessed August 25, 2008).
2“Prosecutors Allege Suspects Shot ‘Casing Video,’” Associated Press, April 28, 2006, http://www.msnbc.msn.com/id/12539510/ (accessed August 25, 2008).
3The method is derived from RTI’s more general Trinity Sight methodology for conducting predictive analysis for counterterrorism and law enforcement, which is described in Colleen McCue, Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis (Burlington, Massachusetts: Butterworth-Heinemann, 2007).
4As an example, see Robert O’Harrow Jr., “Centers Tap into Personal Databases: State Groups Were Formed after 9/11,” Washington Post, April 2, 2008, http://www.washingtonpost.com/wp-dyn/content/article/2008/04/01/AR2008040103049.html (accessed August 25, 2008).
From The Police Chief, vol. LXXV, no. 10, October 2008. Copyright held by the International Association of Chiefs of Police, 515 North Washington Street, Alexandria, VA 22314 USA.