A common thread among all of the introductory Physical Geography and Cultural Geography courses I teach every semester is that students bring with them both prior geographical knowledge they may or may not be aware of and also an interest in using technology to learn about the world. Keeping this in mind, I introduce students to geospatial technologies early in the semester by demonstrating a technology most are familiar with: Google Maps. I geocode an address by entering an address that the application then ties to geographic coordinates and places on a map. Knowing most students have already used the technology, I wait for the inevitable sighs or eye-rolls that suggest I am not teaching them anything new. Then I ask them “How does the application know where to place the point for the specified address?” Silence typically reigns, which allows me to start talking about how geocoding accuracy is dependent upon a high quality reference dataset.
A reference dataset is a collection of features (points, lines, or polygons) representing real-world features (e.g. the center of property boundaries, road centerlines, zip code boundaries) that already have addresses tied to them in an associated table. During geocoding, an application like Google Maps compares the inputted address with the reference dataset, and when a match exists, it uses the coordinates of the reference dataset’s feature to map the inputted address. A reference dataset’s quality varies according to the locational accuracy of the features (e.g. center of property boundaries) and the attribute data used for the address matching (e.g. Street Name, ZIP Code, etc.). A reference dataset’s quality is very important because it determines how accurately the matched address is mapped.
Maryland Statewide Addressing Initiative
Towson University Center for GIS (CGIS) has been supporting addressing initiatives in Maryland for quite some time, and is always looking to utilize cutting edge technologies and improved processes to collect addressing information from across the state for a holistic reference dataset. Since 2007, we led the Maryland Statewide Addressing Initiative which initially began as a pilot project testing geodatabase replication in the collection of road centerline address data for the entire State of Maryland. This initiative was supported by the Maryland State Geographic Information Committee (MSGIC) and ESRI and initially funded through a grant from the Maryland Highway Safety Office in an effort to improve highway safety through the timely and accurate location of traffic related incidents.
Since 2013, the Maryland Department of Information Technology (DoIT) has worked on developing and maintaining a high quality reference dataset that is associated with address points. All address points are associated with a property parcel. The address point’s location typically corresponds with an addressable structure that exists on the property, thus making the dataset very accurate from a locational standpoint. Building upon the established Maryland Statewide Addressing Initiative’s methodologies, every four months, DoIT collects address point data from Maryland’s counties and Baltimore City and standardizes the data into a single statewide dataset that consists of over 2.5 million address points. The statewide address point dataset is used as an important reference dataset for several of Maryland’s geocoding services, which users use to map addresses of interest.
Reducing Level of Effort and Gaining Efficiencies
Since the statewide address point initiative’s beginning, CGIS worked with DoIT in developing an automated process that uses the local jurisdiction’s address point submissions to create a statewide address point dataset. In 2013, CGIS created several Python scripts that 1) standardize each local jurisdiction’s submission to the statewide address point schema; 2) cleans up the data so that all text values are trimmed, uppercase, do not have hyphens or apostrophes; 3) adds and calculates additional fields; and 4) combines all of the submissions into a single statewide address point dataset.
In 2017, CGIS and DoIT revisited the address point process and identified areas for improvement based largely on changes to how some local jurisdictions submitted data to DoIT. In the previous process, local jurisdictions comprising the Baltimore Metropolitan Council (BMC) submitted their data to BMC and BMC submitted a single BMC dataset to DoIT for inclusion in the statewide address point dataset. Now, the local jurisdictions making up the BMC make their submissions directly to DoIT, which required a significant change to the original process. CGIS and DoIT used this change in workflow as an opportunity to streamline the original process by automating previously manual tasks and also integrating the entire process into FME Server, which is a software package DoIT currently uses for several other data management tasks. By reducing the amount of manual intervention and integrating data validation checks, the modifications make the DoIT’s statewide address point dataset’s maintenance more efficient and sustainable.