CS 128/ES 228 - Introduction to Geographic Information Systems

Lab 8: Geocoding

Goals:

        By the conclusion of this lab period, you will have:

  1. Learned to perform basic geocoding operations in a GIS.
  2. Dealt with some of the most common geocoding errors.

Disclaimer

    All of the data/addresses/events in this lab (other than the TIGER data) are the product of (one of) your instructor's overactive imagination.  The addresses themselves are real, but the events described are fictional and should be treated as such!

Background

    Geocoding is the act of locating a point based solely upon its address.  In other words, the geocoding operation turns a conventional address into a point in the coordinate system of the given GIS.  To accomplish this, one must have several items:

  1. A base layer that includes streets (and cities/towns) by name as well as by location, and the attribute data for those streets must include the numbers of the houses at both ends of the street.  TIGER files do contain such data and often serve this purpose in a GIS.

  2. An algorithm to match addresses to locations.  This algorithm is generally built in to the GIS, but is often "tuneable" in the sense that the user may specify the degrees of tolerance for matters such as spelling.
  3. A review mechanism to allow users to "correct" choices made by the matching algorithm. 

The same base layer may, and often is, used for all of the coding within a given GIS.  In our course, we will use TIGER maps for this purpose.

 Instructions

  1. Getting the base layer.  For this lab, we will begin with an empty map.  Our base layer will be the TIGER data for roads in Cattaraugus County.  Download the zip file for this layer, extract it in a folder named Lab8 on your machine and add the appropriate file to your empty map.  Open the attribute table.  How many records are in this database?  How much space does the database take up on disk?  How much space does the complete data set take up on disk?


  2. Preparing the address locator.  ARC GIS locates addresses through an address locator.  In a simple sense, an address locator specifies the database to be used by the geocoding algorithm and the names of the fields within that database that are to be used for each purpose.  Before we can use this layer to help us geocode a set of addresses, we need to create an address locator based upon the layer.  To do so, start ArcCatalog.  In the left window, select the Address Locator option.  In the right window, you should then double-click on Create New Address Locator.  Name your layer "CattaraugusCountyLocator", have it be of type US Streets (File-based), and have the reference data be your TIGER shapefile.  Once you set the latter, field names should appear in the windows on the left.  You should examine the values on the right to learn about your options, but you should leave them unchanged for now.  Simply click OK to create the locator.  To make ArcMap aware of the address locator, choose Address Locator Manager from Geocoding... under the Tools menu and then select the one you just created.  (The Address Locator option should be part of the pull-down menu.)

  3. Preparing the table of addresses.  The last data set needed for the geocoding operation is a series of addresses.  We will build our data set directly in ArcCatalog.  Within ArcCatalog, navigate to your Lab8 folder and then create a new dBASE table.  Name it events.dbf.  Add this file as a layer to your map.  Open its attribute table.  Add two new fields named Address and Event.  (Both should be text fields of length 35 and 20 respectively.)  Delete the field named Field1. Put the following data into the fields (remember to start editing):
     
    1414 Washington StParty
    146 N Union StInjury
    1220 Kamery RdCrime
    Save your data and stop editing. 


  4. Geocoding (when it works).  Within ArcView, geocode the addresses by choosing to do so from under the Tools menu.  Before you dismiss the window, determine how well the matching worked.  Report this data.  Once you dismiss the window, you should be able to find the three dots on your map.  Note that you may need to zoom appropriately to really understand things.  Once you are done, remove this layer (so that it doesn't block the results of the next step.)


  5. Simple addition.  Go back to your table and add another injury at 120 South 14th St.  Do what is necessary to have this location appear on the map as well.


  6. Handling ambiguous addresses.  A crime has occurred at the Olean Public Library (located at 134 N Second St).  Add the appropriate data to the events table and geocode a new layer.  The library should be located within the quadrilateral defined by the four previous points.  It is not.  Where is it?  The problem is that ArcMap found a tie for the best match and did not choose the location you wanted.  Review this match and choose the southern-more of the two plausible locations for the library.  (Note that while you are reviewing, all candidates show up in light blue with your current choice in yellow.  Choose different records until you get the yellow dot to fall on the location you wish it to.)  Warning: you may have to turn the layer off and then back on to see your changes.  What is the match score for the choice you made?

  7. Errors in the base layer.  The library in Step 5 was not matched properly because there were too many matches and the algorithm supplied did not choose the one you desired.  Another case can come up when the database does not have the address located.  To your events table add a party at 3299 West Valley View Dr.  After deleting the previous results layer, geocode your new table.  Review all of the choices, particularly the one for your new party.  What is the problem with the data?  Do the best you can in terms of locating the party.


  8. Further selections.  A geocoded layer is like any other layer.  Using standard techniques, select the locations within your address that correspond to parties.  Create a screen snapshot showing both layers plus the selection.


  9. Correcting the base layer.  (Complete this step only if time permits.)  In truth, 3299 West Valley View Dr. is at the southern (not northern) end of the road, right where the road bends to the east.  The east-west road there has no house number data.  What is the record number for this road?  How did you find out?  Add some data for that road into the shapefile; make the house numbers go from 3298 to 3500.  Save your edits.  Go back in to make sure that they were saved correctly.  Now perform the geocoding operation again.  Describe the changes you see (or don't see) and how you account for them.  If you recall enough about the early steps in this lab, you should be able to explain this completely.  If not, give the best explanation you can.

 

To Hand In

    Hand in "the usual": a Word document containing your responses to the questions and explanation requests from the various steps, the screen snapshot from Step 8, and a cover page.

 

 

Help Policy

       Help Policy in Effect for This Assignment: Group Project With Limited Collaboration

       In particular, you may discuss the assignment and concepts related to the assignment with the following persons, in addition to an instructor in this course: any GIS instructor and any student enrolled in CS 128/ES 228.

       You may use the following materials produced by other students: materials produced by member of your own group.