CS 232
Algorithms and Data Structures
Edit Distance Assignment

Background

    This programming project will require you to implement (in Java) an algorithm to compute the minimum edit distance between two strings.  The algorithm has been explained in lecture and you must use that particular algorithm.  As with the previous programming assignment, this is a phased assignment.  As was the case previously, any phase submitted early will be graded within three business days or an extension will be granted.

The Problem
 

    Given a string, we assume that the three operations below can each be performed for unit cost:

    For this problem, you are to read in two strings and compute the cheapest way to convert the first string into the second using the cost model above.  For instance, "be" can be converted into "boy" at a cost of 2, by changing the 'e' to an 'o' and then inserting a 'y' at the end.

    There are two versions of this problem - Phases III and IV.  In Phase III, you need only report the length of the list of commands.  In Phase IV, you must specify the actual commands for the transformation.  If more than one sequence of commands will work, then any valid sequence will suffice, as long as it is of minimal length.  Note that if you complete Phase IV first, it is trivial to write Phase III (simply return the length of the list), but that it is probably wise to complete them in the specified order.  The class you create should be named Editor.  The two methods you will need to write have the following signatures:

    public static int editDistance(String initial, String modified)

and

    public static ArrayList editChanges(String initial, String modified)

 

Phase I (due Friday, April 11 at 4 p.m.)

    Once again, we will begin with JUnit tests.  This time you must write two different kinds of tests, one for each of the methods.  Of course, since many lists may be correct for the editChanges method, it can be tricky to test it.  Fortunately, you have the Transform class (again from the CS 232 package).  The class is provided in EditHelper.jar, along with the necessary documentation.  It contains a method apply, that will apply an ArrayList<String> of commands to a given String and return the result.  (It contains another method as well; read about it.)  Invoking this method from within your test methods should enable you to test the editChanges method.  As with the skyline problem, include your initials as part of each test method in your suite.  (Hopefully) correct and buggy versions of the solution are also available.

Phase II (due Friday, April 11 at 4 p.m.)

   Similar to Phase II of the skyline problem, this phase simply requires you to demonstrate that you understand the commands necessary to convert one String to another and that you can properly use the Transform class.  You are to build an ArrayList of commands that convert the String "Harlan" into the String "Hunkins".  You are then to apply this list to the String "Harlan" while the class is using "verbose" mode.  Thus, the program will show the results of your transformation at each step.

Phase III (due Friday April 25 at 4 p.m.)

   This program is to compute the minimum edit distance between two Strings.  In other words, you need to complete the editDistance method described above.  As was the case with the skyline assignment, the complete set of test cases submitted by students will be shared as soon as feasible.  Phase III tests are now available.

Phase IV (due Friday, April 25, at 4 p.m.)

   In this phase, the program is not necessarily to compute the number of steps, but is to build an ArrayList of the steps themselves.  A sample output showing the format is shown below

    delete(5)
    insert(3,a)
    change(2,q)

This output says that the change can be effected by deleting the fifth character, followed by inserting an 'a' after the third character, followed by changing the second character to a 'q'. 

As was the case with the skyline assignment, the complete set of test cases submitted by students will be shared as soon as feasible.

Phase IV tests are now available.

Note:  Each Phase is to be turned in via a separate email message.