CS 232
Algorithms and Data Structures
Edit Distance Assignment
Background
This programming project will require you to implement (in Java) an algorithm to compute the minimum edit distance between two strings. The algorithm will be explained in lecture and you must use that particular algorithm. As with the previous assignment, this is a phased assignment. As was the case previously, any phase submitted early will be graded within seven days or an extension will be granted.
The Problem
Given a string, we assume that the three operations below can each be performed for unit cost:
- Delete a character from the string
- Insert a character into the string
- Change a character in the string
For this problem, you are to read in two strings and compute the cheapest way to convert the first string into the second using the cost model above. For instance, "be" can be converted into "boy" at a cost of 2, by changing the 'e' to an 'o' and then inserting a 'y' at the end.
Phase I (due Friday, April 29 in class)
It is always better to determine your test data before you begin writing code. Doing so prevents you from designing your test data to cover only those cases you dealt with in the code. Therefore, the first phase of the assignment requires that you create a test suite against which you will run your code. Your test suite should include triples indicating, the input (two strings), the (correct) output (just the edit distance will suffice), and the condition being tested. For instance, ("be"/"boy", 2, typical case) would be the triple for the example above.
Phase II (due Wednesday, May 4, at 4 p.m.)
This program is to read two strings from the keyboard (NOT from a JOptionPane or other widget), and compute the minimum edit distance. It should simply print out the number of steps needed to convert the first string into the second. Only the integer should print; nothing else.
Phase III (due Wednesday, May 4, at 4 p.m.)
In this phase, the program is to compute not only the number of steps, but is to print out the steps themselves, one to a line. A sample output showing the format is shown below:
delete(5)
insert(3,a)
change(2,q)
This output says that the change can be effected by deleting the fifth character, followed by inserting an 'a' before the third character, followed by changing the second character to a 'q'.
Note the spelling of the commands as well as the fact that the letters do NOT have quotes around them. Remember that the initial character of a string has an index of 0.
Note: Phases II and III should be turned in via email.