David runs FileJockey Software and can be reached at http://www.FileJockeySoftware.com/ or [email protected].
To have a fighting chance of remembering relatives' birthdays and anniversaries, I put together a database table with the necessary information. But in designing the table I ran into a problem: Which field-types should I select? The database I use, Corel Paradox, provides three date-related field-typesDate, Time, and Timestamp. None of these seem appropriate since I do not want to sort rows based on the years when people were born or married. Instead, I included Month Name, Day, Year, and Month Number columns. After entering the data, I sorted on the Month Number, Day, Last Name, and First Name columns.
This worked but has two drawbacksI had to enter month numbers that correspond to month names, and the Month Number column was displayed on the printout. I later found that I could define a date display screen format that suppresses year numbers. (This format does not apply to printing, exporting, or publishing in HTML.) I then combined the Month Name, Day, and Month Number columns into a Date column. The resulting display looks cleaner but makes adding to the table more difficult. The reason is that the Date column includesbut doesn't showyear information. To add birthdays or anniversaries in future years, I will need to display the dates in a format that shows year numbers and enter new rows with the year set to 2005.
I present an alternative in this article. Databases should support a month field-type that will sort in month order. In searching the Web, I found that databases have varying support for date-related field-types. Of those that I looked at, MySQL has the most such field-types. It allows Date, Datetime, Time, Timestamp, and Year fields.
Month-text ordering is also useful for filesystem sorting. For example, you could have a 2005 Marketing Plan folder that contains files such as January.doc, February.doc, and so on. These files might not be created in order and they may be modified out of order. As a result, sorting by the last-modified date/time would not necessarily arrange them properly. Other examples of month-related files include progress reports, sales reports, and newsletters. Some of these may be edited months after they are created. Folders with month names could be useful for storing photos and magazine articles.
MonthOrder Class
The first design goal for this class was to allow for short and long month names. In addition, I decided to include support for the common abbreviation "Sept." Because the month names are not expected to change, they can be sorted alphabetically, stored in an array, and found using a binary search. However, because hash tables are available in Java and MFC, I decided to use one to store the month names.
How many entries should be placed in this hash table? In a shareware file manager that I haven't completed, I use 12 entries that are keyed on the first three letters of the month names. To check if a word corresponds to a short or long month name, this program copies the first three letters, converts them to uppercase, and tests whether this key is in the table. If there is a match, it then tests if this word matches the beginning part of the corresponding month name and that the length of this word is three for any month, four for September, or the full length of the month name.
For the MonthOrder class, I decided to use 24 table entries. The additional entries are the short names, except for "MAY," plus "SEPT." As a result, the testing logic is simplified. A potential month name is converted to uppercase, an existing trailing dot is removed, and the key is checked in the hash table. (The previous routine does not handle trailing dots since they are removed by a tokenizer.) See Listing One.
The compareTo function in Listing Two starts by requesting month numbers for a pair of strings. If both strings are month names and their month numbers differ, these numbers are subtracted. If they match, their lengths are subtracted. This is unnecessary but makes the order look nicer. If only one string is a month name, it is placed first. By grouping month names above other strings, there are no inconsistencies. For example, sorting "January," "February," and "Hat" would lead to an unpredictable arrangement without grouping. If neither string is a month name, they are ordered by using a routine that is based on my article "Alphanumeric Ordering" (DDJ, October 2000).
The Java implementation of this routine differs from the C implementation in that substrings consisting solely of digits must be created for Integer.parseInt to work. The C function atoi can handle strings that start with digits and are followed by letters; see Listing Three.
Sample Program
MonthOrderApplet (available electronically, see "Resource Center," page 4) demonstrates month-text ordering. To use it, load MonthOrder.html into a browser. The class files must be in the same folder as this file. Then, you could either add strings individually or you could add name sets. For the latter, check either or both of the checkboxes and press the Add Name Set(s) button. After checking the Long Month Names box and pressing this button, you will see what is in Figure 1. Then press the Sort button. The month names will be arranged as in Figure 2. You can add short month names and resort the list. Because the compareTo function in Listing Two orders strings based on length when they represent the same month, short names appear before full names; see Figure 3.
Pressing the Remove Item(s) button removes selected items. The corresponding routine does not use an array of indexes. Instead, it looks at each item and deletes it if it is selected. Afterwards, either the count variable or the current index is updated; see Listing Four.
File and Folder Names
Several month names have meanings in other contexts. For example, "march" and "may" are common words. In addition, "April," "May," and "June" are women's names. This creates an interpretation problem. One resolution is to surround text with braces to signal that it should be interpreted as a potential month. Then, when a filesystem comparison function sees a string such as "{May}," it knows to look in a hash table to find the index of the text inside the braces.
The other problem with mixing month names and other strings was already addressed. Month names are grouped above other strings so that a comparison function does not return inconsistent results to a sorting algorithm.
Month Field-Type
A month field-type could contain short and long month names and compare them using the routines in Listings One and Two. An alternate approach is to cache the month indexes along with case and length information.
One possible encoding in a 32-bit integer starts with the left-most byte being zero. The remaining bytes would contain a case code, a length, and a month index. Months encoded in this manner would be compared after applying a 0xFF mask to extract the month index. The case code would range from zero to two to represent mixed, upper, and lowercase, respectively. If the first letter of an input month name is in lowercase, assume that this name is in lowercase. Otherwise, if the second letter is in uppercase, assume that this name is in uppercase. The remaining circumstance is to assume that this name is in mixed case, as in "January."
The lengths would range from three to nine, and the month indexes would range from one to twelve. A display routine would subtract one from an index, use this value to select a month name from a string array, extract the required number of characters, and adjust the case, if necessary.
If 16-bit integers are available, then each part of a month-field code could be put into a nybble (4 bits). Then the mask would be 0xF.
Conclusion
A month field-type would be helpful in database tables when day or year information is not useful for sorting or display. In a filesystem context, month-text ordering supplements alphanumeric ordering in that it also arranges file and folder names in a way that respects the patterns that people see in strings.
DDJ
import java.util.Hashtable; public class MonthOrder implements SortTest { Hashtable map = new Hashtable(); public MonthOrder() { MonthInfo[] monthList = new MonthInfo[24]; monthList[0] = new MonthInfo ("JANUARY", 1); monthList[1] = new MonthInfo ("FEBRUARY", 2); monthList[2] = new MonthInfo ("MARCH", 3); monthList[3] = new MonthInfo ("APRIL", 4); monthList[4] = new MonthInfo ("MAY", 5); monthList[5] = new MonthInfo ("JUNE", 6); monthList[6] = new MonthInfo ("JULY", 7); monthList[7] = new MonthInfo ("AUGUST", 8); monthList[8] = new MonthInfo ("SEPTEMBER", 9); monthList[9] = new MonthInfo ("OCTOBER", 10); monthList[10] = new MonthInfo ("NOVEMBER", 11); monthList[11] = new MonthInfo ("DECEMBER", 12); monthList[12] = new MonthInfo ("JAN", 1); monthList[13] = new MonthInfo ("FEB", 2); monthList[14] = new MonthInfo ("MAR", 3); monthList[15] = new MonthInfo ("APR", 4); monthList[16] = new MonthInfo ("JUN", 6); // MAY is above monthList[17] = new MonthInfo ("JUL", 7); monthList[18] = new MonthInfo ("AUG", 8); monthList[19] = new MonthInfo ("SEP", 9); monthList[20] = new MonthInfo ("SEPT", 9); // alternative monthList[21] = new MonthInfo ("OCT", 10); monthList[22] = new MonthInfo ("NOV", 11); monthList[23] = new MonthInfo ("DEC", 12); for (int i = 0; i < monthList.length; i++) { map.put (monthList[i].monthName, monthList[i]); } } // MonthOrder private int findMonth (String monthName) { if (monthName == null) return -1; // Prepares test string String key = monthName.toUpperCase(); int dot = key.indexOf ('.'); // trims off trailing dot if (dot != -1) key = key.substring (0, dot); MonthInfo mi = (MonthInfo) map.get (key); if (mi == null) return -1; else return mi.index; } // findMonth ... }Back to article
Listing Two
public int compareTo (String name1, String name2) { int m1 = findMonth (name1); int m2 = findMonth (name2); if (m1 == -1 && m2 == -1) return simpleANCompareTo (name1, name2); else if (m1 == -1) // month names appear before other items return +1; else if (m2 == -1) return -1; else if (m1 == m2) // not needed, but makes order look nicer return name1.length() - name2.length(); else return m1 - m2; } // compareToBack to article
Listing Three
private int simpleANCompareTo (String name1, String name2) { int size1 = name1.length(), size2 = name2.length(); int n1 = 0, n2 = 0, e1, e2; int val1, val2, test = 0; while (n1 < size1 && n2 < size2) { if (Character.isDigit (name1.charAt (n1)) && Character.isDigit (name2.charAt (n2))) { // Finds ends of numbers so that parseInt will work for (e1 = n1 + 1; e1 < size1 && Character.isDigit (name1.charAt (e1)); ) e1++; for (e2 = n2 + 1; e2 < size2 && Character.isDigit (name2.charAt (e2)); ) e2++; try { val1 = Integer.parseInt (name1.substring (n1, e1)); } catch (NumberFormatException ex) { val1 = -1; } try { val2 = Integer.parseInt (name2.substring (n2, e2)); } catch (NumberFormatException ex) { val2 = -1; } if ((test = val1 - val2) != 0) return test; n1 = e1; n2 = e2; } else { // caseless match test = Character.toLowerCase (name1.charAt (n1)) - Character.toLowerCase (name2.charAt (n2)); if (test != 0) return test; else { n1++; n2++; } } } // One or more strings has ended if (n1 >= size1 && n2 >= size2) return 0; else if (n1 >= size1) return -1; else return +1; } // simpleANCompareToBack to article
Listing Four
void removeButton_ActionPerformed (java.awt.event.ActionEvent event) { try { int count = itemList.getItemCount(); for (int i = 0; i < count; ) { if (itemList.isIndexSelected (i)) { itemList.delItem (i); count--; } else i++; } } catch (Exception e) { } } // removeButton_ActionPerformedBack to article