Converting file formats using an XML dialect
Al is DDJ's senior contributing editor and a professional musician. He can be contacted at alalstevens.com.
When I retired the DDJ "C Programming" column in 2003, I thought my coding days were just about over and I would be a full-time jazz musician and teacher. Computers would be only for music applications and, whenever possible, I would use existing software. I wouldn't write a program unless I couldn't find one to do what I needed. Given all the music applications for Windows and Linux, I figured I'd never have to write code again. My web site (http://www.alstevens.com/) contains little to indicate that I made my living as a programmer. Not that I didn't enjoy programming. But I have other things to do now and all you dot-netters, see-sharpers, and ex-em-ellers can carry the banner while I pursue a different muse. I didn't realize, however, that as I got into using my computer only for music applications, I'd find more problems waiting to be solved.
This article is about one such solutionhow to convert Band-In-A-Box song files into Finale notation files by using MusicXML as a porting medium.
XML & MusicXML
Most programmers know what XML is. If you program Internet applications, you can't escape it. XML is becoming the language of the Internet. A text-based markup language to describe document content and structure, XML is, among other things, the next-generation HTML, but its capabilities go far beyond describing web pages. XML can describe the architecture of virtually any kind of structured information to be exchanged between platform-independent applications. XML has been used, for example, to define the structure and control the content of relational databases. The World Wide Web Consortium (http://www.w3.org/) maintains XML and its standardization, and you can read all about XML at http://www.w3.org/TR/2004/REC-xml-20040204/.
MusicXML (http://www.musicxml.org/) is a dialect of XML that describes music notation. All those cryptic lines and symbols that musicians read are implemented in XML Document Type Definition (DTD) files. The complete MusicXML DTD is at http://www.musicxml.org/dtds/.
MusicXML could and should become a universal language for music applications to exchange dataif only application developers would buy into the concept of interchangeable data files. But most music notation applicationsand there are manyuse proprietary file formats and closed architectures. Developers do not open their file architectures lest they facilitate the porting of scores to competing notation applications. That would never do.
Composers, arrangers, and musicians have other priorities, however, and we need a way to exchange data between music applications. If the Internet is to become the computing platform of the 21st century, it needs common data formats for every kind of document, and applications that aim to be Internet aware are going to have to abandon the closed architecture mentality to achieve that goal.
Eventually, browser plug-ins will render MusicXML files as music notation. Until then, you need a MusicXML-friendly notation application to view the files in other than text format, and music notation on web pages continues to be portrayed only with graphical image files.
MIDI for Notation Import/Export
Rendering scores is one thing. Exchanging score data is another. Prior to MusicXML, the main way to move scores between music notation programs was to export them as Musical Instrument Digital Interface (MIDI) packets in Standard MIDI Format (SMF) files from one application and import those SMF files into the other. Most notation programs support SMF import/export.
You can order the MIDI specification from http://www.midi.org/. Many books about MIDI describe the specification, too. SMF is the file format for storing MIDI data. You can find the SMF specification at http://help-site.com/local/midi.txt.
Using MIDI to exchange notation gives less than perfect results. Much music notation has no standard representation in MIDI, a protocol that encodes music performances as event packets. MIDI packets encode what notes to play, how loud to play them, when to play them, what instruments to play them on, and what devices play those instruments' sounds. Music notation does some of that, too, but notation was designed centuries ago for people to read and write. MIDI was designed decades ago for machines to read and write. Notation includes symbols and text meant to be interpreted by people in the context of rehearsals and performances, and these data include rehearsal numbers, chord symbols, endings, repeats, codas, cues, slurs, dynamics, expression, who's soloing and who's tacit, and so on. Many of these symbols are notational conveniences not represented in the typical MIDI sequence.
Of the items on the list I just cited, the ability to move chord symbols between music notation applications is the one most relevant to the problem I address here.
Because of these notational requirements, when you use an SMF file to export a score from one notation application to another, you must apply a lot of manual fix-up to restore the imported score to its original configuration. MusicXML addresses this problem. It contains markup tags for everything imaginable in a musical score. A score exported to MusicXML from one application retains virtually all its information when imported into another application. The only thing that might differ is how the two applications render the score with respect to music fonts, page layout, and so on.
If you want to convert an SMF file into music notation, the best program for the job is MidiNotate (http://www.notation.com/). MidiNotate, however, is not a music notation editor. It doesn't let you do much in the way of correcting the notation, and it does not export to MusicXML, so you can't easily move MidiNotate's notation data into a notation editor program.
Scanning for Notation Import/Export
Some notation programs let you scan printed music and convert the scanned bitmap into the application's notational format. Likewise, programs such as PhotoScore (http://www.neuratron.com/photoscore .htm) and SharpEye (http://www.visiv.co.uk/) convert scanned scores into MusicXML, which you can then import into some notation editors.
Any program that can print music notation can indirectly generate a graphical bitmap. I do it by printing to a file in PostScript format and using GSView (http://www.ghostgum.com.au/) to convert the file to a TIF. I convert the TIF by using SharpEye, which treats the TIF as a scanned image and converts it to MusicXML.
These programs work well, but they do not convert one item of notational information that I require in my work. They do not convert chord symbols.
Band-In-A-Box
I regularly use a program called Band-In-A-Box for Windows (http://www.pgmusic .com/). If you are a musician, you've probably heard of it. Band-In-A-Box generates automatic musical accompaniments. It uses MIDI technology to produce accompaniments with drums, bass, piano, and strings. You enter a sequence of chord symbols and select a playback style, and Band-In-A-Box does the rest. Figure 1 shows the Band-In-A-Box screen with the old swing tune "Back Home Again In Indiana" loaded and ready to play in the ZZJAZZ style.
Band-In-A-Box is notorious for its nonstandard user interface and disorganized options procedures. But it's the only program that does what it does, and it does it well. Until recently, Band-In-A-Box was available on the PC platform only as a 16-bit application. The latest version, Band-In-A-Box 2004 is a Win32 application, but it still has that quaint UI that its loyal user base has come to love.
I use Band-In-A-Box to practice in my studio and as a teaching tool. It provides a ready-made rhythm section for students learning to play jazz. Many musicians use Band-In-A-Box on the bandstand, others use it for practicing, and some use it in the studio to build backing tracks. The piano/bass/drums trio MP3s on my web site use Band-In-A-Box bass lines, which, in its jazz styles, can sound realistic after some tweaking with the Sonar sequencer program (http://www.cakewalk.com/ Products/SONAR/).
Musicians find a wealth of standard-tune files in Band-In-A-Box format available for download all over the Internet. Almost any popular tune from the early part of the previous century until the present time is available somewhere in Band-In-A-Box format. These files usually include the melodies of the tunes and often the lyrics.
This enormous online resource is a boon to working musicians because Band-In-A-Box lets you transpose a tune to any key on either clef and print its music notation in lead sheet format. (See the accompanying text box entitled "Lead Sheets and Fake Books.")
I won't say where you can find such files lest I alert those dedicated RIAA lawyers and divert them from their untiring quest to lock up impoverished students who use university computers to share online punk/rap/hip hop/rock/crap MP3s, but a word to the wise is "Google."
Bear in mind that the software in this article is intended for personal use only or to help people publish lead sheets for which the publisher fully intends to pay the required royalties.
Playing the Charts
As any club date musician knows, when a band reads a tune in the concert key of B-flat (the "concert" key is the original key the tune was written in and is usually pitched to suit the average male vocal range), the piano player reads the tune in B-flat, trumpet and tenor sax players read it in C, alto and baritone saxes in G, and trombone and bass in C but from the bass rather than the treble clef. Reading a tune can require as many as four different lead sheets depending on what instruments are in the band. Then there are the chick singers who, because of their vocal range, must sing tunes in nonoriginal keys. They need even more lead sheets for each of the musicians to read. A working musician who prepares for a gig needs a way to rapidly print out readable lead sheets in multiple keys and clefs. Unless you want to carry a trunk full of fake books to the gig. Assuming the fake books are available. Which they aren't.
We used to have thick fake books with a lot of tunes, most of which nobody would ever play. When is the last time you heard a request for "The Naughty Lady of Shady Lane?" The tunes in those old fake books were in the concert key on the treble clef. Most were notated by hand, printed illegibly, distributed underground, and had lots of mistakes. All players read the same notation, and horn players transposed in real time. We shouldn't need those books now because music notation software abounds. All we need is to get every known tune in the Western world into lead sheet format with a music notation program. Then we can print it in any key on any clef for any musician.
Notation Editing
Every known tune in the Western world is somewhere in a Band-In-A-Box file. And Band-In-A-Box has a notation editor feature. You enter the melody lines of a tune by using the notation editor or by playing the melody on a digital piano. Almost everybody uses the digital piano because the Band-In-A-Box notation editor is lacking, to put it mildly. Consequently, the melody notation in most of those thousands of freebie tune files reflects playing errors, player interpretations, and quantizing variations. You'd think someone who was about to release such a file for public consumption would use the notation editor to fix up the notation, but they don't because the notation editor itself discourages people from using it.
But, given a melody line and the chord changes, Band-In-A-Box can print a lead sheet. Figure 2 shows the first of several measures of Band-In-A-Box's notation for "Back Home Again In Indiana."
There's not much wrong with the notation in Figure 2, but few of the tunes you find online are anywhere nearly as clean. And the Band-In-A-Box editor has some annoying conventions that discourages one from trying to clean up the notation. It improperly decides that a note should have a particular sharp or flat accidental, using an A-flat where the harmonic context of the tune calls for G#, for example. Worse, it will make that decision in a key where A-flat is part of the key signature and needs no accidental notation whatsoever. Or it will decide that an F ought to be an E-sharp or something silly like that. It assigns double barlines where you don't want them. I could go on and on. These are only some of its contrary, arbitrary notational decisions that you cannot change.
When you mention such deficiencies to Band-In-A-Box fans, they say, "Band-In-A-Box is not supposed to be a notation editor. Use if for what it is best at." Well, duh. If it's not supposed to be one, why even put the feature in? Oh, well, another problem to solve.
With a MusicXML export of Band-In-A-Box notation, I could import a tune's notation into Finale and correct it to my heart's content. But Band-In-A-Box does not support MusicXML.
Finale
What to do? I have all these songs with notation that cannot be easily corrected. They can, however, be exported to SMF files. I have the full-featured Finale notation editor program that supports both MusicXML import/export and SMF import/export. How do I bring the two applications together to get that large catalog of lead sheets into a manageable format?
Remember as you look at these examples that my objective is to have lead sheets with melody lines and chord symbols that can be edited with a real music notation editor. And I want to do it a lot of times with a lot of tunes. Reaching that goal ought to involve as little manual effort as possible.
Band-In-A-Box plays tunes based on programmed styles. These styles represent the kind of music that fits the tune. They define lines of notes for the bass to play, chords for the piano to play, rhythmic patterns for both, and, of course, drum patterns for the drums to pound out. There are hundreds of styles available in .STY files, and you can even take a tune intended to be played as a country ballad, for example, and play it as a hip hop tune. You can. I won't.
I exported an SMF file of "Back Home Again In Indiana" with the ZZJAZZ style in effect from Band-In-A-Box and imported it into Finale to see what it looks like and how easy it would be to convert to a lead sheet. The imported notation shows some Band-In-A-Box stylemaker's notion of what to play for each chord, not necessarily enough information to infer the actual chord symbol. Figure 3 shows the imported SMF data in notation format.
As you can see, the bass line is all over the place; it does not always convey the chord's root value, which you need to infer a lead sheet chord symbol. The piano part has altered and substitution chords and inversions selected by the style. Those piano chords are, of course, generated in a comping style with voicings, embellishments, and rhythmic interpretations that real piano players use when improvising accompaniments. Because they simulate piano accompaniments, the piano chords rarely unambiguously state the chord itself and rarely occur in a measure precisely where the chord symbol ought to be on a lead sheet.
Obviously, a stylistic rendering of a tune such as Figure 3 shows is information overload on the one hand and yet insufficient information for my purposes on the other.
Exporting with Style
To get an SMF file with the melody and sufficient harmonic information to represent the current chord symbol at the proper place, I developed a Band-In-A-Box export style by using Band-In-A-Box's Stylemaker process. Building the style took less than five minutes because the export style plays only one bass note, the root, in the bass part and only one chord in the piano accompaniment part for each chord change. The style plays a very simple version of the tune. Kind of like when you learned "Heart and Soul" on the piano when you were a kid.
When you load the style to play with a tune that you want to export, you must allow Band-In-A-Box to reconfigure the melody to eliminate Band-In-A-Box's so-called "swing" feeling. A lead sheet should have a plain, unadorned melody line. Musicians apply their own interpretations during performances.
Figure 4 shows "Back Home Again In Indiana" with the export style exported by Band-In-A-Box to an SMF file and imported into Finale. If you used that style for playback in Band-In-A-Box, you would hear a rather boring rendition of the tune. The style is only for exporting information; it is not meant to be used for playback. Unless, of course, you really enjoy listening to boring music, in which case I can recommend some of the hip hop styles for your listening pleasure.
You must apply some manual effort to get Band-In-A-Box to render the exported SMF file correctly and to tell Finale how to import it. You have to mute all Band-In-A-Box tracks other than piano, bass, and melody, tell Band-In-A-Box not to embellish chords during playback, tell it to delete all melody choruses other than the first one, and to export only one chorus. Then you have to tell Finale to import the SMF file by quantizing to the sixteenth note resolution and to space notes evenly in every beat to neutralize some of the tune's original input interpretation.
Because of the simplicity of the export style, the SMF file that Band-In-A-Box renders as shown in Figure 4 has the complete melody but only sufficient harmonic information (a bass note for the root and a piano chord for the chord's other notes) to describe the chord and only when a chord changes. This is enough information to build a Finale lead sheet. If I had only a few such tunes to convert, I could use Finale's Chord/Two-staff Analysis command to click on each chord and create a chord symbol, move the chord symbols to the melody staff, and delete the bass and piano staves. That's simple enough, but it takes enough time for each tune that you wouldn't want to do it for a large catalog of tunes.
The melody track is important for this project. The Band-In-A-Box file needs a melody track. You can't build a lead sheet without a melody for the lead instrument to play. If all you want is a chord sheet, Band-In-A-Box's lead sheet notation is sufficient.
Converting SMF to MusicXML
This problem cries for a software solution to convert such SMF files into a MusicXML file. The solution would parse the bass and piano staves to infer chord symbols, which it adds to the remaining melody staff, and it would eliminate those staves in the converted MusicXML file. With that capability you have your choice of several notation editors that support MusicXML.
At the present, music notation applications that support MusicXML are mostly those that support a software plug-in architecture, and their MusicXML plug-ins are typically developed by third parties. Most notable of those applications is Finale (http://www.finalemusic.com/), and recent versions of Finale include a light version of the MusicXML Dolet plug-in (http://store.recordare.com/doletfin1.html). Finale is the leading music notation program. and versions 2000 through 2004 support the MusicXML format.
Sibelius (http://www.sibelius.com/) is another full-featured notation editor that supports MusicXML with a Dolet plug-in (http://store.recordare.com/doletsib1.html).
Alternative Solutions
My first plan was to write a Finale plug-in that would process the imported score of Figure 4 and create a lead sheet. I abandoned that plan after a careful reading of the Finale plug-in API documentation. I found no way in that API to add chord symbols to a score. That doesn't mean the API doesn't support it; I just couldn't find it. I did find an overly complex, under-documented API so typical of the contemporary programming environments that nudged me into an early retirement last year.
My second plan was to write a script by using the new FinaleScript feature introduced with the latest version of Finale. I upgraded specifically to get that feature. The script would step through the score and apply the Chord/Two-staff Analysis command I mentioned earlier. I abandoned plan two when I could not find commands in the script language for working with chord symbols.
My third plan was to write the program I just mentioned. I considered the knurly problems of converting MIDI data to notation, figuring out based on MIDI delta clock values and NOTE ON/NOTE OFF events what duration to assign a note, whether it is tied to another note, whether it ought to be dotted, and so on. It isn't as easy as it looks, and all the notation programs have already done it. Why not leverage their work?
ChordBuilder
And so I arrived at plan four, the program that this article describes. The program, with the unimaginative name, ChordBuilder, converts to MusicXML. I am pretty good at writing text-parsing programs, having written several programming language and script interpreters, which is why I figured I could get this plan working before any of the others.
To use ChordBuilder, follow Figure 5, a data flow diagram. You begin with a Band-In-A-Box file that contains the chords and melody of a tune you wish to export to a music notation program. That file is represented by the icon in the upper left corner of Figure 5. You wish to end with a lead sheet notation file in Finale's notation file format. That file is represented by the icon in the lower right corner of Figure 5. Table 1 lists the steps in Figure 5.
Indiana.xml (available electronically; see "Resource Center," page 5) is built from the first five measures of "Back Home Again In Indiana." As you can see, MusicXML is a verbose language because it uses XML markup tags to define even the smallest entity of music notation.
ChordBuilder reads the exported score, parses the chord data from the bass and piano parts, inserts chord symbol markup into the melody part, and writes a new version of the score in MusicXML format with only the melody part and its new chord symbols. Indiana[1].xml (available electronically) is the output from ChordBuilder.
You import the output from ChordBuilder into Finale to view, edit, and print a lead sheet. Figure 6 shows "Back Home Again In Indiana" in its lead sheet format in Finale.
The MusicXML Library
I built ChordBuilder from the ground up by designing and coding all its MusicXML classes. I did not have to do it that way. The MusicXML Library (http://libmusicxml .sourceforge.net/) is an open-source project that supports a C++ class library for processing MusicXML data. If I was writing a complex MusicXML program, I might get on board with the MusicXML Library project, and I'd have to ensure that everything complied with the GNU Library or Lesser General Public License. ChordBuilder, however, is a relatively simple text parsing program, and it just seemed less trouble to design a few simple classes than to climb someone else's learning curve and buy into their rules. Most of my code in the DDJ "C Programming" column was open source, but it wasn't Open Source, and I'm not ready to jump into that fray from the comforts of a mostly noncoding retirement.
About the Program
ChordBuilder is a command-line program with only one command-line parameter. You must specify the name and not the extension of the MusicXML file that ChordBuilder reads. The extension has to be .XMl. ChordBuilder builds a file with the same name and a [1] suffix. So indiana.xml becomes indiana[1].xml.
ChordBuilder also reads the SMF file that Band-In-A-Box exported. ChordBuilder expects that file to have the same filename. In this example, that would be indiana.mid. ChordBuilder has to read that file to extract the tune's title. For some reason, although virtually all SMF-processing programs reserve the META_SEQTRKNAME event in track 1 for the song title, the Finale SMF import feature does not import that title, so it does not get exported to the first MusicXML file.
I wrote the program in Standard C++ with no platform dependencies. There is a version of Band-In-A-Box for the Mac and, if music notation programs for the Mac support MusicXML import/export, perhaps you can port ChordBuilder to the Mac. I compiled and tested ChordBuilder with Microsoft Visual C++ 6.0 and MinGW GCC 3.2.1 running under the Quincy 2002 IDE (http://www.alstevens .com/quincy.html).
ChordBuilder's source code is available electronically. The download includes the export.sty file that implements the export style for Band-In-A-Box and a Quincy project file. You can download an executable version of the program along with the style file and instructions for using the package with Band-In-A-Box and Finale on the Windows platform. The download file is found at http://www.alstevens.com/midifitz/.
The program's organization is a straightforward C++ class design. Two classes, musicXMLin and musicXMLout handle reading and writing the MusicXML files. Several structures defined in the musicdata.h source file organize music notation data for internal processing. The ChordBuilder class encapsulates the parsing of MusicXML input data and the generation of MusicXML output data.
To extract the song title from an SMF file, I used a MIDIFile class that I developed several years ago and published in the "C Programming" column.
All console output is isolated in a chordbuildermessage function in main.cpp. If someone wants to port ChordBuilder to a GUI, this function is where you would capture exception and warning messages from the program. The messages themselves have sufficient unique text data that a program can determine what to do with them.
Although the program is reasonably portable among Standard C++-compliant compilers, portability was not my motivation for writing ChordBuilder as a command-line program in Standard C++. I did it because that's how I like to write programs. I never much liked writing GUI programs for any platform, especially with MFC, and I'm retired now and can write programs any way I want.
DDJ