One of the next big features I’m working on for the next version of uSightRead is sound. Now, this post isn’t about getting sound to play on the iPhone (which can be a pain in the arse), but about how I created the audio content for the game.
I wanted to have the actual notes play when the player gets the note correct. For the range of notes that I support (bass and treble clef) I needed 50 different notes. To create them I made a project in Cubase 5, simply entered the MIDI key editor and step-wise entered the notes I needed. I set the tempo so that each whole note would be one second long, and exported a wav file (16-bit, 22,050 sample rate). I thought it would be a simple matter of finding an audio splitting app, and breaking the one second chunks out of the large wav file.
Boy was I wrong. The first problem is that one note will “bleed” into the next note and can cause a small pop when cut into the one second chunks. Similarly, if the audio is in the middle of an up or down trend when it is cut at the end, a pop will occur because the sample goes from a high value or low value to zero immediately. This was not acceptable. I could get around these problems by exporting the notes one at a time, but I steadfastly refused to do this for 50 notes. Enter Python.
As of 2.6 Python has a built in wav file module called wave. To get around the first problem, I simply made the notes play for eight seconds, which allowed the notes to naturally decay before the next one started. The second problem required more work. I setup a Python script to break an audio file into one second chunks, and then scanned the last 5% of the file for a low point after which I simply zeroed out the samples. This eliminated any pops at the end. It also skips the middle seven seconds because I don’t need them. All-in-all it probably took me longer to make the script than to manually export the notes, but I also can re-do the entire operation with say, another instrument without any hassle.
Here’s the script:
import sys
import wave
import struct
notes = ['C', 'Cs', 'D', 'Ds', 'E', 'F', 'Fs', 'G', 'Gs', 'A', 'As', 'B'];
names = [];
for i in range(1,8):
for note in notes:
names.append("%s%d.wav" % (note, i));
inputFile = sys.argv[1];
print "Splitting file: " + inputFile;
inputWav = wave.openfp(inputFile, "rb");
print "n Frames: %d" % inputWav.getnframes();
print "Frame Rate: %d" % inputWav.getframerate();
print "Channels: %d" % inputWav.getnchannels();
totalSeconds = inputWav.getnframes() / inputWav.getframerate();
frameRate = inputWav.getframerate();
sampleWidth = inputWav.getsampwidth();
skipSeconds = 8;
for second in range(0, totalSeconds, skipSeconds):
print "Part: %05d" % second;
# determine the output file name
name = "%05d.wav" % second ;
if (len(names) > 0): name = names.pop(0);
outWav = wave.openfp(name , "wb");
outWav.setnchannels(1);
outWav.setsampwidth(sampleWidth);
outWav.setframerate(frameRate);
outputFrames = inputWav.readframes(frameRate);
packFormat = "%uB" % frameRate * sampleWidth;
outputBytes = list(struct.unpack(packFormat, outputFrames));
last = 0;
previous = 1;
first = frameRate * sampleWidth - int((frameRate * sampleWidth) * 0.05);
for x in reversed(range(first, len(outputBytes) - 1, 2)):
# The data was little endian, compute the sample value
value = outputBytes[x] * 255 + outputBytes[x+1];
if value < 50:
last = x;
break;
outputBytes[last:] = [0 for x in range(last, len(outputBytes))];
# write output
outWav.writeframes(struct.pack(packFormat, *outputBytes));
outWav.close();
# skip intervening seconds
for i in range(0, skipSeconds -1): inputWav.readframes(frameRate);
In the end it was pretty simple, but I haven’t done Python in about 8 years, and so there was a lot of internet-search based programming.