Bulletin

Bulletin | Permissions

The Development of Verse-Level Audio at the ESV Online Edition

Webmaster, Good News Publishers (July 10, 2006. Updated November 15, 2006)

The Former State of Affairs
From time immemorial (or at least 1999), Bible Gateway has offered RealAudio streams of chapters for some of the Bibles on their site.

Bible Gateway has a “Listen to this passage” link

Hearing a biblical passage as well as—or instead of—reading it provides insights into the text and forces you to slow down and approach the Bible more meditatively. You can hear a Bible chapter read to you when you buy a set of CDs. But the Internet offers a chance at interactivity that many Bible sites weren’t (and still aren’t) fully exploiting.

We had a straightforward plan: let people listen to individual verses of the ESV, rather than having to get through a full chapter to hear a single verse. We wanted as many people as possible to be able to hear the ESV text.


Random Access of RealAudio Streams
We’d used SMIL to stitch together Bible Gateway RealAudio streams of consecutive chapters (so you could hear more than one chapter at a time):

<smil>
  <body>
    <audio src="rtsp://media.gospelcom.net/.../john-03-ml.rm"/>
    <audio src="rtsp://media.gospelcom.net/.../john-04-ml.rm"/>
  </body>
</smil>

A little research uncovered the clipBegin and clipEnd attributes, which allowed us to start and end streams at any point:

<smil>
  <body>
    <audio src="rtsp://media.gospelcom.net/.../john-03-ml.rm"
        clipBegin="1.0s" clipEnd="5.0s"/>
  </body>
</smil>

A few tests showed that this approach would work. We had a basic proof-of-concept. Now came the hard part: we had to create a database of start and end times for each verse (i.e., we had to “versify” the New Testament).
Versifying: Approach #1 (5 Chapters / 2 Hours)
Our first approach, though doomed to disaster, at least taught us what wouldn’t work.

We had a SMIL file and RealPlayer; why not play an MP3 in RealPlayer and note the time that each verse began? RealPlayer showed the elapsed seconds, so getting an approximate start time posed no problem. Getting an exact start time, however, proved nettlesome, especially when two verses ran together.

“OK, verse 16 starts somewhere between 136 and 137 seconds. Maybe halfway?”

(Stop RealPlayer. Edit the SMIL file to start playing at 136.5 seconds. Play the clip in RealPlayer.)

“No. Too early. What about 136.7 seconds?”

(Stop RealPlayer. Edit the SMIL file to start playing at 136.7 seconds. Play the clip in RealPlayer.)

“No. Too late. How about 136.6 seconds?”

(Stop RealPlayer. Edit the SMIL file to start playing at 136.6 seconds. Play the clip in RealPlayer.)

“Well, close enough. Let’s move on.”

Repeat that procedure 7940 more times, and you have start and end times for all the verses in the New Testament—unfortunately, you’ve also spent way too much time dealing with trial and error. We had to find a better way.

Versifying: Approach #2 (255 Chapters / 84 Hours)

Enter Audacity, a free audio editor. For our purposes, being able to see the audio waveform visually would help us determine the best points to mark verse boundaries.

Our workspace

Our workspace. We used Audacity to find the times, Excel to record the data, and the ESV Online Edition to keep track of the text of verses.

Our workflow for each chapter looked something like the following:

  1. Open the chapter’s MP3 file in Audacity.
  2. Zoom in to see about thirty seconds of audio.
  3. Position the playhead just before where it appears the first verse starts.
  4. Play a brief clip to confirm the beginning of the verse. (Adjust the start position if necessary.)
  5. Record the start time in Excel. (The end time for the previous verse, if any, corresponds to the start time of the current verse.)
  6. Move the playhead to where it appears the next verse may start.
  7. Repeat steps 4-6 until reaching the end of the chapter.

This procedure had three benefits over our first approach:

  • It was about 30% faster.
  • It was more precise (to the hundredths of a second if necessary).
  • It was more fun.

It also had its share of drawbacks:

  • It was still slow—it took about four times real time to record the data. In other words, versifying ten minutes of audio took forty minutes of work.
  • It lent itself to transcription errors—someone had to type the start time in the Excel file.

Nevertheless, we finished. We had an Excel spreadsheet containing the start times for all the verses in the New Testament.
Application #1: Verse-Level RealAudio Streams
Moving from the Excel spreadsheet containing start times for each verse to generating SMIL files on the fly had three steps.

1. Create and fill the database

We created a table:

CREATE TABLE `esv_audio_times` (
  `version` enum('ml','mm') NOT NULL default 'ml',
  `unit_id` int(8) unsigned zerofill NOT NULL default '00000000',
  `start_time` float(5,2) NOT NULL default '0.00',
  `end_time` float(5,2) NOT NULL default '0.00',
  PRIMARY KEY  (`version`,`unit_id`)
);

Then we exported the Excel spreadsheet to a text file and wrote a Perl script to read it and insert the values into the database. We used the following verse’s start time as the current verse’s end time, so John 1:1 ended when John 1:2 began. The final verse in the chapter had an end time of zero.

The Perl script checked transcription errors by flagging very short or negative-length verses. We double-checked these verses in Audacity and corrected the data as necessary.

2. Add the link to the Bible page

We added a “Listen” link next to the verse reference on the page:

The link pointed to a PHP script that generated a SMIL file based on the verse reference passed to it.

3. Generate the SMIL file

The PHP script generating the SMIL file looked at the parameters sent to the page and then used the same parsing engine as the rest of the ESV site to figure out which verses the queries refer to. We chose this approach rather than an easier-to-program file that took unit ids (e.g., using ?passage=John:3:16-17 instead of ?ids=4303016-4303017) to enable others to link directly to the SMIL files using human-readable query strings if they so chose.

We wanted people to think of SMIL as just another format in which to access the ESV. The web service let you access the text as HTML or plain text, so why not SMIL?

The PHP script looked up the corresponding start and end times for the verses indicated and created the necessary output. For example:

<smil>
  <body>
    <audio src="rtsp://media.gospelcom.net/.../john-03-ml.rm"
      clipBegin="npt=131.60s" clipEnd="npt=154.20s"/>
  </body>
</smil>

Limitations to This Approach

  1. Few people liked RealPlayer; not many people were willing to download the program just to hear the audio.
  2. RealPlayer sometimes skipped the first few seconds in a stream, making the audio start in the middle of a verse.
  3. RealPlayer took about ten seconds to start playing a stream even when everything worked right.

We hoped for better, but it worked well enough for the moment.

Application #2: Verse-of-the-Day MP3s

We could now play any combination of verses from the New Testament, and we wanted to integrate that ability in as many ways as possible with our existing applications. An obvious target: adding audio to our verse-of-the-day RSS and Javascript syndication. This procedure also had three steps.

1. Get the start and end times for each verse

We stored the verses as plain text—e.g., the database contains the string John 3:16-17 for those verses—we didn’t map the reference back to the verses’ numeric identifier we could use for direct lookup. So we just had to run each reference through the ESV Online Edition’s query parser to figure out which verses to which the strings refer. We used our web service to do it.

After we had the verse ids, we just had to look up the start and end times for each verse (or range of verses) and determine which MP3 they referred to. We programmatically renamed our MP3s to facilitate this process.

John 3:16-17 → 43003016-43003017 → part of 43003.mp3

2. Create individual MP3s

We definitely weren’t going to make MP3s for all 400 verses by hand. Enter Perl’s MP3::Splitter and MP3::Info modules. We also needed to reduce the quality from 128Kbps, a process for which Lame proved useful.

First we resampled the files:

foreach my $size (qw(16 32 64))
{
  my $new_file = "$local_dir\\$size\\$file";
  `lame -b $size -h "$old_file" "$new_file"`;
}

A simplified version of the program we wrote to split the MP3s looked like the following:

use MP3::Splitter;
use MP3::Tag;

split_files();

sub split_files
{
  print "Splitting files...\n";
  foreach my $ref (@ids)
  {
    #we've gotten these values from a database
    my ($id, $start, $length) = @{$ref};
    #the source file is equivalent to the first 5 characters
    my $file = substr($id, 0, 5) . '.mp3';
    #extract the verse from the chapter file
    mp3split("mp3s/$file", {verbose => 1}, [$start, $length]);
    #move it to the target folder
    `move 01_mp3s\\$file mp3_verses\\$id.mp3`;
    #rewrite the id3 tags
    handle_id3($id);
  }
}

sub handle_id3
{
  my ($id) = @_;
  my $tag = MP3::Tag->new("mp3_verses/$id.mp3");
  my $id3 = $tag->new_tag('ID3v2');
  #make_title(...) converts the verse id into a readable reference
  $id3->add_frame('TIT2', make_title($id));
  $id3->add_frame('TALB', 'ESV Audio Bible');
  $id3->add_frame('TPE1', '[Artist Name]');
  $id3->add_frame('TYER', '[Copyright Year]');
  $id3->add_frame('TCON', 'Speech');
  $id3->write_tag();
}

3. Add the MP3s to the feeds

Technically, this step proved straightforward. The PHP script generating the feed checked for the existence of an MP3 for the given verse (remember, we only had New Testament verses available) and added an <enclosure> element to the RSS feed or a link to the Javascript feed.

A few people had written programs to consume the RSS or Javascript, so we emailed them before we released the code to make sure they had a chance to update their tools.

Augmenting Bible Gateway’s RealAudio Streams

As time went on, people occasionally wrote us to tell us that some of the audio files cut off before the end of the chapter. We traced the problem to incomplete RealAudio streams provided by Bible Gateway and created our own RealAudio files to replace the problematic chapters. The SMIL file pointed to one of these replacement streams when someone tried to listen to one of the incomplete chapters. No one should have noticed a difference between our files and Bible Gateway’s.

Application #3: Windows Media Streams

After a few months of using SMIL, we thought we’d see if we could create a way to access the verses with Windows Media Player. We had to overcome a number of obstacles, and in the end we didn’t succeed as well as we’d hoped. We don’t question the soundness of our approach; rather, technical limitations in the current version of Windows Media Player (version 10) resulted in an underwhelming experience.

1. Create ASX files

Windows Media Player doesn’t understand SMIL. But it does understand ASX, a proprietary Microsoft alternative to SMIL. For our purposes, an ASX file looked similar enough to a SMIL file that we didn’t need to do a lot of recoding; we essentially just changed the names of the elements. (You may ask why Microsoft bothered to create an alternative that uses almost the same syntax to accomplish exactly the same purpose. We won’t venture an answer.)

2. Convert MP3s to Windows Media Files

Now that we knew that we could use a similar approach to our RealAudio streams, we had to see whether we could easily convert many MP3s into Windows Media format. We came across a nifty command-line MP3 to WMA converter included in the Windows Media 9 SDK from Microsoft that did exactly what we needed.

First we edited the Wmcmd.vbs file that came with the SDK because we couldn’t change everything with the command line. Download the .vbs file we used.

Here’s what we did:

my $script = 'cscript "\\program files\\wmsdk\\wmencsdk9\\samples\\vb\\wmcmd\\album
    .wmcmd.vbs"';
my $out_path = "\\output\\wma\\48k";

opendir DIR, 'mp3s';
while (my $file = readdir DIR)
{
  next unless ($file =~ /mp3$/);
  my $out = $file;
  #make sure the title is correct for the metadata
  my $title = make_title($file);
  $out =~ s/mp3$/wma/;
  #run the vbs from the command line; only one line; broken here for clarity
  #the -album tag doesn't do anything; it just needs to be there. Edit the vbs file
      to change it
  `$script -input "mp3s\\$file" -output "$out_path\\$out" -profile a32 »
      -title "$title" -author "Marquis Laughlin" -copyright 2003 -album "ESV
          Audio New Testament"`;
}
closedir DIR;

Limitations to this approach

The main problem with this approach stemmed from a limitation in Windows Media Player. Simply put, it didn’t have the frame-level precision needed for this type of application. In particular, it got stop times only approximately right; it would often continue playing for up to a second past the stop time. The listener thus got to hear the beginning of the next verse in many cases. You can imagine how distracting it got after a while.

Versifying: Approach #3 (1189 chapters / 60 hours)

The time came in November 2005 to versify another audio recording of the Bible, this one read by Max McLean. Max had recorded the entire ESV, not just the New Testament, and we knew that the single-person approach we took with the New Testament wouldn’t scale to the complete Bible: we were looking at 300 hours of work.

We considered a couple of different approaches. We thought about asking for volunteers and offering prizes. Amazon’s Mechanical Turk came out about this time, and paying a small amount per verse had appeal.

In the end, we found a much faster single-person approach that let us versify the complete Bible faster than real time: 60 hours of work for 72 hours of audio.

Scripting Windows Media Player

Windows Media Player did exactly what we needed; we used Javascript in Internet Explorer to control it. We scripted two properties:

  • CurrentPosition. This function returned a floating-point number rather than the integer we feared. Thus, we could use it to get precise times.
  • Rate. This property controlled playback speed. We used it to slow down playback when we couldn’t tell precisely when verses started.

Also see the full list of Windows Media Player properties accessible from Javascript.

We created a web page running on an internal server that let us control MP3 playback using the keyboard. Then we listened to the MP3s of each chapter and pressed a key on the keyboard to mark when each verse began end ended.

The controls allowed us to back up when we made a mistake or slow down playback when the verses ran together. We could also skip over the middles of verses. In this way, we were able to versify the whole Bible in about 60 hours, or 87% of real time.

Versify John 1 yourself (Windows Internet Explorer only). Do that 1188 more times, and you’re done.

We weren’t thrilled at having to use Internet Explorer, but only one person was going to use the application, so we didn’t need to worry about cross-browser compatibility. We did some research that suggested we could develop an implementation for Firefox, but this approach did what we needed when we needed it.

The end result was a database table containing start and end times for each verse. Here’s the data for John.

Limitations to this approach

We couldn’t always tell the best place to split the audio precisely when verses ran together. A waveform to provide some objectivity to the process would have helped.

Application #4: Podcasts

We had a strict deadline by which we had to finish most of the versifying: January 1, 2006, marked the planned start of offering through-the-Bible-in-a-year podcasts.

M3Us

M3U is a file format that lets you distribute a small text file to point to any number of MP3s. M3Us represented our first choice in delivering the podcasts because they were technically easier to implement.

We needed individual files for each day’s reading for each of the three reading plans we offered. Most of the readings involved complete chapters, so we didn’t have to manipulate them. But we did have to make MP3s of sub-chapter readings. As before, we turned to Perl, using the same approach we had with the daily verses.

Unfortunately, iTunes didn’t handle M3Us well, so we decided to switch to podcasting the MP3s themselves after a week of testing.

MP3s

We needed a way to combine the split MP3 files we’d just made into a single file for each reading plan for each day.

We found a program that did it. In retrospect, we probably could have just concatenated the files and gotten the same results.

Our first tests ran together different readings without any breaks, confusing listeners. So we spliced in 0.75-second breaks between readings and added chapter titles (e.g., “John 3”) for orientation.

Unresolved issues

  • Sudden breaks. The Max McLean recording contained subtle background music, meaning that splicing different segments together could result in sudden breaks in the music. We experimented with fading in and out, but we couldn’t come up with an interval that worked well in all circumstances.
  • Sudden volume changes. Different mp3s had somewhat different volume levels, which could be occasionally jarring. We could use a normalizer to try to even out the volume differences.

Dilemma: How to Deliver the Audio

Few people liked RealPlayer, and the Windows Media streams didn’t work well. We wanted to find a new solution for our new recording. We had a few criteria:

  • It needed to be streaming—that is, the average person shouldn’t be able to save the audio to a computer.
  • It should work without obscure plugins.

Our research led us to lean heavily toward a Flash / progressive download / pseudo-streaming solution. Flash would make it hard for most people to figure out how to record the audio, and 95% of our visitors had Flash 7 or higher installed.

We asked for advice about this approach on the ESV Blog, from which we learned that people have strong preferences about audio players. We remained convinced of the viability of our planned approach, but some people mentioned not having or preferring not to use Flash. We decided to see how we could offer MP3s in addition to Flash.

Our other concern revolved around bandwidth costs. We’d been piggybacking off Bible Gateway’s RealAudio streams, but now we were going to host our own audio. We needed to find a reliable provider who ran PHP and had low bandwidth and storage costs. We signed up with Dreamhost and registered a new domain name (esvmedia.org) after checking out the possibilities. They gave us a 20 GB of storage and a terabyte of bandwidth per month for a reasonable cost.

We later investigated Flash Media Server (aka Flash Communication Server) hosting but found it too expensive for the performance and bandwidth we’d need—or the companies offering hosting didn’t post prices on their websites, which amounted to the same thing.

Solution #1: Flash Video (.flv)

We’d been using Flash 5 for various small projects for five years. It worked decently, but it had minimal support for streaming audio. Creating a simple MP3 player looked daunting. We needed to upgrade.

Flash 8 had come out in September 2005 and promised to make playing media a breeze. Flash 8 Professional came with an encoder that could covert audio to Flash Video (FLV) format. This encoder proved crucial to our plans.

At this point, we still planned to stream MP3s through a Flash player. The more we researched, however, the more attractive FLV (not MP3) streaming became.

For one thing, someone had taken the time to document the FLV format. This page linked to a blog post that used PHP to extract stream data from FLVs, similar to what we had in mind.

We also discovered that the FLV::Info module would let us manipulate FLVs in Perl.

We decided to go with the FLV approach, which involved the following steps:

1. Convert chapter MP3s to FLVs

The Flash 8 Video Encoder made this process easy. We dragged the 1189 chapter MP3s to the encoder, set up our output options (32 Kbps MP3 encoding), and pressed “Start Queue.” Twenty hours later, we had 1189 FLVs.

2. Catalog length discrepancies between MP3s and FLVs

The FLV encoder didn’t create FLVs with exactly the same number of milliseconds as the MP3 source files. Each FLV was between 32 and 1688 milliseconds longer than the MP3 from which it came.

We discovered after a good deal of experimentation that we could simply subtract the difference to convert between the two formats. So, for example, if we knew that the difference was 500 milliseconds, we calculated that a position 12.5 seconds into the FLV file was equivalent to 12 seconds into the MP3 file.

Here’s the Perl program we used to calculate the differences between the MP3s and FLVs:

use MP3::Info;
use FLV::Info;

my $file = '43003016';
my $mp3 = mp3_length("mp3s/$file.mp3");
my $flv = flv_length("flvs/$file.flv");
print $flv - $mp3;

sub flv_length
{
  $flv->parse($_[0]);
  my %info = $flv->get_info();
  return $info{duration};
}

sub mp3_length
{
  my $info = get_mp3info($_[0]);
  return round($info->{SECS} * 1000);
}

sub round
{
  my ($number) = @_;
  return int($number + .5);
}

3. Create verse-level FLVs

Next we wrote a program to create an FLV for each verse. We simply extracted the frames from the FLV audio stream that corresponded most closely to the times we recorded for each verse. The resulting files weren’t complete FLVs—they didn’t have the headers and footers required by the FLV spec. We had to remember to add the header and footer when we streamed the audio.

my $frame_header_length = 15;
my $audio_type = 8; #an FLV constant
my $start = 1000; #ms at which the verse starts
my $end = 12500; #ms at which the verse ends

#get the content of the chapter FLV
my $content = get_full_chapter_content();
open OUT, ">$out_folder/$id.flv";
#binmode is really important
binmode OUT;
(%last_frame, %last_printed) = ();

while (length $content >= $frame_header_length)
{
  my $ref = get_frame_header();
  #extract and remove
  $ref->{full} = substr $content, 0, $frame_header_length + $ref->{length}, '';
  #get the length (in ms) of the current frame
  $ref->{ms_length} = get_frame_length($ref);
  $ref->{ms_end} = $ref->{ms_start} + $ref->{ms_length};
  #the verse hasn't started yet
  if ($ref->{ms_start} < $start)
  {
    %last_frame = %{$ref};
  }
  #we're in the middle of the verse
  #$end == 0 if it goes to the end of the file (it's the last verse in the chapter)
  elsif ($ref->{ms_end} <= $end || $end == 0)
  {
    print_frame($ref);
    %last_frame = %{$ref};
  }
  #the verse has ended
  elsif ($ref->{ms_end} > $end)
  {
    print_frame($ref) unless ($end == $ref->{ms_start});
    %last_frame = ();
    $content = '';
  }
}
print_frame(\%last_frame) if (%last_frame && $end > 0);

sub get_frame_header
{
  my $offset = unpack 'N', substr($content, 0, 4);
  my $type = ord substr($content, 4, 1);
  my $length = unpack 'N', "\0" . substr($content, 5, 3);
  my $ms = unpack 'N', "\0" . substr($content, 8, 3);
  return {
    prev_length => $offset,
    type => $type,
    length => $length,
    ms_start => $ms,
    full_length => $length + 11,
    full => '',
    };
}

sub get_frame_length
{
  my ($ref) = @_;
  #there's another frame; get the metadata for the next frame and subtract
  if (length $content > $frame_header_length)
  {
    my $next_ref = get_frame_header();
    return $next_ref->{ms_start} - $ref->{ms_start};
  }
  #otherwise use the length of the previous frame as an approximation
  elsif ($last_frame{ms_start})
  {
    return $ref->{ms_start} - $last_frame{ms_start};
  }
  #otherwise we're not sure
  else
  {
    return $ref->{ms_start};
  }
}

sub print_frame
{
  my ($ref) = @_;
  #ignore meta frames
  return if ($ref->{type} > $audio_type);
  my $frame = $ref->{full};
  print OUT $frame;
  %last_printed = %{$ref};
  $ref->{full} = $frame;
}

4. Make a database to facilitate lookups

We needed to know which files corresponded to which verses and the length of each verse in milliseconds to create the necessary metadata for each download. We simply had to extract this data from the FLVs we’d just created. Again, we used FLV::Info.

We stored the data in a flat file with the following columns:

  1. Verse id.
  2. FLV file name for that verse.
  3. Whether the verse starts or ends a chapter (and which chapter it starts or ends).
  4. Number of milliseconds in the verse.
  5. Number of bytes in the verse.

5. Write PHP to stream FLVs

Next we had to write a program to adjust frame-level metadata for streaming and join multiple FLVs when someone wanted to listen to more than one verse.

(We explored whether we could achieve better performance by uploading complete chapters and simply extracting the frames we needed using fseek() and fread(). This approach was marginally faster in some circumstances but much slower in most cases. We decided to go with our original strategy.)

Conceptually, the program worked like this:

A. Receive a request for a url in the following format:

http://www.example.com/flv.play/mm/43003016-43003017

B. Parse the url into its component elements.

  • flv.play: the PHP handler.
  • mm: the audio version (in this case, mm for Max McLean).
  • 43003016-43003017: a numerical representation of the verses to play (John 3:16-17). We decided to use numerical representations here rather than a human-readable string because we would be the only consumers of the FLV. We didn’t expect anyone else to access the stream directly.

C. Look up the files that correspond to the given verses.

We do a simple lookup in our flat file.

D. Check whether we’re playing exactly one chapter.

If so, we skip steps E through H and simply pass the contents of the complete chapter FLV file through PHP.

Yes, this approach means that we store a partial FLV for each verse in addition to a complete FLV for each chapter, doubling the amount of disk space we need.

Why would we waste so much space? We’re making a classic tradeoff in computer programming: space vs. speed. The audio we stream breaks down to the following percentages:

  • 11%: Single verse.
  • 11%: Multiple verses.
  • 70%: Complete chapter.
  • 8%: Verses spanning chapters.

We can therefore handle 70% of requests simply by sending a complete FLV with minimal computational overhead. It’s faster for the person who wants to hear the audio, and it uses fewer server resources on our end.

E. Check whether we can stream a cached file.

We cache each dynamic FLV we output to reduce server load for future requests. About 20% of non-complete-chapter requests use a cached file.

We skip steps F through H if we have a cached file.

F. Send the FLV file header.

This content consists of the 9 bytes required by the FLV spec and 139 bytes of metadata (duration, audiodatarate, audiodelay, audiocodecid, and canSeekToEnd; more metadata details). We took these tags from one of the FLVs the Flash Video Encoder made—if those tags were good enough for the official encoder, they were good enough for us.

We edit the duration metadata before sending to reflect the actual duration in the audio we’re about to stream:

#on little-endian machines, we need to flip the value to be big-endian
define('NEEDS_BIG_ENDIAN_FLIP', pack('S', 1) == pack('v', 1));
$seconds = 100.45; #for example; the length of the complete audio
$pack = pack('d', $seconds); #convert it to a double
if (NEEDS_BIG_ENDIAN_FLIP) $pack = strrev($pack);
#$meta is the 139-byte string containing the metadata
#position 44 is where the length is stored in the string
$meta = substr_replace($meta, $pack, 44, 8);

G. Print the FLV frames.

Now comes the complicated part. We have to adjust the timestamp of every frame to compensate for our not knowing when the frame occurs in the output.

See the source code for this step.

We spent a lot of time refactoring and optimizing this code for speed, especially in the main loop. Our testing server can parse a verse in about 3.2 milliseconds, an improvement of 79% over our initial code, though we sacrificed some flexibility in the process. In the realm of non-obvious optimization tips, we found:

  • Creating and accessing a number of class variables is faster than using fewer class variables and accessing them as associative arrays.
  • Class constants are faster than class variables.
  • Declaring temporary (function-scope) variables is relatively expensive, though it’s faster to access and change a local variable repeatedly than to access a class variable (or class constant).
  • Creating a string reference and repeatedly modifying it is hugely expensive.

H. Print the FLV file footer.

The footer consists of four bytes that never change.

6. Create an SWF to play the FLVs.

We wanted a simple interface without unnecessary features. We stripped the player down to the minimum: Play, Pause, and Return to Beginning.

Scripting these basic functions turned out to be straightforward (ActionScript source) despite our inexperience with Flash programming. The complete Flash 7+ compatible SWF weighed in at only 1.5K and 56 lines of code.

However, we did encounter an interesting bug: Internet Explorer had a problem when we sent the Content-Length header with the FLV; the audio would only play some of the time. We stopped sending the header and haven’t had a problem since.

The “Listen” link is an SWF. We originally had the background white, but a commenter pointed out that defining a color didn’t work in all circumstances. The background is now transparent. We may replace the word “Listen” with an icon at some point.

Solution #2: MP3s

Not everyone can or wants to use Flash, however. We hoped to provide a way for such people to listen to the audio. MP3s are cross-platform, nearly universal, and sounded like a good solution. MP3s also have a widely understood and simple playlist format, M3U, that we’d dealt with before.

Security

The MP3 format’s main weakness is its poor security. Media players make it easy to save MP3s to your computer indefinitely for easy playback, which posed a problem: we needed to let people hear the audio but prevent them from saving extended passages.

We decided on a three-pronged approach. First, we would break up the chapter MP3s into individual verses: each verse became an MP3. Second, we would issue HTTP headers to instruct browsers not to cache the MP3s. Third, we would create a time-based key to provide access to the MP3s for a limited period only.

1. Create MP3s of individual verses.

We had experience doing this kind of thing from the daily verse audio. We used MP3::Splitter to make an MP3 for each verse.

2. Write PHP programs to generate M3Us dynamically and stream MP3s.

There’s nothing particularly innovative here. We look up which MP3s correspond to the verses the person wants to hear and generate an M3U playlist.

Then we pass the contents of each MP3 through a PHP file.

Results

There are audible gaps between verses, an artifact of the MP3 splitting process. It works well enough otherwise.

Future Directions

We’re comfortable with how our pseudo-streaming works for now. We see a few possibilities for improvement in the versifying process, however.

For starters, we could generate a waveform in ten-millisecond slices to help guide the versifier. Here’s how. (We don’t actually know whether it would be of any help.)

Second, we could use Mechanical Turk to perform the actual versification. We think we could have three people versify each chapter and take the consensus or do some manual verifying ourselves.

This article and attendant source code are licensed under a Creative Commons Attribution-ShareAlike License. Comments are welcome (webmaster@gnpcb.org). Thanks to Jon Udell for inspiring the original idea of making the audio accessible on a verse level.

You might also be interested in the Technical Introduction to the ESV Online Edition, which gives a lot of background on how we designed this site.

This entry was posted in Uncategorized. Bookmark the permalink.