Chapter 25 Audio Presentations

CONTENTS

Bandwidth and Hardware
Audio Formats
Audio Server Software
- Streaming Audio Packages
Recording Audio
- Choosing Sound-Editing Software
The Future of Online Audio
- The Internet Phone
- Voice Recognition
Summary

Think back to the last time you used a tape recorder not to play but to record a cassette. Maybe you taped your daughter's high school graduation ceremony or a company lecture. Perhaps you took a recorder with you to a family reunion to preserve some oral history. You might have even accidentally recorded a phone conversation on your answering machine. Most likely, though, it hasn't been all that long since you played a tape in a cassette player. Maybe you listened to an audio book driving to work this morning, or perhaps you carried a small portable cassette player with you on your daily walk. If you're a fan of music, you probably have a CD player on your stereo system. You also might have a CD player on your home or work computer. Both the tape recorder and CD player are familiar tools with which to play and create audio. But you can use another tool to make audio clips and import them into presentations, Web pages, and movies. This tool is the computer.

Your computer has the potential to be a microphone, mixer, and boom box all in one. The computer doesn't care what you are recording-your own voice, a telephone conversation off the speaker phone, a music CD, a bird call, a water fountain. You need only a microphone, speakers, an editor, and a sound card for your computer's motherboard. These days, many personal computers (and all Macintosh computers) are built with sound cards.

Creating audio files for an online (or offline) presentation is about as easy as tape-recording your grandfather's war stories-once you get the hang of it. The hardest part is determining your hardware needs and selecting suitable software.

This chapter explores the current options available for adding an audio segment to your computer system and discusses the future of computers and audio. To begin this discussion, let's look at what's happening to a fictional company called Nemosyne, Inc.

Nemosyne's main product is a series of foreign-language study packages. Each package contains a manual and two cassette tapes oriented to one of 30 different languages. Nemosyne's primary market is American executives who travel overseas on business. Although Nemosyne wants to expand its market to European and Asian executives, they have not yet reached that point. Currently, their biggest seller is the English-to-Japanese study package, with French running a close second.

Not too long ago, Nemosyne hired a Webmaster to create and maintain a corporate intranet because the board of directors wanted to capitalize on the advantages of a well-developed Web site. The new Webmaster set up a few employee bulletin boards, an online contact database, and CGI scripts that allow customers to order products via an electronic form. Nemosyne has already noticed an increase in sales.

The board of directors has big plans for expanding the public side of their Web site: They want to be the first to bring foreign language study to the Web. The company's Webmaster has told the board that by creating a large base of audio files (all of which can be extracted from their existing language tapes), they can make a site to which American travelers can refer when they need to know pronunciation for certain words or phrases.

A businessman on assignment in China, for example, needs to find a dry cleaner. If he's online, the businessman can go to the Nemosyne site where he can find a hotlink that looks something like this:

DryCleaning.ra

After he clicks the link, he is presented with a list of audio files for dry cleaning-related phrases such as "Where would I find a dry cleaner?" He selects this phrase, and within seconds a Chinese version of this phrase plays on his laptop. He can repeat the phrase as many times as he likes, until he can say it himself. (By this time, he might be so impressed that, before he looks for the dry cleaners, he'll take the time to order Nemosyne's Chinese language package electronically with his credit card.)

The board thinks the concept of online "instant phrases" is terrific, but they're still a bit hesitant. The Webmaster has been asked to create an online presentation that will give them a feel for how the proposed site would operate. The presentation needs to be posted on Nemosyne's intranet so that the CEO (who is spending the summer in Bermuda) can access it easily.

The Webmaster begins by browsing the Web, just to see what other users are doing with audio files. In one two-hour session, it becomes apparent that creating online audio is not a difficult task.

Bandwidth and Hardware

The biggest challenge for the Webmaster in developing the Nemosyne presentation is overcoming limited bandwidth. This is the challenge that all Web multimedia developers are facing currently. A 28.8 Kbps modem can move about 3.6 kilobytes (3.6K) per second, making it about 50 times slower than the 176K per second that is needed to play CD-quality audio.

To overcome the lack of bandwidth, you must compress all audio files for the Web. Otherwise, they would come out sounding like a tape recorder running on old batteries (or even worse). Although you can compress audio files in many different ways, you can offer audio on your Web site in only two ways.

The first option is to post files for users to download and save to their hard drives. The disadvantage of this method is that it asks a lot of the users. The users must know how to download, how to find and use an audio player, and they must sacrifice hard disk space-all just to listen to something that sounds merely okay. The second option, streaming audio, also sounds merely okay but is much less demanding. A streaming audio file plays just seconds after a user clicks its link-provided a plug-in player has been downloaded and stored in the proper directory.

Streaming audio works by creating an ultra-compressed version of a regular digital audio file, but by keeping the compression in an order that the computer (with the help of the player) can understand as the data comes in. Currently, streaming audio products are marketed by several different companies (more about them later in this chapter), but they all use coders and decoders, and have two components: a compressor, which compresses the audio stream; and a decompresser, which plays the audio stream. The compressor codes the original audio, and the decompresser reorders the data into a file that is similar to the original but not the same.

To hear some clips of presidential speeches that have been transformed into streaming audio files, visit Webcorp's audio archives at http://www.adobe.com/acrobat/readstep.htmlhttp://www.webcorp.com. To hear some streaming audio examples of presentation music, visit NetworkMusic's home page at http://www.networkmusic.com.

Opinions about the quality of streaming audio are varied. Although having a short download time is nice, many streaming audio clips-especially those that were originally created on old-fashioned tape players-sound rough. Don't expect even the most basic stereo sound from streaming audio; however, the quality should progress as bandwidth increases.

For the Nemosyne presentation, the Webmaster wants to use streaming audio. Using these files is quicker than making each user download the entire audio file before it is played. The Webmaster, however, anticipates that the proposed site will eventually include some audio files for downloading. If the Nemosyne company wants to offer a sample tape for downloading to potential customers, for example, streaming audio would not be the best choice. In this case, having a less-compressed higher-quality version is better. As it now stands, streaming audio is best for audio files that are meant to be played spontaneously online. Other formats might be used for longer audio files, especially ones in which sound quality is especially important.

As the quality of streaming audio improves, it might completely replace the other audio file formats (for online purposes). Streaming audio is easier to handle: the number of streams can be determined by the server software. A company that wants to have unlimited streams has the option. Likewise, having as few as five or six streams is possible. Streaming audio's flexibility is a new branch in the ways to distribute audio files over a network. But like most technology having to do with the Internet, its capabilities and protocols are always changing.

Audio Formats

Many times you'll be required to create your file in a sound editor and then transform it for real-time playback with a streaming audio encoder. For that reason, you should become familiar with the audio formats used to digitize sound for the Web. If you don't plan to stream your audio files, you can post any audio format on your Web site. Remember that your users will have to download the files and have the means to obtain a player. Including a compatible player along with your bank of audio files is usually a good idea. On the Web, the most frequent uses of nonstreaming audio are song samples (visit Geffen/DGC Records at http://www.geffen.com), audio greetings, stories, and clips from speeches.

Several different audio formats are available to Web developers. Each method has advantages and disadvantages, but all the methods are lossy, meaning the compression causes a decrease in sound quality. Remember that a particular sound format might support more than one way to encode the sound data.

For the Web, audio-MPEG, .WAV, .AIFF, and .AU-described next-are some of the most popular. Most likely, your sound editor will allow you to create files in one of these formats. However, you will occasionally run across sites that use other formats. Distinguishing between audio formats is a matter of testing them. You also should take into consideration what type of sound you want to post. Some formats work better with speech than with music and vice versa. The same holds true for sound editors and players, including the streaming variety.

The following are some descriptions of audio file formats:

MPEG-Audio: A favorite of multimedia experts, audio-MPEG (Motion Pictures Experts Group) is a sound version of the digital video format MPEG. By executing a technique called perceptual coding, audio-MPEG compresses audio by removing extraneous data. This format produces small file sizes and good sound quality, probably the best of all the formats. You can choose between MPEG 1 (slower) or MPEG 2 (faster). You probably won't have to flip a coin to decide.
.AU (also uLaw, NeXT, and Sun Audio): A common UNIX format, .AU is still popular for the Web. .AU is a versatile format used mainly for its capability to make a very small file rather than for its playback quality, which is usually rather poor.
.WAV (MS Windows) and .AIFF (Macintosh): You can decide between 8- or 16-bit sound when using these formats. The advantage is the flexibility, but the disadvantage is a tendency to produce a "grainy" sound. These files have the potential to become very large, which means a long download time for the users.

Chances are good that you will be able to find a shareware program that produces audio-MPEG, .WAV, .AIFF, and .AU files. Go to Virtual Noise's Audio Help Desk at http://www.virtualnoise.com/audio1.html. You can find some hot links that lead you to some outstanding shareware sound-editing programs and players.

It's becoming more common for computer companies to pre-install sound editors, players, and other multimedia applications along with the system software. You can use the Sound Recorder that comes with Windows 95 or Macintosh's SoundMachine to make audio files and to play them. If you want to offer all your online audio files in .AU, audio-MPEG, .WAV, or .AIFF format, give your users instructions on what applications they need to play the files. You can be very general and say something like "You'll need a sound player on your machine to hear this file." To embed an audio file onto your site for download, use the following standard HTML reference tag:

<A HREF="example.wav">Example</A>

If you plan to make any part of your site dependent on audio files, and you anticipate that a shareware program will not meet your needs, you might want to look into some of the commercially available sound-editing packages, such as MacroMedia's SoundEdit Pro (for Macintosh only) or Cakewalk Music Software's Cakewalk Pro Audio (Windows). Expect to pay around $400 to $500.

Audio Server Software

One issue the Webmaster considers while developing the online Nemosyne presentation is whether to install an audio server. Think of the audio server as a supplement to the Web and mail servers: it gives you the capability to play audio streams from a given site. You should not confuse an audio server with sound-editing software. Think of sound editors as the "drawing programs" of audio: they are nonserver applications that provide tools for editing and manipulating sound.

A streaming audio server, available from the companies that are currently developing streaming audio products, allows a Web site to offer live, on-demand audio in real time. With one of the more expensive packages, at least a couple dozen users-and possibly many more-could therefore request the streaming audio files at any given time. No downloading is involved. Without the audio server, you would have to link streaming audio files to a site that does have audio server software.

The Webmaster decides on the most basic version of Progressive Network's RealAudio server software, which allows about five or six simultaneous users. Nemosyne will pay the $600 it costs to provide this service. The deluxe server would be quite pricey, as the capacity for unlimited audio streams runs about $5,000 to $6,000. The Webmaster plans to ask for the more deluxe server software during the Nemosyne final presentation.

Streaming Audio Packages

Most streaming audio packages come in three parts: the server, the encoder, and the player. Most likely, you can get some parts (if not all) of the package for free by downloading them from Web sites (URLs are listed next). Every package offers different features. Do some research to find the one that's best for you.

RealAudio by Progressive Networks: RealAudio, developed by Progressive Networks-a pioneer in streaming audio technology-offers the most polished, most popular package. Most people find that RealAudio is easy to use, fully functional, and reasonably priced (as far as streaming packages go). RealAudio moves real-time audio over 14.4 Kbps and faster modems, and the player can be downloaded for free. Progressive Networks offers several packages especially for intranets. Each includes the server, the deluxe version of the encoder, license, and upgrades. Prices begin at $495. For more information, visit http://realaudio.com.
StreamWorks by Xing: StreamWorks is a well-developed (although not particularly user-friendly) package. It's more expensive than some of the others but would work well for an intranet that maintains hundreds of simultaneous audio streams, like a broadcasting company. Like RealAudio, StreamWorks requires dedicated server software, whereas TrueSpeech and IWave (described in a moment) do not. Technical support for the free downloadable player costs $29. You might need the tech support considering that the Xing player is a bare-minimum MIME (Multipurpose Internet Mail Extensions) version operating from an embedded applet. Xing StreamWorks server software (platforms offered are SGI, Sun, HP, and Linux) starts at $3,500. For more information, visit http://www.xingtech.com.
TrueSpeech by DSP: TrueSpeech, a freeware package, offers excellent-yet low-bandwidth-quality for both music and speech. Along with IWave (below), TrueSpeech requires no special server software, which makes both of them much less expensive than their competitors. The free TrueSpeech player, however, is a bit too basic for large files: users do not have the option of stopping and resuming the playback, or adjusting the volume. The TrueSpeech encoder is included with MS Windows 95 (it's integrated into the Sound Recorder). For more information, visit http://www.dspg.com.
IWave by VocalTec: IWave (Internet Wave) offers high-quality-yet low-bandwidth-music, but it does not handle speech as well as other players. You can download IWave for free. You also can obtain a well-designed (but large) player by downloading the demo version of VocalTec's Internet phone. The IWave server is limited in its functionality, but if your streaming audio needs are for music, IWave is a good choice. For more information, visit http://www.vocaltec.com/.
ToolVox by Voxware, Inc.: A freeware package, ToolVox is solely for speech. It lacks a fully functional player and does not have a server component. ToolVox audio files must be linked with standard HTML tags, which are then distributed in the same manner as GIF and JPEG files. IWave and TrueSpeech (above) also use the HTML method of distribution. The disadvantage of not having server software is that you lose the control and flexibility you might need for a high-volume commercial system. ToolVox is a good test run if you are thinking about developing streaming audio capabilities into your site. For more information, visit http://www.voxware.com.

Recording Audio

With a cassette from one of the foreign-language packages in hand, the Nemosyne Webmaster is ready to make a recording. She can record directly into the version of the RealAudio Encoder that came with the RealAudio server, but she wants to gain experience creating an audio file in a sound-editing program and then converting it to RealAudio format. For this reason, she must download the free version of the RealAudio Encoder from the RealAudio Web page. (The free version does not have the capability to record live; it can only convert an existing audio file into a RealAudio file). While the encoder is downloading, the Webmaster searches the Web for a good sound-editing program.

Choosing Sound-Editing Software

A sound editor's first purpose is to record a sound and transform it into a digital format (preferably one of the more popular ones). A good audio editor has the following functions:

Operates with specialized sound cards
Provides accurate input meters
Allows a capacity to change bandwidth and format specifications
Offers transitions between the RAM and the hard disk
Creates more than one track
Provides a variety of ways to mark time, such as in measures/beats and seconds/milliseconds

The Webmaster is going to use a Microsoft Windows-compatible program called GoldWave as a sound editor. The good thing about GoldWave is that it is a shareware program and can be downloaded from the Web for free. If you would like a copy of GoldWave, along with some good documentation, visit GoldWave's home page at

http://garfield.cs.mun.ca/%7Echris3/goldwave/goldwave.html

You also can download many other freeware audio editor packages (such as Cool Edit, Sonic Screwdriver, WAVany, and WHAM) from the Web.

After GoldWave is installed, the Webmaster creates a new file in which to record the first audio file, as shown in Figure 25.1.

Figure 25.1: A new file is created in GoldWave.

Because the Webmaster is making a new audio file from an existing source (a cassette), she needs to route the computer to a tape player or stereo system. Most computers have audio ports that allow this routing. See Figure 25.2.

Figure 25.2: GoldWave's Device Controls panel allows you to play and record sound.

Using a sound-editing program, the Webmaster can extract a one-minute clip from the tape and adjust the bandwidth specifications. Once edited, the file is called "greeting" and is saved as a .WAV file, as shown in Figure 25.3.

Figure 25.3: The file is saved as greeting. WAV.

After an audio file is in .WAV (or another) format, it can be stored in the proper directory and linked into an HTML file. A user can then download the file and play it with a computer audio player.

.WAV files-like any that have to be downloaded-do not play in real-time, however. For this reason, the Webmaster pulls up the RealAudio Encoder, shown in Figure 25.4: It's already been decided that the Nemosyne presentation files will be streaming audio.

Figure 25.4: The RealAudio Encoder can transform a variety of audio formats into streaming audio.

The RealAudio Encoder allows the Webmaster to transform the .WAV file into an .RA file (.RA is the extension used specifically for RealAudio files): greeting.RA.

Now that the file is in .RA format, the Webmaster wants to link it to the online presentation site. To do so, RealAudio requires that an additional document, called a metafile, be created and attached to the site. The metafile, which has the extension .RAM, is a text file that contains the URL of the RealAudio file. It provides a link between the Web server and the RealAudio server. The Webmaster also configures the Web server to recognize the .RA and .RAM MIME types. The RealAudio page (http://www.realaudio.com) gives specific instructions on creating the metafile and configuring the Web server.

The Webmaster creates the metafile (greeting.RAM), configures the Web server, and then links the audio file to the desired page. All that's needed to perform these steps is a simple HTML tag:

<A HREF="/greeting.RAM">Greeting</A>

Providing that the RealAudio Player is installed, the file should play on the computer in real time. This procedure can be repeated to make additional real-time audio files.

Streaming audio files are going to provide the Webmaster with what she needs to present her proposal effectively to Nemosyne's board of directors. But you should remember that the procedure described in this chapter is only one of many that you can use to make an audio file. You have to decide whether to purchase an audio server and what editing software you use. You also might decide that you don't want to use streaming audio. It's the trend of the future, but it's also expensive and a bit underdeveloped. Waiting for the streaming audio technology to progress also is an option.

The Future of Online Audio

Developments like streaming audio and other online multimedia are indicators that the Web is rapidly becoming a truly interactive environment. Real-time audio and video capabilities bring to the Web what satellite dishes brought to television. Conferences, court proceedings, talk shows, and celebrity chat hours eventually might be available online.

The Internet Phone

Internet Phone is more than a speaker phone and more than a chat room; VocalTec's Internet Phone gives Internet users the ability to talk to each other in their real voices. By connecting to the IRC (Internet Relay Chat) network, the Internet Phone software provides a list of online users and conversation topics. After you have a TCP\IP Internet connection, select a user from the list to call. The minimum connection is a modem SLIP\PPP connection of 14,400 baud. Internet Phone works best with at least a 486SX PC with 25MHz and 8MB of RAM. Versions for both Windows and Macintosh are available.

Internet Phone works by employing a voice compression algorithm that minimizes bandwidth consumption. Calls made from the Internet Phone cannot be traced, and the software allows for "private topics" that cannot be accessed by outsiders. Users of Internet Phone speak into a computer microphone.

A novel idea, the Phone's most obvious disadvantage is that you can make Internet phone calls only to people who also have the software. At this point, the Internet Phone is something like a very sophisticated chat room or BBS. But who knows, maybe someday everyone will be trashing touch-tones and buying high-tech microphones (with built-in answering machines, of course).

Besides the VocalTec phone, other audio-conferencing software is available. An excellent FAQ is available at http://www.gi.net/NET/PM-1995/95-04/95-04-28/0004.html. For more general information, visit VocalTec's home page at http://vocaltec.com.

Voice Recognition

Also in the future is the development of voice recognition computing. Imagine what it would be like to direct your computer to your favorite Web sites not with a mouse, but with your voice! Voice recognition often is used in word processing and other software as an aid to the visually impaired.

Just about any computer function that is performed with a keyboard and mouse has the potential to be performed with voice recognition. This is good news for people who get tired of moving the mouse around, or who were never that adept at dragging and clicking to begin with. Although it's doubtful that voice recognition would make the mouse and pad extinct (drawing programs are especially dependent on the trackball), you can count on the technology becoming integrated with more software packages.

The concept of machine-voice communication is older than you might think-about 60 years. It wasn't until the 1980s, however, that small vocabulary speech recognition software was developed to run on IBM PCs. The software has continued to progress and is greatly assisted by the Pentium and other powerful processors.

A twist on voice recognition is voice verification. You've probably seen science-fiction movies in which a fingerprint is used as a passkey; but because your voice is as unique as your fingerprint, anticipate the development of the "voiceprint," which might consist of a spoken password or phrase ("Open Sesame"?), or the repetition of certain words at the computer's request.

You can find an excellent directory of voice recognition resources at http://www.kurz-ai.com/gen-vr.html.

MCKEON & JEFFRIES

In the short term, McKeon & Jeffries has very little use or need for audio. Not only do most of their machines have audio capability, but the information that's most important to them is technical research and is important to be read and not heard. On the other hand, M&J hopes to use audio on a few applications in the future:

Continuing education. Every CPA has to keep up with current tax law and accounting practices. To do so, many accountants attend seminars and talks given by experts. McKeon & Jeffries hopes to someday be able to broadcast these seminars through the intranet to their users to save time and money, as shown in Figure 25.5.
Conference calls. In the distant future, M&J hopes to be able to hold online audio conferencing through its intranet. The next step would be to recognize the audio, save it to a file, and make it searchable for individuals who want to use it as a reference.

Figure 25.5: Continuing education is much more convenient and efficient using an intranet.

THE SPORTING GOODS AND APPAREL ASSOCIATION

The SGAA plans to implement sound in several different areas of the intranet. As their users become more sophisticated and have the ability to not only listen to audio but also to create audio, the site will use more and more audio technology

What's new. The SGAA wants to be able to greet users as they visit the site every day with a list of the new things added to the site each day, as shown in Figure 25.6. However, they don't want to crowd the home page with a lot of text. Using the RealAudio server, the SGAA staff can record a new message every day so that when users log on, they can hear about new attractions on the site, fresh news, and any events scheduled for that day such as online chats or audio conferences..
Advertising. Several of the manufacturers and distributors have product pages that contain information on specific products. For some of these products, radio advertising or the audio for television for advertising is captured and available on the server for resellers and distributors to hear.
Audio conferencing. Several times a month, the SGAA's executive committee engages in a short conference call to discuss association business. Members are invited to listen in using RealAudio through the intranet.

Figure 25.6: A different audio message greets users every day at the SGAA site.

Summary

You can use digital audio for online presentations and as a means to develop the resources of your intranet. Creating high-quality, functional audio is a matter of determining your needs (speech, music, or both) and scouting for software that will help you accomplish your goals. Deciding how to compress your audio is most important. Streaming audio packages enable you to play back in real time, so if you want to bring live conferences to your site, streaming audio is for you. On the other hand, if you are more interested in posting music files that do not have to be played simultaneously, and you want to preserve quality, you might want to post your files for downloading. You must then decide what format to use for posting your audio files. Audio-MPEG, .AIFF, .WAV, and .AU formats offer means of compression.

In both cases, take into consideration the software choices you have available: many servers, encoders, and players are available through shareware, so if audio will not be a large part of your presentation or intranet, they might be the best route. If you have plans to make audio an important part of your presentation or your intranet, look into one of the more sophisticated commercially available packages. Don't forget that online sources can answer questions that arise while you are in the process of creating audio.