by Stefan Westerfeld
In This Chapter
What has traditionally been the domain of other systems is slowly coming to Linux (and UNIX) desktops. Images, sound effects, music, and video are a fascinating way to make applications more lively and to enable whole new uses. When I was showing a KDE 2.0 preview at the CeBIT 2000, I often presented some of the multimedia stuff—nice sound effects, flashing lights, and great music. Many people who were passing by stopped and could not take their eyes off the screen. Multimedia programs capture much more attention from a user than simple, "boring" applications that just run in a rectangular space and remain silent and unmoving.
However, those technologies will become widespread in KDE applications only if they are easily accessible for developers. Take audio as an example. KDE is supposed to run on a variety of UNIX platforms, and not all of them support sound. Among those systems that do, there are very different ways of accessing the sound driver. Writing a proper (portable) application isn't really easy.
KDE 1.0 started providing support for playing sound effects easily with the KAudioServer. Thus, a game such as KReversi could support sound without caring about portability. Using one KAudio class, all problems regarding different platforms, and how exactly to load, decode, and play such a file, were gone.
The idea of KDE 2.0 multimedia support remains the same: make multimedia technologies easily accessible to developers. It is the dimension that changed. For KDE 1.0, playing a wave file was about all the multimedia support you could get from the libraries. For KDE 2.0 and beyond, the idea is to really care about multimedia.
KDE 2.0 takes into consideration all audio applications—not only those that casually play a file, but everything from the heavy real-time-oriented game to the sequencer. KDE 2.0 also supports plug-ins and small modules that can easily be recombined, as well as MIDI support and video support.
The challenge of delivering multimedia in all forms to the KDE desktop is big. Thus, the KDE multimedia support should work like a glue between the applications so that the puzzle pieces already solved by various programmers will be usable in any of the applications, and the image will slowly grow complete.
The road for KDE 2.0 (and later versions) is integration through one consistent streaming-media technology. The idea is that you can write any multimedia task as little pieces, which pass multimedia streams.
Quite some time before KDE 2.0, I first heard of the plans of KAudioServer2 (from Christian Esken), which was an attempt to improve and rewrite the audioserver to support streaming media to a certain degree. On the other hand, I had been working on aRts (analog, real-time synthesis) software for quite a while and had already implemented some nice streaming support. In fact, aRts was a modular software synthesizer that worked through little plug-ins and streams between them. And, most important of all, aRts was already working great.
So, after some considerations, we decided at the KDE 2.0 Developer meeting to make aRts the base for all streaming multimedia under KDE. Many things would have to be changed to come from one synthesizer to a base for all multimedia tasks, but it was the much better approach than trying to do something completely new and different, because aRts was already proven to work.
As I see it, the important parts of streaming multimedia support are
How these things work is probably illustrated best with a small example. Assume you want to listen to a beep while the left speaker should be playing a 440Hz frequency and the right speaker is playing a 880Hz frequency. That would look something like the following:
As you see, the task has been divided into very small components, each of which do only a part of the whole. The frequency generators only generate the frequency (they can also be used for other wave forms), nothing more. The sine wave objects only calculate the sinus of the values they get. The play object only takes care that these things really reach your sound card. To get a first impression, the source code for this example is shown in Listing 14.1:
Example 14.1. Listening to a Stereo Beep
Now, while you're thinking of that simplistic example, consider Figure 14.2:
Figure 14.2 illustrates a real-life example. I've simply composed three tasks done at the same time.
First, consider the MIDI player. The MIDI-player component is probably reading a file and sending out MIDI events. These are sent through a software synthesizer, which takes the incoming MIDI events and converts them to an audio stream. This is not about your hardware wave table on the sound card; all things that we are talking about here are happening before the data is sent to the sound card.
On the other hand, there is the game. Games often have very specific requirements for how they calculate their sound, so they might have a complete engine that does this task. One example is Quake. It calculates sound effects according to the player's position, so you can orient yourself by listening closely to what you hear. In that case, the game generates a complete audio stream itself, which only will be sent to the mixer.
The next chain is the one with the microphone attached. The microphone output is sent through a pitch-shifting effect in this example. Then the output goes through the mixer, the same as everything else. Through the pitch shifting, your voice sounds higher (or lower) because the frequency changes give this a funny cartoon-character effect. If you like, you can also imagine a more "serious" application, such as speech recognition or Internet telephony at this place.
Finally, everything is mixed in the mixer component, and then, after sending it through a last effect (which adds the reverb effect), played over your sound card.
This example shows a bit more of what the multimedia support does here. You, for instance, see that not all components that are involved are in the same process. You wouldn't want to run your Quake game inside the audioserver, which also does the other tasks. Maybe your MIDI player is also external; maybe it is a component that runs inside the audioserver. Thus, the signal flow is distributed between the processes. The components that are responsible for certain tasks run where it fits best.
You also see that different kinds of streams are involved. The first is normal audio streams, which are managed nicely by the aRts/MCOP combination (and the most convenient method). The second is the MIDI stream. These differ a lot. An audio stream always carries data. In one second, 44,100 values are passed across the stream. In contrast, a MIDI stream transmits something only when it is needed. When a note is played, a small event is sent over the stream; when nothing happens, nothing is sent.
The third type is byte audio, which refers to the way the game in that case could produce audio. Byte stream is the same format that would normally be replayed through the sound card (16 bit, little endian, 44kHz, stereo). To process such data with the mixer, it needs to go through a converter because the mixer only mixes "real" audio streams.