Hi again guys!
Over the past years OSS has been criticized because it doesn’t support this and that feature of this and that sound card. Instead it limits applications to some common subset of features found on every sound card. In general this is true. However is it actually a problem?
Using sound in applications is in many ways similar to using networking. For example a web browser simply tells the networking software to connect to a given TCP port of given server (name/address). Networking software then builds the connection and that’s it. Web browsers don’t try to find out which network interface card is installed in the system. They don’t try to find if the device has some hardware parameters that could be tweaked to get better performance. In fact the networking software doesn’t even let them to do that (such changes would ruin the performance of the other applications using the same NIC). Instead the NIC settings are managed by kernel’s network core/drivers which have better knowledge about the local network.
This is known as the “black box” model. A web browser application sees only the networking subsystem box. There may be some control switches on the front panel of the box. However they are related with the current stream/socket. The actual NIC devices are located on the back side of the box. There may also be some control switches for the devices but the application cannot see them. Instead they can be used by the system operator (if necessary).
The OSS API is based on this black box approach too. The application doesn’t need to worry about the device parameters at all. Instead it just tells the OSS box to create an audio stream with given parameters (sampling rate, number of bits and number of channels). After that the application programmer can focus on doing his/her job. OSS will automatically perform the required conversions in software if the device doesn’t support the requested format itself. When the stream is running the programmer can use the usual POSIX/Unix/Linux system calls such as read(), write() and select()/poll() to feed more data.
The black box model makes OSS audio applications very simple and robust. About 95% of applications don’t need to use any advanced techniques to get audio working (unfortunately this seems to be very difficult to accept by the developers). Practically all audio applications I have examined do completely unnecessary things that don’t make anything better. In fact in many cases such applications will simply break in systems that have slightly different sound cards (usually some of the high end professional ones).
Of course there are special applications that need to be fully aware about the hardware details. For example it doesn’t make any sense to send encoded digital bit streams (such as AC3) to an ordinary analog output device that cannot decode them. Or an audio recorder/editor application probably should prevent the user from recording a 24bit/192kHz/5.1 file from a (modem) device that supports only 8bits/8kHz/mono. However such applications are rare (about 5% of all applications doing audio). They are usually audio oriented and designed by programmers with good audio knowledge.
Majority of all applications doing audio require just very limited number of features from the sound subsystem. Applications that require anything more are rather rare. For this reason the OSS API has been divided to a small subset of fundamental core functions that are easy to use and to a wider set of less frequently used functions. This makes it very easy to learn and master OSS. At the same time programers seeking for some challenge can get it.
The above wast the first half of my “mission”. The second half is that I was introduced to Unix in 1984. I have always liked it’s powerfull capability to do mighty things with just a small set of carefully selected features. I have also done some MS-DOS (since the introduction of the HP150 microcomputer) and Windows programming during my life before Linux but I never liked that.
I think the above explains why OSS become what it is now. There are some fundamental rules I have tried to follow as much as possible:
o Use the familiar Unix/POSIX/Linux device/file API as much as possible. There are millions of programmers who already know how to use this interface. Developing an entirely new interface that is not compatible with anything earlier will just cause massive confusion and make it difficult to get programs working with the good old legacy API.
o Keep the number of features as small as possible. In this way the API can (hopefully) be properly tested and documented. Compact API is also easier to understand by the application developers. Add new API features only if it’s absolutely necessary. In this way OSS currently has something like 200 ioctl functions. Even this may be too much but fortunately most applications need just a handfull of them. There are good chances that we can actually document and test all these functions within reasonable amount of time. Compare this to some competing API that has some 1500+ (and counting) functions. I would be very surprised if all of them ever get documented (or even used by any applications).
o Don’t require the application programers to use nasty programming techniques such as callbacks or mutexes/semaphores. Instead let the programs to work in fully linear manner. Callbacks and semaphores are features that belong to the kernel code, not to the applications.
o Use the kernel’s device file interface (open, close, read, write, ioctl, select/poll) instead of some library interface. This interface uses very weak binding which makes it possible to use OSS enabled applications in systems that don’t have OSS installed (of course sound doesn’t work but the other features of the application can still be used). Library based applications in turn don’t even start if the required sound library is not installed (unless the program uses nasty dynamic linking to load some sound plugin).
o Let the applications to only do things they should do. For example try to prevent ordinary audio applications from changing the global output volume (that would disturb all the concurrently running applications). Instead provide a way to change just their own output level.
o Provide full backward and forward compatibility of the API. Applications developed for any earlier versions of the OS API should run without recompile under the very latest OSS version. Equally well applications developed and compiled for the latest OSS version should run under any earlier OSS version. This is possible if the application designer uses the suggested default actions when the new features are not available in the older system (ioctl returns errno=EINVAL).
o Keep the API endianess neutral. Applications should work without modifications if they are compiled under a big endian or a litle endian architecture.
o Make the API 64/32 bit neutral. Applications compiled using a 32 bit compiler should run without any problems under a 64 bit compiled kernel.
This is pretty much all of this. Comments are welcome and I will try to answer them as well as I can. You may wonder why I had only talked about audio but not MIDI. MIDI is entirely different beast and I will discuss about it later.