New business model for future OSS development

May 16th, 2008

Some months ago I realized that developing Open Sound System is no longer my profession. Nobody is paying to me so all OSS hacking has become just a hobby (just for fun). I make my living by doing contract development for Sun and very few other paying customers. However that doesn’t cover the work I have planned to do to develop OSS as an independent product for everybody. This means that my focus will be in keeping customers like Sun happy (by developing features they need) and in developing features what I like to do for my personal needs.

At the same time users of OSS are struggling with problems related with their (hdaudio) laptops. Fixing them will require hours or days of work for each system. In addition to do the work I would need to buy each laptop and pay it from my own pocket. Why should I do that when nobody is paying anything for OSS. Users of OSS seem to be just wanting free support from me without paying anything for that.

Recently I bought a decent iMac system because I decided to move from Windows to Mac (I’m a amateur photographer and use Photosshop a lot). I can use this machine to fix some OSS problems related with Mac machines but this is justified because I have other use for the machine. In addition I would like to use OSS under Solaris in this system myself.

However there are many OSS users with many different kind of laptops (Lenovo, Sony, Toshiba, etc). What should I do with them? Do I just go and buy all the laptops from my own pocket? No. I don’t need any more laptops for myself and in addition my salary is will not be enough to buy any more of them.

So what to do with users of such laptops. I don’t care. I don’t have any use for such laptops. Developing hacks for more hdaudio systems is even not fun (it’s lamest of all lame jobs). Should I just say fuck off to all these users? After all being nice to them will not improve my life in any way. I just have to waste my spare time in doing something I don’t want to do. By not doing that the competiveness of OSS will degrade. However does it really matter because I will have to find some other job anyway? Is this the only choice? I hope not.

What I have been thinking about is a new business model for OSS. If 100+ owners of some laptop (say Thinkpad T61) would like to donate something like $10 for the project then I can pay a T61 laptop and cover the required work by that money. In this way it will be profitable and I don’t need to cover the work from my just-for-fun time budget (which will never ever cover the price of a T61 laptop I don’t need). I just need to spend some additional time in developing an auction web site for this purpose (which is actually fun).

What do you think guys/galls? Comments are welcome.

There is something wrong in the state of Open Source

November 7th, 2007

The source code for our Open Sound System product was “open sourced” under GPLv2 and CDDL about three months ago. There were several reasons why we did so.

  1. Third party driver developers can contribute their changes to the common OSS source tree.
  2. We wanted to give our customers a guarantee that they can keep using OSS even 4Front Technologies goes out of business.
  3. By having the source code our customers can customize the code for their purposes.
  4. When large number of developers can see the source code they may find bugs that might otherwise stay undetected for years.
  5. We wanted to make OSS as an acceptable sound subsystem solution for all Unix/Linux/POSIX/whatever compatible operating systems.
  6. We wanted to promote an open source software concept where everybody who uses open source software from other companies/developers also contribute their own development to the community.
  7. We wanted to do all the above in a way that makes it possible to us to continue our work as professional software developers.
  8. We wanted to continue to sell OSS as a commercial product to the customers who want to use OSS from their proprietary/inhouse software.

We decided to release the source code of OSS under two commonly used open source licenses that were GPLv2 and CDDL. We would have preferred just one license scheme but out of all ones these two provided widest coverage of the open source software market. These two licenses are compatible with all the different “open source” licenses so far. We didn’t release OSS under licenses like BSD because we feel they are “half closed source” licenses.

Did this licensing model work? The answer is yes and no. The success story is that we have already got several talented developers to contribute their work back to OSS. What was disappointing is that open sourcing actually decreased our sales significantly. We are in unfortunate position of providing only device drivers that are supposed to be free.

So for us open sourcing means kissing goodbye to any hope of revenue. The customers who previously bought OSS licenses don’t do it any more. They think that having the source code means that they have at least equal rights than we had.

So what can we do? Developing OSS is no longer profitable which means we should find some other way to feed our families. Who will then continue developing OSS for you?

You may thing we could start selling OSS T-shirts to fund the development. Do you have ever bought any T-shirts to sponsor some open source project? At least I have never done so.

I tried to add some Google ads to our open source WEB pages and to the OSS Programmer’s Guide (which have about 2000 hits/day). In two weeks I have got exactly $00.00 in that way.

So does anybody have better ideas?

To GPL or not to GPL, that is the question

June 12th, 2007

Why is this subject so difficult? I started writing this blog entry about a month ago but I’m still here. Somehow everything I have written so far doesn’t make any sense when I look it next day.
Ok. We have made the decision to open source OSS and the announcement will be made on June 14th 2007. The announcement was supposed to be kept secret but we realized that members of the open source communities may want to give some feedback before the announcement. So the secret was leaked which proved to be the right solution.

The original idea was to release OSS under CDDL for Solaris/OpenSolaris and GPLv2 for the others. However as we expected the BSD communities didn’t like this so they will get OSS under CDDL too (how much I hate these silly open source license wars).

In addition we will announce our intention to move further development of Open Sound System to a community.

The license is really GPLv2 instead of LGPL or the Linux kernel license. This may look stupid but actually it’s not. The beauty of GPLv2 is that the license permits use of OSS only with operating systems and applications that are also released under GPL.
In this way the source code is available under GPL for everybody. Organizations and users using GPLed applications under GPLed operating systems (kernels) can use GPLed OSS for free. Just the companies and organizations using OSS in non-GPL compatible environments are required to buy the commercial license.

Some open source operating systems like OpenSolaris and the BSD variants are not compatible with GPL. We will make OSS available also under the CDDL license for these operating systems. CDDL makes things more complicated than with GPLv2 alone. However I think the benefits will be greater than the possible problems.

I think this licensing policy is good for everybody. The open source community will now have access to the source code after 10 years of closed source period. Users of closed source applications and operating systems can still use OSS and it’s source code if they purchase the commercial license. We as the initial developer of the software will still have some chances to continue development of OSS as professional developers. More details about the licensing will be published after the official announcement.

Why do we do this?

First of all we have recognized the benefits of open sourcing to the user and software developer communities. We also believe that with the help of the developer community we can make OSS better than it’s now.

Another reason is that our potential customers currently expect to have access to the source code.

What does “open” mean in Open Sound System?

April 23rd, 2007

Open Sound System has been critizised because it’s not free, open sourced, GPLed or whatever. People claim it should not be called open because it’s in fact closed.

But what do “open” and “closed” mean after all?

During the early years of computing all software was written in assembly. There was no concept of operating system. Instead all applications (such as accounting software) accessed the hardware directly. The hardware had to be fully documented since otherwise nobody would not have been able to use it. In fact most (if not all) computer manufacturers shipped some software (in assembly language) with their hardware. So the situation was “all open” and also “all open source”.

Soon after that some new computer manufacturers entered the market. Since existing customers had already invested truckloads of money to their assembly software the new competitors had no other choice but to clone the original instruction set and hardware interface. The concept of “IBM compatibility” was born. Without compatibility it would have been too expensive for the old big customers to change their hardware vendor.

Few decades later the situation was different. Proper operating systems and programming languages had been invented. So 100% hardware compatibility was no longer necessary. It was possible to recompile the applications in some new system and get it running with reasonable cost. So being “open” was not the only choice.

Some systems even hidden all the hardware (even CPU instruction set) from the user and the software was implemented on top of some interpreted language (much like Java works today). The HP250 computer by Hewlett-Packard (see http://www.decodesystems.com/hp250.html) comes to my mind. It’s probably not the only or even the first system of this kind (I just have seen it). However all applications written for it were written on HP BusinessBasic. Nobody knew what kind of hardware architecture it had. A friend of mine managed to break behind the Basic interpreter and got a feeling that it was actually a 32 bit system (HP’s “bigger” HP3000 machines were still 16 bit systems during that time). However this was an example of “closed”.

In 80’s many (if not most) computers were “proprietary” and “closed”. They were running some operating system or BIOS that was not available for any other hardware. This was true with the first IBM PC too. IBM PC was MS-DOS based and Microsoft licenced DOS to some other computer manufacturers such as HP. However IBM’s BIOS was “proprietary”. During mid 80’s I used HP150 microcomputer that was also MS-DOS based. However it had it’s own (again “proprietary”) firmware that was called AGIOS (or something like that). Both HP150 and IBM PC were running the same MS-DOS which was character oriented. For this reason all applications had to use some firmware functionality to be able to do anything usefull such as graphics. So “IBM compatibility” got a new meaning. IBM compatible applications could be used only in IBM PC because they required services of the “proprietary” BIOS that IBM refused to license to anybody else. Fortunately Phoenix managed to reverse engineer the IBM BIOS and companies like Compaq and HP were able to produce their IBM compatible clones. Since that the PC was “open” again.

Then Windows was invented by Microsoft (btw, Windows 1.0 was originally shipped as a co-product of Microsoft mouse). Also Apple released the original Macintosh. Both products were “closed” again. MacOS was only available for Macintosh computers made by Apple. Windows was only available for MS-DOS. And so on…
Soon after that Unix started to gain more and more attention. It was considered “open” because AT&T licensed it to every computer manufacturer who wanted to license it. However every Unix vendor started to make their own extensions to it and soon the situation was approaching “closed” and “proprietary” again. Each vendor had their own windowing system running on top of X11 so GUI software written for one Unix variant was unusable under the others.

So several Unix manufacturers joined together in Open Software Foundation (OSF) and their developed the Motif GUI toolkit based on HP’s design. They even started to develop a common Unix version called OSF1 but after all only Digital used it in their Aplha machines (now it’s called Tru64Unix). Also Sun left outside OSF because they had developed yet anothe GUI toolkit that was incompatible with Motif (and better IMHO). And the story is getting too boring and I have to stop…

The “open” in Open Sound System comes from the above context. It means that the software is fully documented and any company can license it (or create their own implementation based on the specification). In fact this has happened in Linux and FreeBSD that both have their own implementations of the OSS API. In addition some companies like Sun and SCO have already licensed OSS for their operating systems.

What happened to MIDI/sequencer?

April 8th, 2007

Once again we need to return back to the history about 15 years ago.

I was very enthusiastic about the Yamaha OPL(2) FM synthesizer chip found on my SB card when I started woring on the Sound Blaster driver for Minix and Linux. I soon realized that it was not possible to maintain precise timing in application level because both Minix and Linux (at that time) had rather bad scheduling latencies. An application that written the FM chip register directly (using some ioctl calls) couldn’t play any MIDI files with acceptable timing precision.

The solution was to move the timing stuff to the kernel space where the (100 Hz) timer interrupts worked much better. So the /dev/sequencer interface was born. In the beginning this interface was only used to drive the FM chip. Then few months ago the Gravis Ultrasound card was released which was even more exciting. I quickly hacked the /dev/sequencer API to support the hardware wave table engine of Ultrasound and added yet another set of ioctl calls for patch management. In addition the API was expanded to support MIDI (serial) interface ports too. Also support for the better OPL3 FM chip (4 operators instead of just two) was added.
From the very beginning the sequencer API had an event queue where the application could write 4 byte event records such as instrument change, note on and note off. The event queue was read by a timer callback that forwarded the events to the actual device and removed them from the queue. Timing was “great” and the result sounded good. I even did a “gmod” program (or whatever it was called) that played module files (.MOD) using Ultrasound’s wave table engine. That was a great improvement because my mighty 386/25 PC was able to play nice music without hiccup while compiling the kernel in another VT. Similar player was soon implemented for MIDI by Greg Lee.
The problem was that now there were 3-4 slightly different variations of the sequencer API (OPL2, OPL3, Gravis and MIDI ports) that all required slightly different handling in the applications. The solution was to create another version of the sequencer API that was called /dev/sequencer2 and later /dev/music). The idea was to expand the event records to 8 bytes which was enough to distribute MIDI messages to several synthesizer devices and MIDI ports at the same time. The MIDI port driver the converted the events back to MIDI while the hardware synth drivers played them directly. I also got added stpport for the Music Quest MQX-32M MIDI adapter that supported “MPU-401 intelligent mode” with features such as SMPTE timing.
The result looked and sounded great and I was very proud about it. Applications still had to support two different “patch loading” methods for different cards but that was solved by adding an interface for external patch management daemons. Finally it was possible that one program can play MIDI files to any device(s) without any device dependent code.

The real problems started few months after that when I tried to document the sequencer/music interface. The concept itself was easy to document. However there were tens of different event types. Each MIDI message (note on, note off, MTC/SMTPE, etc) had one of them. The MIDI message system itself was documented in the official MIDI 1.0 specification published by MMA. However I should have to duplicate all that to explain how to translate MIDI messages to the sequencer events and vice versa. Due to lack of time I gave up but promised to write the documentation later.

Then few years later I was writing another version of the OSS API documentation for OSS 3.8 and again the job stalled at the same point. I got something written down but then I had to give up. Once again I just promised to complete the documentation later. Unfortunately developers coudn’t do anything usefull with the interface without documentation so it didn’t get widely supported.
Finally couple of years ago I was working on the OSS 4.0 version and noticed that no applications were actually using the sequencer API any more. All Linux MIDI applications were already using ALSA so the OSS sequencer API was no longer active. There was no point in keeping the current sequencer stuff in OSS and we decided remove it and rewrite it from scratch. At the same time we could fix the known problems in it.

But what should be actually be done with that stuff? The main problem was apparently the documentation and we were thinking about hiring a technical writer to do that part. But there was something else that should have been redesigned before that. But what should be changed and how?

There were some major problems that were pointed to us by some MIDI developers before they jumped to the ALSA boat:

  • Only one application was able to use the sequencer at the same time. That made it impossible to implement applications like software synthesizers.
  • The API was said to be “playback only”. It was not suitable for interactive performances because echoing live keyboard events to the output devices caused hanging notes. Btw, the reason for this was actually not the API design but a bug in the parser that translates incoming MIDI messages to OSS sequencer events. It didn’t handle running status correctly. We noticed this problem later when the same code was used somewhere else in OSS.
  • Because there is only one timing queue it’s possible that flood of events going to one device may delay events sent to the other (non-flooded) ones.
  • Some software based synthesizer engines (such as the SoftOSS virtual wave table engine of OSS) have greater latencies than many hardware based implementations (such as FM and wave table chips). It is necessary to compensate this by sending events to such devices few microseconds before the scheduled time. In this way the notes will play simultaneously on all devices.

These problems should have been easy to fix while rewriting the code. In addition all the supported synthesizer chips were out of production (they were ISA based) so the amount of code to rewrite was not that big. However something looked to be seriously wrong and I couldn’t realize what it was.

Finally on one stormy night I realized it: The problem was not just the documentation. The whole /dev/sequencer concept was seriously bogus. It looked good some ten years earlier when it was designed. However the problems the previous scheme tried to solve were all gone with the old ISA cards. Fist of all the MIDI streams were first converted to the sequecer events which were then converted back to MIDI before sending them to the output device. And there was even a bigger issue.
When somebody is writing a new MIDI application he/she is (at least supposed to be) familiar with the MIDI 1.0 specification. In addition to the official spec MIDI is also documented in many books and on many internet sites. The sequencer concept requires that the developer knows how to map the MIDI messages to the sequencer events and vice versa. This makes the development process rather difficult if the application should support some advanced MIDI concepts such as MMC. In particular the sequencer scheme makes it difficult to port MIDI applications from the other environments that use APIs derived from the plain MIDI. So it didn’t make any sense to rewrite the sequencer API hoping that somebody will manage to document it in the future.

Instead what we are working on now is a different MIDI subsystem based on plain MIDI message streams with some timing headers. The application simply packs all MIDI messages to be sent at given timing tick to a packet. The packet has a header record that contains the transmission time and some other control information. The MIDI core then appends the records to the output queue of the target device and finally sends them to the device at the right moment. Input is handled in the same way.

There is no central /dev/sequencer device file that would be common to all MIDI/synth devices. Instead each target/source devices have their own /dev/midi# device. An application can open as many MIDI device files as it wants and to slave them to the same timer device (it’s also possible that each MIDI devices use different timers with independent tempo/etc settings if that makes any sense).

It is possible to handle playback of live keyboard input by inserting received MIDI bytes to the head of the queues of the output devices. Such live packets will be played as soon as the target device has played all the MIDI bytes from the currently active (incompletely played) packet. In most situations there are no packets currently playing (the device is waiting for it’s scheduled time) so the delays will be minimal. In the future it’s also possible to implement automatic kernel level echo and rerouting capability on top of this simple API.

Running status is handled so that the application “loses” or flushes the status at packet boundaries. Each packet will be started with full status byte even if the previous message had the same status byte. This doesn’t cause any performance problems. Running status is only needed when large number of MIDI events (say pitch bend changes) need to be sent rapidly to the device. A packet boundary normally means that there is a pause in the output until next bytes need to be sent so retiring the running status doesn’t cause any timing problems. Handling running status in this way makes it possible to insert live packets to the head of the queue. The application just needs to make sure that the MIDI channel used for echo messages is not used by the sequence that is currently playing.

Support for software based synthesizers (wave table, modelling, etc) is handled by a special MIDI loopback driver. Each loopback device pair has two ends. The “server” device is used by the synthesizer application. It can change the device name seen by the “client” side to something descriptive such as “ACME super hyper modelling synth”. The user of the client application (say a MIDI player/sequencer) can pick the modelling synth from the device list and then start playing the MIDI sequence using it (input is also possible if that makes any sense). The server side can report it’s delay level (in microsecods) to the MIDI core so that timed events can be sent to it slightly ahead of time. In this way any latencies caused by the audio side can be compensated.

This is the good news. The bad news is that there is no MIDI support in OSS 4.0. There are few bugs in the MIDI code and we decided to ship OSS 4.0 without it (instead of delaying release of the other parts that were ready). MIDI support will be included in OSS 4.1 (hopefully within this year).

What about ALSA which has superior MIDI/sequencer API?

ALSA’s sequencer API is a ‘clone’ of the OSS’ one with steroids. While they have fixed the rest of the flaws of OSS the design is fundamentally the same. Even with the fundamental problem I mentioned above. It will be interesting to see if they ever manage to get it fully documented (or if anybody will ever care to document anything of it).

In addition ALSA has some “advanced” bells and whistles that make me suspicious. However this is just my understanding and I cannot check the details because there is no documentation available.

  • Their timing model is rather odd. I have to confess but I have not managed to understand how it works. However they seem to use something else than the traditional tempo/timebase style MIDI timing.
  • ALSA has sophisticated MIDI rerouting (alsaconnect) capability that is supposed to be able to route MIDI input from any device/application to the input of any MIDI output device or application. However the practice seems to be that applications bind themselves to some hardcoded devices that prevents this scheme from working.

OSS is dead. Long live OSS!

April 8th, 2007

The question often asked from me is: “OSS is deprecated so why are you still developing and maintaining it?”.

This is a short and simple question. However the answer is not short or simple. First of all we need to return some 10 years back in time.

Once upon a time in Linux there was a tiny sound subsystem called VoxWare (formerly known as the Linux Sound Diver). It was maintained by me and released under GPL for Linux (and under the BSD license for FreeBSD and some other Unix variants). That piece of code was included in the Linux kernel source tree. I was working on the code “just for fun”. However it become too difficult to work on the sound stuff in my spare time at the same time when working on some Windows projects for my living. I was contacted by 4Front Technologies and we desided to make our living with a commercial version of OSS.

Unfortunately it took too long time to find the proper procedure to support the GPL/BSD version and the commercial one from the same source tree. So a well known Linux distribution vendor got irritated and hired another person to create another version of OSS for them (without even asking me to do that). The result was rather different than my plans for the future so I had to quit as the maintainer of sound for Linux.

Since that moment the kernel (OSS/Free) and the commercial OSS versions have been maintained by different teams. Unfortunately the Linux kernel version of the API got frozen to the OSS 3.8 version while we continued the development of the official API. In addition the OSS/Free version was (unfortunately) restructured so that most of the common (device independent) code was duplicated in the individual low level drivers. This made it impossible to keep the kernel drivers up to date with the development made to the official OSS version. The result was that the kernel drivers got frozen to the 3.8 version forever, unfortunately.

Then couple of years later a group of fearless programmers had created an entirely different, incompatible and Linux-only sound API called ALSA. They pushed it to the Linux kernel tree and the old OSS/Free version was declared as “deprecated”. It was supposed to become more advanced that OSS/Free 3.8. It was released under GPL (only) so it seemed to be the right thing. However application programmers didn’t like the ASA API and continued to use OSS instead. It was necessary to declare OSS as “deprecated” to push application developers to support ALSA instead of OSS.

However even that was not enough. Application developers still preferred OSS. This was bad for ALSA because they had to provide OSS emulation. In addition the kernel level OSS emulation bypassed some features (such as dmix) that ALSA has implemented in library level. So the OSS emulation was later implemented in library level. However providing OSS emulation in ALSA caused some side effects. Developers of audio applications still refused to convert to ALSA because the OSS API was still available. So some even more agressive policy was needed.

So far the pro-ALSA Borgs have managed to get Linux distributions to compile most audio enabled applications with just the ALSA plugins enabled (all OSS support is stripped). In some cases the distributions even try to prevent users from removing ALSA and installing OSS by keeping ALSA’s mixer interface busy (the Gnome/GTK mixer appled is immediately relaunched if it gets killed). Or the kernel may have been modified to keep parts of kernel’s sound core included even sound support is completely disabled in kernel’s configuration. “We are the ALSA project. Your system will be assimilated. Resistance is futile”. Has anybody ever heard about “freedom of choice”?
ALSA was officially included in the Linux 2.6.0 kernel that was released for more than 3 years ago (December 2003). If ALSA is as great as they claim then shoudn’t it have completely replaced OSS in all applications during that time? Apparently that has not happened so far. Will it happen during next three years? I don’t think so.

There is a relatively small community of ALSA believers who have written most of the currently available ALSA applications (usually called ALSA this or ALSA that). Older applications still support OSS in addtion to OSS. Some newer ones ALSA-only because their developers have been told that the OSS API will disapper tomorrow. However the ALSA API is still almost completely undocumented (after three years of it’s release) so how can anybody expect that programmers could develop good applications based on it.

A funny detail is that even some key developers of ALSA now suggest that developers use the Jack API instead of alsa-lib (btw, Jack has a fully functional OSS plugin). Somehow this is starting to smell like Emperor’s New Clothes.

Back to the subject. The latest Linux 2.6.20 kernel still has the old and obsolete 10+ years old OSS version included. It’s being killed (for a very good reason). However it looks like we are getting a very long funeral. ALSA too has OSS emulation. In fact there are two redundant versions of it: one in the kernel and another implemented in library level. Both of them emulate only the now obsolete 3.8 API version. This is the dead and deprecated OSS.

However this is not the only OSS. We at 4Front have continued working on Open Sound System for all the past years. It has become the real Common Unix and Linux Sound Solution (CULSS). In addtion to Linux it’s now the official sound subsystem for all the Unix variants (other than MacOS). However for many Linux diehards it’s not an alternative because:

  • It’s not GPLed (yet). Instead it’s a commercial product by some evil capitalist pigs.
  • It’s not in the Linux kernel source tree so it doesn’t exist.
  • It’s being used also by the public enemies of Linux.
  • It’s “binary only”.

For the above reasons the benefits of OSS are widely ignored:

  • It’s based on the widely known Unix/POSIX/Linux device model.
  • It’s fully documented (OTOH some parts of the documentation are still under construction).
  • The API is simple and compact which makes it very easy to use for programmers.
  • It has been there for 15 years so practically all applications already support it.
  • It’s kernel only.
  • It’s designed to work under general purpose operating systems such as Linux and Unix. There is no need to use any special real time enabled kernels (they can be used but it’s not a requirement).
  • The limitations and “idiosyncrasies” referred by ALSA’s marketing propaganda have been fixed years ago.
  • Fully dynamic minor/major device number allocation which permits unlimited number of audio/MIDI/mixer devices.
  • New device naming that makes applications immune to changes in the device configuration (installing and removing devices).
  • Transparent virtual mixing that makes it possible for any number of applications to share the same physical audio device(s). This also works for recording and full duplex.
  • Powerful device enumeration support.

Then we have ALSA which is:

  • Not documented. Use the Source, Luke!
  • The API is not compatible/similar with anything else (past, present or future).
  • Very thin device abstraction.
  • The API is designed for low/zero latency which makes it very challenging to use in normal applications that don’t have any latency requirements.
  • Requires redundant layers libraries in addition to the kernel space code (alsa-lib, Jack). This causes increased memory requirements in embedded systems.
  • Has enormous number of functions (1500+ couple of years ago). Majority of the calls have not been used by any applications (even many applications use different functions than any others). Massive number of unnecessary library functions increases the memory footprint even further. And what about the CPU consumption? And will anybody be ever possible to document (or even test) all of them?
  • There are multiple (redundant) transfer methods for audio. How does the programmer know which one should be used with given hardware?
  • Some devices use interleaved channels (for stereo and multich) while some others use non-interleaved.
  • Static minor number assignment that causes waste of the available device/card space. Number of cards, devices and subdevices possible in the system is limited.
  • Strange configuration file mechanism that requires degree in LISP programing to understand it.
  • Sharing of devices is based on the dmix feature that nobody but experts can configure properly.
  • The API is based on callbacks which requires deep programming knowledge from the developers. Gotos have been considered harmful for decades. Callbacks are even worse (in fact they are a re-incarnation of the famous come-from statement).

So which one should be declared as deprecated? As we are talking about APIs the right authority to make the decision are the application developers. They have their “freedom of choice”.
Actually it’s not nice to compare OSS against ALSA in this way so I don’t continue any further. However they have done the same for years (see ALSA’s web page (before they remove that stuff)). So I coudn’t resist. At least we have given them three years of time to discover and fix the above problems but nothing seem to have happened. And I didn’t even mention MIDI yet. Maybe I should do it next…
Regards,

Hannu

Why OSS is OSS?

April 5th, 2007

Hi again guys!

Over the past years OSS has been criticized because it doesn’t support this and that feature of this and that sound card. Instead it limits applications to some common subset of features found on every sound card. In general this is true. However is it actually a problem?

Using sound in applications is in many ways similar to using networking. For example a web browser simply tells the networking software to connect to a given TCP port of given server (name/address). Networking software then builds the connection and that’s it. Web browsers don’t try to find out which network interface card is installed in the system. They don’t try to find if the device has some hardware parameters that could be tweaked to get better performance. In fact the networking software doesn’t even let them to do that (such changes would ruin the performance of the other applications using the same NIC). Instead the NIC settings are managed by kernel’s network core/drivers which have better knowledge about the local network.

This is known as the “black box” model. A web browser application sees only the networking subsystem box. There may be some control switches on the front panel of the box. However they are related with the current stream/socket. The actual NIC devices are located on the back side of the box. There may also be some control switches for the devices but the application cannot see them. Instead they can be used by the system operator (if necessary).

The OSS API is based on this black box approach too. The application doesn’t need to worry about the device parameters at all. Instead it just tells the OSS box to create an audio stream with given parameters (sampling rate, number of bits and number of channels). After that the application programmer can focus on doing his/her job. OSS will automatically perform the required conversions in software if the device doesn’t support the requested format itself. When the stream is running the programmer can use the usual POSIX/Unix/Linux system calls such as read(), write() and select()/poll() to feed more data.

The black box model makes OSS audio applications very simple and robust. About 95% of applications don’t need to use any advanced techniques to get audio working (unfortunately this seems to be very difficult to accept by the developers). Practically all audio applications I have examined do completely unnecessary things that don’t make anything better. In fact in many cases such applications will simply break in systems that have slightly different sound cards (usually some of the high end professional ones).

Of course there are special applications that need to be fully aware about the hardware details. For example it doesn’t make any sense to send encoded digital bit streams (such as AC3) to an ordinary analog output device that cannot decode them. Or an audio recorder/editor application probably should prevent the user from recording a 24bit/192kHz/5.1 file from a (modem) device that supports only 8bits/8kHz/mono. However such applications are rare (about 5% of all applications doing audio). They are usually audio oriented and designed by programmers with good audio knowledge.

Majority of all applications doing audio require just very limited number of features from the sound subsystem. Applications that require anything more are rather rare. For this reason the OSS API has been divided to a small subset of fundamental core functions that are easy to use and to a wider set of less frequently used functions. This makes it very easy to learn and master OSS. At the same time programers seeking for some challenge can get it.

The above wast the first half of my “mission”. The second half is that I was introduced to Unix in 1984. I have always liked it’s powerfull capability to do mighty things with just a small set of carefully selected features. I have also done some MS-DOS (since the introduction of the HP150 microcomputer) and Windows programming during my life before Linux but I never liked that.

I think the above explains why OSS become what it is now. There are some fundamental rules I have tried to follow as much as possible:

o Use the familiar Unix/POSIX/Linux device/file API as much as possible. There are millions of programmers who already know how to use this interface. Developing an entirely new interface that is not compatible with anything earlier will just cause massive confusion and make it difficult to get programs working with the good old legacy API.

o Keep the number of features as small as possible. In this way the API can (hopefully) be properly tested and documented. Compact API is also easier to understand by the application developers. Add new API features only if it’s absolutely necessary. In this way OSS currently has something like 200 ioctl functions. Even this may be too much but fortunately most applications need just a handfull of them. There are good chances that we can actually document and test all these functions within reasonable amount of time. Compare this to some competing API that has some 1500+ (and counting) functions. I would be very surprised if all of them ever get documented (or even used by any applications).

o Don’t require the application programers to use nasty programming techniques such as callbacks or mutexes/semaphores. Instead let the programs to work in fully linear manner. Callbacks and semaphores are features that belong to the kernel code, not to the applications.

o Use the kernel’s device file interface (open, close, read, write, ioctl, select/poll) instead of some library interface. This interface uses very weak binding which makes it possible to use OSS enabled applications in systems that don’t have OSS installed (of course sound doesn’t work but the other features of the application can still be used). Library based applications in turn don’t even start if the required sound library is not installed (unless the program uses nasty dynamic linking to load some sound plugin).

o Let the applications to only do things they should do. For example try to prevent ordinary audio applications from changing the global output volume (that would disturb all the concurrently running applications). Instead provide a way to change just their own output level.

o Provide full backward and forward compatibility of the API. Applications developed for any earlier versions of the OS API should run without recompile under the very latest OSS version. Equally well applications developed and compiled for the latest OSS version should run under any earlier OSS version. This is possible if the application designer uses the suggested default actions when the new features are not available in the older system (ioctl returns errno=EINVAL).

o Keep the API endianess neutral. Applications should work without modifications if they are compiled under a big endian or a litle endian architecture.

o Make the API 64/32 bit neutral. Applications compiled using a 32 bit compiler should run without any problems under a 64 bit compiled kernel.

This is pretty much all of this. Comments are welcome and I will try to answer them as well as I can. You may wonder why I had only talked about audio but not MIDI. MIDI is entirely different beast and I will discuss about it later.

Blogging starting up - please wait..

March 22nd, 2007

Why am I blogging?

The reason is that I started working on OSS almost exactly 15 years ago. I had got a Sound Blaster 1.5 card few months earlier and decided to write a Minix driver for it. I knew that there was something called Linux (I had seen the famous message by Linus on comp.os.minix and I was studying at University of Helsinki at the same time with him). However it took several months before I finally got Linux to work with my SCSI disk (due to some IRQ assignment problems).

So I got the Minix driver working after few weeks of hacking. For obvious reasons it’s performance was not acceptable and playback was just clicking. Later in the summer I had got Linux installed and the driver converted for it. The initial version of the Sound Blaster driver for Linux was released at the end of August 1992. This was the beginning of my never ending odyssey with Linux/Unix sound.

Why sound and why Linux/Unix?

The story started about 15 years earlier during late 70’s. I heard a radio documentary about computer and electronic music. I was very impressed by the compositions made with Music V. I immediately knew it was something I would like to do. Then more than ten years later I saw a sound card made by some small Singapore based company called Creative Labs. I bought the cards immediately and the SDK couple of weeks later. That was in 1990 or was it 1991.

After some hacking under MS-DOS I found out that it was more challenging to get applications to talk to the card than it was to write the application itself. The SDK required use of techniques such as interrupts or callbacks. That was so lame that I had to start looking for some other approach. I had used Unix (HP-UX) since 1984 but the only alternative that worked in PC (that time) was very expensive Xenix. I had heard about Minux from my friends and decided to try it. Btw, there was some other guy who ordered Minix before me in the same bookstore in Helsinki and I think that guy was Linus (I didn’t know him yet that time).

After all the Linux sound driver got released. Initially it supported only Linux and the 8 bit mono Sound Blaster 1.0/1.5 cards (at that time nothing else was available). Then after a while a stereo card (Sound Blaster Pro) and a 16 bit stereo card (Pro Audio Spectrum 16) was introduced. Bit later came the Gravis Ultrasound one.

About the same time I found out that there were few other PC Unix operating systems such as 386BSD (or was it BSD386), SCO Xenix386 and Novell Unixware. It was natural to expand the Sound Blaster only Linux driver to support multiple sound card architectures and operating systems. The diver got renamed to VoxWare (I didn’t know that some startup company (that was later acquired by Netscape) had registered the same name).

In autumn 1995 I was contacted by Dev Mazumdar who had made a SB Pro (MCA) driver for AIX. Soon after that we decided to join forces and I become an eployee of Dev’s 4Front Technologies. Our initial plan to port VoxWare to AIX and start selling the product for AIX and the other commercial Unix versions. Part of the plan was to continue supporting OSS as an open source project for Linux and BSD. However our first announcement was actually for Linux. We released USS (Unix Sound System) for Linux during summer 1996. Unfortunately that name irritated the owners of Unix trade mark who suggested that we call it as Open Sound System instead. So OSS was born.

Now 15 years later we have finally released OSS 4.0 which is the biggest milestone in the history of the product. That means 10 years of work since the OSS/Free drivers that are (still) included in the kernel source tree were frozen. Majority of the API changes have been developed during past 5 years. It has taken 2.5 years to rewrite the core functionality of OSS to be compatible with the latest interfaces provided by different operating systems.

OSS 4.0 is very close to the original idea I had when I started working on sound drivers. It just took 15 years to get it done. Most of this time was spent on implementing drivers for dozens of sound cards and many different operating systems. However significant amount of stime has also been spent on working with the OSS application developers and on finding out ways for making their applications to work even better. But now it’s finished and OSS 4.0 is what it s. More about that later…