<Sound Dev> New thoughts on Java Sound

Florian Bomers javasound-dev at bome.com
Mon Oct 26 13:23:08 UTC 2015


Hi Bob,
thanks for the updated subject.

There is the general problem of "standardized features" vs. "query at
runtime which features are available".

Like most audio API's, Java Sound follows the latter.

As you know, at runtime, you can use the various .Info objects to
answer questions 1 to 6 (pretty much). There is no hardcoded set of
features which are promised to be available on every platform. After
all, it is also an issue of availability of the respective hardware.
An implementer should query the available features before assuming it
works. OR, use a hardcoded audio format, but gracefully handle the
case where it does not work.

I don't say it's convenient to juggle with the .Info objects and many
"instanceof" queries. But it does work as specified in the API...

So I don't think I agree with you that we should be able to tell the
implementers which formats are always available, because it Java Sound
depends on the available hardware, and on the underlying
implementation. Which, yes, might be platform dependent.

Point 7 -- multi channel audio (e.g. 5.1):
correct, this is completely missing from Java Sound. It's unlikely
you'll get more than 2 channels to work on a mixer.

Point 8 -- Controls!
Again, this is an area which works as specified, but which is
extremely inconvenient for the implementer -- up to the point to be
unusable. You cannot only blame Java Sound here -- for example, the
Windows Mixer implementation is one of the most complicated API's I
have ever worked with. Note: the Java Sound Controls implement what is
called "Mixer" in windows (i.e. "gain" of the various lines).

Hacking Coder
The hacking coder is always a problem! However, I agree, the more
complicated or non-compliant to industry standards, the more likely a
hacking coder will get it wrong.

Use Case
I guess the main problem is the different use cases. E.g. for a game
or media playback, you want multi-channel playback where available,
and down conversion where not.
For a DAW, however, you most likely don't want auto-conversion
(because it reduces quality and increases latency).

New Java Audio API?
I doubt there is enough interest to start a JSR for an entirely new
Java Sound. Although it would be great!

But it is possible to create a new modern audio API in pure Java on
top of Java Sound which implements your models. It would take care of
the different available mixers and audio formats, and do the
conversions as needed.

Only for 5.1 and other multi channel models, there is improved native
support necessary. The AudioFormat.properties() can provide the
missing link, e.g. the properties could contain a map of audio
channels, so that there is no change of the Java Sound API necessary.

Some more cents from me...
Florian


On 24.10.2015 14:18, Bob Lang wrote:
> Hi Florian
> 
> I agree that getting sound out of a clip file isn't too problematic, but the portability problems arise when you want to do something more ambitious, and on reflection I think many of these issues arise over the vague definition of what should be a mixer - or maybe, what should be the default mixer. For example:
> 
> 1. A clip should be short, but how short?  Or to put it another way, how long is the maximum length of a clip?  10 seconds, a minute, 5 minutes?  
> 2. How many clips can I open at once? 8? 16? 32? Is there a pool for clip data, if so does have a maximum length?
> 3. What file formats should my clips be in? What sample rate? What sample size? 
> 
> Each time something is left undefined, then some implementer will make his or her own decisions about what is reasonable or practical. Let's move on:
> 
> 4. I can open a SourceDataLine for dynamic audio output, but what sample rates are accepted? We can probably rely on 44100 and 48000, but what about 96000 or even 192000?  And how about lower rates? Can I reasonably expect a Mixer to play at any sample rate between (say) 2000 and 48000?  
> 5. A similar set of questions for SourceDataLine sample sizes ...?
> 6. Can I open two simultaneous SourceDataLines and write to them both simultaneously? Can they both have different sample rates and sizes?  What about the relative volume of the two lines?
> 
> Of course, it may well be that the default mixers all agree on these points, but if the spec is vague and undefined then there is always potential for a programmer to write and test a program on (say) Windows and discover that the default mixer on a Mac is different. This might happen down the line, in 4 years time.
> 
> There are some parts that are completely undefined:
> 7.  If I open a SourceDataLine with two channels then I'm pretty certain that I should buffer the data for the left channel first, followed by the right.  But what about 5.1 data?  What order does that get written in?  Is it fixed between platforms?  Or even between different sound cards?
> 
> And there are parts that are so wacky that only the cognoscenti can get to grips with them:
> 8. Controls! I probably don't need to say any more, but I will.  For any entity there may be a set of controls. I can get hold of any array of these controls by making some call or another, and then if I can figure out what each control actually is, I can adjust it in my code.  Whoever designed this obviously thought that KISS was meant for other people.
> 
> Ok - I know that there answers for each of the questions I pose in 1-7, but that's not the point. The point is that the answers aren't specified anywhere.  This is where the possibility of non-portability arises.
> 
> There is also the problem of the hacking coder. Someone who doesn't fully understand what's going on but has been told he or she has got three days to get this sound working on this app!  With sufficient trial and error, it's always possible to bodge something together that works in the here-and-now - but heaven help what happens in the future!  
> 
> What I would like to see is a few well defined audio models that we can rely on. Off the top of my head, these might be:
> 1. A stereo model, capable of supporting 16 clips and 8 simultaneous SourceDataLines, at default sample size of 24 bits, and switchable between 44100 and 48000 sample rates. 
> 
> 2. A surround model geared towards 5.1 support. This has one 6 channel (5+1) SourceDataLine, and a further 7 SourceDataLines that can be mapped into the 3D space around the listener. This would mean that a space craft could pass through an asteroid field, and the user would hear each asteroid as it goes past (if there were actually any sound in space!). There would also be 16 clips that can be mapped into 3D space.  (Here's the killer bit!) Opening the surround sound model on a two speaker system would invoke a HRTF mapper that will map the 3D sound into two channels, probably for use using earphones. Opening the model on a computer with a different arrangement of multichannel speakers invokes whatever remapping is necessary to give the same audio effect to the user. 
> 
> 3. I also see we might have a high definition model that will work at 96000 Hz or higher, with bigger sample sizes.
> 
> 4. For non-standard equipment, we retain an updated and improved SPI, in much the same way as we do today.
> 
> Just my two cents
> 
> Bob
> --
> On 24 Oct 2015, at 11:18, Florian Bomers <javasound-dev at bome.com> wrote:
> 
>> Hi Bob,
>>
>> I do agree on a higher level, but not for this particular case.
>>
>> Improving an established cross platform API (like Java Sound) is never
>> easy because it must stay compatible for the users. With "users", I
>> mean programmers using Java Sound, and their users. Here, on this
>> list, we discuss the underlying implementation. Often enough it's
>> painful and complicated to still fulfill the spec of 15 years ago. But
>> the upside is that these efforts guarantee that using Java Sound
>> remains the same and even 15 year old JS programs still work the same.
>>
>> In this particular case (the NPE thing), I don't think anything is
>> broken: the "old" implementation works and is "according to
>> specification". I just want to ensure that it does not open a loophole
>> to become broken!
>>
>> On a side note, it is very important to protect users from poorly
>> programmed, or broken, or malicious plugins (i.e. SPI's). That's why
>> we catch the NPE and don't pass it on to the unsuspecting user.
>>
>>> Personally, I've always thought the concept of a mixer is
>>> flawed, because it's inherently a non-portable concept, (...)
>>
>> Hmmm, I cannot follow here. For me, only the name "Mixer" is
>> non-portable. If you think of it as "AudioDevice" or "AudioCard" than
>> Java Sound is pretty much the same as any other audio hardware
>> abstraction. The same is true for the naming of SourceDataLine
>> ("OutputStream") and TargetDataLine ("InputStream").
>>
>>> When I write a Java Sound program, I want the same level
>>> of simplicity.
>>
>> It seems to me that you imply that you need to implement an SPI in
>> order to play back a sound? That would be horrible indeed!
>>
>> But, as I'm sure you know, if you need simplicity, just get a Clip,
>> load a file into it, and play: 3 simple lines of code. If you want to
>> do, for example, low latency VoIP or a software synthesizer, things
>> become a bit more complicated, but that's the same for any audio API
>> I've worked with.
>>
>> For me, it's important that the simple things are /simple/ to do, and
>> the advanced things are /possible/ to do.
>>
>>> Is it time to freeze and deprecate the existing Java Sound, and
>>> start again with a new design, with cross platform portability as
>>> its major aim?
>>
>> The idea is tempting: a modern audio API in Java. But frankly, when
>> thinking about it, it would most likely boil down to little more than
>> JS class renaming/consolidating/cleaning. If you look at other audio
>> API's, you'll always find the concepts of device, stream, audio
>> format, file, codec -- just as in Java Sound. The main difference is
>> that JS uses strange naming and overly complicated arrangements...
>>
>> What I don't see at all is the missing platform portability in Java Sound?
>>
>> Thanks,
>> Florian
>>
>>
>> On 24.10.2015 02:12, Bob Lang wrote:
>>> On 23 Oct 2015, at 19:56, Florian Bomers <javasound-dev at bome.com>
>>> wrote:
>>>
>>>> Hi Sergey,
>>>>
>>>> I guess you're right and the second loop will never be executed
>>>> if we will always have the default mixer providers.
>>>>
>>>> Removing the NPE catch clause, however, will still cause a
>>>> backwards incompatibility, because if a poorly programmed
>>>> MixerProvider gets installed which throws NPE for whatever reason
>>>> (might also happen when "info" is non-null), now
>>>> AudioSystem.getMixer() will throw NPE, where it previously
>>>> worked.
>>>>
>>>> I agree that it's harder for debugging mixer providers if NPE is 
>>>> ignored. Other than that, I don't see any problem with keeping
>>>> the NPE catch for backwards compatibility's sake. Even if just
>>>> theoretical... But you never now, companies might be using poorly
>>>> programmed in-house software or the like.
>>>>
>>>> Thanks, Florian
>>>
>>> So, it's broken if you do the right thing and it's broken if you
>>> don't??
>>>
>>> There comes a point in any software project where years of
>>> cumulative amendments, fixes and modifications make the code so
>>> fragile that it's no longer modifiable.
>>>
>>> Personally, I've always thought the concept of a mixer is flawed,
>>> because it's inherently a non-portable concept, which should be
>>> anathema in a language that has portability as its main goal.  The
>>> Mixer SPI only works when the programmer writing the code actually
>>> understands what's going on and the implications of any choice -
>>> and frankly, how often does that happen?  When I write a graphics
>>> program, I don't have to worry about the specific capabilities of
>>> the specific video card on the user's specific computer - all that
>>> detail is (quite rightly) hidden from me.  And because it's hidden
>>> from me, my program works on any desktop/laptop computer. When I
>>> write a Java Sound program, I want the same level of simplicity.
>>> It should be possible, because sound is inherently simpler than
>>> video - yet Java Sound makes it far more complex.
>>>
>>> Is it time to freeze and deprecate the existing Java Sound, and
>>> start again with a new design, with cross platform portability as
>>> its major aim?
>>>
>>> Bob --
>>>
>>
>> -- 
>> Florian Bomers
>> Bome Software
>>
>> everything sounds.
>> http://www.bome.com
>> __________________________________________________________________
>> Bome Software GmbH & Co KG        Gesellschafterin:
>> Dachauer Str.187                  Bome Komplementär GmbH
>> 80637 München, Germany            Geschäftsführung: Florian Bömers
>> Amtsgericht München HRA95502      Amtsgericht München HRB185574
>>
> 
> 

-- 
Florian Bomers
Bome Software

everything sounds.
http://www.bome.com
__________________________________________________________________
Bome Software GmbH & Co KG        Gesellschafterin:
Dachauer Str.187                  Bome Komplementär GmbH
80637 München, Germany            Geschäftsführung: Florian Bömers
Amtsgericht München HRA95502      Amtsgericht München HRB185574



More information about the sound-dev mailing list