-
Notifications
You must be signed in to change notification settings - Fork 2.2k
audio: Added SDL_PutAudioStreamSeparatedData. #12872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
audio: Added SDL_PutAudioStreamSeparatedData. #12872
Conversation
Also, I just plugged this into SDL_remixer, and it's so much nicer to be like, "here, libvorbisfile handed me a bunch of arrays, figure this out for me." :) |
Some suggestions:
|
This is typically called planar audio data, so maybe |
Also, I’m going to play devil’s advocate on both sides here. On one side, it’s pretty easy to interleave the data, especially since the chunks you’re getting may not be the size you want to put into the stream, so maybe we don’t need this? On the other side, you might have audio data coming from different sources, for example the DualSense controller has four channels of audio, two stereo audio and two haptic feedback, so maybe it makes sense to have a function to send each channel separately? (I don’t necessarily think we should do either of these, just raising them for discussion) |
That was my first impulse, but I thought that was a term for image data...apparently it is not. PortAudio uses "non-interleaved". ChatGPT thinks it's "planar". (and because AI is terrible, it had to add a wrong comment about PortAudio and libsndio at the end.) So I'm fine with Planar. |
It needs to be interleaved one way or the other, regardless of size. If the app doesn't want to interleave it all because the chunk is too large, they can hold their arrays and offset into them to interleave more, later. They would have to anyhow. But SDL_AudioStream is designed to absorb data and use it as necessary, so they can often just shove the whole chunk in there anyhow. As for "easy to interleave" ...this is what I had in SDL_remixer for this: SDL_assert(amount <= SDL_arraysize(samples));
const int channels = d->current_channels;
if (channels == 1) { // one channel, just copy it right through.
SDL_PutAudioStreamData(stream, samples, amount * sizeof (float));
} else { // multiple channels, we need to interleave them.
// We have `amount` sample frames, which is <= SDL_arraysize(samples), but at most,
// `samples` is `channels` multiples too small to hold the interleaved data, so
// interleave it and put it to the audio stream in chunks.
const int max_frames = SDL_arraysize(samples) / channels; // most sample frames per chunk.
const int num_chunks = amount / max_frames;
const int chunksize = max_frames * channels * sizeof (float);
int frame = 0;
for (int chunk = 0; chunk < num_chunks; chunk++) {
float *fptr = samples;
for (int i = 0; i < max_frames; i++) {
for (int channel = 0; channel < channels; channel++) {
*(fptr++) = pcm_channels[channel][frame];
}
frame++;
}
SDL_PutAudioStreamData(stream, samples, chunksize);
}
// put the last bit.
SDL_assert(frame <= amount);
SDL_assert(((amount - frame) * channels) <= SDL_arraysize(samples));
const int remaining = amount - frame;
if (remaining > 0) {
float *fptr = samples;
for (int i = 0; i < remaining; i++) {
for (int channel = 0; channel < channels; channel++) {
*(fptr++) = pcm_channels[channel][frame];
}
frame++;
}
SDL_PutAudioStreamData(stream, samples, remaining * channels * sizeof (float));
}
SDL_assert(frame == amount);
} (Note that libvorbisfile only gives you float data; this gets more complicated if data comes in various formats.) libvorbisfile gives you planar arrays of data, and we tapdance to interleave it in a little stack space. We were able to avoid that tapdance inside SDL because we were going to malloc anyhow to store it in the audiostream. So we can just allocate it upfront and use that buffer all the way through instead of allocating, interleaving, allocating again, copying, freeing, freeing. Here's the vorbis code, in my working copy, with the new API: SDL_PutAudioStreamSeparatedData(stream, (const void * const *) pcm_channels, amount); |
A good point. Another thing I ran into is libFLAC gives you 3 channel audio in FL-FR-FC format, as opposed to the usual 2.1 audio of FL-FR-LFE, so you have to move up to 5.1 before you get anything with a front-center channel, and here we can just give it a single silent array to blank out those unneeded channels. (This is a strong argument for the channel mask stuff we've talked about, too, but that's a longer discussion.) |
Example code in action: https://icculus.org/~icculus/emscripten/sdl3-planar-audio/ |
Looks good! |
3db952b
to
d9b205d
Compare
Okay, this has been rebased, so it's ready to merge if the builders turn green. |
Builders are green! Last call before I click the Big Green Button! |
Thinking on this a little more, Sam said:
And I pushed back, but on further consideration, this actually makes some sense in that an app could be free to specify less channels than are needed and SDL could treat the missing range as NULL arrays, instead of the app having to supply them. This would actually be usable today in SDL_remixer's libFLAC decoder. I'll make this tweak before 3.4.0 ships this API. |
Fixes #12846.
I'm not in love with the new function name, but I couldn't think of anything better.
I resisted the urge to plumb the separated channel arrays all the way into SDL_AudioTrack. I decided it would add complexity for not much performance gain, and just interleaving the data upfront was both easier and cleaner.
There's an obvious FIXME for SIMD code. My suspicion is an SSE unpack (or ARM NEON zip) implementation would be a massive speed boost over the generic for-loop, at least for stereo data, but that can be done later, or never, as appropriate.
CC'ing @0x1F9F1, in case any of this is interesting.