Audio Blocks

JohnM · Post by **JohnM** » Sep 15th, '15, 15:23

Mark Guzdial asked: "How do I get a sampled sound into a sound buffer that I can then manipulate like in the SoundSynthesis projects? And once I change a buffer, how do I turn it into a sound that I can play?"

To explore sampled sound processing, you should use the low-level sound API in the "Audio" category. (The "play sound" block in the "Sound" category, is just for playing WAV files imported into the sounds tab.)

The audio blocks revolve around the notion of fixed-size sound buffers. The size of those buffers (and the sampling rate) are determined by the "open audio" block. When you invoke "open audio", it starts a sound thread that both reads and writes audio continuously. If you don't use incoming audio, it is just discarded. If you don't write audio, the audio thread just plays silence (a buffer of zeros). The range of sample values is -1.0 to 1.0.

Once the audio thread is running, the "wait for audio" block waits for both the next input buffer and for space to be available in the audio output queue. At that point, "read audio" will return a buffer of samples (an array) or you can output a buffer of audio (the same size as the input buffer) using "write audio" -- or both!

Here is a simple loop that reads and echoes incoming audio:

Note that if you don't use headphones, you'll get feedback because the audio coming from the speakers is picked up by the microphone and appears in the next incoming audio buffer. (The frequency of the feedback depends on the overall delay in the system, which is determined in part by the audio buffer size.) The stop button stops the audio thread, so you can quickly silence feedback or other painful sounds when things go wrong. :-) You seldom need to use the "close audio" block, since the stop button does that.

The "fft" block takes an audio buffer and returns the lower half of the discrete Fourier transform of that buffer (the upper half of the Fourier transform is just a reflection of the lower half). The Fourier transform gives you a breakdown of the frequencies that were present in that buffer of sound. The first element is the constant offset (sometimes called the "D.C." component) which is typically close to zero and, in any case, not usually of interest. The next component is for a frequency that would fit one cycle of a sine wave into a sound buffer. To figure out what that frequency is, you divide the sampling rate by the buffer size. For example, with the default sampling rate of 22050 samples/second and buffer size of 2048, it would be 22050 / 2048, or 10.7666 Hz, which is below what humans can usually hear. The next component is double that, 21.53 Hz, and the following component is three times that, 32.23 Hz. The last component is 1023 times that, 11014.2 Hz. You might use this frequency data to make a real-time frequency graph, drive a music visualizer, or perhaps make a tuner for your guitar or ukelele.

(Note for digital audio geeks: The FFT result buffer contains only the magnitudes of the FFT values, so the phase information is lost. That means you can't use the "fft" block to do a reverse FFT, as one might do in advanced audio applications like sound codecs or vocoders. However, the "fft" block makes it much easier to simply get and use the frequency spectrum for a sound buffer.)

mguzdial · Post by **mguzdial** » Sep 17th, '15, 16:57

I'm working on being able to manipulate samples and then play them again. I've now decided that I want to create new sounds with changed samples, and I think I know how to do that.

But in my explorations, I built a bit of code that I don't quite understand what's going on yet. Here's the code that I'm playing with:

getSamples works. Here's what it looks like:

Code: Select all

to getSamples snd {
  if (isClass snd 'String') {
	// argument is the name of a sound
	proj = (projectForMorph (morph (implicitReceiver)))
	if (notNil proj) { snd = (soundNamed proj snd) }
  }
  if (not (isClass snd 'Sound')) { return }
  allsamples = (list)
  for i (samples snd) {
    add allsamples (i / 32767.0)
  }
  return (toArray allsamples)
}

When I play the blocks code listed above, it plays "anti" of "antidestablishmentarianism." When the buffer size was 1024, I only got a bit of "a." Changing it up to 8000 gives me "anti." But more than 8000 doesn't buy me any more of the word.

Here's my guess what's going on: I'm playing the first 8000 bytes of the sound, and I need to write the next 8000 to get more of the sound. But I still don't get why changing the buffer size higher doesn't get me more of the sound. Any tips?

JohnM · Post by **JohnM** » Sep 17th, '15, 20:00

The underlying audio system has a maximum buffer size of 8192. Thus, even if you ask for a larger buffer size, you don't get it.

To solve this problem, you need to write a loop to play your sound in buffer-sized chunks, doing a "wait for audio" between chunks. You'll need to keep track of the index where the next sound buffer should start. You can use the "copy" block from the "Data" category to copy a buffer-sized subsection of the full sound. (The copy block has optional parameters for a start and stop index.)

Good luck!

mguzdial · Post by **mguzdial** » Sep 25th, '15, 12:56

I wrote a piece of code to play samples.

Code: Select all

to playSamples samples {
  openPortAudio
  len = (count samples)
  i = 1
  while (i < len) {
    bufsamples = (list)
    bufcount = 0
    while (and (bufcount < 1024) (i < len)) {
      add bufsamples (at samples i)
      i = (i + 1)    
      bufcount = (bufcount + 1)
    }
    waitForAudio
    writeAudioData (toArray bufsamples)
  }
  closePortAudio
}

I have a getSamples block that returns the samples from the sound (normalized to +/- 1.0). I use these together like this:

: playsamples.png (6 KiB) Viewed 28492 times

This works! But...it's...really...choppy.

Am I doing something there that's eating up gobs of cycles? Is there a faster way to do it?

I have a workaround now. I've successfully turned samples back into a sound, then just used the soundplayer. That works, but I'd like to understand what I'm doing wrong here. Thanks!

JohnM · Post by **JohnM** » Dec 14th, '15, 12:35

I guess you've already figured this out. The problem is that the default buffer size is 2048 and playSamples is playing buffers with only 1024 samples, so the other half of each sound buffer is just silence. The result is a very fast alternation between short chunks of your sound and short chunks of silence resulting in the choppy sound you report. You could request a sound buffer size of 1024 by changing the first line to openPortAudio 1024. However, even if you do that, it is good practice to get the actual buffer size using audioBufferSize and supply buffers of that size to writeAudioData, since some platforms may have restrictions on the allowable buffer size, perhaps restricting it to a power of 2 or to some range. (On Mac OS, for example, the maximum buffer size is 8192.) In other words, the actual buffer size may not be what you requested.

GP Beta Test Forum

Audio Blocks

Audio Blocks

Re: Audio Blocks

Re: Audio Blocks

Re: Audio Blocks

Re: Audio Blocks