Audio Blocks
Posted: Sep 15th, '15, 15:23
Mark Guzdial asked: "How do I get a sampled sound into a sound buffer that I can then manipulate like in the SoundSynthesis projects? And once I change a buffer, how do I turn it into a sound that I can play?"
To explore sampled sound processing, you should use the low-level sound API in the "Audio" category. (The "play sound" block in the "Sound" category, is just for playing WAV files imported into the sounds tab.)
The audio blocks revolve around the notion of fixed-size sound buffers. The size of those buffers (and the sampling rate) are determined by the "open audio" block. When you invoke "open audio", it starts a sound thread that both reads and writes audio continuously. If you don't use incoming audio, it is just discarded. If you don't write audio, the audio thread just plays silence (a buffer of zeros). The range of sample values is -1.0 to 1.0.
Once the audio thread is running, the "wait for audio" block waits for both the next input buffer and for space to be available in the audio output queue. At that point, "read audio" will return a buffer of samples (an array) or you can output a buffer of audio (the same size as the input buffer) using "write audio" -- or both!
Here is a simple loop that reads and echoes incoming audio:
Note that if you don't use headphones, you'll get feedback because the audio coming from the speakers is picked up by the microphone and appears in the next incoming audio buffer. (The frequency of the feedback depends on the overall delay in the system, which is determined in part by the audio buffer size.) The stop button stops the audio thread, so you can quickly silence feedback or other painful sounds when things go wrong. :-) You seldom need to use the "close audio" block, since the stop button does that.
The "fft" block takes an audio buffer and returns the lower half of the discrete Fourier transform of that buffer (the upper half of the Fourier transform is just a reflection of the lower half). The Fourier transform gives you a breakdown of the frequencies that were present in that buffer of sound. The first element is the constant offset (sometimes called the "D.C." component) which is typically close to zero and, in any case, not usually of interest. The next component is for a frequency that would fit one cycle of a sine wave into a sound buffer. To figure out what that frequency is, you divide the sampling rate by the buffer size. For example, with the default sampling rate of 22050 samples/second and buffer size of 2048, it would be 22050 / 2048, or 10.7666 Hz, which is below what humans can usually hear. The next component is double that, 21.53 Hz, and the following component is three times that, 32.23 Hz. The last component is 1023 times that, 11014.2 Hz. You might use this frequency data to make a real-time frequency graph, drive a music visualizer, or perhaps make a tuner for your guitar or ukelele.
(Note for digital audio geeks: The FFT result buffer contains only the magnitudes of the FFT values, so the phase information is lost. That means you can't use the "fft" block to do a reverse FFT, as one might do in advanced audio applications like sound codecs or vocoders. However, the "fft" block makes it much easier to simply get and use the frequency spectrum for a sound buffer.)
To explore sampled sound processing, you should use the low-level sound API in the "Audio" category. (The "play sound" block in the "Sound" category, is just for playing WAV files imported into the sounds tab.)
The audio blocks revolve around the notion of fixed-size sound buffers. The size of those buffers (and the sampling rate) are determined by the "open audio" block. When you invoke "open audio", it starts a sound thread that both reads and writes audio continuously. If you don't use incoming audio, it is just discarded. If you don't write audio, the audio thread just plays silence (a buffer of zeros). The range of sample values is -1.0 to 1.0.
Once the audio thread is running, the "wait for audio" block waits for both the next input buffer and for space to be available in the audio output queue. At that point, "read audio" will return a buffer of samples (an array) or you can output a buffer of audio (the same size as the input buffer) using "write audio" -- or both!
Here is a simple loop that reads and echoes incoming audio:
Note that if you don't use headphones, you'll get feedback because the audio coming from the speakers is picked up by the microphone and appears in the next incoming audio buffer. (The frequency of the feedback depends on the overall delay in the system, which is determined in part by the audio buffer size.) The stop button stops the audio thread, so you can quickly silence feedback or other painful sounds when things go wrong. :-) You seldom need to use the "close audio" block, since the stop button does that.
The "fft" block takes an audio buffer and returns the lower half of the discrete Fourier transform of that buffer (the upper half of the Fourier transform is just a reflection of the lower half). The Fourier transform gives you a breakdown of the frequencies that were present in that buffer of sound. The first element is the constant offset (sometimes called the "D.C." component) which is typically close to zero and, in any case, not usually of interest. The next component is for a frequency that would fit one cycle of a sine wave into a sound buffer. To figure out what that frequency is, you divide the sampling rate by the buffer size. For example, with the default sampling rate of 22050 samples/second and buffer size of 2048, it would be 22050 / 2048, or 10.7666 Hz, which is below what humans can usually hear. The next component is double that, 21.53 Hz, and the following component is three times that, 32.23 Hz. The last component is 1023 times that, 11014.2 Hz. You might use this frequency data to make a real-time frequency graph, drive a music visualizer, or perhaps make a tuner for your guitar or ukelele.
(Note for digital audio geeks: The FFT result buffer contains only the magnitudes of the FFT values, so the phase information is lost. That means you can't use the "fft" block to do a reverse FFT, as one might do in advanced audio applications like sound codecs or vocoders. However, the "fft" block makes it much easier to simply get and use the frequency spectrum for a sound buffer.)