After new year festivities it’s time to get back to my series of Unity3D tutorials. This time, I’ll show you how to extract the fundamental, or strongest, frequency in a mixed-signal input such as coming from a microphone into Unity3D. Then we’ll look into how you can compare them to notes from a bass or any other instrument.

### Do the Fast Fourier Transform

As mentioned in a previous tutorial, we can utilize Fast Fourier Transform (FFT) to get the frequency data out of a signal. When using Unity3D we don’t have to implement our own FFT function since Unity3D provides us with GetSpectrumData function. To use this function, you pass it a float array with a size that’s power of two (ie. 128, 256, 512) with a minimum of 64 and maximum of 8192 along with a channel to extract data from and a possible window function to increase precision. Now, if we take the MicrophoneInput -script from my previous tutorial and start to build on that, we’ll add a new function called GetFundamentalFrequency, where we first grab the spectrum data to an array. I’ve also defined a variable for the fundamental frequency we are going to calculate later on.

float GetFundamentalFrequency() { float fundamentalFrequency = 0.0f; float[] data = new float[8192]; audio.GetSpectrumData(data,0,FFTWindow.BlackmanHarris); return fundamentalFrequency; }

### Find the bin

Now, we are not really calculating the exact frequency that is strongest in the signal, but we are going to find out the FFT bin that has the strongest signal. We do that by iterating through the data and keeping track of the signal level in the loudest bin. We do that by using a simple loop and a couple of temporary variables, s will keep the strength of the strongest signal and i will keep the index of the bin where that signal was found.

float s = 0.0f; int i = 0; for (int j = 1; j < 8192; j++) { if ( s < data[j] ) { f = data[j]; i = j; } }

### Calculate the frequency

In order to get the frequency, we have to do some maths. Since the precision of FFT depends also on our sample rate, we must take this into account. Earlier, I wrote a post about the FFT and it’s precision so you might want to check that out too in order to get the details. But the formula we are using to calculate the frequency of the bin we found that was the strongest, is as follows:

frequency = binIndex * samplerate / bins

As you can see, the precision is dependent on the sample rate and the number of bins (size of array) used in the FFT. After adding that equation to the function, it looks like this.

float GetFundamentalFrequency() { float fundamentalFrequency = 0.0f; float[] data = new float[8192]; audio.GetSpectrumData(data,0,FFTWindow.BlackmanHarris); float s = 0.0f; int i = 0; for (int j = 1; j < 8192; j++) { if ( s < data[j] ) { s = data[j]; i = j; } } fundamentalFrequency = i * samplerate / 8192; return fundamentalFrequency; }

### Putting it together

Now we have a function that provides us with a frequency that is strongest in the signal fed in by our microphone. To combine this properly to the script, we should add a global variable for the sample rate and for the frequency we found so we can access it from other scripts. With these changes, the full MicrophoneInput script should be something like this:

using UnityEngine; using System.Collections; [RequireComponent(typeof(AudioSource))] public class MicrophoneInput : MonoBehaviour { public float sensitivity = 100.0f; public float loudness = 0.0f; public float frequency = 0.0f; public int samplerate = 11024; void Start() { audio.clip = Microphone.Start(null, true, 10, samplerate); audio.loop = true; // Set the AudioClip to loop audio.mute = true; // Mute the sound, we don't want the player to hear it while (!(Microphone.GetPosition(AudioInputDevice) > 0)){} // Wait until the recording has started audio.Play(); // Play the audio source! } void Update(){ loudness = GetAveragedVolume() * sensitivity; frequency = GetFundamentalFrequency(); } float GetAveragedVolume() { float[] data = new float[256]; float a = 0; audio.GetOutputData(data,0); foreach(float s in data) { a += Mathf.Abs(s); } return a/256; } float GetFundamentalFrequency() { float fundamentalFrequency = 0.0f; float[] data = new float[8192]; audio.GetSpectrumData(data,0,FFTWindow.BlackmanHarris); float s = 0.0f; int i = 0; for (int j = 1; j < 8192; j++) { if ( s < data[j] ) { s = data[j]; i = j; } } fundamentalFrequency = i * samplerate / 8192; return fundamentalFrequency; } }

### Now lets figure out what note that is…

Ok, we have the strongest frequency now. If you want to convert that to a note, you need to know the fundamental frequency of that note and compare it to the frequency given by our function. Let’s say we want to know if the note being played is C_{4}, or “middle-C”. If we assume that A_{4} is 440Hz, as it usually is with normal tuning, C_{4} is 261.63Hz. Now all you need to do, is make a simple comparison between that and the frequency you get from the script above. Lets make that into a script, I’ll call it NoteFinder for now and make it display the note in a GUIText component if it is found. The beginning of the script is pretty much the same as the SpawnByLoudness -script from previous post, except for the inclusion of requirement for GUIText component.

using UnityEngine; using System.Collections; [RequireComponent(typeof(GUIText))] // Require GUIText component so we can display a text public class NoteFinder : MonoBehaviour { public GameObject audioInputObject; public float threshold = 1.0f; MicrophoneInput micIn; // Use this for initialization void Start () { if (audioInputObject == null) audioInputObject = GameObject.Find("MicMonitor"); micIn = (MicrophoneInput) audioInputObject.GetComponent("MicrophoneInput"); } // Update is called once per frame void Update () { int f = (int)micIn.frequency; // Get the frequency from our MicrophoneInput script if (f >= 261 && f <= 262) // Compare the frequency to known value, take possible rounding error in to account { this.guiText.text="Middle-C played!"; } else { this.guiText.text="Play another note..."; } } }

### That’s all folks, or is it?

You should now have a system that can detect frequencies and notes for you. You can go ahead and implement different versions of the frequency detection, like divide it to a frequency bands and use their combined loudness value to trigger events in your game. If you however want to detect more notes, you could refer to a table like http://www.phy.mtu.edu/~suits/notefreqs.html for more frequency values.

There is a couple of considerations though. Make sure you select an appropriate sample rate for your implementation. For example, since I want to have a good resolution in the low frequencies, I use sample rate of 11025 and FFT size of 8192. this gives me a bit over 1Hz resolution up to around 5000Hz. Then there is a way to speed up the frequency calculation. Since FFT, by nature, does not give any meaningful information with a real-world signal over the Nyquist frequency, we can ignore the upper half of the bins. So when using 8192 bins, we need to iterate only to 4096 bins, which speeds up the GetFundamentalFrequency loop quite a bit.

Thank you so much for this series on Unity audio. It’s my first introduction to DSP and I’ve loved it! I’ve implemented your code and would like to add a high-pass filter to the microphone data. I don’t have Unity Pro which gives you that filter, but I was wondering if it’s possible to manually implement a high-pass filter on the binned sample data returned by audio.getSpectrumData?

Thanks!

Constructing a filter, be it low-pass, high-pass or bandpass while the signal is in the frequency domain is pretty straightforward, since you can make a crude version by simply just discarding the bins you don’t need. This works fine if you use the leftover frequency data to do stuff in your game, however if you want to play the filtered data we need to get a bit creative, because Unity itself sets a small challenge if you want to move the signal back to time domain. The challenge is to do an inverse FFT since Unity doesn’t provide a ready-made function for it. I’ll dig around my old files a bit and see if I could construct an example or a small how-to on filter design and inverse FFT inside Unity. However, in terms of performance, I must suggest that if you really need a high-pass filter to modify audio that plays during the game, get the Pro version.

This is great! I am completely new to all of this and after lots of searching the web I came across these tutorials which have got me started!

However I am having a problem with the frequency that is being picked up. I have copied the numbers you have (ie. sample rate of 11025 & FFT size of 8192) but when I play a middle C into my laptop mic the Frequency on the Microphone Input script is showing around 120.

The frequency hits 261 half way between a C# and a D an octave above middle C. I can get these numbers to change by adjusting the sample rate but I am guessing that is not the best way to do it and that I am missing something.

Can anyone help with this at all?

Hi I want to construct a high pass filter to cut away all the noise from the recorded sound. How do i do it?

thanks very useful!

Hi can you give me link to download your project, please.

thanks.

You can grab an example project with more sophisticated version of the code used in this old tutorial from Unity Asset Store and support further development of it. Here’s the link: http://u3d.as/8P4