IndexOutOfRangeException while running FFTPitchDetector - c#

I'm facing some problems on implementing FFTPitchDetector. What I actually want to do is to get real-time frequency from guitar input, I not so sure how to use the functions in FftPitchDetector.cs. Any idea?
private void sourceStream_DataAvailable(object sender, NAudio.Wave.WaveInEventArgs e)
{
if (waveWriter == null) return;
byte[] buffer = e.Buffer;
float[] floats = new float[buffer.Length];
float sample32 = 0;
int bytesRecorded = e.BytesRecorded;
waveWriter.Write(buffer, 0, bytesRecorded);
for (int index = 0; index < e.BytesRecorded; index += 2)
{
short sample = (short)((buffer[index + 1] << 8) |
buffer[index + 0]);
sample32 = sample / 32768f;
sampleAggregator.Add(sample32);
}
floats = bytesToFloats(buffer);
FftPitchDetector PitchDetect = new FftPitchDetector(sample32);
PitchDetect.DetectPitch(floats, bytesRecorded);
Console.WriteLine("{0}",sample32);
}
private static float[] bytesToFloats(byte[] bytes)
{
float[] floats = new float[bytes.Length / 2];
for (int i = 0; i < bytes.Length; i += 2)
{
floats[i / 2] = bytes[i] | (bytes[i + 1] << 8);
}
return floats;
}
When I execute the code, there's an error IndexOutOfRangeException was unhandled which points to the line
fftBuffer[n * 2] = buffer[n-inFrames] * window(n, frames);
in fftPitchDetector.cs. What is the problem in my code?
Is there any open source code of C# Guitar Tuner? I wish to outsource it into my project.

The issue is that in either fftBuffer[] or buffer[] you are trying to access an index in the array which does not exist.
So if fftBuffer[] has 4 items and n * 2 totals 6 then you're trying to access fftBuffer[6] which doesn't exist.
So basically you need to check how many items are in each array before trying to access their values.
You will need to place checks in your code, for example:
if ((n * 2) < fftBuffer.length && (n - inFrames) < buffer.length) {
fftBuffer[n * 2] = buffer[n-inFrames] * window(n, frames);
}
You first ensure that the indexes you are about to use for each array is not greater then the number of items the array.

Related

Converting 32bit wav array to x-bit

I have a 32 bit wav array and I wanted to convert it to 8 bit
So I tried to take this function which converts 32 to 16
void _waveIn_DataAvailable(object sender, WaveInEventArgs e)
{
byte[] newArray16Bit = new byte[e.BytesRecorded / 2];
short two;
float value;
for (int i = 0, j = 0; i < e.BytesRecorded; i += 4, j += 2)
{
value = (BitConverter.ToSingle(e.Buffer, i));
two = (short)(value * short.MaxValue);
newArray16Bit[j] = (byte)(two & 0xFF);
newArray16Bit[j + 1] = (byte)((two >> 8) & 0xFF);
}
}
And modify it to take x bit as destination
private byte[] Convert32BitRateToNewBitRate(byte[] bytes, int newBitRate)
{
var sourceBitRate = 32;
byte[] newArray = new byte[bytes.Length / (sourceBitRate / newBitRate)];
for (int i = 0, j = 0; i < bytes.Length; i += (sourceBitRate / 8), j += (newBitRate / 8))
{
var value = (BitConverter.ToSingle(bytes, i));
var two = (short)(value * short.MaxValue);
newArray[j] = (byte)(two & 0xFF);
newArray[j + 1] = (byte)((two >> 8) & 0xFF);
}
return newArray;
}
My problem is that I wasn't sure how to convert the code within the "for" loop, I tried to debug it but I couldn't quite figure out how it works.
I saw here : simple wav 16-bit / 8-bit converter source code?, that they divided the value by 256 to get from 16 to 8, I tried to divide by 256 to get from 32 to 16 but it didn't work
for (int i = 0, j = 0; i < bytes.Length; i += sourceBitRateBytes, j += newBitRateBytes)
{
var value = BitConverter.ToInt32(bytes, i);
value /= (int)Math.Pow(256, sourceBitRate / newBitRate / 2.0);
var valueBytes = BitConverter.GetBytes(value);
for (int k = 0; k < newBitRateBytes; k++)
{
newArray[k + j] = valueBytes[k];
}
The for loop is still using 16 bit in the following places:
short.MaxValue. Use byte.MaxValue instead.
by assigning two bytes at [j] and [j+1]. Assign one byte only.
I don't have the rest of the program and no sample data, so it's hard for me to try. But I'd say the following sounds about right
for (int i = 0, j = 0; i < bytes.Length; i += (sourceBitRate / 8), j += (newBitRate / 8))
{
var value = (BitConverter.ToSingle(bytes, i));
var two = (byte)(value * byte.MaxValue);
newArray[j] = two;
}
Be aware that this works for 8 bit only, so newBitRate must be 8, otherwise it does not work. It should probably not be a parameter to the method.

C# .wav file data analysis & draw Freq.Domain graph from it

my first question in stackoverflow...so pls inform me, if i violate the forum rules
i try to analyze the audio .wav data (basically a sinusoid with/without Noise) in FFT & draw the result as a Freq. Domain graph using C#. I need to analyze the sine wave & find whether any noise is present or not for 10 to 20seconds audio data.
Sine wave was generated using Audacity with the following properties.
1KHz Sine Tone .wav data for 15 to 20s.
16Bits per sample
44100Hz
Mono Channel
Here is the code part to separate the Audio data from the Header data.
// --- Open & Read .wav File from the Path
System.IO.FileStream WaveFile = System.IO.File.OpenRead(waveFilePath);
System.IO.BinaryReader br = new System.IO.BinaryReader(WaveFile);
// Convert the length of the file into Byte value.
waveLength = WaveFile.Length;
wave = new byte[waveLength];
// --- Subtract the .wav File Header Info (First 44 Bytes) to get the real Audio Data. Source -> http://www.topherlee.com/software/pcm-tut-wavformat.html
wavenew = new byte[(waveLength - 44)];
for (long i = 0; i < waveLength; i++)
{
wave[i] = br.ReadByte();
//Get only the Audio Data byte values
if (i >= 44)
{
wavenew[i - 44] = wave[i];
}
}
// Bits per Sample(whether 8 Bit or 16 Bit.)- Header Information35, 36 is for Bits per Sample.
bitsPerSample = BitConverter.ToInt16(wave, 34); //16Bit
//Number of Channels - Header Information 23, 24 is for No of Channels. (Mono or Stereo)
noOfChannels = BitConverter.ToInt16(wave, 22); //Mono
// Calculate the over all Audio Data length, here for FFT only values of exponential 2 can be given. So take 2 power 17 = 131072 for this calculation. Remaining Bytes will be neglected.
// Divide by 4 since every audio data contains 4 bytes. (Mono,16 Bit or Stereo,8 bit - 4 Bytes. Mono, 8 bit - 2 Bytes)
dataLength = (131116-44)/4; // /4;
data = new double[dataLength];
//SampleRate - Header Information 25 to 28 (4 Bytes) are for SampleRates
SampleRate = BitConverter.ToInt32(wave, 24); // 44100;
// The every 4 Bytes contain the data of Sinus wave.
for (int i = 0; i < dataLength; i++)
{
data[i] = BitConverter.ToInt32(wavenew, (1 + i) * 4);
}
Here are example raw Audio data after conversion.
890381306
1403864333
1804165263
2058843210
2147319456
2063433407
1812231337
1414947118
903955929
320087260
-289603352
-875963166
-1391936590
-1795514361
-2054321130
-2147384959
For FFT, I use http://www.lomont.org/Software/Misc/FFT/LomontFFT.html as a reference, which returns a Double -Real & Imag Value.
public void RealFFT(double[] data, bool forward)
{
var n = data.Length; // # of real inputs, 1/2 the complex length
// checks n is a power of 2 in 2's complement format
if ((n & (n - 1)) != 0)
throw new ArgumentException(
"data length " + n + " in FFT is not a power of 2"
);
var sign = -1.0; // assume inverse FFT, this controls how algebra below works
if (forward)
{ // do packed FFT. This can be changed to FFT to save memory
TableFFT(data, true);
sign = 1.0;
// scaling - divide by scaling for N/2, then mult by scaling for N
if (A != 1)
{
var scale = Math.Pow(2.0, (A - 1) / 2.0);
for (var i = 0; i < data.Length; ++i)
data[i] *= scale;
}
}
var theta = B * sign * 2 * Math.PI / n;
var wpr = Math.Cos(theta);
var wpi = Math.Sin(theta);
var wjr = wpr;
var wji = wpi;
for (var j = 1; j <= n/4; ++j)
{
var k = n / 2 - j;
var tkr = data[2 * k]; // real and imaginary parts of t_k = t_(n/2 - j)
var tki = data[2 * k + 1];
var tjr = data[2 * j]; // real and imaginary parts of t_j
var tji = data[2 * j + 1];
var a = (tjr - tkr) * wji;
var b = (tji + tki) * wjr;
var c = (tjr - tkr) * wjr;
var d = (tji + tki) * wji;
var e = (tjr + tkr);
var f = (tji - tki);
// compute entry y[j]
data[2 * j] = 0.5 * (e + sign * (a + b));
data[2 * j + 1] = 0.5 * (f + sign * (d - c));
// compute entry y[k]
data[2 * k] = 0.5 * (e - sign * (b + a));
data[2 * k + 1] = 0.5 * (sign * (d - c) - f);
var temp = wjr;
// todo - allow more accurate version here? make option?
wjr = wjr * wpr - wji * wpi;
wji = temp * wpi + wji * wpr;
}
if (forward)
{
// compute final y0 and y_{N/2}, store in data[0], data[1]
var temp = data[0];
data[0] += data[1];
data[1] = temp - data[1];
}
else
{
var temp = data[0]; // unpack the y0 and y_{N/2}, then invert FFT
data[0] = 0.5 * (temp + data[1]);
data[1] = 0.5 * (temp - data[1]);
// do packed inverse (table based) FFT. This can be changed to regular inverse FFT to save memory
TableFFT(data, false);
// scaling - divide by scaling for N, then mult by scaling for N/2
//if (A != -1) // todo - off by factor of 2? this works, but something seems weird
{
var scale = Math.Pow(2.0, -(A + 1) / 2.0)*2;
for (var i = 0; i < data.Length; ++i)
data[i] *= scale;
}
}
} void Scale(double[] data, int n, bool forward)
{
// forward scaling if needed
if ((forward) && (A != 1))
{
var scale = Math.Pow(n, (A - 1) / 2.0);
for (var i = 0; i < data.Length; ++i)
data[i] *= scale;
}
// inverse scaling if needed
if ((!forward) && (A != -1))
{
var scale = Math.Pow(n, -(A + 1) / 2.0);
for (var i = 0; i < data.Length; ++i)
data[i] *= scale;
}
}
public void TableFFT(double[] data, bool forward)
{
var n = data.Length;
// checks n is a power of 2 in 2's complement format
if ((n & (n - 1)) != 0)
throw new ArgumentException(
"data length " + n + " in FFT is not a power of 2"
);
n /= 2; // n is the number of samples
Reverse(data, n); // bit index data reversal
// make table if needed
if ((cosTable == null) || (cosTable.Length != n))
Initialize(n);
// do transform: so single point transforms, then doubles, etc.
double sign = forward ? B : -B;
var mmax = 1;
var tptr = 0;
while (n > mmax)
{
var istep = 2 * mmax;
for (var m = 0; m < istep; m += 2)
{
var wr = cosTable[tptr];
var wi = sign * sinTable[tptr++];
for (var k = m; k < 2 * n; k += 2 * istep)
{
var j = k + istep;
var tempr = wr * data[j] - wi * data[j + 1];
var tempi = wi * data[j] + wr * data[j + 1];
data[j] = data[k] - tempr;
data[j + 1] = data[k + 1] - tempi;
data[k] = data[k] + tempr;
data[k + 1] = data[k + 1] + tempi;
}
}
mmax = istep;
}
// perform data scaling as needed
Scale(data, n, forward);
}
Amplitude is derived from the received Real & Imag. value as below
//Converting RealFFT return value in to an amplitude for Graph
(SqurRoot of Real*Real + Imag * Imag)
realPart = new double[(dataLength / 2)];
imagPart = new double[(dataLength / 2)];
dataNew = new double[(dataLength / 2)];
for (int i = 0; i < dataNew.Length; i++)
{
if (i == 0) //Ignore the first two values, since it contain the special values. Then calculate the magnitude for real & imaginary part.
{
}
else
{
realPart[i] = data[(i * 2)] * data[(i * 2)];
imagPart[i] = data[((i * 2) + 1)] * data[((i * 2) + 1)];
dataNew[i] = Math.Sqrt(realPart[i] + imagPart[i]);
}
}
// Get these values for Y Axis as Amplitude
Values.Add(dataNew);
int N = dataNew.Length;
double[] frequencies = new double[N];
for (int i = 0; i < N; i++ )
{
if (i < (N / 2))
{
frequencies[i] = i * SampleRate / N;
}
else if (i >= (N / 2))
{
frequencies[i] = (N - i) * SampleRate / N;
}
//frequencies[i] = (i / N); //*10;
}
// Get the Frequency values for X Axis
Values.Add(frequencies);
Processed Amplitude values are looking like..
0
9030724,08220743
78204971,4566076
11562334,8871099
10855402,9273669
9213306,99124749
39816810,42806
9154491,10446211
10747893,2800474
9744695,14696198
140738122,900694
Derived Frequency from the sample rate are as follows
0
2
5
8
10
13
16
18
21
24
26
29
Finally the graph does not show the correct Freq range & peaks are observed through out the chart, even though the .wav file is a pure sine wave & does not contain any noise. (I´m not able to attach the graph image due to missing reputation, but i´m ready to post it.)
As per Sinus Tone, there should be only one peak around 1KHz, but I´m seeing again sinus like signal. I´ve checked with the different frequencies, but the result does not match with the expectation.
I could not clearly figure out where I´m doing the mistake, whether
Processing Audio Data (Splitting 4 Bytes after Header Info to Double)
Processing data within FFT
or Amplitude/Freq Graph calculation.
I´m ready to provide additional information, if required & any help would be appreciated.

How to read IMediaSample 24 bit PCM data

I have the following method which collects PCM data from the IMediaSample into floats for the FFT:
public int PCMDataCB(IntPtr Buffer, int Length, ref TDSStream Stream, out float[] singleChannel)
{
int numSamples = Length / (Stream.Bits / 8);
int samplesPerChannel = numSamples / Stream.Channels;
float[] samples = new float[numSamples];
if (Stream.Bits == 32 && Stream.Float) {
// this seems to work for 32 bit floating point
byte[] buffer32f = new byte[numSamples * 4];
Marshal.Copy(Buffer, buffer32f, 0, numSamples);
for (int j = 0; j < buffer32f.Length; j+=4)
{
samples[j / 4] = System.BitConverter.ToSingle(new byte[] { buffer32f[j + 0], buffer32f[j + 1], buffer32f[j + 2], buffer32f[j + 3]}, 0);
}
}
else if (Stream.Bits == 24)
{
// I need this code
}
// compress result into one mono channel
float[] result = new float[samplesPerChannel];
for (int i = 0; i < numSamples; i += Stream.Channels)
{
float tmp = 0;
for (int j = 0; j < Stream.Channels; j++)
tmp += samples[i + j] / Stream.Channels;
result[i / Stream.Channels] = tmp;
}
// mono output to be used for visualizations
singleChannel = result;
return 0;
}
Seems to work for 32b float, because I get sensible data in the spectrum analyzer (although it seems too shifted(or compressed?) to the lower frequencies).
I also seem to manage to make it work for 8, 16 and 32 non float, but I can only read garbage when the bits are 24.
How can I adapt this to work with 24 bit PCM coming into Buffer?
Buffer comes from an IMediaSample.
Another thing I am wondering is if the method I use to add all channels to one by summing and dividing by the number of channels is ok...
I figured it out:
byte[] buffer24 = new byte[numSamples * 3];
Marshal.Copy(Buffer, buffer24, 0, numSamples * 3);
var window = (float)(255 << 16 | 255 << 8 | 255);
for (int j = 0; j < buffer24.Length; j+=3)
{
samples[j / 3] = (buffer24[j] << 16 | buffer24[j + 1] << 8 | buffer24[j + 2]) / window;
}
Creates a integer from the three bytes and then scales it into the range 1/-1 by dividing with the max value of three bytes.
Have you tried
byte[] buffer24f = new byte[numSamples * 3];
Marshal.Copy(Buffer, buffer24f, 0, numSamples);
for (int j = 0; j < buffer24f.Length; j+=3)
{
samples[j / 3] = System.BitConverter.ToSingle(
new byte[] {
0,
buffer24f[j + 0],
buffer24f[j + 1],
buffer24f[j + 2]
}, 0);
}

Memory Object Allocation failure using c# and opencl

I am writing an image processing program with the express purpose to alter large images, the one I'm working with is 8165 pixels by 4915 pixels. I was told to implement gpu processing, so after some research I decided to go with OpenCL. I started implementing the OpenCL C# wrapper OpenCLTemplate.
My code takes in a bitmap and uses lockbits to lock its memory location. I then copy the order of each bit into an array, run the array through the openCL kernel, and it inverts each bit in the array. I then run the inverted bits back into the memory location of the image. I split this process into ten chunks so that i can increment a progress bar.
My code works perfectly with smaller images, but when I try to run it with my big image I keep getting a MemObjectAllocationFailure when trying to execute the kernel. I don't know why its doing this and i would appreciate any help in figuring out why or how to fix it.
using OpenCLTemplate;
public static void Invert(Bitmap image, ToolStripProgressBar progressBar)
{
string openCLInvert = #"
__kernel void Filter(__global uchar * Img0,
__global float * ImgF)
{
// Gets information about work-item
int x = get_global_id(0);
int y = get_global_id(1);
// Gets information about work size
int width = get_global_size(0);
int height = get_global_size(1);
int ind = 4 * (x + width * y );
// Inverts image colors
ImgF[ind]= 255.0f - (float)Img0[ind];
ImgF[1 + ind]= 255.0f - (float)Img0[1 + ind];
ImgF[2 + ind]= 255.0f - (float)Img0[2 + ind];
// Leave alpha component equal
ImgF[ind + 3] = (float)Img0[ind + 3];
}";
//Lock the image in memory and get image lock data
var imageData = image.LockBits(new Rectangle(0, 0, image.Width, image.Height), ImageLockMode.ReadWrite, PixelFormat.Format32bppArgb);
CLCalc.InitCL();
for (int i = 0; i < 10; i++)
{
unsafe
{
int adjustedHeight = (((i + 1) * imageData.Height) / 10) - ((i * imageData.Height) / 10);
int count = 0;
byte[] Data = new byte[(4 * imageData.Stride * adjustedHeight)];
var startPointer = (byte*)imageData.Scan0;
for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
{
for (int x = 0; x < imageData.Width; x++)
{
byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));
Data[count] = *Byte;
Data[count + 1] = *(Byte + 1);
Data[count + 2] = *(Byte + 2);
Data[count + 3] = *(Byte + 3);
count += 4;
}
}
CLCalc.Program.Compile(openCLInvert);
CLCalc.Program.Kernel kernel = new CLCalc.Program.Kernel("Filter");
CLCalc.Program.Variable CLData = new CLCalc.Program.Variable(Data);
float[] imgProcessed = new float[Data.Length];
CLCalc.Program.Variable CLFiltered = new CLCalc.Program.Variable(imgProcessed);
CLCalc.Program.Variable[] args = new CLCalc.Program.Variable[] { CLData, CLFiltered };
kernel.Execute(args, new int[] { imageData.Width, adjustedHeight });
CLCalc.Program.Sync();
CLFiltered.ReadFromDeviceTo(imgProcessed);
count = 0;
for (int y = ((i * imageData.Height) / 10); y < (((i + 1) * imageData.Height) / 10); y++)
{
for (int x = 0; x < imageData.Width; x++)
{
byte* Byte = (byte*)(startPointer + (y * imageData.Stride) + (x * 4));
*Byte = (byte)imgProcessed[count];
*(Byte + 1) = (byte)imgProcessed[count + 1];
*(Byte + 2) = (byte)imgProcessed[count + 2];
*(Byte + 3) = (byte)imgProcessed[count + 3];
count += 4;
}
}
}
progressBar.Owner.Invoke((Action)progressBar.PerformStep);
}
//Unlock image
image.UnlockBits(imageData);
}
You may have reached a memory allocation limit of your OpenCL driver/device. Check the values returned by clGetDeviceInfo. There is a limit for the size of one single memory object. The OpenCL driver may allow the total size of all allocated memory objects to exceed the memory size on your device, and will copy them to/from host memory when needed.
To process large images, you may have to split them into smaller pieces, and process them separately.

C# predictive coding for image compression

I've been playing with Huffman Compression on images to reduce size while maintaining a lossless image, but I've also read that you can use predictive coding to further compress image data by reducing entropy.
From what I understand, in the lossless JPEG standard, each pixel is predicted as the weighted average of the adjacent 4 pixels already encountered in raster order (three above and one to the left). e.g., trying to predict the value of a pixel a based on preceding pixels, x, to the left as well as above a :
x x x
x a
Then calculate and encode the residual (difference between predicted and actual value).
But what I don't get is if the average 4 neighbor pixels aren't a multiple of 4, you'd get a fraction right? Should that fraction be ignored? If so, would the proper encoding of an 8 bit image (saved in a byte[]) be something like:
public static void Encode(byte[] buffer, int width, int height)
{
var tempBuff = new byte[buffer.Length];
for (int i = 0; i < buffer.Length; i++)
{
tempBuff[i] = buffer[i];
}
for (int i = 1; i < height; i++)
{
for (int j = 1; j < width - 1; j++)
{
int offsetUp = ((i - 1) * width) + (j - 1);
int offset = (i * width) + (j - 1);
int a = tempBuff[offsetUp];
int b = tempBuff[offsetUp + 1];
int c = tempBuff[offsetUp + 2];
int d = tempBuff[offset];
int pixel = tempBuff[offset + 1];
var ave = (a + b + c + d) / 4;
var val = (byte)(ave - pixel);
buffer[offset + 1] = val;
}
}
}
public static void Decode(byte[] buffer, int width, int height)
{
for (int i = 1; i < height; i++)
{
for (int j = 1; j < width - 1; j++)
{
int offsetUp = ((i - 1) * width) + (j - 1);
int offset = (i * width) + (j - 1);
int a = buffer[offsetUp];
int b = buffer[offsetUp + 1];
int c = buffer[offsetUp + 2];
int d = buffer[offset];
int pixel = buffer[offset + 1];
var ave = (a + b + c + d) / 4;
var val = (byte)(ave - pixel);
buffer[offset + 1] = val;
}
}
}
I don't see how this really will reduce entropy? How will this help compress my images further while still being lossless?
Thanks for any enlightenment
EDIT:
So after playing with the predictive coding images, I noticed that the histogram data shows a lot of +-1's of the varous pixels. This reduces entropy quite a bit in some cases. Here is a screenshot:
Yes, just truncate. Doesn't matter because you store the difference. It reduces entropy because you only store small values, a lot of them will be -1, 0 or 1. There are a couple of off-by-one bugs in your snippet btw.

Categories