How to get Bass, Mid, Treble data from FFT - c#

I'm new to this whole audio processing area and I'm wondering how to extract Bass, Mid and treble from an FFT output. I'm currently using this to get the data: https://stackoverflow.com/a/20414331/2714577 which uses Naudio.
But I'm using a fftlength of 1024 (require speed). I'm trying to get these 3 sections in a format such as 0-255 for colour purposes.
I currently have this:
double[] data = new double[512];
void FftCalculated(object sender, FftEventArgs e)
{
for (int j = 0; j < e.Result.Length / 2; j++)
{
double magnitude = Math.Sqrt(e.Result[j].X * e.Result[j].X + e.Result[j].Y * e.Result[j].Y);
double dbValue = 20 * Math.Log10(magnitude);
data[j] = dbValue;
}
double d = 0;
for (int i = 20; i < 89; i++)
{
d += data[i];
}
double m = 0;
for (int i = 150; i < 255; i++)
{
m += data[i];
}
double t = 0;
for (int i = 300; i < 512; i++)
{
t += data[i];
}
Debug.Message(""+d+" |||| "+m+" |||| "+t);
}
Which returns:
Is this right? How do I get this data to something more usable?

The coefficients you get out of a Fourier transform can be positive or negative - what you're interested in is the magnitude (ie. the amount of each frequency), so you will want to take the absolute value in your summation.
Also, I would recommend normalizing - at the end of your summation do this:
double total = data.Sum(x => Math.Abs(x));
d /= total;
m /= total;
t /= total;
This way, your numbers will be confined to the range [0-1) and you will get the same information out if the sound is quieter (unless you don't want that). Actually, the range will be somewhat less than that because each of your summations covers a smaller individual range. So you may want to scale them by the largest one of them:
double largest = Math.Max(d, m, t);
d /= largest;
m /= largest;
t /= largest;
Now the range of each should be between 0 and 1. You can then multiply by 255 or 256 and truncate the decimal if you like.
The downside of the last step is if the values are all zero (because the inputs were all zero) then you will divide by zero. Oops! At this point you need to decide exactly what you want.. If you don't do this scaling, then a sound which is entirely treble (according to your breakdown above) will have (0,0,1) for (d,m,t). But a sound which is an even mixture of the three will be (0.3333, 0.3333, 0.3333) for (d,m,t). And a sound which is completely quiet would be (0,0,0). If that's not what you want, well then you need to define exactly what you want before I could help you any further.

Your dbValue is already a very good number, maesuring the level in decibel relative to 1.0 which becomes 0.0 dB
You should average instead of sum the individual (dB-Values at various) frequencies.
Then map the dB Range of about -80db .. 0.0dB to your color range.
Also note: Speach and music tend to have an average pink noise spectrum. This means that low frequencies tend to have higher dB than high frequencies.
You should compensate for this effect (probably before averaging the frequencies) to get a "better" display.

Related

How to calculate normal distribution kernal for 1d gaussian filter

I need to apply a 1d gaussian filter to a list of floats in c#, ie, to smooth a graph.
I got as far as simply averaging each value with n neighbors, but the result wasn't quite right and so I discovered that I need to apply a normal distribution weight to the contributions of the values per iteration.
I can't find a library like scipy that has a function for this, and I don't quite understand the algebraic formulas I have found for computing a gaussian kernal. Examples are generally geared towards a 2D implementation for images.
Can anyone suggest the modifications that would need to be made to the following code to achieve the proper gaussian effect?
public static List<float> MeanFloats(List<float> floats, int width)
{
List<float> results = new List<float>();
if (width % 2 == 0)
width -= 1; // make sure width is odd
int halfWidthMinus1 = width / 2; // width is known to be odd, divide by 2 will round down
for (int i = 0; i < floats.Count; i++) // iterate through all floats in list
{
float result = 0;
for (int j = 0; j < width; j++)
{
var index = i - halfWidthMinus1 + j;
index = math.max(index, 0); // clamp index - the first and last elements of the list will be used when the algorithm tries to access outside the bounds of the list
index = math.min(index, floats.Count-1);
result += floats[index]; // multiply with kernal here??
}
result /= width; // calculate mean
results.Add(result);
}
return results;
}
If relevant this is for use in a Unity game.
A 1-dimensional Gaussian Kernel is defined as
where sigma is the standard deviation of your list, and x is the index distance.
You then create a kernel by filling each of its array slots with a multiplier. Here is an (untested) example:
private static float[] GaussianKernel(int width, float sigma)
{
float[] kernel = new float[width + 1 + width];
for (int i = -width; i <= width; i++)
{
kernel[width + i] = Mathf.Exp(-(i * i) / (2 * sigma * sigma)) / (Math.PI * 2 * sigma * sigma);
}
return kernel;
}
In your smoothing function you apply this multiplier to the floats[index] value. Finally, before adding the result, instead of dividing it by the width, you divide it by the total sum of the kernel weights (the values of the kernel array).
You could compile the values of the current kernel weight during each iteration in your j-loop weightSum += kernel[j].

Dividing and spacing

I need to divide a variable distance in a very specific way. The spacing for the divisions must be 40 units minimum, and 80 units maximum.
I've tried several different various of this code but I am struggling to wrap my head around how to include the min/max variable in my division.
double totaldist = X;
double division = totaldist / 80;
double roundup = Math.Ceiling(division);
double space = totaldist / roundup;
double increment = 0;
while (increment < totaldist)
{
increment = increment + space;
}
The attached code is obviously short of what I want to accomplish, I'm not sure how to bridge the gap. Thank you
So all you have to do is loop over all the possible divisors and pick the best one. The simplest way to accomplish this is as follows:
public static int remainder(int totalDist)
{
double minRemainder = (totalDist % 40) / 40;
int bestDivision = 40;
for (var i = 40; i <= 80; i++)
{
double cRemainder = (totalDist % i) / i;
if (totalDist % i == 0) return i;
else if (cRemainder < minRemainder) { minRemainder = cRemainder; bestDivision = i; }
}
return bestDivision;
}
This will always return the closest result. Even if there is no real solution, it will still provide an approximate answer as a fallback.
I'd test every divisor for mod 0 (no remainder)
int d = 420;
int s = 40;
for(; s <= 80; s++){
if(d%s==0)
break;
}
if(s==81)
Console.Write("There is no suitable divisor");
else
Console.Write($"{d} divides into {s} segments of {d/s} with no remainder");
If you want to minimise the segment length (greater number of segments) start at 80 and work towards 40 in the loop instead - set your d to 480, start at 80 and you should get "80 segments of length 6" rather than "40 segments of length 12"
You can even get cute with your loop and have no body:
for(; s <= 80 && d%s > 0; s++){ }
But it's not quite so readable/self explanatory

Normalizing signed values results in values outside 0 and 1

I am trying to create some kind of heatmap to visualize data I get from a sensor. So far this works but it looked odd. So I figured out that my code which normalizes the values (ranging from at least -100 to 100) must be invalid.
I often got exceptions because the factor (the normalized value) I use to calculate the color (byte) was negative or greater than 1.
So I modified my code and basically shifted min, max and the value by Math.Abs(min) to ensure that my values are positive. The negative issue is fixed but I sometimes end up with values like 1.02xxxx which must not happen. EDIT: Still getting negative values in some cases...
This is my code so far:
double min = data.Min();
double max = data.Max();
double avg = data.Average();
double v = max - min;
byte a, b, g, r;
// Set minimum and maximum
Dispatcher.InvokeAsync(() =>
{
Minimum = min;
Maximum = max;
Average = avg;
});
double shift = 0;
if (min < 0) // If minimum is negative, shift it to 0 and everything else positively by Abs(min)
{
max += shift = Math.Abs(min);
min += shift;
}
int skipFactor = (int)(1 / _pointResolution);
for (int i = 0; i < data.Length; i += skipFactor)
{
// If min == 0 then shift data
double value = Math.Abs(min) < 0.0001d ? (data[i] + shift) / max : (data[i] - min) / v;
...
}
How can I fix this without adding rounding or "edge-cases"? I assume the Math.Abs() combined with adding operations results in slightly off values... But there must be a solution to that, right?

Getting a neural network to output anything inbetween -1.0 and 1.0

I am trying to make a back propagation neural network.
Based upon the the tutorials i found here : MSDN article by James McCaffrey. He gives many examples but all his networks are based upon the same problem to solve. So his networks look like 4:7:3 >> 4input - 7hidden - 3output.
His output is always binary 0 or 1, one output gets a 1, to classify an Irish flower, into one of the three categories.
I would like to solve another problem with a neural network and that would require me 2 neural networks where one needs an output inbetween 0..255 and another inbetween 0 and 2times Pi. (a full turn, circle). Well essentially i think i need an output that range from 0.0 to 1.0 or from -1 to 1 and anything in between, so that i can multiply it to becomme 0..255 or 0..2Pi
I think his network does behave, like it does because of his computeOutputs
Which I show below here :
private double[] ComputeOutputs(double[] xValues)
{
if (xValues.Length != numInput)
throw new Exception("Bad xValues array length");
double[] hSums = new double[numHidden]; // hidden nodes sums scratch array
double[] oSums = new double[numOutput]; // output nodes sums
for (int i = 0; i < xValues.Length; ++i) // copy x-values to inputs
this.inputs[i] = xValues[i];
for (int j = 0; j < numHidden; ++j) // compute i-h sum of weights * inputs
for (int i = 0; i < numInput; ++i)
hSums[j] += this.inputs[i] * this.ihWeights[i][j]; // note +=
for (int i = 0; i < numHidden; ++i) // add biases to input-to-hidden sums
hSums[i] += this.hBiases[i];
for (int i = 0; i < numHidden; ++i) // apply activation
this.hOutputs[i] = HyperTanFunction(hSums[i]); // hard-coded
for (int j = 0; j < numOutput; ++j) // compute h-o sum of weights * hOutputs
for (int i = 0; i < numHidden; ++i)
oSums[j] += hOutputs[i] * hoWeights[i][j];
for (int i = 0; i < numOutput; ++i) // add biases to input-to-hidden sums
oSums[i] += oBiases[i];
double[] softOut = Softmax(oSums); // softmax activation does all outputs at once for efficiency
Array.Copy(softOut, outputs, softOut.Length);
double[] retResult = new double[numOutput]; // could define a GetOutputs method instead
Array.Copy(this.outputs, retResult, retResult.Length);
return retResult;
The network uses the folowing hyperTan function
private static double HyperTanFunction(double x)
{
if (x < -20.0) return -1.0; // approximation is correct to 30 decimals
else if (x > 20.0) return 1.0;
else return Math.Tanh(x);
}
In above a function makes for the output layer use of Softmax() and it is i think critical to problem here. In that I think it makes his output all binary, and it looks like this :
private static double[] Softmax(double[] oSums)
{
// determine max output sum
// does all output nodes at once so scale doesn't have to be re-computed each time
double max = oSums[0];
for (int i = 0; i < oSums.Length; ++i)
if (oSums[i] > max) max = oSums[i];
// determine scaling factor -- sum of exp(each val - max)
double scale = 0.0;
for (int i = 0; i < oSums.Length; ++i)
scale += Math.Exp(oSums[i] - max);
double[] result = new double[oSums.Length];
for (int i = 0; i < oSums.Length; ++i)
result[i] = Math.Exp(oSums[i] - max) / scale;
return result; // now scaled so that xi sum to 1.0
}
How to rewrite softmax ?
So the network will be able to give non binary answers ?
Notice the full code of the network is here. if you would like to try it out.
Also as to test the network the following accuracy function is used, maybe the binary behaviour emerges from it
public double Accuracy(double[][] testData)
{
// percentage correct using winner-takes all
int numCorrect = 0;
int numWrong = 0;
double[] xValues = new double[numInput]; // inputs
double[] tValues = new double[numOutput]; // targets
double[] yValues; // computed Y
for (int i = 0; i < testData.Length; ++i)
{
Array.Copy(testData[i], xValues, numInput); // parse test data into x-values and t-values
Array.Copy(testData[i], numInput, tValues, 0, numOutput);
yValues = this.ComputeOutputs(xValues);
int maxIndex = MaxIndex(yValues); // which cell in yValues has largest value?
int tMaxIndex = MaxIndex(tValues);
if (maxIndex == tMaxIndex)
++numCorrect;
else
++numWrong;
}
return (numCorrect * 1.0) / (double)testData.Length;
}
Just in case that someone gets into the same situation.
If you need some example code of a neural network regression
(a NNR) That's how they are called.
Here is link to sample code in C#, and here is a good article about it. Notice the guy writes more articles there, you wont find everything but there's a lot there. Despite I was following this man for a while I missed this specific article as I didn't know how they where called, when I asked the question here on stack overflow.
I'm a bit rusty at Neural Netowrks but I think, if you want to have a range of values from your output then you need to make sure your activation functions on your output layer are linear (or something that has a similar effect).
Try adding this method:
private static double[] Linear(double[] oSums)
{
double sum = oSums.Sum(d => Math.Abs(d));
double[] result = new double[oSums.Length];
for (int i = 0; i < oSums.Length; ++i)
result[i] = Math.Abs(oSums[i]) / sum;
// scaled so that xi sum to 1.0
return result;
}
And then in the ComputeOutputs method you need to use this new activation function for the output (rather than Softmax):
...
//double[] softOut = Softmax(oSums); // all outputs at once for efficiency
double[] softOut = Linear(oSums); // all outputs at once for efficiency
Array.Copy(softOut, outputs, softOut.Length);
...
This should now output linear values.

Calculate Biggest Rational Fraction Within Some Bounds

I am trying to place currency trades that match an exact rate on a market that only accepts integral bid/offer amounts. I want to make the largest trade possible at a specific rate. This is a toy program, not a real trading bot, so I am using C#.
I need an algorithm that returns an answer in a reasonable amount of time even when the numerator and denominator can be large (100000+).
static bool CalcBiggestRationalFraction(float target_real, float epsilon, int numerator_max, int denominator_max, out int numerator, out int denominator)
{
// target_real is the ratio we are tryig to achieve in our output fraction (numerator / denominator)
// epsilon is the largest difference abs(target_real - (numerator / denominator)) we are willing to tolerate in the answer
// numerator_max, denominator_max are the upper bounds on the numerator and the denominator in the answer
//
// in the case where there are multiple answers, we want to return the largest one
//
// in the case where an answer is found that is within epsilon, we return true and the answer.
// in the case where an answer is not found that is within epsilon, we return false and the closest answer that we have found.
//
// ex: CalcBiggestRationalFraction(.5, .001, 4, 4, num, denom) returns (2/4) instead of (1/2).
}
I asked a previous question that is similar (http://stackoverflow.com/questions/4385580/finding-the-closest-integer-fraction-to-a-given-random-real) before I thought about what I was actually trying to accomplish and it turns out that I am trying to solve a different, but related problem.
The canonical way to solve your problem is with continued fraction expansion. In particular, see this section.
If you want the unreduced fraction, then here's one optimization you can do: Since you'll never be interested in n/2, because you want 2n/4, 4n/8, or 1024n/2048, we only need to check some of the numbers. As soon as we check any multiple of 2, we never need to check 2. Therefore, I believe you can try denominators denominator_max through denominator_max/2, and you'll have implicitly checked all of the factors of those numbers, which would be everything 2 through denominator_max/2.
I'm not at a compiler at the moment, so I haven't checked this code for correctness, or even that it compiles, but it should be close.
static bool CalcBiggestRationalFraction(float target_real, float epsilon,
int numerator_max, int denominator_max,
out int numerator, out int denominator)
{
if((int)Math.Round(target_real * denominator_max) > numerator_max)
{
// We were given values that don't match up.
// For example, target real = 0.5, but max_num / max_den = 0.3
denominator_max = (int)(numerator_max / target_real);
}
float bestEpsilon = float.MAX_VALUE;
for(int den = denominator_max; den >= denominator_max/2, den--)
{
int num = (int)Math.Round(target_real * den);
float thisEpsilon = Math.abs(((float)num / den) - target_real);
if(thisEpsilon < bestEpsilon)
{
numerator = num;
denominator = den;
bestEpsilon = thisEpsilon;
}
}
return bestEpsilon < epsilon;
}
Let's try this:
First, we need to turn the float into a fraction. Easiest way I can think to do this is to find the order of magnitude of the epsilon, multiply the float by that order, and truncate to get the numerator.
long orderOfMagnitude = 1
while(epsilon * orderOfMagnitude <1)
orderOfMagnitude *= 10;
numerator = (int)(target_real*orderOfMagnitude);
denominator = orderOfMagnitude;
//sanity check; if the initial fraction isn't within the epsilon, then add sig figs until it is
while(target_real - (float)numerator / denominator > epsilon)
{
orderOfMagnitude *= 10;
numerator = (int)(target_real*orderOfMagnitude);
denominator = orderOfMagnitude;
}
Now, we can break the fraction down into least terms. The most efficient way I know of is to attempt to divide by all prime numbers less than or equal to the square root of the smaller of the numerator and denominator.
var primes = new List<int>{2,3,5,7,11,13,17,19,23}; //to start us off
var i = 0;
while (true)
{
if(Math.Sqrt(numerator) < primes[i] || Math.Sqrt(denominator) < primes[i]) break;
if(numerator % primes[i] == 0 && denominator % primes[i] == 0)
{
numerator /= primes[i];
denominator /= primes[i];
i=0;
}
else
{
i++;
if(i > primes.Count)
{
//Find the next prime number by looking for the first number not divisible
//by any prime < sqrt(number).
//We are actually unlikely to have to use this, because the denominator
//is a power of 10, so its prime factorization will be 2^x*5^x
var next = primes.Last() + 2;
bool add;
do
{
add = true;
for(var x=0; primes[x] <= Math.Sqrt(next); x++)
if(next % primes[x] == 0)
{
add = false;
break;
}
if(add)
primes.Add(next);
else
next+=2;
} while(!add);
}
}
}

Categories