I am reading a raw wave stream coming from the microphone. (This part works as I can send it to the speaker and get a nice echo.)
For simplicity lets say I want to detect a DTMF-tone in the wave data. In reality I want to detect any frequency, not just those in DTMF. But I always know which frequency I am looking for.
I have tried running it through FFT, but it doesn't seem very efficient if I want high accuracy in the detection (say it is there for only 20 ms). I can detect it down to an accuracy of around 200 ms.
What are my options with regards to algorithms?
Are there any .Net libs for it?
You may want to look at the Goertzel algorithm if you're trying to detect specific frequencies such as DTMF input. There is a C# DTMF generator/detector library on Sourceforge based on this algorithm.
Very nice implementation of Goertzel is there. C# modification:
private double GoertzelFilter(float[] samples, double freq, int start, int end)
{
double sPrev = 0.0;
double sPrev2 = 0.0;
int i;
double normalizedfreq = freq / SIGNAL_SAMPLE_RATE;
double coeff = 2 * Math.Cos(2 * Math.PI * normalizedfreq);
for (i = start; i < end; i++)
{
double s = samples[i] + coeff * sPrev - sPrev2;
sPrev2 = sPrev;
sPrev = s;
}
double power = sPrev2 * sPrev2 + sPrev * sPrev - coeff * sPrev * sPrev2;
return power;
}
Works great for me.
Let's say that typical DTMF frequency is 200Hz - 1000Hz. Then you'd have to detect a signal based on between 4 and 20 cycles. FFT will not get you anywhere I guess, since you'll detect only multiples of 50Hz frequencies: this is a built in feature of FFT, increasing the number of samples will not solve your problem. You'll have to do something more clever.
Your best shot is to linear least-square fit your data to
h(t) = A cos (omega t) + B sin (omega t)
for a given omega (one of the DTMF frequencies). See this for details (in particular how to set a statistical significance level) and links to the litterature.
I found this as a simple implementation of Goertzel. Haven't gotten it to work yet (looking for wrong frequency?), but I thought I'd share it anywas. It is copied from this site.
public static double CalculateGoertzel(byte[] sample, double frequency, int samplerate)
{
double Skn, Skn1, Skn2;
Skn = Skn1 = Skn2 = 0;
for (int i = 0; i < sample.Length; i++)
{
Skn2 = Skn1;
Skn1 = Skn;
Skn = 2 * Math.Cos(2 * Math.PI * frequency / samplerate) * Skn1 - Skn2 + sample[i];
}
double WNk = Math.Exp(-2 * Math.PI * frequency / samplerate);
return 20 * Math.Log10(Math.Abs((Skn - WNk * Skn1)));
}
As far as any .NET libraries that do this try TAPIEx ToneDecoder.Net Component. I use it for detecting DTMF, but it can do custom tones as well.
I know this question is old, but maybe it will save someone else several days of searching and trying out code samples and libraries that just don't work.
Spectral Analysis.
All application where you extract frequencies from signals goes under field spectral analysis.
Related
I looked pretty much all over google for this but it didn't have any questions that really seemed like they were directed to the problem I was having so I posted here to see if you guys knew what was up with the below code example I have posted on this forum so yeah.
Hello. I have a minimum number (3200) and a maximum number (4000) and then a 3rd number (3663), I am trying to get the progression the 3rd number is between the minimum number and the maximum number.
I'll explain it in better terms, I have levels which have xp counts they start at. I am trying to get the progression to the next level.
Raw [C#]: (3663 - 3200) / (4000 - 3200) * 100
Code:
int progression = (newXp - Convert.ToInt32(currentGrade["grade_xp_needed"])) / (Convert.ToInt32(newGrade["grade_xp_needed"]) - Convert.ToInt32(currentGrade["grade_xp_needed"])) * 100;
Why is it returning 0% but I still have 300 or something XP to go? I've worked it out in PHP and it returns 57%
PHP Code:
$progression = round(($currentXp - $currentGrade->grade_xp_needed) / ($nextGrade->grade_xp_needed - $currentGrade->grade_xp_needed) * 100, 1);
You need to convert the values to decimal to get the correct division results. Then you can convert back the result of the whole equation into integer.
int progression = (int)((newXp - Convert.ToDecimal(currentGrade["grade_xp_needed"]))
/ (Convert.ToDecimal(newGrade["grade_xp_needed"])
- Convert.ToDecimal(currentGrade["grade_xp_needed"])) * 100);
in C#, int divided by an int results in a int.
You are getting a wrong result, since you are performing an integer calculation.
Make any one of numerator or denominator as a float and it will work.
int progression = (int) ((1.0*(newXp - Convert.ToInt32(currentGrade["grade_xp_needed"]))) / (Convert.ToInt32(newGrade["grade_xp_needed"]) - Convert.ToInt32(currentGrade["grade_xp_needed"])) * 100);
Hope this helps.
You have to use double instead of integer.
See this link for sample code and output.
public static void Main()
{
double low = 3200;
double high = 4000;
double xp = 3663;
double progression = (xp - low) / (high - low) * 100;
Console.WriteLine(progression + "%");
}
Audio noob here and math challenged. I'm working with DirectSound which uses a -10000 to 0 range, converting that to a 0-100 scale.
I found this function here to obtain the millibels based on a percentage:
private int ConvertPercentageToMillibels(double value)
{
double attenuation = 1.0 / 1024.0 + value / 100.0 * 1023.0 / 1024.0;
double db = 10 * Math.Log10(attenuation) / Math.Log10(2);
return (int)(db * 100);
}
I need help getting the inverse of this function, basically to get the percentage based on millibels. Here is what I've got so far, which isn't working:
private double ConvertMillibelsToPercentage(int value)
{
double db = value / 100;
double attenuation = Math.Pow(10, db) / 10 * Math.Pow(10, 2);
double percentage = (1.0 * attenuation) - (1024.0 * 100.0 / 1023.0 * 1024.0);
return percentage;
}
Here you go!
private double ConvertMillibelsToPercentage(int value)
{
double exponent = ((value / 1000.0) + 10);
double numerator = 100.0 * (Math.Pow(2, exponent) - 1);
return numerator / 1023.0;
}
Answer will differ slightly due to obvious issues that arise from going between an int and a double.
EDIT: Per the teach how to fish request, here are the first mathematical steps toward arriving at the solution. I didn't show the whole thing because I didn't want to spoil allll the fun. All log functions should be considered Log base 10 unless otherwise noted:
millibels = db*100; // Beginning to work backward
millibels = 10*Log(attenuation)*(1/Log(2))*1000; // Substituting for db
millibels = 1000*Log(attenuation)/Log(2); // Simplifying
let millibels = m. Then:
m = 1000*Log(attenuation)/Log(2);
from here you can go two routes, you can either use properties of logs to find that:
m = 1000* Log_2(attenuation);// That is, log base 2 here
attenuation = 2^(m/1000);
OR you can ignore that particular property and realize:
attenuation = 10^(m*Log(2)/1000);
Try to work it out from one of the above options by plugging in the value that you know for attenuation:
attenuation = (1/1024)+(percentage/100)*(1023/1024);
And then solving for percentage. Good luck!
PS If you ever get stuck on things like this, I highly recommend going to the math stack exchange - there are some smart people there who love to solve math problems.
OR if you are particularly lazy and just want the answer, you can often simply type this stuff into Wolfram Alpha and it will "magically" give you the answer. Check this out
I just wrote the implementation of dft. Here is my code:
int T = 2205;
float[] sign = new float[T];
for (int i = 0, j = 0; i < T; i++, j++)
sign[i] = (float)Math.Sin(2.0f * Math.PI * 120.0f * i/ 44100.0f);
float[] re = new float[T];
float[] im = new float[T];
float[] dft = new float[T];
for (int k = 0; k < T; k++)
{
for (int n = 0; n < T; n++)
{
re[k] += sign[n] * (float)Math.Cos(2.0f* Math.PI * k * n / T);
im[k] += sign[n] * (float)Math.Sin(2.0f* Math.PI * k * n / T);;
}
dft[k] = (float)Math.Sqrt(re[k] * re[k] + im[k] * im[k]);
}
So the sampling freguency is 44100 Hz and I have a 50ms segment of a 120Hz sinus wave. According to the result I have a peak of the dft function at pont 7 and 2200. Did I do something wrong and if not, how should I interpret the results?
I tried the FFT method of AFORGE. Heres is my code.
int T = 2048;
float[] sign = new float[T];
AForge.Math.Complex[] input = new AForge.Math.Complex[T];
for (int i = 0; i < T; i++)
{
sign[i] = (float)Math.Sin(2.0f * Math.PI * 125.0f * i / 44100.0f);
input[i].Re = sign[i];
input[i].Im = 0.0;
}
AForge.Math.FourierTransform.FFT(input, AForge.Math.FourierTransform.Direction.Forward);
AForge.Math.FourierTransform.FFT(input, AForge.Math.FourierTransform.Direction.Backward);
I had expected to get the original sign but I got something different (a function with only positive values). Is that normal?
Thanks in advance!
Your code look correct, but it could be more efficient, DFT is often solved by FFT algorithm (fast-fourier transform, it's not a new transform, it's just an algorithm to solve DFT in more efficient way).
Even if you do not want to implement FFT (which is a bit harder to understand and it's harder to make it work on data which is not in form of 2^n) or use some open source code, you can make your implementation a bit fast, for example by seeing that 2.0f * Math.PI * K / T is a constant outside of inner loop, so you can compute it once for each k (move it outside inner loop) and then just multiply it by n in your cos/sin functions.
As for position and interpretation, you have changed your domain, now your X-axis, which is the index of data in table corresponds not to time but frequency. You have sampling of 44100Hz and you have captures 2205 samples, that means that every 1 sample represents a magnitude of your input signal at frequency equal to 44100Hz / 2205 = 20Hz. You have your magnitude peak at 7th point (index 6) because your signal is 120Hz, so 6 * 20Hz = 120Hz which is what you could expect.
Seconds peak might seem to represent some high frequency, but it's just a spurious signal, because your sampling rate is 44100Hz you can not measure frequencies higher than 44100Hz / 2 (Nyquist's law) which if you cut-off point, after that frequency DFT data is not valid. That's why, second half of your table is invalid and it's basically your first half but mirrored and you can ignore it.
Edit//
From your questions I can see that you are interested in audio processing, you might want to google NForge.Net library, which is a great opensource library for audio and visual processing and its author have many good articles on codeproject.com regarding many of it's features.
I'm trying to implement Hanning and Hamming window functions in C#. I can't find any .Net samples anywhere and I'm not sure if my attempts at converting from C++ samples does the job well.
My problem is mainly that looking at the formulas I imagine they need to have the original number somewhere on the right hand side of the equation - I just don't get it from looking at the formulas. (My math isn't that good yet obviously.)
What I have so far:
public Complex[] Hamming(Complex[] iwv)
{
Complex[] owv = new Complex[iwv.Length];
double omega = 2.0 * Math.PI / (iwv.Length);
// owv[i].Re = real number (raw wave data)
// owv[i].Im = imaginary number (0 since it hasn't gone through FFT yet)
for (int i = 1; i < owv.Length; i++)
// Translated from c++ sample I found somewhere
owv[i].Re = (0.54 - 0.46 * Math.Cos(omega * (i))) * iwv[i].Re;
return owv;
}
public Complex[] Hanning(Complex[] iwv)
{
Complex[] owv = new Complex[iwv.Length];
double omega = 2.0 * Math.PI / (iwv.Length);
for (int i = 1; i < owv.Length; i++)
owv[i].Re = (0.5 + (1 - Math.Cos((2d * Math.PI ) / (i -1)))); // Uhm... wrong
return owv;
}
Here's an example of a Hamming window in use in an open source C# application I wrote a while back. It's being used in a pitch detector for an autotune effect.
You can use the Math.NET library.
double[] hannDoubles = MathNet.Numerics.Window.Hamming(dataIn.Length);
for (int i = 0; i < dataIn.Length; i++)
{
dataOut[i] = hannDoubles[i] * dataIn[i];
}
See my answer to a similar question:
https://stackoverflow.com/a/42939606/246758
The operation of "windowing" means multiplying a signal by a window function. This code you found appears to generate the window function and scale the original signal. The equations are for just the window function itself, not the scaling.
I am writing a C# function for doing dynamic range compression (an audio effect that basically squashes transient peaks and amplifies everything else to produce an overall louder sound). I have written a function that does this (I think):
alt text http://www.freeimagehosting.net/uploads/feea390f84.jpg
public static void Compress(ref short[] input, double thresholdDb, double ratio)
{
double maxDb = thresholdDb - (thresholdDb / ratio);
double maxGain = Math.Pow(10, -maxDb / 20.0);
for (int i = 0; i < input.Length; i += 2)
{
// convert sample values to ABS gain and store original signs
int signL = input[i] < 0 ? -1 : 1;
double valL = (double)input[i] / 32768.0;
if (valL < 0.0)
{
valL = -valL;
}
int signR = input[i + 1] < 0 ? -1 : 1;
double valR = (double)input[i + 1] / 32768.0;
if (valR < 0.0)
{
valR = -valR;
}
// calculate mono value and compress
double val = (valL + valR) * 0.5;
double posDb = -Math.Log10(val) * 20.0;
if (posDb < thresholdDb)
{
posDb = thresholdDb - ((thresholdDb - posDb) / ratio);
}
// measure L and R sample values relative to mono value
double multL = valL / val;
double multR = valR / val;
// convert compressed db value to gain and amplify
val = Math.Pow(10, -posDb / 20.0);
val = val / maxGain;
// re-calculate L and R gain values relative to compressed/amplified
// mono value
valL = val * multL;
valR = val * multR;
double lim = 1.5; // determined by experimentation, with the goal
// being that the lines below should never (or rarely) be hit
if (valL > lim)
{
valL = lim;
}
if (valR > lim)
{
valR = lim;
}
double maxval = 32000.0 / lim;
// convert gain values back to sample values
input[i] = (short)(valL * maxval);
input[i] *= (short)signL;
input[i + 1] = (short)(valR * maxval);
input[i + 1] *= (short)signR;
}
}
and I am calling it with threshold values between 10.0 db and 30.0 db and ratios between 1.5 and 4.0. This function definitely produces a louder overall sound, but with an unacceptable level of distortion, even at low threshold values and low ratios.
Can anybody see anything wrong with this function? Am I handling the stereo aspect correctly (the function assumes stereo input)? As I (dimly) understand things, I don't want to compress the two channels separately, so my code is attempting to compress a "virtual" mono sample value and then apply the same degree of compression to the L and R sample value separately. Not sure I'm doing it right, however.
I think part of the problem may the "hard knee" of my function, which kicks in the compression abruptly when the threshold is crossed. I think I may need to use a "soft knee" like this:
alt text http://www.freeimagehosting.net/uploads/4c1040fda8.jpg
Can anybody suggest a modification to my function to produce the soft knee curve?
The open source Skype Voice Changer project includes a port to C# of a number of nice compressors written by Scott Stillwell, all with configurable parameters:
Fast attack compressor
Fairly childish (compressor limiter)
Event Horizon (peak eating limiter)
The first one looks like it has the capability to do soft-knee, although the parameter to do so is not exposed.
I think your basic understanding of how to do compression is wrong (sorry ;)). It's not about "compressing" individual sample values; that will radically change the waveform and produce severe harmonic distortions. You need to assess the input signal volume over many samples (I would have to Google for the correct formula), and use this to apply a much-more-gradually-changing multiplier to the input samples to generate output.
The DSP forum at kvraudio.com/forum might point you in the right direction if you have a hard time finding the usual techniques.