I am totally new to signal processing and am trying to make a program that shows the amplitude of low frequency signals in a PCM (WAV) file.
So far, I've been able to read in the WAV file and populate an array (actually a multi-dimensional array, one for each channel, but let's consider only on a channel-by-channel basis) of float with the data points of the sound file taken from the WAV. Each data point is an amplitude. In short, I have the time-domain representation of the sound wave.
I use this to draw a graph of the amplitude of the wave with respect to time, which looks like:
My goal is to do exactly the same, but only display frequencies below a certain value (eg. 350Hz). To be clear, it's not that I want to display a graph in the frequency domain (ie. after a Fast Fourier Transform). I want to display the same amplitude vs. time graph, but for frequencies in the range [0, 350Hz].
I'm looking for a function that can do:
// Returns an array of data points that contains
// amplitude data points, after a low pass filter
float[] low_pass_filter(float[] original_data, float low_pass_freq=350.0)
{
...
}
I've read up on the FFT, read Chris Lomont's code for the FFT and understand the "theory" behind a low-pass filter, but I'm finding it difficult to get my head around how to actually implement this specific function (above). Any help (+ explanations) would be greatly appreciated!
I ended up using this example which works really well. I wrapped it a bit nicer and ended up with:
/// <summary>
/// Returns a low-pass filter of the data
/// </summary>
/// <param name="data">Data to filter</param>
/// <param name="cutoff_freq">The frequency below which data will be preserved</param>
private float[] lowPassFilter(ref float[] data, float cutoff_freq, int sample_rate, float quality_factor=1.0f)
{
// Calculate filter parameters
float O = (float)(2.0 * Math.PI * cutoff_freq / sample_rate);
float C = quality_factor / O;
float L = 1 / quality_factor / O;
// Loop through and apply the filter
float[] output = new float[data.Length];
float V = 0, I = 0, T;
for (int s = 0; s < data.Length; s++)
{
T = (I - V) / C;
I += (data[s] * O - V) / L;
V += T;
output[s] = V / O;
}
return output;
}
The output of both regular and low-pass waveforms:
And isolating the regular waveforms vs low-pass waveforms:
Related
My problem is very spesific. I want to create narrow band noise for my small wpf application. I am using NAudio library for creating a infinite stream of noise that can be stopped and started by user. I created Tone(Simple sinus wave), Warble(Sinus wave that is modulated by another wave) and White noise.
This is the class I use in order to make any ISampleProvider that is stereo be able to give sound to left, right or both depending what user wants.
using NAudio.Wave;
using System;
namespace AppForSoundCard
{
public class SignalStereoProvider : ISampleProvider
{
private readonly ISampleProvider sample;
public WaveFormat WaveFormat => sample.WaveFormat;
public float LeftVolume { get; set; }
public float RightVolume { get; set; }
public SignalStereoProvider(ISampleProvider sample)
{
if (sample.WaveFormat.Channels != 2)
throw new ArgumentException("Source sample provider must be stereo");
this.sample = sample;
}
public int Read(float[] buffer, int offset, int count)
{
int samplesRead = sample.Read(buffer, offset, count);
for (int n = 0; n < count; n += 2)
{
buffer[offset + n] *= LeftVolume;
buffer[offset + n + 1] *= RightVolume;
}
return samplesRead;
}
}
}
I use this code to generate audio stream that I mentioned. You will ask what is NarrowBand32 is. It is the class that supposed to generate narrowband noise from white noise. I will write it's code after next paragraph.
signalGenerator = new SignalGenerator();
signalGenerator.Type = SignalGeneratorType.White;
signalGenerator.Gain = 1.0;
signalGenerator.Frequency = Frequency;
narrowBand = new NarrowBandProvider32(signalGenerator, Frequency, 96000, 100, dB);
stereoProvider = new SignalStereoProvider(narrowBand)
{
RightVolume = !((ComboBoxItem)RoutingCombobox.SelectedItem).Tag.ToString().Equals("Left") ? (float)Math.Pow(10, (dB - 80) / 20.0) * (float)Settings.Default["ReferenceAmplitudeFor" + Frequency.ToString()] : 0.0f,
LeftVolume = !((ComboBoxItem)RoutingCombobox.SelectedItem).Tag.ToString().Equals("Right") ? (float)Math.Pow(10, (dB - 80) / 20.0) * (float)Settings.Default["ReferenceAmplitudeFor" + Frequency.ToString()] : 0.0f
};
Output.Init(stereoProvider);
I first generate an white noise using NAudio's SignalGenerator class. Then give it to the NarrowbandProvider32 that I wrote. Which suppose to make white noise narrowband. After all that I make the sound either go left or right or both.
Left and right volume is amplitute value for desibel value that is given by user. There is a combobox about routng in which you can choose, left, right, bilateral. Depending on your choise leftvolume is the apmlitute, right volume is the amplitute or both of them is amplitute.
I have visited several sites about how a narrowband noise can be generated. They were all suggesting bandpass filtering a white noise to generate narrow band noise. I tried that. It sort of did what I wanted but it was narrower than I wanted. You can find frequency response of the noise that I generated for 500 hz.
Here is the NarrowBand32 class code for that noise
using NAudio.Dsp;
using NAudio.Wave;
using System;
namespace AppForSoundCard
{
class NarrowBandProvider32 : ISampleProvider
{
ISampleProvider sample;
float lowFreq;
float highFreq;
BiQuadFilter biQuad;
public WaveFormat WaveFormat => sample.WaveFormat;
public NarrowBandProvider32(ISampleProvider sample, float frequency, float sampleRate, float q, float dB)
{
if (sample.WaveFormat.Channels != 2)
throw new ArgumentException("Source sample provider must be stereo");
this.sample = sample;
//Low and High frequency variables are defined like this in audiometry.
//these variables are the boundaries for narrowband noise
lowFreq = (float)Math.Round(frequency / Math.Pow(2, 1.0 / 4.0));
highFreq = (float)Math.Round(frequency * Math.Pow(2, 1.0 / 4.0));
biQuad = BiQuadFilter.BandPassFilterConstantSkirtGain(sampleRate, frequency, q);
biQuad.SetHighPassFilter(sampleRate, lowFreq, q);
biQuad.SetLowPassFilter(sampleRate, highFreq, q);
}
public int Read(float[] buffer, int offset, int count)
{
int samplesRead = sample.Read(buffer, offset, count);
for (int i = 0; i < samplesRead; i++)
buffer[offset + i] = biQuad.Transform(buffer[offset + i]);
return samplesRead;
}
}
}
Those the arguments I gave:
narrowBand = new NarrowBandProvider32(signalGenerator, Frequency, 96000, 100, dB);
As I said this noise is close to the narrowband noise that is defined in audiometry but it is more narrow. Narrowband noise for 500 hz in audiometry has this frequency response.
As you can see it is more wide than the noise that I generated. How can I genereate a narrowband noise that is close to narrowband noise in audiometry for any hz. I only gave examples of 500 hz for the images but in my code you can generate a noise between 150hz to 8000hz. What filter should I use to filter white noise in order to generate that type of narrowband noise. Any help is appreciated.
Edit:
I find a standart which explains how a narrowband noise should be for any frequency and desibel.
Where narrow-band masking is required, the noise band shall be centred geometrically
around the test frequency. The band limits for the masking noise are given in Table 4.
Outside these band limits the sound pressure spectrum density level of the noise shall fall at
a rate of at least 12 dB per octave for at least three octaves and outside these three octaves it
shall be at least 36 dB below the level at the centre frequency. Measurements are required in
the range from 31,5 kHz to 10 kHz for instruments limited to 8 kHz. For EHF instruments
measurements are required up to 20 kHz.
Due to limitations of transducers, ear simulators, acoustic couplers and mechanical couplers,
measurements of the bandwidth at 4 kHz and above may not accurately describe the
spectrum of the masking noise. Therefore at centre frequencies above 3,15 kHz
measurements shall be made electrically across the transducer terminals.
With that definition, I guess just an standart bandpass filter wouldn't work and I have to define a custom filter for the noise. Is there a C# library that allows defining custom filters. If there is how should I define the custom filter in order to make noises in that standart.
They were all suggesting bandpass filtering a white noise to generate narrow band noise. I tried that. It sort of did what I wanted but it was narrower than I wanted.
The approach of applying a bandpass filter to a white noise source makes sense. The problem is just that the bandpass filter design is too narrow. You can make it wider by reducing the q, moving the lowFreq and highFreq a bit outward, or switching to a different filter design method.
I suggest that rather than coding directly in C#, it might be useful to prototype this first in Python using the scipy.signal library, which has a various tools for designing and working with filters.
In the code below, I vary the c parameter to tweak the low and high edges of the band.
Code:
# Copyright 2022 Google LLC.
# SPDX-License-Identifier: Apache-2.0
import matplotlib.pyplot as plt
import numpy as np
import scipy.signal as sig
fs = 96000 # Sample rate.
f0 = 500 # Center frequency in Hz.
# Generate noise with a few different bandwidths.
for c in [1.03, 1.07, 1.15]:
# Design a second-order Butterworth filter bandpass filter.
sos = sig.butter(2, [f0 / c, f0 * c], 'bandpass', output='sos', fs=fs)
# Generate white noise.
white_noise = np.random.randn(fs)
# Run it through the filter.
output = sig.sosfilt(sos, white_noise)
# Use Welch's method to estimate the PSD of the filtered noise.
f, psd = sig.welch(output, fs, nperseg=4096)
plt.semilogx(f, 10 * np.log10(psd), label=f'c = {c}')
plt.axvline(x=f0, color='k')
plt.xlim(50, fs/2)
plt.ylim(-140, -40)
plt.xlabel('Frequency (Hz)', fontsize=15)
plt.ylabel('PSD (dB)', fontsize=15)
plt.legend()
plt.show()
Output:
this is the code that i found online somewhere; it works quite well, but i dont fully understand how it convert a bunch of math into an audio wave:
public static void Beeps(int Amplitude, int Frequency, int Duration)
{
double A = ((Amplitude * (System.Math.Pow(2, 15))) / 1000) - 1;
double DeltaFT = 2 * Math.PI * Frequency / 44100.0;
int Samples = 441 * Duration / 10;
int Bytes = Samples * 4;
int[] Hdr =
{ 0X46464952, 36 + Bytes, 0X45564157,
0X20746D66, 16, 0X20001, 44100, 176400, 0X100004,
0X61746164, Bytes };
using (MemoryStream MS = new MemoryStream(44 + Bytes))
{
using (BinaryWriter BW = new BinaryWriter(MS))
{
for (int I = 0; I < Hdr.Length; I++)
{
BW.Write(Hdr[I]);
}
for (int T = 0; T < Samples; T++)
{
short Sample = System.Convert.ToInt16(A * Math.Sin(DeltaFT * T));
BW.Write(Sample);
BW.Write(Sample);
}
BW.Flush();
MS.Seek(0, SeekOrigin.Begin);
using (SoundPlayer SP = new SoundPlayer(MS))
{
SP.PlaySync();
}
}
}
}
It looks like all it does is beep at certain pitches. The reason math converts into sound is because when the data is fed to your speaker, it's really bytes telling it how to vibrate during that instant.
If you're asking about how sound works, it's based on how vibrations move through the air. Vibrations exist as waves; they literally are shaking the air in certain patterns that your brain interprets as noise through your ears. If the sound has a higher pitch, the soundwaves are closer to each other, and if it's a lower pitch, they're further away. This is why a computer can "convert a bunch of math into an audio wave", because that's all sound really is: a constantly manipulated wave. That method takes a wavelength (Frequency) and creates a sine wave based on it, converts it to bytes, and feeds it to your speaker with a certain volume (Amplitude) and for a certain duration. Cool stuff right?
Also, you're looking at a "method", not a class. :)
Here's more about sound if you're interested: https://en.wikipedia.org/wiki/Sound#Sound_wave_properties_and_characteristics
This answer has a good overview of how wav files work:
Simply sample the waveform at fixed intervals, and write the amplitude at each interval into your file.
That's what the BW.Write calls are doing. T represents the Time.
In order to play the sound, that data goes after the Hdr section, which is simply the correct header for a standard .wav file. 0X46464952 is ascii for "RIFF" and 0X45564157 is "WAVE". The player needs to know what rate the wave was sampled at. In this case it's 44100, which is a common standard.
I am trying to "fake 3D" in a game in WPF. Think of a road, and that the objects appear somewhere in the distant. As they get closer, they look bigger, and eventually they grow in size very fast.
I'm thinking that when the object appears, it's close to 0 in width and height. As it moves towards the player, it becomes closer to hundred percent of its true size.
I think I will need to solve this using logarithmic calculations, and there are several threads on that. What I would really want to do however, is to send in three values to a LogaritmicGrowth method:
the starting Y point
the point at which the object should appear at 100%
the y point where the object is at this very moment.
Thus, what I would like to get in return is the scaling factor for the object in question. So if it's halfway between the starting point and the ending point, then perhaps 0.3 (or so) should be returned.
I can write the method inputs and outputs myself, but need help with the calculation. Thanks!
I am not entirely sure about the use of log here. This is a simple geometry problem.
Think about a point P which is D distance in front of you, which has a height Y (from your line of observation). Your screen is d distance in front of you. The intersection point of the light from P on the screen is p, which makes a height y on screen.
Then, by considering the similar triangles, one can show that:
y = (Y/D) d
Just in case someone else is looking at this question in the future, here's the correct reply (I figured it out myself):
/// <summary>
/// Method that enlargens the kind of object sent in
/// </summary>
public void ExponentialGrowth2(string name, float startY, float endY)
{
float totalDistance = endY - startY;
float currentY = 0;
for (int i = 0; i < Bodies.Bodylist.Count; i++)
{
if (Bodies.Bodylist[i].Name.StartsWith(name)) //looks for all bodies of this type
{
currentY = Bodies.Bodylist[i].PosY;
float distance = currentY - startY + (float)Bodies.Bodylist[i].circle.Height;
float fraction = distance / totalDistance; //such as 0.8
Bodies.Bodylist[i].circle.Width = Bodies.Bodylist[i].OriginalWidth * Math.Pow(fraction, 3);
Bodies.Bodylist[i].circle.Height = Bodies.Bodylist[i].OriginalHeight * Math.Pow(fraction, 3);
}
}
}
The method could be worked on further, such as allowing randomized power-to values (say from 1.5 to 4.5). Note that the higher the exponential value, the greater the effect.
The Problem
How would one interpolate between two given angles, given a certain time delta, so that the simulated motion from rotation A or rotation B would take a similar amount of time when the algorithm is ran at different frequencies (without a fixed time step dependency).
Potential Solution
I have been using the following C# code to do this kind of interpolation between two points. It solves the differential for the situation:
Vector3 SmoothLerpVector3(Vector3 x0, Vector3 y0, Vector3 yt, double t, float k)
{
// x0 = current position
// y0 = last target position
// yt = current target position
// t = time delta between last and current target positions
// k = damping
Vector3 value = x0;
if (t > 0)
{
Vector3 f = x0 - y0 + (yt - y0) / (k * (float)t);
value = yt - (yt - y0) / (k * (float)t) + f * (float)Math.Exp(-k * t);
}
return value;
}
This code is usable for 2D coordinates by having the Z coordinate of the Vector3 set as 0.
The "last" and "current" positions are because the target can move during the interpolation. Not taking this in to account causes motion jitter at moderately high speeds.
I did not write this code and it appears to work. I had trouble altering this for angles because, for example, an interpolation between the angles 350° and 10° would take the 'long' way round instead of going in the direction of the 20° difference in angle.
I've looked into quaternion slerp but haven't been able to find an implementation that takes a time delta into account. Something that I have thought of, but not been able to implement either, is to interpolate between both angles twice, but the second time with a phase difference of 180° on each angle and to output the smaller of the two multiplied by -1.
Would appreciate any help or direction!
The way I've done this before, is to test if the difference between the two angles is greater than 180°, and if so, add 360° to the smaller value, and then do your stuff with those two angle values. So in your example, instead of interpolating between 350° and 10°, you interpolate between 350° and 370°. You can always modulo 360 the result if you need to display it.
Use Slerp() and make sure you wrap the angles between -π and π with one of these helper functions
/// <summary>
/// Wraps angle between -π and π
/// </summary>
/// <param name="angle">The angle</param>
/// <returns>A bounded angle value</returns>
public static double WrapBetweenPI(this double angle)
{
return angle+(2*Math.PI)*Math.Floor((Math.PI-angle)/(2*Math.PI));
}
/// <summary>
/// Wraps angle between -180 and 180
/// </summary>
/// <param name="angle">The angle</param>
/// <returns>A bounded angle value</returns>
public static double WrapBetween180(this double angle)
{
return angle+360*Math.Floor((180-angle)/360);
}
Caution: Related Post for Inconsistency with Math.Round()
The solution
I have some working code using quaternions. In order to take time steps into account (to remove reliance on a fixed step update) the amount slerp/lerp is calculated using amount = 1 - Math.Exp(-k * t). The constant k effects damping (1 - very sluggish, 20 - almost instant snap to target).
I decided to not try and get this to work for 3D as I'm developing a 2D game.
public static float SlerpAngle(
float currentAngle, float targetAngle, double t, float k)
{
// No time has passed, keep angle at current
if (t == 0)
return currentAngle;
// Avoid unexpected large angles
currentAngle = MathHelper.WrapAngle(currentAngle);
targetAngle = MathHelper.WrapAngle(targetAngle);
// Make sure the shortest path between
// current -> target doesn't overflow from
// -pi -> pi range otherwise the 'long
// way round' will be calculated
float difference = Math.Abs(currentAngle - targetAngle);
if (difference > MathHelper.Pi)
{
if (currentAngle > targetAngle)
{
targetAngle += MathHelper.TwoPi;
}
else
{
currentAngle += MathHelper.TwoPi;
}
}
// Quaternion.Slerp was outputing a close-to-0 value
// when target was in the range (-pi, 0). Ensuring
// positivity, halfing difference between current
// and target then doubling result before output
// solves this.
currentAngle += MathHelper.TwoPi;
targetAngle += MathHelper.TwoPi;
currentAngle /= 2;
targetAngle /= 2;
// Calculate spherical interpolation
Quaternion qCurrent = Quaternion.CreateFromAxisAngle(
Vector3.UnitZ, currentAngle);
Quaternion qTarget = Quaternion.CreateFromAxisAngle(
Vector3.UnitZ, targetAngle);
Quaternion qResult = Quaternion.Slerp(
qCurrent, qTarget, (float)(1 - Math.Exp(-k * t)));
// Double value as above
float value = 2 * 2 * (float)Math.Acos(qResult.W);
return value;
}
Rotation speed is consistent over the 5Hz -> 1000Hz range. I thought these were suitable extremes. There's no real reason the run this higher than 60Hz.
I just wrote the implementation of dft. Here is my code:
int T = 2205;
float[] sign = new float[T];
for (int i = 0, j = 0; i < T; i++, j++)
sign[i] = (float)Math.Sin(2.0f * Math.PI * 120.0f * i/ 44100.0f);
float[] re = new float[T];
float[] im = new float[T];
float[] dft = new float[T];
for (int k = 0; k < T; k++)
{
for (int n = 0; n < T; n++)
{
re[k] += sign[n] * (float)Math.Cos(2.0f* Math.PI * k * n / T);
im[k] += sign[n] * (float)Math.Sin(2.0f* Math.PI * k * n / T);;
}
dft[k] = (float)Math.Sqrt(re[k] * re[k] + im[k] * im[k]);
}
So the sampling freguency is 44100 Hz and I have a 50ms segment of a 120Hz sinus wave. According to the result I have a peak of the dft function at pont 7 and 2200. Did I do something wrong and if not, how should I interpret the results?
I tried the FFT method of AFORGE. Heres is my code.
int T = 2048;
float[] sign = new float[T];
AForge.Math.Complex[] input = new AForge.Math.Complex[T];
for (int i = 0; i < T; i++)
{
sign[i] = (float)Math.Sin(2.0f * Math.PI * 125.0f * i / 44100.0f);
input[i].Re = sign[i];
input[i].Im = 0.0;
}
AForge.Math.FourierTransform.FFT(input, AForge.Math.FourierTransform.Direction.Forward);
AForge.Math.FourierTransform.FFT(input, AForge.Math.FourierTransform.Direction.Backward);
I had expected to get the original sign but I got something different (a function with only positive values). Is that normal?
Thanks in advance!
Your code look correct, but it could be more efficient, DFT is often solved by FFT algorithm (fast-fourier transform, it's not a new transform, it's just an algorithm to solve DFT in more efficient way).
Even if you do not want to implement FFT (which is a bit harder to understand and it's harder to make it work on data which is not in form of 2^n) or use some open source code, you can make your implementation a bit fast, for example by seeing that 2.0f * Math.PI * K / T is a constant outside of inner loop, so you can compute it once for each k (move it outside inner loop) and then just multiply it by n in your cos/sin functions.
As for position and interpretation, you have changed your domain, now your X-axis, which is the index of data in table corresponds not to time but frequency. You have sampling of 44100Hz and you have captures 2205 samples, that means that every 1 sample represents a magnitude of your input signal at frequency equal to 44100Hz / 2205 = 20Hz. You have your magnitude peak at 7th point (index 6) because your signal is 120Hz, so 6 * 20Hz = 120Hz which is what you could expect.
Seconds peak might seem to represent some high frequency, but it's just a spurious signal, because your sampling rate is 44100Hz you can not measure frequencies higher than 44100Hz / 2 (Nyquist's law) which if you cut-off point, after that frequency DFT data is not valid. That's why, second half of your table is invalid and it's basically your first half but mirrored and you can ignore it.
Edit//
From your questions I can see that you are interested in audio processing, you might want to google NForge.Net library, which is a great opensource library for audio and visual processing and its author have many good articles on codeproject.com regarding many of it's features.