I'm working a project about chord recognition. I'm using someone's journal as a reference but I still have little grasp in field of DSP. In her reference, first thing is I need to split the signal from wav file into number of frames. In my case, I need to split into 65 ms each frame, with 2866 sample per frame.
I have searched how to split signal into frames but I don't find them clear enough for me to understand.
So far these are some of my codes in WavProcessing class:
public void SetFileName(String fileNameWithPath) //called first in the form, to get the FileStream
{
_fileNameWithPath = fileNameWithPath;
strm = File.OpenRead(_fileNameWithPath);
}
public double getLengthTime(uint wavSize, uint sampleRate, int bitRate, int channels)
{
wavTimeLength = ((strm.Length - 44) / (sampleRate * (bitRate / 8))) / channels;
return wavTimeLength;
}
public int getNumberOfFrames() //return number of frames, I just divided total length time with interval time between frames. (in my case, 3000ms / 65 ms = 46 frames)
{
numOfFrames = (int) (wavTimeLength * 1000 / _sampleFrameTime);
return numOfFrames;
}
public int getSamplePerFrame(UInt32 sampleRate, int sampleFrameTime) // return the sample per frame value (in my case, it's 2866)
{
_sampleRate = sampleRate;
_sampleFrameTime = sampleFrameTime;
sFr = (int)(sampleRate * (sampleFrameTime / 1000.0 ));
return sFr;
}
I just still don't get the idea how to split the signal into 65 ms per frame in C#.
Do I need to split the FileStream and break them into frames and save them into array? Or anything else?
with NAudio you would do it like this:
using (var reader = new AudioFileReader("myfile.wav"))
{
float[] sampleBuffer = new float[2866];
int samplesRead = reader.Read(sampleBuffer, 0, sampleBuffer.Length);
}
As others have commented, the number of samples you read ought to be a power of 2 if you plan to pass it into an FFT. Also, if the file is stereo, you will have left and right samples interleaved, so your FFT will need to be able to cope with this.
Related
I wrote a quick wave file normalizer in c# using naudio.
Currently it locks the thread and creates 1 KB files. sm is the highest peak of the file
using (WaveFileReader reader = new WaveFileReader(aktuellerPfad))
{
using (WaveFileWriter writer = new WaveFileWriter("temp.wav", reader.WaveFormat))
{
byte[] bytesBuffer = new byte[reader.Length];
int read = reader.Read(bytesBuffer, 0, bytesBuffer.Length);
writer.WriteSample(read *32768/sm);
}
}
You need to do mathematical operation in audio buffer to normalise the signal. The normalising steps would be:
a. Read audio buffer as you are doing. (Although, I would prefer reading in chunks).
byte[] bytesBuffer = new byte[reader.Length];
reader.Read( bytesBuffer, 0, bytesBuffer.Length );
b. Calculate multiplier value. There are different ways to calculate the value. I don't know how you are calculating but it looks the value is 32768/sm. I would denote multiplier as "mul".
c. Now iterate through the buffer and multiply the value with the multiplier.
for ( int i = 0; i < bytesBuffer.Length; i++ )
{
bytesBuffer[i] = bytesBuffer[i] * mul;
}
d. Finally, write samples to file.
writer.WriteSamples( bytesBuffer, 0, bytesBuffer.Length );
I'm working with phone raw phone sounds and recordings and I want to normalize them to a certain volume level in a .Net C# project.
The sound is a collection of raw audio bytes (mono unheadered 16-bit signed PCM audio 16000Hz).
The audio is split into blocks of 3200 bytes == 100ms.
Any suggestions how to increase the volume/amplitude so the sounds is louder?
I haven't got a clue if I need to add a constant or multiply values, or if I need to do it to every 1,2,3.... bytes? And maybe there is already a open source solution for this?
To answer my own question (for others).
The solution is to multiply every sample (when 16bit PCM that are 2 bytes) with a constant value.
Do avoid overflow\to much increase you can calculate the highest constant value you can use by looking for the highest sample value and calculate the multiply factor to get it to highest sample value possible, in 16bit PCM case thats 32676 or something.
Here is litle example:
public byte[] IncreaseDecibel(byte[] audioBuffer, float multiplier)
{
// Max range -32768 and 32767
var highestValue = GetHighestAbsoluteSample(audioBuffer);
var highestPosibleMultiplier = (float)Int16.MaxValue/highestValue; // Int16.MaxValue = 32767
if (multiplier > highestPosibleMultiplier)
{
multiplier = highestPosibleMultiplier;
}
for (var i = 0; i < audioBuffer.Length; i = i + 2)
{
Int16 sample = BitConverter.ToInt16(audioBuffer, i);
sample *= (Int16)(sample * multiplier);
byte[] sampleBytes = GetLittleEndianBytesFromShort(sample);
audioBuffer[i] = sampleBytes[sampleBytes.Length-2];
audioBuffer[i+1] = sampleBytes[sampleBytes.Length-1];
}
return audioBuffer;
}
// ADDED GetHighestAbsoluteSample, hopefully its still correct because code has changed over time
/// <summary>
/// Peak sample value
/// </summary>
/// <param name="audioBuffer">audio</param>
/// <returns>0 - 32768</returns>
public static short GetHighestAbsoluteSample(byte[] audioBuffer)
{
Int16 highestAbsoluteValue = 0;
for (var i = 0; i < (audioBuffer.Length-1); i = i + 2)
{
Int16 sample = ByteConverter.GetShortFromLittleEndianBytes(audioBuffer, i);
// prevent Math.Abs overflow exception
if (sample == Int16.MinValue)
{
sample += 1;
}
var absoluteValue = Math.Abs(sample);
if (absoluteValue > highestAbsoluteValue)
{
highestAbsoluteValue = absoluteValue;
}
}
return (highestAbsoluteValue > LowestPossibleAmplitude) ?
highestAbsoluteValue : LowestPossibleAmplitude;
}
I'm just trying use naudio to implement guitar plucked string using Karplus-Strong algorithm according http://www.cs.princeton.edu/courses/archive/fall07/cos126/assignments/guitar.html.
Now I have this for Read method:
public override int Read(byte[] buffer, int offset, int count)
{
int samples = (int)(SamplingRate / Frequency);
for (int i = 0; i < samples; i++)
{
var arr = _ringBuffer.ToArray();
buffer[i] = (byte)((short)((Math.Pow(2, 15) - 1) * arr[i]) & 0x00ff);
}
Tic();
return samples;
}
and this for Tic method:
private void Tic()
{
var newLastValue = ENERGY_DECAY_FACTOR * ((1 / 2) * (_ringBuffer.Dequeue() + _ringBuffer.Peek()));
_ringBuffer.Enqueue((float)newLastValue);
}
And that sound which I can hear is really no guitar sound. It is like metallic something :). Is there some way to do this better? What I'm doing wrong? Any ideas?
Looks like you're truncating 16 bit samples into 8 bit ones (can't see what WaveFormat you're using). I'd recommend creating a class that inherits from ISampleProvider, allowing you to provide floating point samples (where 1.0f is full scale) instead.
Also you're completely ignoring the count method of the Read function. That indicates how many bytes are required. For ISampleProvider it is the number of samples.
I'm trying to record 3-4 streams of audio and write them to separate wav files. I'm using NAudio's asioOut to capture the audio. At first I was just writing a 3 or 4 channel wav file. So I searched questions here and thought I founded an answer, but I still can't figure it out. Now I can write the separate wav files but they have multiple channels in each file.
channel_1.wav --------> has channel 1 and 2
channel_2.wav --------> has channel 1 and 3
channel_3.wav --------> has channel 2 and 3
I think there is a problem when I try to parse the GetAsInterleavedSamples().
This is my asioOut_AudioAvailable()
void asioOut_AudioAvailable(object sender, AsioAudioAvailableEventArgs e)
{
float[] samples = new float[2 * 44100 * GetUserSpecifiedChannelCount()];
samples = e.GetAsInterleavedSamples();
int offset = 0;
while (offset < samples.Length)
{
for (int n = 0; n < this.writers.Length; n++)
{
this.writers[n].WriteSamples(samples, offset, 2);
offset += 2;
}
}
}
I'm new at this so I'm just learning as a go, any help would be greatly appreciated.
Thanks.
If I understand you correctly, you only want to write one sample to each writer, so the third parameter for WriteSamples should be 1, and offset should be incremented by 1. You've probably copied the +=2 from an example that is dealing with 16 bit audio in a byte array.
first time asking though i have been visiting for some time.
Here's the problem:
I'm currently trying to isolate the base frequenciy of a signal contained in a WAVE data file with these properties:
PCM Audio Format i.e Liner Quantization
8000 Hz Sample Rate
16 Bits Per Sample
16000 Byte Rate
One Channel only there is no interleaving.
Getting the byte value:
System.IO.FileStream WaveFile = System.IO.File.OpenRead(#"c:\tet\fft.wav");
byte[] data = new byte[WaveFile.Length];
WaveFile.Read(data,0,Convert.ToInt32(WaveFile.Length));
Converting it to an Array of Doubles:
for (int i = 0; i < 32768; i++)//this is only for a relatively small chunk of the file
{
InReal[i] =BitConverter.ToDouble(data, (i + 1) * 8 + 44);
}
and finanly passing it to a Transform Function.
FFT FftObject = new FFT();
FftObject.Transform(InReal, InImg, 0, 32768, out outReal, out outImg, false);
Now the first question, as i understand the PCM values of the wav file should be in the boundaries of
-1 and 1, but when converting to Double i get this values:
2.65855908666825E-235
2.84104982662944E-285
-1.58613492930337E+235
-1.25617351166869E+264
1.58370933499389E-242
6.19284549187335E-245
-2.92969500042228E+254
-5.90042665390976E+226
3.11954507295188E-273
3.06831908609091E-217
NaN
2.77113146323761E-302
6.76597919848376E-306
-1.55843653898344E+291
These are the firs few of the array in those limits is the rest of array too.
My conclusion of this is that i have some sort of code malfunction but i can seem to be able to find it.
Any help would be appreciated.
And the second question, because i'm only providing real data to the FFT algorithm in the response vector should i expect only Real part data too??
Thank you very much.
I was finally able to find out what was going wrong it seems that i didn't accounted for the pulse code modulation of the signal in the data representation, and because i found many unanswered questions here on wave file preparing for Fourier transformation here is the code in a function that prepares the wave file.
public static Double[] prepare(String wavePath, out int SampleRate)
{
Double[] data;
byte[] wave;
byte[] sR= new byte[4];
System.IO.FileStream WaveFile = System.IO.File.OpenRead(wavePath);
wave = new byte[WaveFile.Length];
data = new Double[(wave.Length - 44) / 4];//shifting the headers out of the PCM data;
WaveFile.Read(wave,0,Convert.ToInt32(WaveFile.Length));//read the wave file into the wave variable
/***********Converting and PCM accounting***************/
for (int i = 0; i < data.Length - i * 4; i++)
{
data[i] = (BitConverter.ToInt32(wave, (1 + i) * 4)) / 65536.0;
//65536.0.0=2^n, n=bits per sample;
}
/**************assigning sample rate**********************/
for (int i = 24; i < 28; i++)
{
sR[i-24]= wave[i];
}
SampleRate = BitConverter.ToInt32(sR,0);
return data;
}
all you need to do now is to send the sample rate and the returned result to your FFT algorithm.
The code is not handled so do your own handling as needed.
I has been tested for phone recordings, of busy, ringing and speech, it functions correctly.