Read binary code transmitted via radio in WAV recording - c#

I have some WAV files that were recorded from a radio transmission. It contains information about who has send the transmission and I want to be able to read these information.
The information is transmitted by sending x hz for a 0 and y hz for a 1 ( More about AFSK on Wikipedia)
My problem is: How do I get the binary data out of the wave file? If there are controls for C# would be nice, but some source code for better understanding would be better.
Any ideas?

The WAV file specification is your blueprint for reading the sound data from the WAV file. Sample code for reading and manipulating WAV files can be found in this CodeProject article.
To achieve the tone mapping, you can read this article, which describes how to write software to transfer data between two sound cards. For example, to find out how much of a given frequency is present in a particular segment of the WAV file, you would use a Fourier Transform.
Something like this:
double fourier1(double x_in[], double n, int length) {
double x_complex[2] = { 0, 0 };
int i;
for(i = 0; i < length; i++)
{
x_complex[0] += x_in[i] * cos(M_PI * 2 * i * n / (double) length);
x_complex[1] += x_in[i] * sin(M_PI * 2 * i * n / (double) length);
}
return sqrt(x_complex[0]*x_complex[0] + x_complex[1]*x_complex[1]) / (double) length;
}
Where x_in is a se­ries of num­bers be­tween -1 and 1, and n is the mod­i­fied fre­quency:
(length * fre­quency / rate)

Related

Can someone explain how this class (which generates a sine wave frequency that can be changed) works?

this is the code that i found online somewhere; it works quite well, but i dont fully understand how it convert a bunch of math into an audio wave:
public static void Beeps(int Amplitude, int Frequency, int Duration)
{
double A = ((Amplitude * (System.Math.Pow(2, 15))) / 1000) - 1;
double DeltaFT = 2 * Math.PI * Frequency / 44100.0;
int Samples = 441 * Duration / 10;
int Bytes = Samples * 4;
int[] Hdr =
{ 0X46464952, 36 + Bytes, 0X45564157,
0X20746D66, 16, 0X20001, 44100, 176400, 0X100004,
0X61746164, Bytes };
using (MemoryStream MS = new MemoryStream(44 + Bytes))
{
using (BinaryWriter BW = new BinaryWriter(MS))
{
for (int I = 0; I < Hdr.Length; I++)
{
BW.Write(Hdr[I]);
}
for (int T = 0; T < Samples; T++)
{
short Sample = System.Convert.ToInt16(A * Math.Sin(DeltaFT * T));
BW.Write(Sample);
BW.Write(Sample);
}
BW.Flush();
MS.Seek(0, SeekOrigin.Begin);
using (SoundPlayer SP = new SoundPlayer(MS))
{
SP.PlaySync();
}
}
}
}
It looks like all it does is beep at certain pitches. The reason math converts into sound is because when the data is fed to your speaker, it's really bytes telling it how to vibrate during that instant.
If you're asking about how sound works, it's based on how vibrations move through the air. Vibrations exist as waves; they literally are shaking the air in certain patterns that your brain interprets as noise through your ears. If the sound has a higher pitch, the soundwaves are closer to each other, and if it's a lower pitch, they're further away. This is why a computer can "convert a bunch of math into an audio wave", because that's all sound really is: a constantly manipulated wave. That method takes a wavelength (Frequency) and creates a sine wave based on it, converts it to bytes, and feeds it to your speaker with a certain volume (Amplitude) and for a certain duration. Cool stuff right?
Also, you're looking at a "method", not a class. :)
Here's more about sound if you're interested: https://en.wikipedia.org/wiki/Sound#Sound_wave_properties_and_characteristics
This answer has a good overview of how wav files work:
Simply sample the waveform at fixed intervals, and write the amplitude at each interval into your file.
That's what the BW.Write calls are doing. T represents the Time.
In order to play the sound, that data goes after the Hdr section, which is simply the correct header for a standard .wav file. 0X46464952 is ascii for "RIFF" and 0X45564157 is "WAVE". The player needs to know what rate the wave was sampled at. In this case it's 44100, which is a common standard.

Convert Audio data in IeeeFloat buffer to PCM in buffer

I use NAudio to capture sound input and the input appears as a buffer containing the sound information in IeeeFloat format.
Now that I have this data in the buffer, I want to translate it to PCM at a different sampling rate.
I have already figured out how to convert from IeeeFloat to PCM, and also convert between mono and stereo. Converting the sampling rate is the tough one.
Any solution, preferable using NAudio, that can convert the IeeeFLoat buffer to a buffer with PCM format of choice (including changing sampling rate)?
If you want to resample while you receive data, then you need to perform input driven resampling. I wrote an article on this a while ago.
NAudio has some helper classes to go from mono to stereo, and float to PCM, but they tend to operate on IWaveProvider or ISampleProvider inputs. Typically if I just had the samples as a raw block of bytes I'd write by own simple code to go from float to PCM and double up the samples. It's not that hard to do and the WaveBuffer class will allow you to read float samples directly from a byte[].
I recently had to do this and couldn't find a built in way to do it, so I did just what Mark is talking about, converting the raw data manually. Below is code to downsample IeeeFloat (32 bit float samples), 48000 samples/second, 2 channels to 16 bit short, 16000 samples/second, 1 channel.
I hardcoded some things because my formats were known and fixed, but the same principles apply.
private DownsampleFile()
{
var file = {your file}
using (var reader = new NAudio.Wave.WaveFileReader(file.FullName))
using (var writer = new NAudio.Wave.WaveFileWriter({your output file}, MyWaveFormat))
{
float[] floats;
//a variable to flag the mod 3-ness of the current sample
//we're mapping 48000 --> 16000, so we need to average 3 source
//samples to make 1 output sample
var arity = -1;
var runningSamples = new short[3];
while ((floats = reader.ReadNextSampleFrame()) != null)
{
//simple average to collapse 2 channels into 1
float mono = (float)((double)floaters[0] + (double)floaters[1]) / 2;
//convert (-1, 1) range int to short
short sixteenbit = (short)(mono * 32767);
//the input is 48000Hz and the output is 16000Hz, so we need 1/3rd of the data points
//so save up 3 running samples and then mix and write to the file
arity = (arity + 1) % 3;
runningSamples[arity] = sixteenbit;
//on the third of 3 running samples
if (arity == 2)
{
//simple average of the 3 and put in the 0th position
runningSamples[0] = (short)(((int)runningSamples[0] + (int)runningSamples[1] + (int)runningSamples[2]) / 3);
//write the one 16 bit short to the output
writer.WriteData(runningSamples, 0, 1);
}
}
}
}

Reading Geo tiff Latitude and Longitude [duplicate]

I have acquired Digital Elevation Maps(Height Map of Earth) of some area. My aim was to create Realistic Terrains.
Terrain Generation is no problem. I have practiced that using VC# & XNA framework.
The problem is that those Height Map Files are in GeoTIFF format which i don't know how to read. Nor do i have previous experience with reading any image files so that i could experiment something using little tips-bits available on internet about reading GeoTIFF files. So far i have been unsuccessful.
The geoTIFF files I have are 3601 x 3601 files.
Each file has two version, a decimal & num valued files.
Each file has data of every second of longitude & latitude of
Geo-Coords along with Height Map i.e Lon, Lat, height from sea level
How to read these file :)
The files I have are from ASTER G-DEM Version-2 LINK TO OFFICIAL DESCRIPTION according to them GeoTIFF is pretty standard which is because some GeoTIFF Visualizers I dwonloaded are showing me the correct data.
I am gonna be using C#. I would appreciate if we talk in relation to this language.
E D I T
okay i got the libtiff and this what i have done,
using (Tiff tiff = Tiff.Open(#"Test\N41E071_dem.tif", r))
{
int width = tiff.GetField(TiffTag.IMAGEWIDTH)[0].ToInt();
int height = tiff.GetField(TiffTag.IMAGELENGTH)[0].ToInt();
double dpiX = tiff.GetField(TiffTag.XRESOLUTION)[0].ToDouble();
double dpiY = tiff.GetField(TiffTag.YRESOLUTION)[0].ToDouble();
byte[] scanline = new byte[tiff.ScanlineSize()];
ushort[] scanline16Bit = new ushort[tiff.ScanlineSize() / 2];
for (int i = 0; i < height; i++)
{
tiff.ReadScanline(scanline, i); //Loading ith Line
MultiplyScanLineAs16BitSamples(scanline, scanline16Bit, 16,i);
}
}
private static void MultiplyScanLineAs16BitSamples(byte[] scanline, ushort[] temp, ushort factor,int row)
{
if (scanline.Length % 2 != 0)
{
// each two bytes define one sample so there should be even number of bytes
throw new ArgumentException();
}
Buffer.BlockCopy(scanline, 0, temp, 0, scanline.Length);
for (int i = 0; i < temp.Length; i++)
{
temp[i] *= factor;
MessageBox.Show("Row:"+row.ToString()+"Column:"+(i/2).ToString()+"Value:"+temp[i].ToString());
}
}
where i am displaying the message box, i am displaying the corresponding values, Am i doing it Right, i am asking this cuz this is my maiden experience with images & 8\16 bit problem. I think unlike the official tutorials of libtiff i should be using short instead of ushort because the images i am using are "GeoTIFF, signed 16 bits"
There are some SDKs out there usable from C# to read GeoTIFF files:
http://www.bluemarblegeo.com/global-mapper/developer/developer.php#details (commercial)
http://bitmiracle.com/libtiff/ (free)
http://trac.osgeo.org/gdal/wiki/GdalOgrInCsharp (free?)
UPDATE:
The spec for GeoTIFF can be found here - to me it seems that GeoTIFFs can contain different "subtypes" of information which in turn need to be interpreted appropriately...
Here's a guy that did it without GDAL: http://build-failed.blogspot.com.au/2014/12/processing-geotiff-files-in-net-without.html
GDAL is available in NuGet, though.
If the GeoTIFF contains tiles, you need a different approach. This is how to read a GeoTiff that contains 32bit floats with height data:
int buffersize = 1000000;
using (Tiff tiff = Tiff.Open(geotifffile, "r"))
{
int nooftiles = tiff.GetField(TiffTag.TILEBYTECOUNTS).Length;
int width = tiff.GetField(TiffTag.TILEWIDTH)[0].ToInt();
int height = tiff.GetField(TiffTag.TILELENGTH)[0].ToInt();
byte[] buffer = new byte[buffersize];
for (int i = 0; i < nooftiles; i++)
{
int size = tiff.ReadEncodedTile(i, buffer, 0, buffersize);
float[,] data = new float[width, height];
Buffer.BlockCopy(buffer, 0, data, 0, size); // Convert byte array to x,y array of floats (height data)
// Do whatever you want with the height data (calculate hillshade images etc.)
}
}

FFT Inaccuracy for C#

Ive been experimenting with the FFT algorithm. I use NAudio along with a working code of the FFT algorithm from the internet. Based on my observations of the performance, the resulting pitch is inaccurate.
What happens is that I have an MIDI (generated from GuitarPro) converted to WAV file (44.1khz, 16-bit, mono) that contains a pitch progression starting from E2 (the lowest guitar note) up to about E6. What results is for the lower notes (around E2-B3) its generally very wrong. But reaching C4 its somewhat correct in that you can already see the proper progression (next note is C#4, then D4, etc.) However, the problem there is that the pitch detected is a half-note lower than the actual pitch (e.g. C4 should be the note but D#4 is displayed).
What do you think may be wrong? I can post the code if necessary. Thanks very much! Im still beginning to grasp the field of DSP.
Edit: Here is a rough scratch of what Im doing
byte[] buffer = new byte[8192];
int bytesRead;
do
{
bytesRead = stream16.Read(buffer, 0, buffer.Length);
} while (bytesRead != 0);
And then: (waveBuffer is simply a class that is there to convert the byte[] into float[] since the function only accepts float[])
public int Read(byte[] buffer, int offset, int bytesRead)
{
int frames = bytesRead / sizeof(float);
float pitch = DetectPitch(waveBuffer.FloatBuffer, frames);
}
And lastly: (Smbpitchfft is the class that has the FFT algo ... i believe theres nothing wrong with it so im not posting it here)
private float DetectPitch(float[] buffer, int inFrames)
{
Func<int, int, float> window = HammingWindow;
if (prevBuffer == null)
{
prevBuffer = new float[inFrames]; //only contains zeroes
}
// double frames since we are combining present and previous buffers
int frames = inFrames * 2;
if (fftBuffer == null)
{
fftBuffer = new float[frames * 2]; // times 2 because it is complex input
}
for (int n = 0; n < frames; n++)
{
if (n < inFrames)
{
fftBuffer[n * 2] = prevBuffer[n] * window(n, frames);
fftBuffer[n * 2 + 1] = 0; // need to clear out as fft modifies buffer
}
else
{
fftBuffer[n * 2] = buffer[n - inFrames] * window(n, frames);
fftBuffer[n * 2 + 1] = 0; // need to clear out as fft modifies buffer
}
}
SmbPitchShift.smbFft(fftBuffer, frames, -1);
}
And for interpreting the result:
float binSize = sampleRate / frames;
int minBin = (int)(82.407 / binSize); //lowest E string on the guitar
int maxBin = (int)(1244.508 / binSize); //highest E string on the guitar
float maxIntensity = 0f;
int maxBinIndex = 0;
for (int bin = minBin; bin <= maxBin; bin++)
{
float real = fftBuffer[bin * 2];
float imaginary = fftBuffer[bin * 2 + 1];
float intensity = real * real + imaginary * imaginary;
if (intensity > maxIntensity)
{
maxIntensity = intensity;
maxBinIndex = bin;
}
}
return binSize * maxBinIndex;
UPDATE (if anyone is still interested):
So, one of the answers below stated that the frequency peak from the FFT is not always equivalent to pitch. I understand that. But I wanted to try something for myself if that was the case (on the assumption that there are times in which the frequency peak IS the resulting pitch). So basically, I got 2 softwares (SpectraPLUS and FFTProperties by DewResearch ; credits to them) that is able to display the frequency-domain for the audio signals.
So here are the results of the frequency peaks in the time domain:
SpectraPLUS
and FFT Properties:
This was done using a test note of A2 (around 110Hz). Upon looking at the images, they have frequency peaks around the range of 102-112 Hz for SpectraPLUS and 108 Hz for FFT Properties. On my code, I get 104Hz (I use 8192 blocks and a samplerate of 44.1khz ... 8192 is then doubled to make it complex input so in the end, I get around 5Hz for binsize, as compared to the 10Hz binsize of SpectraPLUS).
So now Im a bit confused, since on the softwares they seem to return the correct result but on my code, I always get 104Hz (note that I have compared the FFT function that I used with others such as Math.Net and it seems to be correct).
Do you think that the problem may be with my interpretation of the data? Or do the softwares do some other thing before displaying the Frequency-Spectrum? Thanks!
It sounds like you may have an interpretation problem with your FFT output. A few random points:
the FFT has a finite resolution - each output bin has a resolution of Fs / N, where Fs is the sample rate and N is the size of the FFT
for notes which are low on the musical scale, the difference in frequency between successive notes is relatively small, so you will need a sufficiently large N to discrimninate between notes which are a semitone apart (see note 1 below)
the first bin (index 0) contains energy centered at 0 Hz but includes energy from +/- Fs / 2N
bin i contains energy centered at i * Fs / N but includes energy from +/- Fs / 2N either side of this center frequency
you will get spectral leakage from adjacent bins - how bad this is depends on what window function you use - no window (== rectangular window) and spectral leakage will be very bad (very broad peaks) - for frequency estimation you want to pick a window function that gives you sharp peaks
pitch is not the same thing as frequency - pitch is a percept, frequency is a physical quantity - the perceived pitch of a musical instrument may be slightly different from the fundamental frequency, depending on the type of instrument (some instruments do not even produce significant energy at their fundamental frequency, yet we still perceive their pitch as if the fundamental were present)
My best guess from the limited information available though is that perhaps you are "off by one" somewhere in your conversion of bin index to frequency, or perhaps your FFT is too small to give you sufficient resolution for the low notes, and you may need to increase N.
You can also improve your pitch estimation via several techniques, such as cepstral analysis, or by looking at the phase component of your FFT output and comparing it for successive FFTs (this allows for a more accurate frequency estimate within a bin for a given FFT size).
Notes
(1) Just to put some numbers on this, E2 is 82.4 Hz, F2 is 87.3 Hz, so you need a resolution somewhat better than 5 Hz to discriminate between the lowest two notes on a guitar (and much finer than this if you actually want to do, say, accurate tuning). At a 44.1 kHz sample then you probably need an FFT of at least N = 8192 to give you sufficient resolution (44100 / 8192 = 5.4 Hz), probably N = 16384 would be better.
I thought this might help you. I made some plots of the 6 open strings of a guitar. The code is in Python using pylab, which I recommend for experimenting:
# analyze distorted guitar notes from
# http://www.freesound.org/packsViewSingle.php?id=643
#
# 329.6 E - open 1st string
# 246.9 B - open 2nd string
# 196.0 G - open 3rd string
# 146.8 D - open 4th string
# 110.0 A - open 5th string
# 82.4 E - open 6th string
from pylab import *
import wave
fs = 44100.0
N = 8192 * 10
t = r_[:N] / fs
f = r_[:N/2+1] * fs / N
gtr_fun = [329.6, 246.9, 196.0, 146.8, 110.0, 82.4]
gtr_wav = [wave.open('dist_gtr_{0}.wav'.format(n),'r') for n in r_[1:7]]
gtr = [fromstring(g.readframes(N), dtype='int16') for g in gtr_wav]
gtr_t = [g / float64(max(abs(g))) for g in gtr]
gtr_f = [2 * abs(rfft(g)) / N for g in gtr_t]
def make_plots():
for n in r_[:len(gtr_t)]:
fig = figure()
fig.subplots_adjust(wspace=0.5, hspace=0.5)
subplot2grid((2,2), (0,0))
plot(t, gtr_t[n]); axis('tight')
title('String ' + str(n+1) + ' Waveform')
subplot2grid((2,2), (0,1))
plot(f, gtr_f[n]); axis('tight')
title('String ' + str(n+1) + ' DFT')
subplot2grid((2,2), (1,0), colspan=2)
M = int(gtr_fun[n] * 16.5 / fs * N)
plot(f[:M], gtr_f[n][:M]); axis('tight')
title('String ' + str(n+1) + ' DFT (16 Harmonics)')
if __name__ == '__main__':
make_plots()
show()
String 1, fundamental = 329.6 Hz:
String 2, fundamental = 246.9 Hz:
String 3, fundamental = 196.0 Hz:
String 4, fundamental = 146.8 Hz:
String 5, fundamental = 110.0 Hz:
String 6, fundamental = 82.4 Hz:
The fundamental frequency isn't always the dominant harmonic. It determines the spacing between harmonics of a periodic signal.
I had a similar question and the answer for me was to use Goertzel instead of FFT. If you know what tones you are looking for (MIDI) Goertzel is capable of detecting the tones to within one sinus wave (one cycle). It does this by generating the sinus wave of the sound and "placing it on top of the raw data" to see if it exist. FFT samples large amounts of data to provide an aproximate frequency spectrum.
Musical pitch is different from frequency peak. Pitch is a psycho-perceptual phenomena that may depend more on the overtones and such. The frequency of what a human would call the pitch could be missing or quite small in the actual signal spectra.
And a frequency peak in a spectrum can be different from any FFT bin center. The FFT bin center frequencies will change in frequency and spacing depending only on the FFT length and sample rate, not the spectra in the data.
So you have at least 2 problems with which to contend. There are a ton of academic papers on frequency estimation as well as the separate subject of pitch estimation. Start there.

Detect a specific frequency/tone from raw wave-data

I am reading a raw wave stream coming from the microphone. (This part works as I can send it to the speaker and get a nice echo.)
For simplicity lets say I want to detect a DTMF-tone in the wave data. In reality I want to detect any frequency, not just those in DTMF. But I always know which frequency I am looking for.
I have tried running it through FFT, but it doesn't seem very efficient if I want high accuracy in the detection (say it is there for only 20 ms). I can detect it down to an accuracy of around 200 ms.
What are my options with regards to algorithms?
Are there any .Net libs for it?
You may want to look at the Goertzel algorithm if you're trying to detect specific frequencies such as DTMF input. There is a C# DTMF generator/detector library on Sourceforge based on this algorithm.
Very nice implementation of Goertzel is there. C# modification:
private double GoertzelFilter(float[] samples, double freq, int start, int end)
{
double sPrev = 0.0;
double sPrev2 = 0.0;
int i;
double normalizedfreq = freq / SIGNAL_SAMPLE_RATE;
double coeff = 2 * Math.Cos(2 * Math.PI * normalizedfreq);
for (i = start; i < end; i++)
{
double s = samples[i] + coeff * sPrev - sPrev2;
sPrev2 = sPrev;
sPrev = s;
}
double power = sPrev2 * sPrev2 + sPrev * sPrev - coeff * sPrev * sPrev2;
return power;
}
Works great for me.
Let's say that typical DTMF frequency is 200Hz - 1000Hz. Then you'd have to detect a signal based on between 4 and 20 cycles. FFT will not get you anywhere I guess, since you'll detect only multiples of 50Hz frequencies: this is a built in feature of FFT, increasing the number of samples will not solve your problem. You'll have to do something more clever.
Your best shot is to linear least-square fit your data to
h(t) = A cos (omega t) + B sin (omega t)
for a given omega (one of the DTMF frequencies). See this for details (in particular how to set a statistical significance level) and links to the litterature.
I found this as a simple implementation of Goertzel. Haven't gotten it to work yet (looking for wrong frequency?), but I thought I'd share it anywas. It is copied from this site.
public static double CalculateGoertzel(byte[] sample, double frequency, int samplerate)
{
double Skn, Skn1, Skn2;
Skn = Skn1 = Skn2 = 0;
for (int i = 0; i < sample.Length; i++)
{
Skn2 = Skn1;
Skn1 = Skn;
Skn = 2 * Math.Cos(2 * Math.PI * frequency / samplerate) * Skn1 - Skn2 + sample[i];
}
double WNk = Math.Exp(-2 * Math.PI * frequency / samplerate);
return 20 * Math.Log10(Math.Abs((Skn - WNk * Skn1)));
}
As far as any .NET libraries that do this try TAPIEx ToneDecoder.Net Component. I use it for detecting DTMF, but it can do custom tones as well.
I know this question is old, but maybe it will save someone else several days of searching and trying out code samples and libraries that just don't work.
Spectral Analysis.
All application where you extract frequencies from signals goes under field spectral analysis.

Categories