C# .wav file processing advice needed - c#

What I am making is a small audio editor that loads a .wav file and displays it in the time domain. The user can select a part of it and zoom in or DFT the chunk which displays a small window in the frequency domain. (extra functionality to be added later)
I think I am making a mistake when splitting my byte array into two float arrays.
I use NAudio to get my samples into a byte array.
Then I use a loop I found on stack overflow to split the array into left and right channels.
private void readToArrays(WaveFileReader pcm) {
wf = pcm.WaveFormat;
int samplesDesired = (int)pcm.Length;
buffer = new byte[samplesDesired * 4];
left = new float[samplesDesired];
right = new float[samplesDesired];
int bytesRead = pcm.Read(buffer, 0, samplesDesired);
if (wf.BitsPerSample == 16) {
if (wf.Channels == 1) {
for (int i = 0; i < buffer.Length / 4; i++)
//handle
}
else if (wf.Channels == 2) {
int index = 0;
for (int sample = 0; sample < bytesRead / 4; sample++) {
left[sample] = BitConverter.ToInt16(buffer, index);
index += 2;
right[sample] = BitConverter.ToInt16(buffer, index);
index += 2;
}
}
}else if (wf.BitsPerSample == 8) {
if (wf.Channels == 1) {
//handle
}
else if (wf.Channels == 2) {
//handle
}
}
}
TL:DR: I get a lot of noise from playback with individual files. What I need is some advice as to how I can still be able to modify my samples / DFT them AND be able to output them without noise. Do note that I am just a 2nd year Computing student and do not have years of experience.
Extra Info : Bit Depth = 16; Sample Rate = 22050;

Related

Algorithms and techniques for string search across multiple GiB of text files

I have to create a utility that searches through 40 to 60 GiB of text files as quick as possible.
Each file has around 50 MB of data that consists of log lines (about 630.000 lines per file).
A NOSQL document database is unfortunately no option...
As of now I am using a Aho-Corsaick algorithm for the search which I stole from Tomas Petricek off of his blog. It works very well.
I process the files in Tasks. Each file is loaded into memory by simply calling File.ReadAllLines(path). The lines are then fed into the Aho-Corsaick one by one, thus each file causes around 600.000 calls to the algorithm (I need the line number in my results).
This takes a lot of time and requires a lot of memory and CPU.
I have very little expertise in this field as I usually work in image processing.
Can you guys recommend algorithms and approaches which could speed up the processing?
Below is more detailed view to the Task creation and file loading which is pretty standard. For more information on the Aho-Corsaick, please visit the linked blog page above.
private KeyValuePair<string, StringSearchResult[]> FindInternal(
IStringSearchAlgorithm algo,
string file)
{
List<StringSearchResult> result = new List<StringSearchResult>();
string[] lines = File.ReadAllLines(file);
for (int i = 0; i < lines.Length; i++)
{
var results = algo.FindAll(lines[i]);
for (int j = 0; j < results.Length; j++)
{
results[j].Row = i;
}
}
foreach (string line in lines)
{
result.AddRange(algo.FindAll(line));
}
return new KeyValuePair<string, StringSearchResult[]>(
file, result.ToArray());
}
public Dictionary<string, StringSearchResult[]> Find(
params string[] search)
{
IStringSearchAlgorithm algo = new StringSearch();
algo.Keywords = search;
Task<KeyValuePair<string, StringSearchResult[]>>[] findTasks
= new Task<KeyValuePair<string, StringSearchResult[]>>[_files.Count];
Parallel.For(0, _files.Count, i => {
findTasks[i] = Task.Factory.StartNew(
() => FindInternal(algo, _files[i])
);
});
Task.WaitAll(findTasks);
return findTasks.Select(t => t.Result)
.ToDictionary(x => x.Key, x => x.Value);
}
EDIT
See section Initial Answer for the original Answer.
I further optimized my code by doing the following:
Added paging to prevent memory overflow / crash due to large amount of result data.
I offload the search results into local files as soon as they exceed a certain buffer size (64kb in my case).
Offloading the results required me to convert my SearchData struct to binary and back.
Splicing the array of files which are processed and running them in Tasks greatly increased performance (from 35 sec to 9 sec when processing about 25 GiB of search data)
Splicing / scaling the file array
The code below gives a scaled/normalized value for T_min and T_max.
This value can then be used to determine the size of each array holding n-amount of file paths.
private int ScalePartition(int T_min, int T_max)
{
// Scale m to range.
int m = T_max / 2;
int t_min = 4;
int t_max = Math.Max(T_max / 16, T_min);
m = ((T_min - m) / (T_max - T_min)) * (t_max - t_min) + t_max;
return m;
}
This code shows the implementation of the scaling and splicing.
// Get size of file array portion.
int scale = ScalePartition(1, _files.Count);
// Iterator.
int n = 0;
// List containing tasks.
List<Task<SearchData[]>> searchTasks = new List<Task<SearchData[]>>();
// Loop through files.
while (n < _files.Count) {
// Local instance of n.
// You will get an AggregateException if you use n
// as n changes during runtime.
int num = n;
// The amount of items to take.
// This needs to be calculated as there might be an
// odd number of elements in the file array.
int cnt = n + scale > _files.Count ? _files.Count - n : scale;
// Run the Find(int, int, Regex[]) method and add as task.
searchTasks.Add(Task.Run(() => Find(num, cnt, regexes)));
// Increment iterator by the amount of files stored in scale.
n += scale;
}
Initial Answer
I had the best results so far after switching to MemoryMappedFile and moving from the Aho-Corsaick back to Regex (a demand has been made that pattern matching is a must have).
There are still parts that can be optimized or changed and I'm sure this is not the fastest or best solution but for it's alright.
Here is the code which returns the results in 30 seconds for 25 GiB worth of data:
// GNU coreutil wc defined buffer size.
// Had best performance with this buffer size.
//
// Definition in wc.c:
// -------------------
// /* Size of atomic reads. */
// #define BUFFER_SIZE (16 * 1024)
//
private const int BUFFER_SIZE = 16 * 1024;
private KeyValuePair<string, SearchData[]> FindInternal(Regex[] rgx, string file)
{
// Buffer for data segmentation.
byte[] buffer = new byte[BUFFER_SIZE];
// Get size of file.
FileInfo fInfo = new FileInfo(file);
long fSize = fInfo.Length;
fInfo = null;
// List of results.
List<SearchData> results = new List<SearchData>();
// Create MemoryMappedFile.
string name = "mmf_" + Path.GetFileNameWithoutExtension(file);
using (var mmf = MemoryMappedFile.CreateFromFile(
file, FileMode.Open, name))
{
// Create read-only in-memory access to file data.
using (var accessor = mmf.CreateViewStream(
0, fSize,
MemoryMappedFileAccess.Read))
{
// Store current position.
int pos = (int)accessor.Position;
// Check if file size is less then the
// default buffer size.
int cnt = (int)(fSize - BUFFER_SIZE > 0
? BUFFER_SIZE
: fSize - BUFFER_SIZE);
// Iterate through file until end of file is reached.
while (accessor.Position < fSize)
{
// Write data to buffer.
accessor.Read(buffer, 0, cnt);
// Update position.
pos = (int)accessor.Position;
// Update next buffer size.
cnt = (int)(fSize - pos >= BUFFER_SIZE
? BUFFER_SIZE
: fSize - pos);
// Convert buffer data to string for Regex search.
string s = Encoding.UTF8.GetString(buffer);
// Run regex against extracted data.
foreach (Regex r in rgx) {
// Get matches.
MatchCollection matches = r.Matches(s);
// Create SearchData struct to reduce memory
// impact and only keep relevant data.
foreach (Match m in matches) {
SearchData sd = new SearchData();
// The actual matched string.
sd.Match = m.Value;
// The index in the file.
sd.Index = m.Index + pos;
// Index to find beginning of line.
int nFirst = m.Index;
// Index to find end of line.
int nLast = m.Index;
// Go back in line until the end of the
// preceeding line has been found.
while (s[nFirst] != '\n' && nFirst > 0) {
nFirst--;
}
// Append length of \r\n (new line).
// Change this to 1 if you work on Unix system.
nFirst+=2;
// Go forth in line until the end of the
// current line has been found.
while (s[nLast] != '\n' && nLast < s.Length-1) {
nLast++;
}
// Remove length of \r\n (new line).
// Change this to 1 if you work on Unix system.
nLast-=2;
// Store whole line in SearchData struct.
sd.Line = s.Substring(nFirst, nLast - nFirst);
// Add result.
results.Add(sd);
}
}
}
}
}
return new KeyValuePair<string, SearchData[]>(file, results.ToArray());
}
public List<KeyValuePair<string, SearchData[]>> Find(params string[] search)
{
var results = new List<KeyValuePair<string, SearchData[]>>();
// Prepare regex objects.
Regex[] regexes = new Regex[search.Length];
for (int i=0; i<regexes.Length; i++) {
regexes[i] = new Regex(search[i], RegexOptions.Compiled);
}
// Get all search results.
// Creating the Regex once and passing it
// to the sub-routine is best as the regex
// engine adds a lot of overhead.
foreach (var file in _files) {
var data = FindInternal(regexes, file);
results.Add(data);
}
return results;
}
I had a stupid idea yesterday were I though that it might work out converting the file data to a bitmap and looking for the input within pixels as pixel checking is quite fast.
Just for the giggles... here is the non-optimized test code for that stupid idea:
public struct SearchData
{
public string Line;
public string Search;
public int Row;
public SearchData(string l, string s, int r) {
Line = l;
Search = s;
Row = r;
}
}
internal static class FileToImage
{
public static unsafe SearchData[] FindText(string search, Bitmap bmp)
{
byte[] buffer = Encoding.ASCII.GetBytes(search);
BitmapData data = bmp.LockBits(
new Rectangle(0, 0, bmp.Width, bmp.Height),
ImageLockMode.ReadOnly, bmp.PixelFormat);
List<SearchData> results = new List<SearchData>();
int bpp = Bitmap.GetPixelFormatSize(bmp.PixelFormat) / 8;
byte* ptFirst = (byte*)data.Scan0;
byte firstHit = buffer[0];
bool isFound = false;
for (int y=0; y<data.Height; y++) {
byte* ptStride = ptFirst + (y * data.Stride);
for (int x=0; x<data.Stride; x++) {
if (firstHit == ptStride[x]) {
byte[] temp = new byte[buffer.Length];
if (buffer.Length < data.Stride-x) {
int ret = 0;
for (int n=0, xx=x; n<buffer.Length; n++, xx++) {
if (ptStride[xx] != buffer[n]) {
break;
}
ret++;
}
if (ret == buffer.Length) {
int lineLength = 0;
for (int n = 0; n<data.Stride; n+=bpp) {
if (ptStride[n+2] == 255 &&
ptStride[n+1] == 255 &&
ptStride[n+0] == 255)
{
lineLength=n;
}
}
SearchData sd = new SearchData();
byte[] lineBytes = new byte[lineLength];
Marshal.Copy((IntPtr)ptStride, lineBytes, 0, lineLength);
sd.Search = search;
sd.Line = Encoding.ASCII.GetString(lineBytes);
sd.Row = y;
results.Add(sd);
}
}
}
}
}
return results.ToArray();
bmp.UnlockBits(data);
return null;
}
private static unsafe Bitmap GetBitmapInternal(string[] lines, int startIndex, Bitmap bmp)
{
int bpp = Bitmap.GetPixelFormatSize(bmp.PixelFormat) / 8;
BitmapData data = bmp.LockBits(
new Rectangle(0, 0, bmp.Width, bmp.Height),
ImageLockMode.ReadWrite,
bmp.PixelFormat);
int index = startIndex;
byte* ptFirst = (byte*)data.Scan0;
int maxHeight = bmp.Height;
if (lines.Length - startIndex < maxHeight) {
maxHeight = lines.Length - startIndex -1;
}
for (int y = 0; y < maxHeight; y++) {
byte* ptStride = ptFirst + (y * data.Stride);
index++;
int max = lines[index].Length;
max += (max % bpp);
lines[index] += new string('\0', max % bpp);
max = lines[index].Length;
for (int x=0; x+2<max; x+=bpp) {
ptStride[x+0] = (byte)lines[index][x+0];
ptStride[x+1] = (byte)lines[index][x+1];
ptStride[x+2] = (byte)lines[index][x+2];
}
ptStride[max+2] = 255;
ptStride[max+1] = 255;
ptStride[max+0] = 255;
for (int x = max + bpp; x < data.Stride; x += bpp) {
ptStride[x+2] = 0;
ptStride[x+1] = 0;
ptStride[x+0] = 0;
}
}
bmp.UnlockBits(data);
return bmp;
}
public static unsafe Bitmap[] GetBitmap(string filePath)
{
int bpp = Bitmap.GetPixelFormatSize(PixelFormat.Format24bppRgb) / 8;
var lines = System.IO.File.ReadAllLines(filePath);
int y = 0x800; //lines.Length / 0x800;
int x = lines.Max(l => l.Length) / bpp;
int cnt = (int)Math.Ceiling((float)lines.Length / (float)y);
Bitmap[] results = new Bitmap[cnt];
for (int i = 0; i < results.Length; i++) {
results[i] = new Bitmap(x, y, PixelFormat.Format24bppRgb);
results[i] = GetBitmapInternal(lines, i * 0x800, results[i]);
}
return results;
}
}
You can split the file into partitions and regex search each partition in parallel then join the results. There are some sharp edges in the details like handling values that span two partitions. Gigantor is a c# library I have created that does this very thing. Feel free to try it or have a look at the source code.

Accessing processed values from FFT

I am attempting to use Lomont FFT in order to return complex numbers to build a spectrogram / spectral density chart using c#.
I am having trouble understanding how to return values from the class.
Here is the code I have put together thus far which appears to be working.
int read = 0;
Double[] data;
byte[] buffer = new byte[1024];
FileStream wave = new FileStream(args[0], FileMode.Open, FileAccess.Read);
read = wave.Read(buffer, 0, 44);
read = wave.Read(buffer, 0, 1024);
data = new Double[read];
for (int i = 0; i < read; i+=2)
{
data[i] = BitConverter.ToInt16(buffer, i) / 32768.0;
Console.WriteLine(data[i]);
}
LomontFFT LFFT = new LomontFFT();
LFFT.FFT(data, true);
What I am not clear on is, how to return/access the values from Lomont FFT implementation back into my application (console)?
Being pretty new to c# development, I'm thinking I am perhaps missing a fundamental aspect of understanding regarding how to retrieve processed values from the instance of the Lomont Class, or perhaps even calling it incorrectly.
Console.WriteLine(LFFT.A); // Returns 0
Console.WriteLine(LFFT.B); // Returns 1
I have been searching for a code snippet or explanation of how to do this, but so far have come up with nothing that I understand or explains this particular aspect of the issue I am facing. Any guidance would be greatly appreciated.
A subset of the results held in data array noted in the code above can be found below and based on my current understanding, appear to be valid:
0.00531005859375
0.0238037109375
0.041473388671875
0.0576171875
0.07183837890625
0.083465576171875
0.092193603515625
0.097625732421875
0.099639892578125
0.098114013671875
0.0931396484375
0.0848388671875
0.07354736328125
0.05963134765625
0.043609619140625
0.026031494140625
0.007476806640625
-0.011260986328125
-0.0296630859375
-0.047027587890625
-0.062713623046875
-0.076141357421875
-0.086883544921875
-0.09454345703125
-0.098785400390625
-0.0994873046875
-0.0966796875
-0.090362548828125
-0.080810546875
-0.06842041015625
-0.05352783203125
-0.036712646484375
-0.0185546875
What am I actually attempting to do? (perspective)
I am looking to load a wave file into a console application and return a spectrogram/spectral density chart/image as a jpg/png for further processing.
The wave files I am reading are mono in format
UPDATE 1
I Receive slightly different results depending on which FFT is used.
Using RealFFT
for (int i = 0; i < read; i+=2)
{
data[i] = BitConverter.ToInt16(buffer, i) / 32768.0;
//Console.WriteLine(data[i]);
}
LomontFFT LFFT = new LomontFFT();
LFFT.RealFFT(data, true);
for (int i = 0; i < buffer.Length / 2; i++)
{
System.Console.WriteLine("{0}",
Math.Sqrt(data[2 * i] * data[2 * i] + data[2 * i + 1] * data[2 * i + 1]));
}
Partial Result of RealFFT
0.314566983321381
0.625242818210924
0.30314888696868
0.118468857708093
0.0587697011760449
0.0369034115568654
0.0265842582236275
0.0207195964060356
0.0169601273233317
0.0143745438577886
0.012528799609089
0.0111831275153128
0.0102313284519146
0.00960198279358434
0.00920236001619566
Using FFT
for (int i = 0; i < read; i+=2)
{
data[i] = BitConverter.ToInt16(buffer, i) / 32768.0;
//Console.WriteLine(data[i]);
}
double[] bufferB = new double[2 * data.Length];
for (int i = 0; i < data.Length; i++)
{
bufferB[2 * i] = data[i];
bufferB[2 * i + 1] = 0;
}
LomontFFT LFFT = new LomontFFT();
LFFT.FFT(bufferB, true);
for (int i = 0; i < bufferB.Length / 2; i++)
{
System.Console.WriteLine("{0}",
Math.Sqrt(bufferB[2 * i] * bufferB[2 * i] + bufferB[2 * i + 1] * bufferB[2 * i + 1]));
}
Partial Result of FFT:
0.31456698332138
0.625242818210923
0.303148886968679
0.118468857708092
0.0587697011760447
0.0369034115568653
0.0265842582236274
0.0207195964060355
0.0169601273233317
0.0143745438577886
0.012528799609089
0.0111831275153127
0.0102313284519146
0.00960198279358439
0.00920236001619564
Looking at the LomontFFT.FFT documentation:
Compute the forward or inverse Fourier Transform of data, with
data containing complex valued data as alternating real and
imaginary parts. The length must be a power of 2. The data is
modified in place.
This tells us a few things. First the function is expecting complex-valued data whereas your data is real. A quick fix for this is to create another buffer of twice the size and setting all the imaginary parts to 0:
double[] buffer = new double[2*data.Length];
for (int i=0; i<data.Length; i++)
{
buffer[2*i] = data[i];
buffer[2*i+1] = 0;
}
The documentation also tells us that the computation is done in place. That means that after the call to FFT returns, the input array is replaced with the computed result. You could thus print the spectrum with:
LomontFFT LFFT = new LomontFFT();
LFFT.FFT(buffer, true);
for (int i = 0; i < buffer.Length/2; i++)
{
System.Console.WriteLine("{0}",
Math.Sqrt(buffer[2*i]*buffer[2*i]+buffer[2*i+1]*buffer[2*i+1]));
}
Note since your input data is real valued you could also use LomontFFT.RealFFT. In that case, given a slightly different packing rule, you would obtain the FFT results using:
LomontFFT LFFT = new LomontFFT();
LFFT.RealFFT(data, true);
System.Console.WriteLine("{0}", Math.Abs(data[0]);
for (int i = 1; i < data.Length/2; i++)
{
System.Console.WriteLine("{0}",
Math.Sqrt(data[2*i]*data[2*i]+data[2*i+1]*data[2*i+1]));
}
System.Console.WriteLine("{0}", Math.Abs(data[1]);
This would give you the non-redundant lower half of the spectrum (Unlike LomontFFT.FFT which provides the entire spectrum). Also, numerical differences on the order of double precision (around 1e-16 times the spectrum peak value) with respect to LomontFFT.FFT can be expected.

Unwanted Audio Intensity Decrease

I have a c# windows form project with the following c# code which uses BiQuadFilter class of NAudio library to implement low pass filter. The problem I am facing is that the intensity of the sound is also dropping along with the frequency thus making the volume barely audible. How can I fix this?
My Code:
private ISampleProvider sourceProvider;
private BiQuadFilter[] filters;
private int channels,cutOffFreq;
//Constructor
public MyFilter(ISampleProvider sourceProvider,int cutOffFreq)
{
this.sourceProvider = sourceProvider;
this.cutOffFreq = cutOffFreq;
channels = sourceProvider.WaveFormat.Channels;
filters = new BiQuadFilter[channels];
CreateFilters();
}
private void CreateFilters()
{
for (int n = 0; n < channels; n++)
if (filters[n] == null)
filters[n] = BiQuadFilter.LowPassFilter(44100, cutOffFreq, 1);
else
filters[n].SetLowPassFilter(44100, cutOffFreq, 1);
}
public WaveFormat WaveFormat { get { return sourceProvider.WaveFormat; } }
public int Read(float[] buffer, int offset, int count)
{
int samplesRead = sourceProvider.Read(buffer, offset, count);
for (int i = 0; i < samplesRead; i++)
buffer[offset + i] = filters[(i % channels)].Transform(buffer[offset + i]);
return samplesRead;
}
This is how I am using it:
waveOut.Init(new MyFilter(new AudioFileReader(path + "\\Audio_Tracks\\" + topic + "\\" + currentTrack + ".wav"), 750));
waveOut.Play();
Well I don't know what your source material is, but a low pass filter with a cutoff frequency of just 750Hz is potentially going to be cutting a lot out of your original signal. Generally speaking a filter will make things quieter. Amplify the sound afterwards (e.g. multiply each sample by a gain factor after transforming it).

VB6: How are binary files encoded? using Put statement

I have this code
Open WritingPath & "\FplDb.txt" For Random As #1 Len = Len(WpRec)
For i = 1 To 99
WpRec.WpIndex = FplDB(i, 1)
WpRec.WpName = FplDB(i, 2)
WpRec.WpLat = FplDB(i, 3)
WpRec.WpLon = FplDB(i, 4)
WpRec.WpLatDir = FplDB(i, 5)
WpRec.WpLonDir = FplDB(i, 6)
Put #1, i, WpRec
Next i
Close #1
SaveOk = 1
FplSave = SaveOk
Exit Function
This function makes binary serialization of a matrix of 99 structs (WpRec) to file, using "Open" and "Put" statements. But I didn't get how it is encoded... It is important to me because I need to rewrite the same serialization in C# but I need to know what encoding method is used for that so I can do the same in C#....
The tricky bit in VB6 was that you were allowed to declare structures with fixed length strings so that you could write records containing strings that didn't need a length prefix. The length of the string buffer was encoded into the type instead of needing to be written out with the record. This allowed for fixed size records. In .NET, this has kind of been left behind in the sense that VB.NET has a mechanism to support it for backward compatibility, but it's not really intended for C# as far as I can tell: How to declare a fixed-length string in VB.NET?.
.NET seems to have a preference for generally writing out strings with a length prefix, meaning that records are generally variable-length. This is suggested by the implementation of BinaryReader.ReadString.
However, you can use System.BitConverter to get finer control over how records are serialized and de-serialized as bytes (System.IO.BinaryReader and System.IO.BinaryWriter are probably not useful since they make assumptions that strings have a length prefix). Keep in mind that a VB6 Integer maps to a .NET Int16 and a VB6 Long is a .Net Int32. I don't know exactly how you have defined your VB6 structure, but here's one possible implementation as an example:
class Program
{
static void Main(string[] args)
{
WpRecType[] WpRec = new WpRecType[3];
WpRec[0] = new WpRecType();
WpRec[0].WpIndex = 0;
WpRec[0].WpName = "New York";
WpRec[0].WpLat = 40.783f;
WpRec[0].WpLon = 73.967f;
WpRec[0].WpLatDir = 1;
WpRec[0].WpLonDir = 1;
WpRec[1] = new WpRecType();
WpRec[1].WpIndex = 1;
WpRec[1].WpName = "Minneapolis";
WpRec[1].WpLat = 44.983f;
WpRec[1].WpLon = 93.233f;
WpRec[1].WpLatDir = 1;
WpRec[1].WpLonDir = 1;
WpRec[2] = new WpRecType();
WpRec[2].WpIndex = 2;
WpRec[2].WpName = "Moscow";
WpRec[2].WpLat = 55.75f;
WpRec[2].WpLon = 37.6f;
WpRec[2].WpLatDir = 1;
WpRec[2].WpLonDir = 2;
byte[] buffer = new byte[WpRecType.RecordSize];
using (System.IO.FileStream stm =
new System.IO.FileStream(#"C:\Users\Public\Documents\FplDb.dat",
System.IO.FileMode.OpenOrCreate, System.IO.FileAccess.ReadWrite))
{
WpRec[0].SerializeInto(buffer);
stm.Write(buffer, 0, buffer.Length);
WpRec[1].SerializeInto(buffer);
stm.Write(buffer, 0, buffer.Length);
WpRec[2].SerializeInto(buffer);
stm.Write(buffer, 0, buffer.Length);
// Seek to record #1, load and display it
stm.Seek(WpRecType.RecordSize * 1, System.IO.SeekOrigin.Begin);
stm.Read(buffer, 0, WpRecType.RecordSize);
WpRecType rec = new WpRecType(buffer);
Console.WriteLine("[{0}] {1}: {2} {3}, {4} {5}", rec.WpIndex, rec.WpName,
rec.WpLat, (rec.WpLatDir == 1) ? "N" : "S",
rec.WpLon, (rec.WpLonDir == 1) ? "W" : "E");
}
}
}
class WpRecType
{
public short WpIndex;
public string WpName;
public Single WpLat;
public Single WpLon;
public byte WpLatDir;
public byte WpLonDir;
const int WpNameBytes = 40; // 20 unicode characters
public const int RecordSize = WpNameBytes + 12;
public void SerializeInto(byte[] target)
{
int position = 0;
target.Initialize();
BitConverter.GetBytes(WpIndex).CopyTo(target, position);
position += 2;
System.Text.Encoding.Unicode.GetBytes(WpName).CopyTo(target, position);
position += WpNameBytes;
BitConverter.GetBytes(WpLat).CopyTo(target, position);
position += 4;
BitConverter.GetBytes(WpLon).CopyTo(target, position);
position += 4;
target[position++] = WpLatDir;
target[position++] = WpLonDir;
}
public void Deserialize(byte[] source)
{
int position = 0;
WpIndex = BitConverter.ToInt16(source, position);
position += 2;
WpName = System.Text.Encoding.Unicode.GetString(source, position, WpNameBytes);
position += WpNameBytes;
WpLat = BitConverter.ToSingle(source, position);
position += 4;
WpLon = BitConverter.ToSingle(source, position);
position += 4;
WpLatDir = source[position++];
WpLonDir = source[position++];
}
public WpRecType()
{
}
public WpRecType(byte[] source)
{
Deserialize(source);
}
}
Add a reference to Microsoft.VisualBasic and use FilePut
It is designed to assist with compatibility with VB6
The VB6 code in your question would be something like this in C# (I haven't compiled this)
Microsoft.VisualBasic.FileOpen (1, WritingPath & "\FplDb.txt", OpenMode.Random,
RecordLength:=Marshal.SizeOf(WpRec))
for (i = 1; i < 100 ; i++) {
WpRec.WpIndex = FplDB(i, 1)
WpRec.WpName = FplDB(i, 2)
WpRec.WpLat = FplDB(i, 3)
WpRec.WpLon = FplDB(i, 4)
WpRec.WpLatDir = FplDB(i, 5)
WpRec.WpLonDir = FplDB(i, 6)
Microsoft.VisualBasic.FilePut(1, WpRec, i)
}
Microsoft.VisualBasic.FileClose(1)
I think Marshal.SizeOf(WpRec) returns the same value that Len(WpRec) will return in VB6 - do check this though.
The put statement in VB6 does not do any encoding. It saves a structure just as it is stored internally in memory. For example, put saves a double as a 64-bit floating point value, just as it is represented in memory. In your example, the members of WpRec are stored in the put statement just as WpRec is stored in memory.

Detecting audio silence in WAV files using C#

I'm tasked with building a .NET client app to detect silence in a WAV files.
Is this possible with the built-in Windows APIs? Or alternately, any good libraries out there to help with this?
Audio analysis is a difficult thing requiring a lot of complex math (think Fourier Transforms). The question you have to ask is "what is silence". If the audio that you are trying to edit is captured from an analog source, the chances are that there isn't any silence... they will only be areas of soft noise (line hum, ambient background noise, etc).
All that said, an algorithm that should work would be to determine a minimum volume (amplitude) threshold and duration (say, <10dbA for more than 2 seconds) and then simply do a volume analysis of the waveform looking for areas that meet this criteria (with perhaps some filters for millisecond spikes). I've never written this in C#, but this CodeProject article looks interesting; it describes C# code to draw a waveform... that is the same kind of code which could be used to do other amplitude analysis.
If you want to efficiently calculate the average power over a sliding window: square each sample, then add it to a running total. Subtract the squared value from N samples previous. Then move to the next step. This is the simplest form of a CIC Filter. Parseval's Theorem tells us that this power calculation is applicable to both time and frequency domains.
Also you may want to add Hysteresis to the system to avoid switching on&off rapidly when power level is dancing about the threshold level.
http://www.codeproject.com/Articles/19590/WAVE-File-Processor-in-C
This has all the code necessary to strip silence, and mix wave files.
Enjoy.
I'm using NAudio, and I wanted to detect the silence in audio files so I can either report or truncate.
After a lot of research, I came up with this basic implementation. So, I wrote an extension method for the AudioFileReader class which returns the silence duration at the start/end of the file, or starting from a specific position.
Here:
static class AudioFileReaderExt
{
public enum SilenceLocation { Start, End }
private static bool IsSilence(float amplitude, sbyte threshold)
{
double dB = 20 * Math.Log10(Math.Abs(amplitude));
return dB < threshold;
}
public static TimeSpan GetSilenceDuration(this AudioFileReader reader,
SilenceLocation location,
sbyte silenceThreshold = -40)
{
int counter = 0;
bool volumeFound = false;
bool eof = false;
long oldPosition = reader.Position;
var buffer = new float[reader.WaveFormat.SampleRate * 4];
while (!volumeFound && !eof)
{
int samplesRead = reader.Read(buffer, 0, buffer.Length);
if (samplesRead == 0)
eof = true;
for (int n = 0; n < samplesRead; n++)
{
if (IsSilence(buffer[n], silenceThreshold))
{
counter++;
}
else
{
if (location == SilenceLocation.Start)
{
volumeFound = true;
break;
}
else if (location == SilenceLocation.End)
{
counter = 0;
}
}
}
}
// reset position
reader.Position = oldPosition;
double silenceSamples = (double)counter / reader.WaveFormat.Channels;
double silenceDuration = (silenceSamples / reader.WaveFormat.SampleRate) * 1000;
return TimeSpan.FromMilliseconds(silenceDuration);
}
}
This will accept almost any audio file format not just WAV.
Usage:
using (AudioFileReader reader = new AudioFileReader(filePath))
{
TimeSpan duration = reader.GetSilenceDuration(AudioFileReaderExt.SilenceLocation.Start);
Console.WriteLine(duration.TotalMilliseconds);
}
References:
How audio dB levels are calculated.
Floating-point samples range.
More about amplitude.
Here a nice variant to detect threshold alternatings:
static class AudioFileReaderExt
{
private static bool IsSilence(float amplitude, sbyte threshold)
{
double dB = 20 * Math.Log10(Math.Abs(amplitude));
return dB < threshold;
}
private static bool IsBeep(float amplitude, sbyte threshold)
{
double dB = 20 * Math.Log10(Math.Abs(amplitude));
return dB > threshold;
}
public static double GetBeepDuration(this AudioFileReader reader,
double StartPosition, sbyte silenceThreshold = -40)
{
int counter = 0;
bool eof = false;
int initial = (int)(StartPosition * reader.WaveFormat.Channels * reader.WaveFormat.SampleRate / 1000);
if (initial > reader.Length) return -1;
reader.Position = initial;
var buffer = new float[reader.WaveFormat.SampleRate * 4];
while (!eof)
{
int samplesRead = reader.Read(buffer, 0, buffer.Length);
if (samplesRead == 0)
eof = true;
for (int n = initial; n < samplesRead; n++)
{
if (IsBeep(buffer[n], silenceThreshold))
{
counter++;
}
else
{
eof=true; break;
}
}
}
double silenceSamples = (double)counter / reader.WaveFormat.Channels;
double silenceDuration = (silenceSamples / reader.WaveFormat.SampleRate) * 1000;
return TimeSpan.FromMilliseconds(silenceDuration).TotalMilliseconds;
}
public static double GetSilenceDuration(this AudioFileReader reader,
double StartPosition, sbyte silenceThreshold = -40)
{
int counter = 0;
bool eof = false;
int initial = (int)(StartPosition * reader.WaveFormat.Channels * reader.WaveFormat.SampleRate / 1000);
if (initial > reader.Length) return -1;
reader.Position = initial;
var buffer = new float[reader.WaveFormat.SampleRate * 4];
while (!eof)
{
int samplesRead = reader.Read(buffer, 0, buffer.Length);
if (samplesRead == 0)
eof=true;
for (int n = initial; n < samplesRead; n++)
{
if (IsSilence(buffer[n], silenceThreshold))
{
counter++;
}
else
{
eof=true; break;
}
}
}
double silenceSamples = (double)counter / reader.WaveFormat.Channels;
double silenceDuration = (silenceSamples / reader.WaveFormat.SampleRate) * 1000;
return TimeSpan.FromMilliseconds(silenceDuration).TotalMilliseconds;
}
}
Main usage:
using (AudioFileReader reader = new AudioFileReader("test.wav"))
{
double duratioff = 1;
double duration = 1;
double position = 1;
while (duratioff >-1 && duration >-1)
{
duration = reader.GetBeepDuration(position);
Console.WriteLine(duration);
position = position + duration;
duratioff = reader.GetSilenceDuration(position);
Console.WriteLine(-duratioff);
position = position + duratioff;
}
}
I don't think you'll find any built-in APIs for detection of silence. But you can always use good ol' math/discreete signal processing to find out loudness.
Here's a small example: http://msdn.microsoft.com/en-us/magazine/cc163341.aspx
Use Sox. It can remove leading and trailing silences, but you'll have to call it as an exe from your app.
See code below from Detecting audio silence in WAV files using C#
private static void SkipSilent(string fileName, short silentLevel)
{
WaveReader wr = new WaveReader(File.OpenRead(fileName));
IntPtr format = wr.ReadFormat();
WaveWriter ww = new WaveWriter(File.Create(fileName + ".wav"),
AudioCompressionManager.FormatBytes(format));
int i = 0;
while (true)
{
byte[] data = wr.ReadData(i, 1);
if (data.Length == 0)
{
break;
}
if (!AudioCompressionManager.CheckSilent(format, data, silentLevel))
{
ww.WriteData(data);
}
}
ww.Close();
wr.Close();
}

Categories