How to use BiQuadFilter of NAudio? - c#

I use NAudio for recording sound from the microphone and save it in a file. I use for this:
public WaveFileWriter m_WaveFile = null;
m_WaveFile = new WaveFileWriter(strFile, m_WaveSource.WaveFormat);
void DataAvailable(object sender, WaveInEventArgs e)
{
if (m_WaveFile != null)
{
m_WaveFile.Write(e.Buffer, 0, e.BytesRecorded);
}
}
Now I would like to apply a highpassfilter and a lowpassfilter to the recorded sound. So far I found that BiQuadFilters would do this for me but so far I don't understand how to use that.
The examples I found look all like this:
var r = BiQuadFilter.LowPassFilter(44100, 1500, 1);
var reader = new WaveFileReader(File.OpenRead(strFile));
var filter = new MyWaveProvider(reader, r); // reader is the source for filter
var waveOut = new WaveOut();
waveOut.Init(filter); // filter is the source for waveOut
waveOut.Play();
If I understand this correctly then the Filter is applied to the class that is playing the sound. But I don't want to play the sound, I want the high and log pass filter applied to the file and save the result in a file. How can I do that?
edit:
This is MyWaveProvider class:
class MyWaveProvider : ISampleProvider
{
private ISampleProvider sourceProvider;
private float cutOffFreq;
private int channels;
private BiQuadFilter[] filters;
public MyWaveProvider (ISampleProvider sourceProvider, int cutOffFreq)
{
this.sourceProvider = sourceProvider;
this.cutOffFreq = cutOffFreq;
channels = sourceProvider.WaveFormat.Channels;
filters = new BiQuadFilter[channels];
CreateFilters();
}
private void CreateFilters()
{
for (int n = 0; n < channels; n++)
if (filters[n] == null)
filters[n] = BiQuadFilter.LowPassFilter(44100, cutOffFreq, 1);
else
filters[n].SetLowPassFilter(44100, cutOffFreq, 1);
}
public WaveFormat WaveFormat { get { return sourceProvider.WaveFormat; } }
public int Read(float[] buffer, int offset, int count)
{
int samplesRead = sourceProvider.Read(buffer, offset, count);
for (int i = 0; i < samplesRead; i++)
buffer[offset + i] = filters[(i % channels)].Transform(buffer[offset + i]);
return samplesRead;
}
}

The following code should satisfy what you are trying to do:
using (var reader = new WaveFileReader(File.OpenRead(strFile)))
{
var r = BiQuadFilter.LowPassFilter(44100, 1500, 1);
// reader is the source for filter
using (var filter = new MyWaveProvider(reader, r))
{
WaveFileWriter.CreateWaveFile("filteroutput.wav", filter);
}
}

There's an example of using the BiQuadFilter to make a multi-band equalizer in the NAudio WPF demo code. You can see the Equalizer code here

Related

Performing linear search on randomized array

I am trying to perform a linear search on some randomized array, but just cant see where the code gets stuck. Its a simple BigONotation algorithm test that I am currently learning and messed up somewhere. I am using a text editor for my coding. Can I get some help with this code?
using System;
using System.Collections.Generic;
using System.Data;
namespace TheBigONotations
{
public class BigONotations
{
///<properties>Class Properties</properties>
private int[] theArray;
private int arraySize;
private int itemsInArray = 0;
DateTime startTime;
DateTime endTime;
public BigONotations(int size)
{
arraySize = size;
theArray = new int[size];
}
///<order>O(1)</order>
public void addItemToArray(int newItem)
{
theArray[itemsInArray++] = newItem;
}
///<order>O(n)</order>
public void linearSearch(int value)
{
bool valueInArray = false;
string indexsWithValue = "";
startTime = DateTime.Now;
for (int i = 0; i < arraySize; i++)
{
if (theArray[i] == value)
{
valueInArray = true;
indexsWithValue += i + " ";
}
}
Console.WriteLine("Value found: " + valueInArray);
endTime = DateTime.Now;
Console.WriteLine("Linear Search took "+(endTime - startTime) );
}
///<order>O(n^2)</order>
public void BinarySearch()
{
}
///<order>O(n)</order>
///<order>O(n)</order>
///<order>O(n)</order>
public void generateRandomArray()
{
Random rnd = new Random();
for (int i = 0; i < arraySize; i++)
{
theArray[i] = (int)(rnd.Next() * 1000) + 10;
}
itemInArray = arraySize - 1;
}
public static void Main()
{
BigONotations algoTest1 = new BigONotations(100000);
algoTest1.generateRandomArray();
BigONotations algoTest2 = new BigONotations(200000);
algoTest2.generateRandomArray();
BigONotations algoTest3 = new BigONotations(300000);
algoTest3.generateRandomArray();
BigONotations algoTest4 = new BigONotations(400000);
algoTest4.generateRandomArray();
BigONotations algoTest5 = new BigONotations(500000);
algoTest5.generateRandomArray();
algoTest2.linearSearch(20);
algoTest3.linearSearch(20);
algoTest4.linearSearch(20);
}
}
}
I put all information in the code but had some error and cant see exactly where it is.
I think the error is because itemInArray inside generateRandomArray() function is not declared. Perhaps you meant itemsInArray?
String addition (with + and += operator) for too many results might be taking long time during linearSearch(). Consider using StringBuilder which is significantly faster and more memory efficient.
StringBuilder sb = new StringBuilder();
// ...
// Let's say "i" is an integer.
sb.Append(i.ToString());
sb.Append(" ");
// ...
Console.WriteLine("Indexes: " + sb.ToString());

NAudio - wave split and combine wave in realtime

I am working with a multi input soundcard and I want to achieve live mixing of multiple inputs. All the inputs are stereo, so I need to split them in first place, mix a selection of channel and provide them as mono stream.
The goal would be something like this mix Channel1[left] + Channel3[right] + Channel4[right] -> mono stream.
I have already implemented a process chain like this:
1) WaveIn -> create BufferedWaveProvider for each channel -> add Samples (just the ones for current channel) to each BufferedWaveProvider by using wavein.DataAvailable += { buffwavprovider[channel].AddSamples(...)...
This gives me a nice list of multiple BufferdWaveProvider. The splitting audio part here is implemented correctly.
2) Select multiple BufferedWaveProviders and give them to MixingWaveProvider32. Then create a WaveStream (using WaveMixerStream32 and IWaveProvider).
3) A MultiChannelToMonoStream takes that WaveStream and generates a mixdown. This also works.
But result is, that audio is chopped. Like some trouble with the buffer....
Is this the correct way to handle this problem, or is there a way better solution around?
edit - code added:
public class AudioSplitter
{
public List<NamedBufferedWaveProvider> WaveProviders { private set; get; }
public string Name { private set; get; }
private WaveIn _wavIn;
private int bytes_per_sample = 4;
/// <summary>
/// Splits up one WaveIn into one BufferedWaveProvider for each channel
/// </summary>
/// <param name="wavein"></param>
/// <returns></returns>
public AudioSplitter(WaveIn wavein, string name)
{
if (wavein.WaveFormat.Encoding != WaveFormatEncoding.IeeeFloat)
throw new Exception("Format must be IEEE float");
WaveProviders = new List<NamedBufferedWaveProvider>(wavein.WaveFormat.Channels);
Name = name;
_wavIn = wavein;
_wavIn.StartRecording();
var outFormat = NAudio.Wave.WaveFormat.CreateIeeeFloatWaveFormat(wavein.WaveFormat.SampleRate, 1);
for (int i = 0; i < wavein.WaveFormat.Channels; i++)
{
WaveProviders.Add(new NamedBufferedWaveProvider(outFormat) { DiscardOnBufferOverflow = true, Name = Name + "_" + i });
}
bytes_per_sample = _wavIn.WaveFormat.BitsPerSample / 8;
wavein.DataAvailable += Wavein_DataAvailable;
}
/// <summary>
/// add samples for each channel to bufferedwaveprovider
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
private void Wavein_DataAvailable(object sender, WaveInEventArgs e)
{
int channel = 0;
byte[] buffer = e.Buffer;
for (int i = 0; i < e.BytesRecorded - bytes_per_sample; i = i + bytes_per_sample)
{
byte[] channel_buffer = new byte[bytes_per_sample];
for (int j = 0; j < bytes_per_sample; j++)
{
channel_buffer[j] = buffer[i + j];
}
WaveProviders[channel].AddSamples(channel_buffer, 0, channel_buffer.Length);
channel++;
if (channel >= _wavIn.WaveFormat.Channels)
channel = 0;
}
}
}
Using the Audiosplitter for each channel gives a list of buffered wave provider (mono 32bit float).
var mix = new MixingWaveProvider32(_waveProviders);
var wps = new WaveProviderToWaveStream(mix);
MultiChannelToMonoStream mms = new MultiChannelToMonoStream(wps);
new Thread(() =>
{
byte[] buffer = new byte[4096];
while (mms.Read(buffer, 0, buffer.Length) > 0 && isrunning)
{
using (FileStream fs = new FileStream("C:\\temp\\audio\\mono_32.wav", FileMode.Append, FileAccess.Write))
{
fs.Write(buffer, 0, buffer.Length);
}
}
}).Start();
there is some space left for optimization, but basically this gets the job done:
private void Wavein_DataAvailable(object sender, WaveInEventArgs e)
{
int channel = 0;
byte[] buffer = e.Buffer;
List<List<byte>> channelbuffers = new List<List<byte>>();
for (int c = 0; c < _wavIn.WaveFormat.Channels; c++)
{
channelbuffers.Add(new List<byte>());
}
for (int i = 0; i < e.BytesRecorded; i++)
{
var byteList = channelbuffers[channel];
byteList.Add(buffer[i]);
if (i % bytes_per_sample == bytes_per_sample - 1)
channel++;
if (channel >= _wavIn.WaveFormat.Channels)
channel = 0;
}
for (int j = 0; j < channelbuffers.Count; j++)
{
WaveProviders[j].AddSamples(channelbuffers[j].ToArray(), 0, channelbuffers[j].Count());
}
}
We need to provide a WaveProvider (WaveProviders[j]) for each channel.

Importing and removing duplicates from a massive amount of text files using C# and Redis

This is a bit of a doozy and it's been a while since I worked with C#, so bear with me:
I'm running a jruby script to iterate through 900 files (5 Mb - 1500 Mb in size) to figure out how many dupes STILL exist within these (already uniq'd) files. I had little luck with awk.
My latest idea was to insert them into a local MongoDB instance like so:
db.collection('hashes').update({ :_id => hash}, { $inc: { count: 1} }, { upsert: true)
... so that later I could just query it like db.collection.where({ count: { $gt: 1 } }) to get all the dupes.
This is working great except it's been over 24 hours and at the time of writing I'm at 72,532,927 Mongo entries and growing.
I think Ruby's .each_line is bottlnecking the IO hardcore:
So what I'm thinking now is compiling a C# program which fires up a thread PER EACH FILE and inserts the line (md5 hash) into a Redis list.
From there, I could have another compiled C# program simply pop the values off and ignore the save if the count is 1.
So the questions are:
Will using a compiled file reader and multithreading the file reads significantly improve performance?
Is using Redis even necessary? With a tremendous amount of AWS memory, could I not just use the threads to fill some sort of a list atomically and proceed from there?
Thanks in advance.
Updated
New solution. Old solution. The main idea is to calculate dummy hashes(just sum of all chars in string) of each line and store it in Dictionary<ulong, List<LinePosition>> _hash2LinePositions. It's possible to have multiple hashes in the same stream and it solves by List in Dictionary Value. When the hashes are the same, we read and compare the strings from the streams. LinePosition is using for storing info about line - position in stream and its length. I don't have such huge files as you, but my tests shows that it works. Here is the full code:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
public class Solution
{
struct LinePosition
{
public long Start;
public long Length;
public LinePosition(long start, long count)
{
Start = start;
Length = count;
}
public override string ToString()
{
return string.Format("Start: {0}, Length: {1}", Start, Length);
}
}
class TextFileHasher : IDisposable
{
readonly Dictionary<ulong, List<LinePosition>> _hash2LinePositions;
readonly Stream _stream;
bool _isDisposed;
public HashSet<ulong> Hashes { get; private set; }
public string Name { get; private set; }
public TextFileHasher(string name, Stream stream)
{
Name = name;
_stream = stream;
_hash2LinePositions = new Dictionary<ulong, List<LinePosition>>();
Hashes = new HashSet<ulong>();
}
public override string ToString()
{
return Name;
}
public void CalculateFileHash()
{
int readByte = -1;
ulong dummyLineHash = 0;
// Line start position in file
long startPosition = 0;
while ((readByte = _stream.ReadByte()) != -1) {
// Read until new line
if (readByte == '\r' || readByte == '\n') {
// If there was data
if (dummyLineHash != 0) {
// Add line hash and line position to the dict
AddToDictAndHash(dummyLineHash, startPosition, _stream.Position - 1 - startPosition);
// Reset line hash
dummyLineHash = 0;
}
}
else {
// Was it new line ?
if (dummyLineHash == 0)
startPosition = _stream.Position - 1;
// Calculate dummy hash
dummyLineHash += (uint)readByte;
}
}
if (dummyLineHash != 0) {
// Add line hash and line position to the dict
AddToDictAndHash(dummyLineHash, startPosition, _stream.Position - startPosition);
// Reset line hash
dummyLineHash = 0;
}
}
public List<LinePosition> GetLinePositions(ulong hash)
{
return _hash2LinePositions[hash];
}
public List<string> GetDuplicates()
{
List<string> duplicates = new List<string>();
foreach (var key in _hash2LinePositions.Keys) {
List<LinePosition> linesPos = _hash2LinePositions[key];
if (linesPos.Count > 1) {
duplicates.AddRange(FindExactDuplicates(linesPos));
}
}
return duplicates;
}
public void Dispose()
{
if (_isDisposed)
return;
_stream.Dispose();
_isDisposed = true;
}
private void AddToDictAndHash(ulong hash, long start, long count)
{
List<LinePosition> linesPosition;
if (!_hash2LinePositions.TryGetValue(hash, out linesPosition)) {
linesPosition = new List<LinePosition>() { new LinePosition(start, count) };
_hash2LinePositions.Add(hash, linesPosition);
}
else {
linesPosition.Add(new LinePosition(start, count));
}
Hashes.Add(hash);
}
public byte[] GetLineAsByteArray(LinePosition prevPos)
{
long len = prevPos.Length;
byte[] lineBytes = new byte[len];
_stream.Seek(prevPos.Start, SeekOrigin.Begin);
_stream.Read(lineBytes, 0, (int)len);
return lineBytes;
}
private List<string> FindExactDuplicates(List<LinePosition> linesPos)
{
List<string> duplicates = new List<string>();
linesPos.Sort((x, y) => x.Length.CompareTo(y.Length));
LinePosition prevPos = linesPos[0];
for (int i = 1; i < linesPos.Count; i++) {
if (prevPos.Length == linesPos[i].Length) {
var prevLineArray = GetLineAsByteArray(prevPos);
var thisLineArray = GetLineAsByteArray(linesPos[i]);
if (prevLineArray.SequenceEqual(thisLineArray)) {
var line = System.Text.Encoding.Default.GetString(prevLineArray);
duplicates.Add(line);
}
#if false
string prevLine = System.Text.Encoding.Default.GetString(prevLineArray);
string thisLine = System.Text.Encoding.Default.GetString(thisLineArray);
Console.WriteLine("PrevLine: {0}\r\nThisLine: {1}", prevLine, thisLine);
StringBuilder sb = new StringBuilder();
sb.Append(prevPos);
sb.Append(" is '");
sb.Append(prevLine);
sb.Append("'. ");
sb.AppendLine();
sb.Append(linesPos[i]);
sb.Append(" is '");
sb.Append(thisLine);
sb.AppendLine("'. ");
sb.Append("Equals => ");
sb.Append(prevLine.CompareTo(thisLine) == 0);
Console.WriteLine(sb.ToString());
#endif
}
else {
prevPos = linesPos[i];
}
}
return duplicates;
}
}
public static void Main(String[] args)
{
List<TextFileHasher> textFileHashers = new List<TextFileHasher>();
string text1 = "abc\r\ncba\r\nabc";
TextFileHasher tfh1 = new TextFileHasher("Text1", new MemoryStream(System.Text.Encoding.Default.GetBytes(text1)));
tfh1.CalculateFileHash();
textFileHashers.Add(tfh1);
string text2 = "def\r\ncba\r\nwet";
TextFileHasher tfh2 = new TextFileHasher("Text2", new MemoryStream(System.Text.Encoding.Default.GetBytes(text2)));
tfh2.CalculateFileHash();
textFileHashers.Add(tfh2);
string text3 = "def\r\nbla\r\nwat";
TextFileHasher tfh3 = new TextFileHasher("Text3", new MemoryStream(System.Text.Encoding.Default.GetBytes(text3)));
tfh3.CalculateFileHash();
textFileHashers.Add(tfh3);
List<string> totalDuplicates = new List<string>();
Dictionary<ulong, Dictionary<TextFileHasher, List<LinePosition>>> totalHashes = new Dictionary<ulong, Dictionary<TextFileHasher, List<LinePosition>>>();
textFileHashers.ForEach(tfh => {
foreach(var dummyHash in tfh.Hashes) {
Dictionary<TextFileHasher, List<LinePosition>> tfh2LinePositions = null;
if (!totalHashes.TryGetValue(dummyHash, out tfh2LinePositions))
totalHashes[dummyHash] = new Dictionary<TextFileHasher, List<LinePosition>>() { { tfh, tfh.GetLinePositions(dummyHash) } };
else {
List<LinePosition> linePositions = null;
if (!tfh2LinePositions.TryGetValue(tfh, out linePositions))
tfh2LinePositions[tfh] = tfh.GetLinePositions(dummyHash);
else
linePositions.AddRange(tfh.GetLinePositions(dummyHash));
}
}
});
HashSet<TextFileHasher> alreadyGotDuplicates = new HashSet<TextFileHasher>();
foreach(var hash in totalHashes.Keys) {
var tfh2LinePositions = totalHashes[hash];
var tfh = tfh2LinePositions.Keys.FirstOrDefault();
// Get duplicates in the TextFileHasher itself
if (tfh != null && !alreadyGotDuplicates.Contains(tfh)) {
totalDuplicates.AddRange(tfh.GetDuplicates());
alreadyGotDuplicates.Add(tfh);
}
if (tfh2LinePositions.Count <= 1) {
continue;
}
// Algo to get duplicates in more than 1 TextFileHashers
var tfhs = tfh2LinePositions.Keys.ToArray();
for (int i = 0; i < tfhs.Length; i++) {
var tfh1Positions = tfhs[i].GetLinePositions(hash);
for (int j = i + 1; j < tfhs.Length; j++) {
var tfh2Positions = tfhs[j].GetLinePositions(hash);
for (int k = 0; k < tfh1Positions.Count; k++) {
var tfh1Pos = tfh1Positions[k];
var tfh1ByteArray = tfhs[i].GetLineAsByteArray(tfh1Pos);
for (int m = 0; m < tfh2Positions.Count; m++) {
var tfh2Pos = tfh2Positions[m];
if (tfh1Pos.Length != tfh2Pos.Length)
continue;
var tfh2ByteArray = tfhs[j].GetLineAsByteArray(tfh2Pos);
if (tfh1ByteArray.SequenceEqual(tfh2ByteArray)) {
var line = System.Text.Encoding.Default.GetString(tfh1ByteArray);
totalDuplicates.Add(line);
}
}
}
}
}
}
Console.WriteLine();
if (totalDuplicates.Count > 0) {
Console.WriteLine("Total number of duplicates: {0}", totalDuplicates.Count);
Console.WriteLine("#######################");
totalDuplicates.ForEach(x => Console.WriteLine("{0}", x));
Console.WriteLine("#######################");
}
// Free resources
foreach (var tfh in textFileHashers)
tfh.Dispose();
}
}
If you have tons of ram... You guys are overthinking it...
var fileLines = File.ReadAllLines(#"c:\file.csv").Distinct();

Increase/Decrease audio volume using FFmpeg

I'm am currently using C# invokes to call the FFmpeg APIs to handle video and audio. I have the following code in place to extract the audio from a video and write it to a file.
while (ffmpeg.av_read_frame(formatContext, &packet) >= 0)
{
if (packet.stream_index == streamIndex)
{
while (packet.size > 0)
{
int frameDecoded;
int frameDecodedResult = ffmpeg.avcodec_decode_audio4(codecContext, frame, &frameDecoded, packet);
if (frameDecoded > 0 && frameDecodedResult >= 0)
{
//writeAudio.WriteFrame(frame);
packet.data += totalBytesDecoded;
packet.size -= totalBytesDecoded;
}
}
frameIndex++;
}
Avcodec.av_free_packet(&packet);
}
This is all working correctly. I'm currently using the FFmpeg.AutoGen project for the API access.
I want to be able to increase/decrease the volume of the audio before its written to the file, but I cannot seem to find a command or any help with this. Does it have to be done manually?
Update 1:
After receiving some help, this is the class layout I have:
public unsafe class FilterVolume
{
#region Private Member Variables
private AVFilterGraph* m_filterGraph = null;
private AVFilterContext* m_aBufferSourceFilterContext = null;
private AVFilterContext* m_aBufferSinkFilterContext = null;
#endregion
#region Private Constant Member Variables
private const int EAGAIN = 11;
#endregion
public FilterVolume(AVCodecContext* codecContext, AVStream* stream, float volume)
{
CodecContext = codecContext;
Stream = stream;
Volume = volume;
Initialise();
}
public AVFrame* Adjust(AVFrame* frame)
{
AVFrame* returnFilteredFrame = ffmpeg.av_frame_alloc();
if (m_aBufferSourceFilterContext != null && m_aBufferSinkFilterContext != null)
{
int bufferSourceAddFrameResult = ffmpeg.av_buffersrc_add_frame(m_aBufferSourceFilterContext, frame);
if (bufferSourceAddFrameResult < 0)
{
}
int bufferSinkGetFrameResult = ffmpeg.av_buffersink_get_frame(m_aBufferSinkFilterContext, returnFilteredFrame);
if (bufferSinkGetFrameResult < 0 && bufferSinkGetFrameResult != -EAGAIN)
{
}
}
return returnFilteredFrame;
}
public void Dispose()
{
Cleanup(m_filterGraph);
}
#region Private Properties
private AVCodecContext* CodecContext { get; set; }
private AVStream* Stream { get; set; }
private float Volume { get; set; }
#endregion
#region Private Setup Helper Functions
private void Initialise()
{
m_filterGraph = GetAllocatedFilterGraph();
string aBufferFilterArguments = string.Format("sample_fmt={0}:channel_layout={1}:sample_rate={2}:time_base={3}/{4}",
(int)CodecContext->sample_fmt,
CodecContext->channel_layout,
CodecContext->sample_rate,
Stream->time_base.num,
Stream->time_base.den);
AVFilterContext* aBufferSourceFilterContext = CreateFilter("abuffer", m_filterGraph, aBufferFilterArguments);
AVFilterContext* volumeFilterContext = CreateFilter("volume", m_filterGraph, string.Format("volume={0}", Volume));
AVFilterContext* aBufferSinkFilterContext = CreateFilter("abuffersink", m_filterGraph);
LinkFilter(aBufferSourceFilterContext, volumeFilterContext);
LinkFilter(volumeFilterContext, aBufferSinkFilterContext);
SetFilterGraphConfiguration(m_filterGraph, null);
m_aBufferSourceFilterContext = aBufferSourceFilterContext;
m_aBufferSinkFilterContext = aBufferSinkFilterContext;
}
#endregion
#region Private Cleanup Helper Functions
private static void Cleanup(AVFilterGraph* filterGraph)
{
if (filterGraph != null)
{
ffmpeg.avfilter_graph_free(&filterGraph);
}
}
#endregion
#region Provate Helpers
private AVFilterGraph* GetAllocatedFilterGraph()
{
AVFilterGraph* filterGraph = ffmpeg.avfilter_graph_alloc();
if (filterGraph == null)
{
}
return filterGraph;
}
private AVFilter* GetFilterByName(string name)
{
AVFilter* filter = ffmpeg.avfilter_get_by_name(name);
if (filter == null)
{
}
return filter;
}
private void SetFilterGraphConfiguration(AVFilterGraph* filterGraph, void* logContext)
{
int filterGraphConfigResult = ffmpeg.avfilter_graph_config(filterGraph, logContext);
if (filterGraphConfigResult < 0)
{
}
}
private AVFilterContext* CreateFilter(string filterName, AVFilterGraph* filterGraph, string filterArguments = null)
{
AVFilter* filter = GetFilterByName(filterName);
AVFilterContext* filterContext;
int aBufferFilterCreateResult = ffmpeg.avfilter_graph_create_filter(&filterContext, filter, filterName, filterArguments, null, filterGraph);
if (aBufferFilterCreateResult < 0)
{
}
return filterContext;
}
private void LinkFilter(AVFilterContext* source, AVFilterContext* destination)
{
int filterLinkResult = ffmpeg.avfilter_link(source, 0, destination, 0);
if (filterLinkResult < 0)
{
}
}
#endregion
}
The Adjust() function is called after a frame is decoded. I'm currently getting a -22 error when av_buffersrc_add_frame() is called. This indicates that a parameter is invalid, but after debugging, I cannot see anything that would be causing this.
This is how the code is called:
while (ffmpeg.av_read_frame(formatContext, &packet) >= 0)
{
if (packet.stream_index == streamIndex)
{
while (packet.size > 0)
{
int frameDecoded;
int frameDecodedResult = ffmpeg.avcodec_decode_audio4(codecContext, frame, &frameDecoded, packet);
if (frameDecoded > 0 && frameDecodedResult >= 0)
{
AVFrame* filteredFrame = m_filterVolume.Adjust(frame);
//writeAudio.WriteFrame(filteredFrame);
packet.data += totalBytesDecoded;
packet.size -= totalBytesDecoded;
}
}
frameIndex++;
}
Avcodec.av_free_packet(&packet);
}
Update 2:
Cracked it. The "channel_layout" option in the filter argument string is supposed to be a hexadecimal. This is what the string formatting should look like:
string aBufferFilterArguments = string.Format("sample_fmt={0}:channel_layout=0x{1}:sample_rate={2}:time_base={3}/{4}",
(int)CodecContext->sample_fmt,
CodecContext->channel_layout,
CodecContext->sample_rate,
Stream->time_base.num,
Stream->time_base.den);
I do not know what API you are using, but ffmpeg has a command that allows to increase or decrease audio:
Decrease to half:
ffmpeg -i input.wav -af "volume=0.5" output.wav
Increase 50%:
ffmpeg -i input.wav -af "volume=1.5" output.wav
or in dB:
ffmpeg -i input.wav -af "volume=10dB" output.wav
Hopes it helps you
What you need to do is build a filter graph and process the audio stream through that graph. In your case, the graph is just INPUT ("abuffer") -> VOLUME -> OUTPUT ("abuffersink").
Here is a sample console app that demonstrates that. It's loosely based on ffmpeg samples filtering_audio, filter_audio and remuxing.
You can use it like this:
ChangeVolume.exe http://www.quirksmode.org/html5/videos/big_buck_bunny.mp4 bunny_half.mp4 0.5
And here is the code:
class Program
{
static unsafe void Main(string[] args)
{
Console.WriteLine(#"Current directory: " + Environment.CurrentDirectory);
Console.WriteLine(#"Running in {0}-bit mode.", Environment.Is64BitProcess ? #"64" : #"32");
// adapt this to your context
var ffmpegPath = string.Format(#"../../../FFmpeg/bin/{0}", Environment.Is64BitProcess ? #"x64" : #"x86");
InteropHelper.SetDllDirectory(ffmpegPath);
int ret, i;
if (args.Length < 3)
{
Console.WriteLine("usage: ChangeVolume input output <volume ratio>");
return;
}
string in_filename = args[0];
string out_filename = args[1];
double ratio = double.Parse(args[2]);
ffmpeg.av_register_all();
ffmpeg.avfilter_register_all();
// open input file
AVFormatContext* ifmt_ctx = null;
InteropHelper.Check(ffmpeg.avformat_open_input(&ifmt_ctx, in_filename, null, null));
// dump input
ffmpeg.av_dump_format(ifmt_ctx, 0, in_filename, 0);
// get streams info to determine audio stream index
InteropHelper.Check(ffmpeg.avformat_find_stream_info(ifmt_ctx, null));
// determine input decoder
AVCodec* dec;
int audio_stream_index = ffmpeg.av_find_best_stream(ifmt_ctx, AVMediaType.AVMEDIA_TYPE_AUDIO, -1, -1, &dec, 0);
AVCodecContext* dec_ctx = ifmt_ctx->streams[audio_stream_index]->codec;
// open input decoder
InteropHelper.Check(ffmpeg.avcodec_open2(dec_ctx, dec, null));
// build a filter graph
AVFilterContext* buffersrc_ctx;
AVFilterContext* buffersink_ctx;
AVFilterGraph* filter_graph = init_filter_graph(ifmt_ctx, dec_ctx, audio_stream_index, &buffersrc_ctx, &buffersink_ctx, ratio);
// prepare output
AVFormatContext* ofmt_ctx = null;
InteropHelper.Check(ffmpeg.avformat_alloc_output_context2(&ofmt_ctx, null, null, out_filename));
InteropHelper.Check(ofmt_ctx);
// create output streams
AVCodecContext* enc_ctx = null;
ofmt_ctx->oformat->flags |= InteropHelper.AVFMT_NOTIMESTAMPS;
for (i = 0; i < ifmt_ctx->nb_streams; i++)
{
AVStream* in_stream = ifmt_ctx->streams[i];
if (in_stream->codec->codec_type == AVMediaType.AVMEDIA_TYPE_DATA) // skip these
continue;
AVStream* out_stream = ffmpeg.avformat_new_stream(ofmt_ctx, in_stream->codec->codec);
InteropHelper.Check(out_stream);
InteropHelper.Check(ffmpeg.avcodec_copy_context(out_stream->codec, in_stream->codec));
out_stream->codec->codec_tag = 0;
if ((ofmt_ctx->oformat->flags & InteropHelper.AVFMT_GLOBALHEADER) != 0)
{
out_stream->codec->flags |= InteropHelper.AV_CODEC_FLAG_GLOBAL_HEADER;
}
if (i == audio_stream_index)
{
// create audio encoder from audio decoder
AVCodec* enc = ffmpeg.avcodec_find_encoder(dec_ctx->codec_id);
InteropHelper.Check(enc);
enc_ctx = ffmpeg.avcodec_alloc_context3(enc);
InteropHelper.Check(enc_ctx);
enc_ctx->sample_rate = dec_ctx->sample_rate;
enc_ctx->channel_layout = dec_ctx->channel_layout;
enc_ctx->channels = ffmpeg.av_get_channel_layout_nb_channels(enc_ctx->channel_layout);
enc_ctx->sample_fmt = enc->sample_fmts[0];
enc_ctx->time_base.num = 1;
enc_ctx->time_base.den = enc_ctx->sample_rate;
InteropHelper.Check(ffmpeg.avcodec_open2(enc_ctx, enc, null));
}
}
// dump output
ffmpeg.av_dump_format(ofmt_ctx, 0, out_filename, 1);
if ((ofmt_ctx->oformat->flags & InteropHelper.AVFMT_NOFILE) == 0)
{
// open output file
InteropHelper.Check(ffmpeg.avio_open(&ofmt_ctx->pb, out_filename, InteropHelper.AVIO_FLAG_WRITE));
}
// write output file header
InteropHelper.Check(ffmpeg.avformat_write_header(ofmt_ctx, null));
// read all packets and process
AVFrame* frame = ffmpeg.av_frame_alloc();
AVFrame* filt_frame = ffmpeg.av_frame_alloc();
while (true)
{
AVStream* in_stream;
AVStream* out_stream;
AVPacket pkt;
ret = ffmpeg.av_read_frame(ifmt_ctx, &pkt);
if (ret < 0)
break;
in_stream = ifmt_ctx->streams[pkt.stream_index];
if (in_stream->codec->codec_type == AVMediaType.AVMEDIA_TYPE_DATA)
continue;
// audio stream? we need to pass it through our filter graph
if (pkt.stream_index == audio_stream_index)
{
// decode audio (packet -> frame)
int got_frame = 0;
InteropHelper.Check(ffmpeg.avcodec_decode_audio4(dec_ctx, frame, &got_frame, &pkt));
if (got_frame > 0)
{
// add the frame into the filter graph
InteropHelper.Check(ffmpeg.av_buffersrc_add_frame(buffersrc_ctx, frame));
while (true)
{
// get the frame out from the filter graph
ret = ffmpeg.av_buffersink_get_frame(buffersink_ctx, filt_frame);
const int EAGAIN = 11;
if (ret == -EAGAIN)
break;
InteropHelper.Check(ret);
// encode audio (frame -> packet)
AVPacket enc_pkt = new AVPacket();
int got_packet = 0;
InteropHelper.Check(ffmpeg.avcodec_encode_audio2(enc_ctx, &enc_pkt, filt_frame, &got_packet));
enc_pkt.stream_index = pkt.stream_index;
InteropHelper.Check(ffmpeg.av_interleaved_write_frame(ofmt_ctx, &enc_pkt));
ffmpeg.av_frame_unref(filt_frame);
}
}
}
else
{
// write other (video) streams
out_stream = ofmt_ctx->streams[pkt.stream_index];
pkt.pts = ffmpeg.av_rescale_q_rnd(pkt.pts, in_stream->time_base, out_stream->time_base, AVRounding.AV_ROUND_NEAR_INF | AVRounding.AV_ROUND_PASS_MINMAX);
pkt.dts = ffmpeg.av_rescale_q_rnd(pkt.dts, in_stream->time_base, out_stream->time_base, AVRounding.AV_ROUND_NEAR_INF | AVRounding.AV_ROUND_PASS_MINMAX);
pkt.duration = ffmpeg.av_rescale_q(pkt.duration, in_stream->time_base, out_stream->time_base);
pkt.pos = -1;
InteropHelper.Check(ffmpeg.av_interleaved_write_frame(ofmt_ctx, &pkt));
}
ffmpeg.av_packet_unref(&pkt);
}
// write trailer, close file
ffmpeg.av_write_trailer(ofmt_ctx);
ffmpeg.avformat_close_input(&ifmt_ctx);
if ((ofmt_ctx->oformat->flags & InteropHelper.AVFMT_NOFILE) == 0)
{
ffmpeg.avio_closep(&ofmt_ctx->pb);
}
ffmpeg.avformat_free_context(ofmt_ctx);
ffmpeg.av_frame_free(&filt_frame);
ffmpeg.av_frame_free(&frame);
ffmpeg.avfilter_graph_free(&filter_graph);
return;
}
static unsafe AVFilterGraph* init_filter_graph(AVFormatContext* format, AVCodecContext* codec, int audio_stream_index, AVFilterContext** buffersrc_ctx, AVFilterContext** buffersink_ctx, double volumeRatio)
{
// create graph
var filter_graph = ffmpeg.avfilter_graph_alloc();
InteropHelper.Check(filter_graph);
// add input filter
var abuffersrc = ffmpeg.avfilter_get_by_name("abuffer");
if (abuffersrc == null) InteropHelper.CheckTag("\x00F8FIL");
string args = string.Format("sample_fmt={0}:channel_layout={1}:sample_rate={2}:time_base={3}/{4}",
(int)codec->sample_fmt,
codec->channel_layout,
codec->sample_rate,
format->streams[audio_stream_index]->time_base.num,
format->streams[audio_stream_index]->time_base.den);
InteropHelper.Check(ffmpeg.avfilter_graph_create_filter(buffersrc_ctx, abuffersrc, "IN", args, null, filter_graph));
// add volume filter
var volume = ffmpeg.avfilter_get_by_name("volume");
if (volume == null) InteropHelper.CheckTag("\x00F8FIL");
AVFilterContext* volume_ctx;
InteropHelper.Check(ffmpeg.avfilter_graph_create_filter(&volume_ctx, volume, "VOL", "volume=" + volumeRatio.ToString(CultureInfo.InvariantCulture), null, filter_graph));
// add output filter
var abuffersink = ffmpeg.avfilter_get_by_name("abuffersink");
if (abuffersink == null) InteropHelper.CheckTag("\x00F8FIL");
InteropHelper.Check(ffmpeg.avfilter_graph_create_filter(buffersink_ctx, abuffersink, "OUT", "", null, filter_graph));
// connect input -> volume -> output
InteropHelper.Check(ffmpeg.avfilter_link(*buffersrc_ctx, 0, volume_ctx, 0));
InteropHelper.Check(ffmpeg.avfilter_link(volume_ctx, 0, *buffersink_ctx, 0));
InteropHelper.Check(ffmpeg.avfilter_graph_config(filter_graph, null));
return filter_graph;
}
}
It uses a utility InteropHelper class derived from AutoGen's:
public class InteropHelper
{
[DllImport("kernel32", SetLastError = true)]
public static extern bool SetDllDirectory(string lpPathName);
public static readonly int AVERROR_EOF = -GetTag("EOF ");
public static readonly int AVERROR_UNKNOWN = -GetTag("UNKN");
public static readonly int AVFMT_GLOBALHEADER = 0x0040;
public static readonly int AVFMT_NOFILE = 0x0001;
public static readonly int AVIO_FLAG_WRITE = 2;
public static readonly int AV_CODEC_FLAG_GLOBAL_HEADER = (1 << 22);
public static readonly int AV_ROUND_ZERO = 0;
public static readonly int AV_ROUND_INF = 1;
public static readonly int AV_ROUND_DOWN = 2;
public static readonly int AV_ROUND_UP = 3;
public static readonly int AV_ROUND_PASS_MINMAX = 8192;
public static readonly int AV_ROUND_NEAR_INF = 5;
public static readonly int AVFMT_NOTIMESTAMPS = 0x0080;
public static unsafe void Check(void* ptr)
{
if (ptr != null)
return;
const int ENOMEM = 12;
Check(-ENOMEM);
}
public static unsafe void Check(IntPtr ptr)
{
if (ptr != IntPtr.Zero)
return;
Check((void*)null);
}
// example: "\x00F8FIL" is "Filter not found" (check libavutil/error.h)
public static void CheckTag(string tag)
{
Check(-GetTag(tag));
}
public static int GetTag(string tag)
{
var bytes = new byte[4];
for (int i = 0; i < 4; i++)
{
bytes[i] = (byte)tag[i];
}
return BitConverter.ToInt32(bytes, 0);
}
public static void Check(int res)
{
if (res >= 0)
return;
string err = "ffmpeg error " + res;
string text = GetErrorText(res);
if (!string.IsNullOrWhiteSpace(text))
{
err += ": " + text;
}
throw new Exception(err);
}
public static string GetErrorText(int res)
{
IntPtr err = Marshal.AllocHGlobal(256);
try
{
ffmpeg.av_strerror(res, err, 256);
return Marshal.PtrToStringAnsi(err);
}
finally
{
Marshal.FreeHGlobal(err);
}
}
}

Suppressing Frequencies From FFT

What I am trying to do is to retrieve the frequencies from some song and suppress all the frequencies that do not appear in the human vocal range or in general any range. Here is my suppress function.
public void SupressAndWrite(Func<FrequencyUnit, bool> func)
{
this.WaveManipulated = true;
while (this.mainWave.WAVFile.NumSamplesRemaining > 0)
{
FrequencyUnit[] freqUnits = this.mainWave.NextFrequencyUnits();
Complex[] compUnits = (from item
in freqUnits
select (func(item)
? new Complex(item.Frequency, 0) :Complex.Zero))
.ToArray();
FourierTransform.FFT(compUnits, FourierTransform.Direction.Backward);
short[] shorts = (from item
in compUnits
select (short)item.Real).ToArray();
foreach (short item in shorts)
{
this.ManipulatedFile.AddSample16bit(item);
}
}
this.ManipulatedFile.Close();
}
Here is my class for my wave.
public sealed class ComplexWave
{
public readonly WAVFile WAVFile;
public readonly Int32 SampleSize;
private FourierTransform.Direction fourierDirection { get; set; }
private long position;
/// <param name="file"></param>
/// <param name="sampleSize in BLOCKS"></param>
public ComplexWave(WAVFile file, int sampleSize)
{
file.NullReferenceExceptionCheck();
this.WAVFile = file;
this.SampleSize = sampleSize;
if (this.SampleSize % 8 != 0)
{
if (this.SampleSize % 16 != 0)
{
throw new ArgumentException("Sample Size");
}
}
if (!MathTools.IsPowerOf2(sampleSize))
{
throw new ArgumentException("Sample Size");
}
this.fourierDirection = FourierTransform.Direction.Forward;
}
public Complex[] NextSampleFourierTransform()
{
short[] newInput = this.GetNextSample();
Complex[] data = newInput.CopyToComplex();
if (newInput.Any((x) => x != 0))
{
Debug.Write("done");
}
FourierTransform.FFT(data, this.fourierDirection);
return data;
}
public FrequencyUnit[] NextFrequencyUnits()
{
Complex[] cm = this.NextSampleFourierTransform();
FrequencyUnit[] freqUn = new FrequencyUnit[(cm.Length / 2)];
int max = (cm.Length / 2);
for (int i = 0; i < max; i++)
{
freqUn[i] = new FrequencyUnit(cm[i], this.WAVFile.SampleRateHz, i, cm.Length);
}
Array.Sort(freqUn);
return freqUn;
}
private short[] GetNextSample()
{
short[] retval = new short[this.SampleSize];
for (int i = 0; i < this.SampleSize; i++)
{
if (this.WAVFile.NumSamplesRemaining > 0)
{
retval[i] = this.WAVFile.GetNextSampleAs16Bit();
this.position++;
}
}
return retval;
}
}
Both FFT forward and FFT backwards work correctly. Could you please tell me what my error is.
Unfortunately, human voice, even when singing, isn't in 'frequency range'. It usually has one main frequency and multitude of harmonics that follow it, depending on the phoneme.
Use this https://play.google.com/store/apps/details?id=radonsoft.net.spectralview&hl=en or some similar app to see what I mean - and then re-define your strategy. Also google 'karaoke' effect.
NEXT:
It's not obvious from your example, but you should scan whole file in windows (google 'fft windowing') to process it whole.

Categories