I finished getting row data from a WAV file. So now, I know the informatipon about the WAV file, such as DataRate and SamplingPerBits, etc.
And I have several kinds of data types after reading the WAV file: 16 bits - []Int16, 8 bits - []byte.
Now I am trying to convert []Int16 to []float!
I found the NAudio.wave function Wave16ToFloatProvider().
I have seen Converting 16 bit to 32-nit floating point. But I couldn't get about it, because I don't need to write WaveFileWriter.
So I tried to do without WaveFileWriter. Here is my code.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using NAudio;
using NAudio.Wave;
namespace WaveREader
{
class WaveReader
{
WaveFileReader reader = new WaveFileReader("wavetest.wav");
IWaveProvider stream32 = new Wave16ToFloatProvider(reader);
byte[] buffer = new byte[8192];
float[] DATASIXTEEN;
for (int i = 0; i < buffer.Length; i++)
{
DATASIXTEEN = new float[buffer.Length];
DATASIXTEEN[i] = stream32.Read(buffer, 0, buffer.Length);
}
}
}
I think this part would be wrong, DATASIXTEEN[i] = stream32.Read(buffer, 0, buffer.Length);, but I have no idea how to correct it.
Would you give me some advice for it or code by using Wave16ToFloatProvider?
Or would I ask you how to convert without Wave16ToFloatProvider?
The return value from Stream.Read is the count of the number of bytes read, not what you're after. The data you want is in the buffer, but each 32-bit sample is spread across 4 8-bit bytes.
There are a number of ways to get the data as 32-bit float.
The first is to use an ISampleProvider which converts the data into the floating point format and gives a simple way to read the data in that format:
WaveFileReader reader = new WaveFileReader("wavetest.wav");
ISampleProvider provider = new Pcm16BitToSampleProvider(reader);
int blockSize = 2000;
float[] buffer = new float[blockSize];
// Read blocks of samples until no more available
int rc;
while ((rc = provider.Read(buffer, 0, blockSize)) > 0)
{
// Process the array of samples in here.
// rc is the number of valid samples in the buffer
// ....
}
Alternatively, there is a method in WaveFileReader that lets you read floating point samples directly. The downside is that it reads one sample group (, that is, one sample for each channel - one for mono, two for stereo) at a time, which can be time consuming. Reading and processing arrays is faster in most cases.
WaveFileReader reader = new WaveFileReader("wavetest.wav");
float[] buffer;
while ((buffer = reader.ReadNextSampleFrame()) != null)
{
// Process samples in here.
// buffer contains one sample per channel
// ....
}
Related
I am using C# in Universal Windows App to write a Watson Speech-to-text service.
For now instead of using the Watson service, I write to the file and then read it in the Audacity to confirm it is in the right format since Watson service wasn't returning correct responses to me, and the following explains why.
For some reason when I create 16-bit PCM encoding properties, and read buffer, I am only able to read data as 32-bit PCM, and it's working well, but if I read it in 16-bit PCM it is in slow motion, and all the speech is basically corrupted.
I don't really know what exactly needs to be done to convert from 32-bit to 16-bit, but here's what I have in my C# application:
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
//Initialize microphone
var microphone = await DeviceInformation.CreateFromIdAsync(MediaDevice.GetDefaultAudioCaptureId(AudioDeviceRole.Default));
var micInputResult = await graph.CreateDeviceInputNodeAsync(MediaCategory.Speech, pcmEncoding, microphone);
//Create frame output node
frameOutputNode = graph.CreateFrameOutputNode(pcmEncoding);
//Callback function to fire when buffer is filled with data
graph.QuantumProcessed += (s, a) => ProcessFrameOutput(frameOutputNode.GetFrame());
frameOutputNode.Start();
//Make the microphone write into the frame node
micInputResult.DeviceInputNode.AddOutgoingConnection(frameOutputNode);
micInputResult.DeviceInputNode.Start();
graph.Start();
Initialization step is done at this stage. Now, actually reading from the buffer and writing to the file is only working if I use 32-bit PCM encoding with the following function (commented out is the PCM 16-bit code that is resulting in a slow motion speech output):
private void ProcessFrameOutput(AudioFrame frame)
{
//Making a copy of the audio frame buffer
var audioBuffer = frame.LockBuffer(AudioBufferAccessMode.Read);
var buffer = Windows.Storage.Streams.Buffer.CreateCopyFromMemoryBuffer(audioBuffer);
buffer.Length = audioBuffer.Length;
using (var dataReader = DataReader.FromBuffer(buffer))
{
dataReader.ByteOrder = ByteOrder.LittleEndian;
byte[] byteData = new byte[buffer.Length];
int pos = 0;
while (dataReader.UnconsumedBufferLength > 0)
{
/*Reading Float -> Int 32*/
/*With this code I can import raw wav file into the Audacity
using Signed 32-bit PCM Encoding, and it is working well*/
var singleTmp = dataReader.ReadSingle();
var int32Tmp = (Int32)(singleTmp * Int32.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int32Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
byteData[pos++] = chunkBytes[2];
byteData[pos++] = chunkBytes[3];
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
//var singleTmp = dataReader.ReadSingle();
//var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
//byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
//byteData[pos++] = chunkBytes[0];
//byteData[pos++] = chunkBytes[1];
}
WriteBytesToFile(byteData);
}
}
Can anyone think of a reason why this is happening? Is it because Int32 PCM is larger in size and when I use Int16, it extends it and makes the sound longer? Or am I not sampling it properly?
Note: I tried reading Bytes directly from the buffer, and then using that as a raw data, but it's not encoded as PCM that way.
Reading Int16/32 from the buffer directly also doesn't work.
In the above example I am only using Frame Output node. IF I create a file output node that automatically writes to the raw file, it works really well as 16-bit PCM, so something is wrong in my callback function that causes it to be in a slow motion.
Thanks
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
pcmEncoding does not make much sense here since only Float encoding is supported by AudioGraph.
byte[] byteData = new byte[buffer.Length];
it should be buffer.Length / 2 since you convert from float data with 4 bytes per sample to int16 data with 2 bytes per sample
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
var singleTmp = dataReader.ReadSingle();
var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
This is correct code, it should work. Your "slow motion" is most likely related to the buffer size you incorrectly set before.
I must admit Microsoft needs someone to review their bloated APIs
I'm trying to play raw pcm data delivered from ohLibSpotify c# library (https://github.com/openhome/ohLibSpotify).
I get the data in the following callback:
public void MusicDeliveryCallback(SpotifySession session, AudioFormat format, IntPtr frames, int num_frames)
{
//EXAMPLE DATA
//format.channels = 2, format.samplerate = 44100, format.sample_type = Int16NativeEndian
//frames = ?
//num_frames = 2048
}
Now i want to directly play the received data with NAudio (http://naudio.codeplex.com/). With the following code snippet i can play a mp3 file from disk. Is it possible to directly pass the data received from spotify to NAudio and play it in realtime?
using (var ms = File.OpenRead("test.pcm"))
using (var rdr = new Mp3FileReader(ms))
using (var wavStream = WaveFormatConversionStream.CreatePcmStream(rdr))
using (var baStream = new BlockAlignReductionStream(wavStream))
using (var waveOut = new WaveOut(WaveCallbackInfo.FunctionCallback()))
{
waveOut.Init(baStream);
waveOut.Play();
while (waveOut.PlaybackState == PlaybackState.Playing)
{
Thread.Sleep(100);
}
}
EDIT:
I updated my code. The program doesn't throw any errors, but i also can't hear music. Is anything wrong in my code?
This is the music delivery callback:
public void MusicDeliveryCallback(SpotifySession session, AudioFormat format, IntPtr frames, int num_frames)
{
//format.channels = 2, format.samplerate = 44100, format.sample_type = Int16NativeEndian
//frames = ?
//num_frames = 2048
byte[] frames_copy = new byte[num_frames];
Marshal.Copy(frames, frames_copy, 0, num_frames);
bufferedWaveProvider = new BufferedWaveProvider(new WaveFormat(format.sample_rate, format.channels));
bufferedWaveProvider.BufferDuration = TimeSpan.FromSeconds(40);
bufferedWaveProvider.AddSamples(frames_copy, 0, num_frames);
bufferedWaveProvider.Read(frames_copy, 0, num_frames);
if (_waveOutDeviceInitialized == false)
{
IWavePlayer waveOutDevice = new WaveOut();
waveOutDevice.Init(bufferedWaveProvider);
waveOutDevice.Play();
_waveOutDeviceInitialized = true;
}
}
And these are the overwritten callbacks in the SessionListener:
public override int MusicDelivery(SpotifySession session, AudioFormat format, IntPtr frames, int num_frames)
{
_sessionManager.MusicDeliveryCallback(session, format, frames, num_frames);
return base.MusicDelivery(session, format, frames, num_frames);
}
public override void GetAudioBufferStats(SpotifySession session, out AudioBufferStats stats)
{
stats.samples = 2048 / 2; //???
stats.stutter = 0; //???
}
I think you can do this:
Create a BufferedWaveProvider.
Pass this to waveOut.Init.
In your MusicDeliveryCallback, use Marshal.Copy to copy from the native buffer into a managed byte array.
Pass this managed byte array to AddSamples on your BufferedWaveProvider.
In your GetAudioBufferStats callback, use bufferedWaveProvider.BufferedBytes / 2 for "samples" and leave "stutters" as 0.
I think that will work. It involves some unnecessary copying and doesn't accurately keep track of stutters, but it's a good starting point. I think it might be a better (more efficient and reliable) solution to implement IWaveProvider and manage the buffering yourself.
I wrote the ohLibSpotify wrapper-library, but I don't work for the same company anymore, so I'm not involved in its development anymore. You might be able to get more help from someone on this forum: http://forum.openhome.org/forumdisplay.php?fid=6 So far as music delivery goes, ohLibSpotify aims to have as little overhead as possible. It doesn't copy the music data at all, it just passes you the same native pointer that the libspotify library itself provided, so that you can copy it yourself to its final destination and avoid an unnecessary layer of copying. It does make it a bit clunky for simple usage, though.
Good luck!
First, your code snippet shown above is more complicated than it needs to be. You only need five, instead of two using statments. Mp3FileReader decodes to PCM for you. Second, use WaveOutEvent in preference to WaveOut with function callbacks. It is much more reliable.
using (var rdr = new Mp3FileReader("test.pcm"))
using (var waveOut = new WaveOutEvent())
{
//...
}
To answer you actual question, you need to use a BufferedWaveProvider. You create one of these and pass it to your output device in the Init method. Now, as you receive audio, decompress it to PCM (if it is compressed) and put it into the BufferedWaveProvider. The NAudioDemo application includes examples of how to do this, so look at the NAudio source code to see how its done.
I'm working with the NAudio-library and would like to perform the fast fourier transformation to a WaveStream. I saw that NAudio has already built-in the FFT but how do I use it?
I heard i have to use the SampleAggregator class.
You need to read this entire blog article to best understand the following code sample I lifted to ensure the sample is preserved even if the article isn't:
using (WaveFileReader reader = new WaveFileReader(fileToProcess))
{
IWaveProvider stream32 = new Wave16toFloatProvider(reader);
IWaveProvider streamEffect = new AutoTuneWaveProvider(stream32, autotuneSettings);
IWaveProvider stream16 = new WaveFloatTo16Provider(streamEffect);
using (WaveFileWriter converted = new WaveFileWriter(tempFile, stream16.WaveFormat))
{
// buffer length needs to be a power of 2 for FFT to work nicely
// however, make the buffer too long and pitches aren't detected fast enough
// successful buffer sizes: 8192, 4096, 2048, 1024
// (some pitch detection algorithms need at least 2048)
byte[] buffer = new byte[8192];
int bytesRead;
do
{
bytesRead = stream16.Read(buffer, 0, buffer.Length);
converted.WriteData(buffer, 0, bytesRead);
} while (bytesRead != 0 && converted.Length < reader.Length);
}
}
but in short, if you get the WAV file created you can use that sample to convert it to FFT.
I've been looking over the NAudio examples trying to work out how I can get ulaw samples suitable for packaging up as an RTP payload. I'm attempting to generate the samples from an mp3 file using the code below. Not surprisingly, since I don't really have a clue what I'm doing with NAudio, when I transmit the samples across the network to a softphone all I get is static.
Can anyone provide any direction on how I should be getting 160 bytes (8Khz # 20ms) ULAW samples from an MP3 file using NAudio?
private void GetAudioSamples()
{
var pcmStream = WaveFormatConversionStream.CreatePcmStream(new Mp3FileReader("whitelight.mp3"));
byte[] buffer = new byte[2];
byte[] sampleBuffer = new byte[160];
int sampleIndex = 0;
int bytesRead = pcmStream.Read(buffer, 0, 2);
while (bytesRead > 0)
{
var ulawByte = MuLawEncoder.LinearToMuLawSample(BitConverter.ToInt16(buffer, 0));
sampleBuffer[sampleIndex++] = ulawByte;
if (sampleIndex == 160)
{
m_rtpChannel.AddSample(sampleBuffer);
sampleBuffer = new byte[160];
sampleIndex = 0;
}
bytesRead = pcmStream.Read(buffer, 0, 2);
}
logger.Debug("Finished adding audio samples.");
}
Here's a few pointers. First of all, as long as you are using NAudio 1.5, no need for the additional WaveFormatConversionStream - Mp3FileReader's Read method returns PCM.
However, you will not be getting 8kHz out, so you need to resample it first. WaveFormatConversionStream can do this, although it uses the built-in Windows ACM sample rate conversion, which doesn't seem to filter the incoming audio well, so there could be aliasing artefacts.
Also, you usually read bigger blocks than just two bytes at a time as the MP3 decoder needs to decode frames one at a time (the resampler also will want to deal with bigger block sizes). I would try reading at least 20ms worth of bytes at a time.
Your use of BitConverter.ToInt16 is correct for getting the 16 bit sample value, but bear in mind that an MP3 is likely stereo, with left, right samples. Are you sure your phone expects stereo.
Finally, I recommend making a mu-law WAV file as a first step, using WaveFileWriter. Then you can easily listen to it in Windows Media Player and check that what you are sending to your softphone is what you intended.
Below is the way I eventually got it working. I do lose one of the channels from the mp3, and I guess there's some way to combine the channels as part of a conversion, but that doesn't matter for my situation.
The 160 byte buffer size gives me 20ms ulaw samples which work perfectly with the SIP softphone I'm testing with.
var pcmFormat = new WaveFormat(8000, 16, 1);
var ulawFormat = WaveFormat.CreateMuLawFormat(8000, 1);
using (WaveFormatConversionStream pcmStm = new WaveFormatConversionStream(pcmFormat, new Mp3FileReader("whitelight.mp3")))
{
using (WaveFormatConversionStream ulawStm = new WaveFormatConversionStream(ulawFormat, pcmStm))
{
byte[] buffer = new byte[160];
int bytesRead = ulawStm.Read(buffer, 0, 160);
while (bytesRead > 0)
{
byte[] sample = new byte[bytesRead];
Array.Copy(buffer, sample, bytesRead);
m_rtpChannel.AddSample(sample);
bytesRead = ulawStm.Read(buffer, 0, 160);
}
}
}
I am trying to read PCM samples from a (converted) MP3 file using NAudio, but failing as the Read method returns zero (indicating EOF) every time.
Example: this piece of code, which attempts to read a single 16-bit sample, always prints "0":
using System;
using NAudio.Wave;
namespace NAudioMp3Test
{
class Program
{
static void Main(string[] args)
{
using (Mp3FileReader fr = new Mp3FileReader("MySong.mp3"))
{
byte[] buffer = new byte[2];
using (WaveStream pcm = WaveFormatConversionStream.CreatePcmStream(fr))
{
using (WaveStream aligned = new BlockAlignReductionStream(pcm))
{
Console.WriteLine(aligned.WaveFormat);
Console.WriteLine(aligned.Read(buffer, 0, 2));
}
}
}
}
}
}
output:
16 bit PCM: 44kHz 2 channels
0
But this version which reads from a WAV file works fine (I used iTunes to convert the MP3 to a WAV so they should contain similar samples):
static void Main(string[] args)
{
using (WaveFileReader pcm = new WaveFileReader("MySong.wav"))
{
byte[] buffer = new byte[2];
using (WaveStream aligned = new BlockAlignReductionStream(pcm))
{
Console.WriteLine(aligned.WaveFormat);
Console.WriteLine(aligned.Read(buffer, 0, 2));
}
}
}
output:
16 bit PCM: 44kHz 2 channels
2
What is going on here? Both streams have the same wave formats so I would expect to be able to use the same API to read samples. Setting the Position property doesn't help either.
You probably need to read in larger chunks. NAudio uses ACM to perform the conversion from MP3 to WAV, and if your target buffer isn't big enough, the codec may refuse to convert any data at all. In other words, you need to convert a block of samples before you can read the first sample.
WAV files are a different matter as it is nice and easy to read a single sample from them.