I am trying to read PCM samples from a (converted) MP3 file using NAudio, but failing as the Read method returns zero (indicating EOF) every time.
Example: this piece of code, which attempts to read a single 16-bit sample, always prints "0":
using System;
using NAudio.Wave;
namespace NAudioMp3Test
{
class Program
{
static void Main(string[] args)
{
using (Mp3FileReader fr = new Mp3FileReader("MySong.mp3"))
{
byte[] buffer = new byte[2];
using (WaveStream pcm = WaveFormatConversionStream.CreatePcmStream(fr))
{
using (WaveStream aligned = new BlockAlignReductionStream(pcm))
{
Console.WriteLine(aligned.WaveFormat);
Console.WriteLine(aligned.Read(buffer, 0, 2));
}
}
}
}
}
}
output:
16 bit PCM: 44kHz 2 channels
0
But this version which reads from a WAV file works fine (I used iTunes to convert the MP3 to a WAV so they should contain similar samples):
static void Main(string[] args)
{
using (WaveFileReader pcm = new WaveFileReader("MySong.wav"))
{
byte[] buffer = new byte[2];
using (WaveStream aligned = new BlockAlignReductionStream(pcm))
{
Console.WriteLine(aligned.WaveFormat);
Console.WriteLine(aligned.Read(buffer, 0, 2));
}
}
}
output:
16 bit PCM: 44kHz 2 channels
2
What is going on here? Both streams have the same wave formats so I would expect to be able to use the same API to read samples. Setting the Position property doesn't help either.
You probably need to read in larger chunks. NAudio uses ACM to perform the conversion from MP3 to WAV, and if your target buffer isn't big enough, the codec may refuse to convert any data at all. In other words, you need to convert a block of samples before you can read the first sample.
WAV files are a different matter as it is nice and easy to read a single sample from them.
Related
I am using C# in Universal Windows App to write a Watson Speech-to-text service.
For now instead of using the Watson service, I write to the file and then read it in the Audacity to confirm it is in the right format since Watson service wasn't returning correct responses to me, and the following explains why.
For some reason when I create 16-bit PCM encoding properties, and read buffer, I am only able to read data as 32-bit PCM, and it's working well, but if I read it in 16-bit PCM it is in slow motion, and all the speech is basically corrupted.
I don't really know what exactly needs to be done to convert from 32-bit to 16-bit, but here's what I have in my C# application:
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
//Initialize microphone
var microphone = await DeviceInformation.CreateFromIdAsync(MediaDevice.GetDefaultAudioCaptureId(AudioDeviceRole.Default));
var micInputResult = await graph.CreateDeviceInputNodeAsync(MediaCategory.Speech, pcmEncoding, microphone);
//Create frame output node
frameOutputNode = graph.CreateFrameOutputNode(pcmEncoding);
//Callback function to fire when buffer is filled with data
graph.QuantumProcessed += (s, a) => ProcessFrameOutput(frameOutputNode.GetFrame());
frameOutputNode.Start();
//Make the microphone write into the frame node
micInputResult.DeviceInputNode.AddOutgoingConnection(frameOutputNode);
micInputResult.DeviceInputNode.Start();
graph.Start();
Initialization step is done at this stage. Now, actually reading from the buffer and writing to the file is only working if I use 32-bit PCM encoding with the following function (commented out is the PCM 16-bit code that is resulting in a slow motion speech output):
private void ProcessFrameOutput(AudioFrame frame)
{
//Making a copy of the audio frame buffer
var audioBuffer = frame.LockBuffer(AudioBufferAccessMode.Read);
var buffer = Windows.Storage.Streams.Buffer.CreateCopyFromMemoryBuffer(audioBuffer);
buffer.Length = audioBuffer.Length;
using (var dataReader = DataReader.FromBuffer(buffer))
{
dataReader.ByteOrder = ByteOrder.LittleEndian;
byte[] byteData = new byte[buffer.Length];
int pos = 0;
while (dataReader.UnconsumedBufferLength > 0)
{
/*Reading Float -> Int 32*/
/*With this code I can import raw wav file into the Audacity
using Signed 32-bit PCM Encoding, and it is working well*/
var singleTmp = dataReader.ReadSingle();
var int32Tmp = (Int32)(singleTmp * Int32.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int32Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
byteData[pos++] = chunkBytes[2];
byteData[pos++] = chunkBytes[3];
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
//var singleTmp = dataReader.ReadSingle();
//var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
//byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
//byteData[pos++] = chunkBytes[0];
//byteData[pos++] = chunkBytes[1];
}
WriteBytesToFile(byteData);
}
}
Can anyone think of a reason why this is happening? Is it because Int32 PCM is larger in size and when I use Int16, it extends it and makes the sound longer? Or am I not sampling it properly?
Note: I tried reading Bytes directly from the buffer, and then using that as a raw data, but it's not encoded as PCM that way.
Reading Int16/32 from the buffer directly also doesn't work.
In the above example I am only using Frame Output node. IF I create a file output node that automatically writes to the raw file, it works really well as 16-bit PCM, so something is wrong in my callback function that causes it to be in a slow motion.
Thanks
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
pcmEncoding does not make much sense here since only Float encoding is supported by AudioGraph.
byte[] byteData = new byte[buffer.Length];
it should be buffer.Length / 2 since you convert from float data with 4 bytes per sample to int16 data with 2 bytes per sample
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
var singleTmp = dataReader.ReadSingle();
var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
This is correct code, it should work. Your "slow motion" is most likely related to the buffer size you incorrectly set before.
I must admit Microsoft needs someone to review their bloated APIs
I need to convert a wav file to 8000Hz 16Bit Mono Wav. I already have a code, which works well with NAudio library, but I want to use MemoryStream instead of temporary file.
using System.IO;
using NAudio.Wave;
static void Main()
{
var input = File.ReadAllBytes("C:/input.wav");
var output = ConvertWavTo8000Hz16BitMonoWav(input);
File.WriteAllBytes("C:/output.wav", output);
}
public static byte[] ConvertWavTo8000Hz16BitMonoWav(byte[] inArray)
{
using (var mem = new MemoryStream(inArray))
using (var reader = new WaveFileReader(mem))
using (var converter = WaveFormatConversionStream.CreatePcmStream(reader))
using (var upsampler = new WaveFormatConversionStream(new WaveFormat(8000, 16, 1), converter))
{
// todo: without saving to file using MemoryStream or similar
WaveFileWriter.CreateWaveFile("C:/tmp_pcm_8000_16_mono.wav", upsampler);
return File.ReadAllBytes("C:/tmp_pcm_8000_16_mono.wav");
}
}
Not sure if this is the optimal way, but it works...
public static byte[] ConvertWavTo8000Hz16BitMonoWav(byte[] inArray)
{
using (var mem = new MemoryStream(inArray))
{
using (var reader = new WaveFileReader(mem))
{
using (var converter = WaveFormatConversionStream.CreatePcmStream(reader))
{
using (var upsampler = new WaveFormatConversionStream(new WaveFormat(8000, 16, 1), converter))
{
byte[] data;
using (var m = new MemoryStream())
{
upsampler.CopyTo(m);
data = m.ToArray();
}
using (var m = new MemoryStream())
{
// to create a propper WAV header (44 bytes), which begins with RIFF
var w = new WaveFileWriter(m, upsampler.WaveFormat);
// append WAV data body
w.Write(data,0,data.Length);
return m.ToArray();
}
}
}
}
}
}
It might be added and sorry I can't comment yet due to lack of points. That NAudio ALWAYS writes 46 byte headers which in certain situations can cause crashes. I want to add this in case someone encouters this while searching for a clue why naudio wav files only start crashing certain programs.
I encoutered this problem after figuring out how to convert and/or sample wav with NAudio and was stuck after for 2 days now and only figured it out with a hex editor.
(The 2 extra bytes are located at byte 37 and 38 right before the data subchunck [d,a,t,a,size<4bytes>].
Here is a comparison of two wave file headers left is saved by NAudio 46 bytes; right by Audacity 44 bytes
You can check this back by looking at the NAudio src in WaveFormat.cs at line 310 where instead of 16 bytes for the fmt chunck 18+extra are reserved (+extra because there are some wav files which even contain bigger headers than 46 bytes) but NAudio always seems to write 46 byte headers and never 44 (MS standard). It may also be noted that in fact NAudio is able to read 44 byte headers (line 210 in WaveFormat.cs)
I'm trying to play raw pcm data delivered from ohLibSpotify c# library (https://github.com/openhome/ohLibSpotify).
I get the data in the following callback:
public void MusicDeliveryCallback(SpotifySession session, AudioFormat format, IntPtr frames, int num_frames)
{
//EXAMPLE DATA
//format.channels = 2, format.samplerate = 44100, format.sample_type = Int16NativeEndian
//frames = ?
//num_frames = 2048
}
Now i want to directly play the received data with NAudio (http://naudio.codeplex.com/). With the following code snippet i can play a mp3 file from disk. Is it possible to directly pass the data received from spotify to NAudio and play it in realtime?
using (var ms = File.OpenRead("test.pcm"))
using (var rdr = new Mp3FileReader(ms))
using (var wavStream = WaveFormatConversionStream.CreatePcmStream(rdr))
using (var baStream = new BlockAlignReductionStream(wavStream))
using (var waveOut = new WaveOut(WaveCallbackInfo.FunctionCallback()))
{
waveOut.Init(baStream);
waveOut.Play();
while (waveOut.PlaybackState == PlaybackState.Playing)
{
Thread.Sleep(100);
}
}
EDIT:
I updated my code. The program doesn't throw any errors, but i also can't hear music. Is anything wrong in my code?
This is the music delivery callback:
public void MusicDeliveryCallback(SpotifySession session, AudioFormat format, IntPtr frames, int num_frames)
{
//format.channels = 2, format.samplerate = 44100, format.sample_type = Int16NativeEndian
//frames = ?
//num_frames = 2048
byte[] frames_copy = new byte[num_frames];
Marshal.Copy(frames, frames_copy, 0, num_frames);
bufferedWaveProvider = new BufferedWaveProvider(new WaveFormat(format.sample_rate, format.channels));
bufferedWaveProvider.BufferDuration = TimeSpan.FromSeconds(40);
bufferedWaveProvider.AddSamples(frames_copy, 0, num_frames);
bufferedWaveProvider.Read(frames_copy, 0, num_frames);
if (_waveOutDeviceInitialized == false)
{
IWavePlayer waveOutDevice = new WaveOut();
waveOutDevice.Init(bufferedWaveProvider);
waveOutDevice.Play();
_waveOutDeviceInitialized = true;
}
}
And these are the overwritten callbacks in the SessionListener:
public override int MusicDelivery(SpotifySession session, AudioFormat format, IntPtr frames, int num_frames)
{
_sessionManager.MusicDeliveryCallback(session, format, frames, num_frames);
return base.MusicDelivery(session, format, frames, num_frames);
}
public override void GetAudioBufferStats(SpotifySession session, out AudioBufferStats stats)
{
stats.samples = 2048 / 2; //???
stats.stutter = 0; //???
}
I think you can do this:
Create a BufferedWaveProvider.
Pass this to waveOut.Init.
In your MusicDeliveryCallback, use Marshal.Copy to copy from the native buffer into a managed byte array.
Pass this managed byte array to AddSamples on your BufferedWaveProvider.
In your GetAudioBufferStats callback, use bufferedWaveProvider.BufferedBytes / 2 for "samples" and leave "stutters" as 0.
I think that will work. It involves some unnecessary copying and doesn't accurately keep track of stutters, but it's a good starting point. I think it might be a better (more efficient and reliable) solution to implement IWaveProvider and manage the buffering yourself.
I wrote the ohLibSpotify wrapper-library, but I don't work for the same company anymore, so I'm not involved in its development anymore. You might be able to get more help from someone on this forum: http://forum.openhome.org/forumdisplay.php?fid=6 So far as music delivery goes, ohLibSpotify aims to have as little overhead as possible. It doesn't copy the music data at all, it just passes you the same native pointer that the libspotify library itself provided, so that you can copy it yourself to its final destination and avoid an unnecessary layer of copying. It does make it a bit clunky for simple usage, though.
Good luck!
First, your code snippet shown above is more complicated than it needs to be. You only need five, instead of two using statments. Mp3FileReader decodes to PCM for you. Second, use WaveOutEvent in preference to WaveOut with function callbacks. It is much more reliable.
using (var rdr = new Mp3FileReader("test.pcm"))
using (var waveOut = new WaveOutEvent())
{
//...
}
To answer you actual question, you need to use a BufferedWaveProvider. You create one of these and pass it to your output device in the Init method. Now, as you receive audio, decompress it to PCM (if it is compressed) and put it into the BufferedWaveProvider. The NAudioDemo application includes examples of how to do this, so look at the NAudio source code to see how its done.
I finished getting row data from a WAV file. So now, I know the informatipon about the WAV file, such as DataRate and SamplingPerBits, etc.
And I have several kinds of data types after reading the WAV file: 16 bits - []Int16, 8 bits - []byte.
Now I am trying to convert []Int16 to []float!
I found the NAudio.wave function Wave16ToFloatProvider().
I have seen Converting 16 bit to 32-nit floating point. But I couldn't get about it, because I don't need to write WaveFileWriter.
So I tried to do without WaveFileWriter. Here is my code.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using NAudio;
using NAudio.Wave;
namespace WaveREader
{
class WaveReader
{
WaveFileReader reader = new WaveFileReader("wavetest.wav");
IWaveProvider stream32 = new Wave16ToFloatProvider(reader);
byte[] buffer = new byte[8192];
float[] DATASIXTEEN;
for (int i = 0; i < buffer.Length; i++)
{
DATASIXTEEN = new float[buffer.Length];
DATASIXTEEN[i] = stream32.Read(buffer, 0, buffer.Length);
}
}
}
I think this part would be wrong, DATASIXTEEN[i] = stream32.Read(buffer, 0, buffer.Length);, but I have no idea how to correct it.
Would you give me some advice for it or code by using Wave16ToFloatProvider?
Or would I ask you how to convert without Wave16ToFloatProvider?
The return value from Stream.Read is the count of the number of bytes read, not what you're after. The data you want is in the buffer, but each 32-bit sample is spread across 4 8-bit bytes.
There are a number of ways to get the data as 32-bit float.
The first is to use an ISampleProvider which converts the data into the floating point format and gives a simple way to read the data in that format:
WaveFileReader reader = new WaveFileReader("wavetest.wav");
ISampleProvider provider = new Pcm16BitToSampleProvider(reader);
int blockSize = 2000;
float[] buffer = new float[blockSize];
// Read blocks of samples until no more available
int rc;
while ((rc = provider.Read(buffer, 0, blockSize)) > 0)
{
// Process the array of samples in here.
// rc is the number of valid samples in the buffer
// ....
}
Alternatively, there is a method in WaveFileReader that lets you read floating point samples directly. The downside is that it reads one sample group (, that is, one sample for each channel - one for mono, two for stereo) at a time, which can be time consuming. Reading and processing arrays is faster in most cases.
WaveFileReader reader = new WaveFileReader("wavetest.wav");
float[] buffer;
while ((buffer = reader.ReadNextSampleFrame()) != null)
{
// Process samples in here.
// buffer contains one sample per channel
// ....
}
I've been looking over the NAudio examples trying to work out how I can get ulaw samples suitable for packaging up as an RTP payload. I'm attempting to generate the samples from an mp3 file using the code below. Not surprisingly, since I don't really have a clue what I'm doing with NAudio, when I transmit the samples across the network to a softphone all I get is static.
Can anyone provide any direction on how I should be getting 160 bytes (8Khz # 20ms) ULAW samples from an MP3 file using NAudio?
private void GetAudioSamples()
{
var pcmStream = WaveFormatConversionStream.CreatePcmStream(new Mp3FileReader("whitelight.mp3"));
byte[] buffer = new byte[2];
byte[] sampleBuffer = new byte[160];
int sampleIndex = 0;
int bytesRead = pcmStream.Read(buffer, 0, 2);
while (bytesRead > 0)
{
var ulawByte = MuLawEncoder.LinearToMuLawSample(BitConverter.ToInt16(buffer, 0));
sampleBuffer[sampleIndex++] = ulawByte;
if (sampleIndex == 160)
{
m_rtpChannel.AddSample(sampleBuffer);
sampleBuffer = new byte[160];
sampleIndex = 0;
}
bytesRead = pcmStream.Read(buffer, 0, 2);
}
logger.Debug("Finished adding audio samples.");
}
Here's a few pointers. First of all, as long as you are using NAudio 1.5, no need for the additional WaveFormatConversionStream - Mp3FileReader's Read method returns PCM.
However, you will not be getting 8kHz out, so you need to resample it first. WaveFormatConversionStream can do this, although it uses the built-in Windows ACM sample rate conversion, which doesn't seem to filter the incoming audio well, so there could be aliasing artefacts.
Also, you usually read bigger blocks than just two bytes at a time as the MP3 decoder needs to decode frames one at a time (the resampler also will want to deal with bigger block sizes). I would try reading at least 20ms worth of bytes at a time.
Your use of BitConverter.ToInt16 is correct for getting the 16 bit sample value, but bear in mind that an MP3 is likely stereo, with left, right samples. Are you sure your phone expects stereo.
Finally, I recommend making a mu-law WAV file as a first step, using WaveFileWriter. Then you can easily listen to it in Windows Media Player and check that what you are sending to your softphone is what you intended.
Below is the way I eventually got it working. I do lose one of the channels from the mp3, and I guess there's some way to combine the channels as part of a conversion, but that doesn't matter for my situation.
The 160 byte buffer size gives me 20ms ulaw samples which work perfectly with the SIP softphone I'm testing with.
var pcmFormat = new WaveFormat(8000, 16, 1);
var ulawFormat = WaveFormat.CreateMuLawFormat(8000, 1);
using (WaveFormatConversionStream pcmStm = new WaveFormatConversionStream(pcmFormat, new Mp3FileReader("whitelight.mp3")))
{
using (WaveFormatConversionStream ulawStm = new WaveFormatConversionStream(ulawFormat, pcmStm))
{
byte[] buffer = new byte[160];
int bytesRead = ulawStm.Read(buffer, 0, 160);
while (bytesRead > 0)
{
byte[] sample = new byte[bytesRead];
Array.Copy(buffer, sample, bytesRead);
m_rtpChannel.AddSample(sample);
bytesRead = ulawStm.Read(buffer, 0, 160);
}
}
}