Playing a sound from a generated buffer in a Windows 8 app - c#

I'm porting some C# Windows Phone 7 apps over to Windows 8.
The phone apps used an XNA SoundEffect to play arbitrary sounds from a buffer. In the simplest cases I'd just create a sine wave of the required duration and frequency. Both the duration and frequency can vary greatly, so I'd prefer not to rely on MediaElements (unless there is someway to shift the frequency of a base file, but that will only help me with the single frequency generation).
What is the equivalent of an XNA SoundEffectInstance in WinRT?
I assume I'll need to use DirectX for this, but I'm not sure how to go about this from an otherwise C#/XAML app. I've had a look at SharpDX, but it didn't seem to have the DirectSound, SecondaryBuffer, SecondaryBuffer classes that I assume I'd need to use.
I've made a number of assumptions above. It may be I'm looking for the wrong classes or there is an entirely separate way to generate arbitrary sound from a Windows 8 app.
I found an example using XAudio2 from SharpDX to play a wav file via an AudioBuffer. This seems promising, I'd just need to substitute my generated audio buffer for the native file stream.
PM> Install-Package SharpDX
PM> Install-Package SharpDX.XAudio2
public void PlaySound()
{
XAudio2 xaudio;
MasteringVoice masteringVoice;
xaudio = new XAudio2();
masteringVoice = new MasteringVoice(xaudio);
var nativefilestream = new NativeFileStream(
#"Assets\SpeechOn.wav",
NativeFileMode.Open,
NativeFileAccess.Read,
NativeFileShare.Read);
var soundstream = new SoundStream(nativefilestream);
var waveFormat = soundstream.Format;
var buffer = new AudioBuffer
{
Stream = soundstream.ToDataStream(),
AudioBytes = (int)soundstream.Length,
Flags = BufferFlags.EndOfStream
};
var sourceVoice = new SourceVoice(xaudio, waveFormat, true);
// There is also support for shifting the frequency.
sourceVoice.SetFrequencyRatio(0.5f);
sourceVoice.SubmitSourceBuffer(buffer, soundstream.DecodedPacketsInfo);
sourceVoice.Start();
}

The only way to generate dynamic sound in Win8RT is to use XAudio2, so you should be able to do this with SharpDX.XAudio2.
Instead of using NativeFileStream, just instantiate a DataStream directly giving your managed buffer (or you can use an unmanaged buffer or let DataStream instantiate one for you). The code would be like this:
// Initialization phase, keep this buffer during the life of your application
// Allocate 10s at 44.1Khz of stereo 16bit signals
var myBufferOfSamples = new short[44100 * 10 * 2];
// Create a DataStream with pinned managed buffer
var dataStream = DataStream.Create(myBufferOfSamples, true, true);
var buffer = new AudioBuffer
{
Stream = dataStream,
AudioBytes = (int)dataStream.Length,
Flags = BufferFlags.EndOfStream
};
//...
// Fill myBufferOfSamples
//...
// PCM 44.1Khz stereo 16 bit format
var waveFormat = new WaveFormat();
XAudio2 xaudio = new XAudio2();
MasteringVoice masteringVoice = new MasteringVoice(xaudio);
var sourceVoice = new SourceVoice(xaudio, waveFormat, true);
// Submit the buffer
sourceVoice.SubmitSourceBuffer(buffer, null);
// Start playing
sourceVoice.Start();
Sample method to fill the buffer with a Sine wave:
private void FillBuffer(short[] buffer, int sampleRate, double frequency)
{
double totalTime = 0;
for (int i = 0; i < buffer.Length - 1; i += 2)
{
double time = (double)totalTime / (double)sampleRate;
short currentSample = (short)(Math.Sin(2 * Math.PI * frequency * time) * (double)short.MaxValue);
buffer[i] = currentSample; //(short)(currentSample & 0xFF);
buffer[i + 1] = currentSample; //(short)(currentSample >> 8);
totalTime += 2;
}
}

You can also use WASAPI to play dynamically-generated sound buffers in WinRT. (xaudio2 isn't the only solution).
I wrote sample code for it in VB here (the C# will be essentially the same):
http://www.codeproject.com/Articles/460145/Recording-and-playing-PCM-audio-on-Windows-8-VB
I believe that the NAudio guy is planning to translate+incorporate my sample code into NAudio, for a Win8-supported version, so that'll be easier to use.

Related

NAudio playing 32 bit float IEE

I have an application which is generating 32 bit floats (big endian). If I write these to a file and then open them in Audacity the file plays correctly.
I am trying to play the stream using NAudio. If I create a WaveFormat of 24k samples, 32 bit and 2 channels I do hear noise although as it's the wrong format the stream isn't rendering of course, if I create the correct format (IeeeFloatWave) then I don't hear anything at all. I know the samples are arriving correctly as I can save them to disk but I just can't play them. Anybody see what I'm doing that is wrong?
Updated and solved - little to big endian change is in the feeding routine
FloatTo16 bit provider converts format before playing
private bool _streamActive = false;
private BufferedWaveProvider bufferedWaveProvider = null;
private WaveFloatTo16Provider waveFloatTo16Provider = null;
private WaveOut waveOut = null;
private WaveFormat waveFormat=null;
// ProcessSound is fed incoming byte packets
// (28 bytes header plus 1024 bytes audio)
// by background thread
// Data is converted from little to big endian in background thread
public void ProcessSound(byte[] rxData)
{
// get data length
int datalen = rxData.Length;
// check to activate player
if (_streamActive == true)
{
// add samples to buffer ('28' allows for header information)
bufferedWaveProvider.AddSamples(rxData, 28, datalen - 28);
return;
}
// start it going
waveFormat = WaveFormat.CreateIeeeFloatWaveFormat(24000, 2);
// create buffer to allow samples to be added
bufferedWaveProvider = new BufferedWaveProvider(waveFormat);
// convert from 32 bit float to 16 bit PCM
waveFloatTo16Provider = new WaveFloatTo16Provider(bufferedWaveProvider);
// add samples to buffer
bufferedWaveProvider.AddSamples(rxData, 28, datalen - 28);
// create waveOut player
waveOut = new WaveOut();
waveOut.Init(waveFloatTo16Provider);
waveOut.Volume = 0.25f;
waveOut.Play();
// mark stream is active
_streamActive = true;
}

Audio Encoding conversion problems with PCM 32-bit yo PCM 16-bit

I am using C# in Universal Windows App to write a Watson Speech-to-text service.
For now instead of using the Watson service, I write to the file and then read it in the Audacity to confirm it is in the right format since Watson service wasn't returning correct responses to me, and the following explains why.
For some reason when I create 16-bit PCM encoding properties, and read buffer, I am only able to read data as 32-bit PCM, and it's working well, but if I read it in 16-bit PCM it is in slow motion, and all the speech is basically corrupted.
I don't really know what exactly needs to be done to convert from 32-bit to 16-bit, but here's what I have in my C# application:
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
//Initialize microphone
var microphone = await DeviceInformation.CreateFromIdAsync(MediaDevice.GetDefaultAudioCaptureId(AudioDeviceRole.Default));
var micInputResult = await graph.CreateDeviceInputNodeAsync(MediaCategory.Speech, pcmEncoding, microphone);
//Create frame output node
frameOutputNode = graph.CreateFrameOutputNode(pcmEncoding);
//Callback function to fire when buffer is filled with data
graph.QuantumProcessed += (s, a) => ProcessFrameOutput(frameOutputNode.GetFrame());
frameOutputNode.Start();
//Make the microphone write into the frame node
micInputResult.DeviceInputNode.AddOutgoingConnection(frameOutputNode);
micInputResult.DeviceInputNode.Start();
graph.Start();
Initialization step is done at this stage. Now, actually reading from the buffer and writing to the file is only working if I use 32-bit PCM encoding with the following function (commented out is the PCM 16-bit code that is resulting in a slow motion speech output):
private void ProcessFrameOutput(AudioFrame frame)
{
//Making a copy of the audio frame buffer
var audioBuffer = frame.LockBuffer(AudioBufferAccessMode.Read);
var buffer = Windows.Storage.Streams.Buffer.CreateCopyFromMemoryBuffer(audioBuffer);
buffer.Length = audioBuffer.Length;
using (var dataReader = DataReader.FromBuffer(buffer))
{
dataReader.ByteOrder = ByteOrder.LittleEndian;
byte[] byteData = new byte[buffer.Length];
int pos = 0;
while (dataReader.UnconsumedBufferLength > 0)
{
/*Reading Float -> Int 32*/
/*With this code I can import raw wav file into the Audacity
using Signed 32-bit PCM Encoding, and it is working well*/
var singleTmp = dataReader.ReadSingle();
var int32Tmp = (Int32)(singleTmp * Int32.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int32Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
byteData[pos++] = chunkBytes[2];
byteData[pos++] = chunkBytes[3];
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
//var singleTmp = dataReader.ReadSingle();
//var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
//byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
//byteData[pos++] = chunkBytes[0];
//byteData[pos++] = chunkBytes[1];
}
WriteBytesToFile(byteData);
}
}
Can anyone think of a reason why this is happening? Is it because Int32 PCM is larger in size and when I use Int16, it extends it and makes the sound longer? Or am I not sampling it properly?
Note: I tried reading Bytes directly from the buffer, and then using that as a raw data, but it's not encoded as PCM that way.
Reading Int16/32 from the buffer directly also doesn't work.
In the above example I am only using Frame Output node. IF I create a file output node that automatically writes to the raw file, it works really well as 16-bit PCM, so something is wrong in my callback function that causes it to be in a slow motion.
Thanks
//Creating PCM Encoding properties
var pcmEncoding = AudioEncodingProperties.CreatePcm(16000, 1, 16);
var result = await AudioGraph.CreateAsync(
new AudioGraphSettings(AudioRenderCategory.Speech)
{
DesiredRenderDeviceAudioProcessing = AudioProcessing.Raw,
AudioRenderCategory = AudioRenderCategory.Speech,
EncodingProperties = pcmEncoding
}
);
graph = result.Graph;
pcmEncoding does not make much sense here since only Float encoding is supported by AudioGraph.
byte[] byteData = new byte[buffer.Length];
it should be buffer.Length / 2 since you convert from float data with 4 bytes per sample to int16 data with 2 bytes per sample
/*Reading Float -> Int 16 (Slow Motion)*/
/*With this code I can import raw wav file into the Audacity
using Signed 16-bit PCM Encoding, but when I play it, it's in
a slow motion*/
var singleTmp = dataReader.ReadSingle();
var int16Tmp = (Int16)(singleTmp * Int16.MaxValue);
byte[] chunkBytes = BitConverter.GetBytes(int16Tmp);
byteData[pos++] = chunkBytes[0];
byteData[pos++] = chunkBytes[1];
This is correct code, it should work. Your "slow motion" is most likely related to the buffer size you incorrectly set before.
I must admit Microsoft needs someone to review their bloated APIs

How to know if audio is going dead in Naudio C#`

I am trying to find out when the song is going dead (inaudible sound for few seconds). I am using Naudio library in C#. Till now i am able to get the PCM data and plot the amplitude of the audio. I am guessing the dead audio through this amplitude i am obtaining. But i am bit confused about audio channels. Following is the piece of code i wrote.
NAudio.Wave.WaveChannel32 wave = new NAudio.Wave.WaveChannel32(new NAudio.Wave.WaveFileReader(open.FileName));
int songLength = (int)wave.Length;
byte[] songPCM = new byte[songLength];
int sampleRate = (int)wave.WaveFormat.SampleRate;
int bitsPerSample = (int)wave.WaveFormat.BitsPerSample;
int numChannels = (int)wave.WaveFormat.Channels;
wave.Read(songPCM, 0, songLength);
double[] _waveLeft = new double[songLength / 8];
double[] _waveRight = new double[songLength / 8];
System.IO.StreamWriter fileoutLeft = new System.IO.StreamWriter("E:\\LOutputSongPCM.dat", true);
System.IO.StreamWriter fileoutRight = new System.IO.StreamWriter("E:\\ROutputSongPCM.dat", true);
int h = 0;
for (int i = 0; i < songLength; i += 8)
{
_waveLeft[h] = (double)BitConverter.ToSingle(songPCM, i);
_waveRight[h] = (double)BitConverter.ToSingle(songPCM, i + 4);
chart1.Series["wave"].Points.Add(_waveLeft[h]);
//chart1.Series["wave"].Points.Add(_waveRight[h]);
fileoutLeft.WriteLine(_waveLeft[h]);
fileoutRight.WriteLine(_waveRight[h]);
h++;
}
fileoutLeft.Close();
fileoutRight.Close();
Now for this piece of code i know the audio is 2 channel. So i referred many links and threads and got confused if i am reading my pcm data for each channel correctly. However i compared the plots of each channel and they look good(Matching with original song) but i am not sure about their accuracy. Can you guide me to get the exact raw data for any channel. for mono, stereo and 5.1.
Thanks.
You'll find it easier to get at the samples by using the AudioFileReader class, whose Read method takes a float array. The samples are stored interleaved for multi-channel audio, so for stereo, you'll get left sample, then right sample, then another left and so on.

Play dynamically-created simple sounds in C# without external libraries

I need to be able to generate dynamically a waveform and play it, using C#, without any external libraries, and without having to store sound files on the hard disk. Latency isn't an issue; the sounds would be generated well before they are needed by the application.
Actually the Console.Beep() method might meet my needs if it weren't for the fact that Microsoft says it isn't supported in 64-bit versions of Windows.
If I generate my own sound dynamically I can get a more fancy than a simple beep. For example, I could make a waveform from a triangle wave that increases in frequency from 2 KHz to 4 KHz while decaying in volume. I don't need fancy 16-bit stereo, just 8-bit mono is fine. I don't need dynamic control over volume and pitch, just basically generate a soundfile in memory and play it without storing it.
Last time I needed to generate sounds was many years ago, on Apple II, on HP workstations, and on my old Amiga computer. Haven't needed to do it since then, and it seems that something simple that I describe has gotten a lot more complicated. I am having trouble believing that something so simple seems so hard. Most of the answers I see refer to NAudio or similar libraries, and that isn't an option for this project (aside from the fact that pulling in an entire library just to play a tone seems like a waste).
Based on one of the links in the answers I received, and some other pages I found about .wav header formats, here is my working code for a little class that generates an 8-bit "ding!" sound with a user-specified frequency and duration. It's basically a beep that decays linearly to zero in amplitude during the specified duration.
public class AlertDing {
private SoundPlayer player = null;
private BinaryWriter writer = null;
/// <summary>
/// Dynamically generate a "ding" sound and save it to a memory stream
/// </summary>
/// <param name="freq">Frequency in Hertz, e.g. 880</param>
/// <param name="tenthseconds">Duration in multiple of 1/10 second</param>
public AlertDing(double freq, uint tenthseconds) {
string header_GroupID = "RIFF"; // RIFF
uint header_FileLength = 0; // total file length minus 8, which is taken up by RIFF
string header_RiffType = "WAVE"; // always WAVE
string fmt_ChunkID = "fmt "; // Four bytes: "fmt "
uint fmt_ChunkSize = 16; // Length of header in bytes
ushort fmt_FormatTag = 1; // 1 for PCM
ushort fmt_Channels = 1; // Number of channels, 2=stereo
uint fmt_SamplesPerSec = 14000; // sample rate, e.g. CD=44100
ushort fmt_BitsPerSample = 8; // bits per sample
ushort fmt_BlockAlign =
(ushort)(fmt_Channels * (fmt_BitsPerSample / 8)); // sample frame size, in bytes
uint fmt_AvgBytesPerSec =
fmt_SamplesPerSec * fmt_BlockAlign; // for estimating RAM allocation
string data_ChunkID = "data"; // "data"
uint data_ChunkSize; // Length of header in bytes
byte [] data_ByteArray;
// Fill the data array with sample data
// Number of samples = sample rate * channels * bytes per sample * duration in seconds
uint numSamples = fmt_SamplesPerSec * fmt_Channels * tenthseconds / 10;
data_ByteArray = new byte[numSamples];
//int amplitude = 32760, offset=0; // for 16-bit audio
int amplitude = 127, offset = 128; // for 8-audio
double period = (2.0*Math.PI*freq) / (fmt_SamplesPerSec * fmt_Channels);
double amp;
for (uint i = 0; i < numSamples - 1; i += fmt_Channels) {
amp = amplitude * (double)(numSamples - i) / numSamples; // amplitude decay
// Fill with a waveform on each channel with amplitude decay
for (int channel = 0; channel < fmt_Channels; channel++) {
data_ByteArray[i+channel] = Convert.ToByte(amp * Math.Sin(i*period) + offset);
}
}
// Calculate file and data chunk size in bytes
data_ChunkSize = (uint)(data_ByteArray.Length * (fmt_BitsPerSample / 8));
header_FileLength = 4 + (8 + fmt_ChunkSize) + (8 + data_ChunkSize);
// write data to a MemoryStream with BinaryWriter
MemoryStream audioStream = new MemoryStream();
BinaryWriter writer = new BinaryWriter(audioStream);
// Write the header
writer.Write(header_GroupID.ToCharArray());
writer.Write(header_FileLength);
writer.Write(header_RiffType.ToCharArray());
// Write the format chunk
writer.Write(fmt_ChunkID.ToCharArray());
writer.Write(fmt_ChunkSize);
writer.Write(fmt_FormatTag);
writer.Write(fmt_Channels);
writer.Write(fmt_SamplesPerSec);
writer.Write(fmt_AvgBytesPerSec);
writer.Write(fmt_BlockAlign);
writer.Write(fmt_BitsPerSample);
// Write the data chunk
writer.Write(data_ChunkID.ToCharArray());
writer.Write(data_ChunkSize);
foreach (byte dataPoint in data_ByteArray) {
writer.Write(dataPoint);
}
player = new SoundPlayer(audioStream);
}
/// <summary>
/// Call this to clean up when program is done using this sound
/// </summary>
public void Dispose() {
if (writer != null) writer.Close();
if (player != null) player.Dispose();
writer = null;
player = null;
}
/// <summary>
/// Play "ding" sound
/// </summary>
public void Play() {
if (player != null) {
player.Stream.Seek(0, SeekOrigin.Begin); // rewind stream
player.Play();
}
}
}
Hopefully this should help others who are trying to produce a simple alert sound dynamically without needing a sound file.
The following article explains how *.wav file can be generated and played using SoundPlayer. Be aware that SoundPlayer can take a stream as an argument, so you can generate wav-file contents in a MemoryStream and avoid saving to a file.
http://blogs.msdn.com/b/dawate/archive/2009/06/24/intro-to-audio-programming-part-3-synthesizing-simple-wave-audio-using-c.aspx
I tried out the code-snipped from Anachronist (2012-10) - and it is working for me.
biggest hurdle for me:
get rid of the systematic "clicking-noise" at the end of "AlertDing" wav.
This is caused by a "soft-bug" in the code snipped:
for (uint i = 0; i < numSamples - 1; i += fmt_Channels)
needs to change to
for (uint i = 0; i < numSamples; i += fmt_Channels)
if not changed, a systematic "zero" will be generated at the end of each "play", causing a sharp clicking noise. (= amplitude jumps 0->min->0)
the original question implies "without clicking noise" of course :)

Use NAudio to get Ulaw samples for RTP

I've been looking over the NAudio examples trying to work out how I can get ulaw samples suitable for packaging up as an RTP payload. I'm attempting to generate the samples from an mp3 file using the code below. Not surprisingly, since I don't really have a clue what I'm doing with NAudio, when I transmit the samples across the network to a softphone all I get is static.
Can anyone provide any direction on how I should be getting 160 bytes (8Khz # 20ms) ULAW samples from an MP3 file using NAudio?
private void GetAudioSamples()
{
var pcmStream = WaveFormatConversionStream.CreatePcmStream(new Mp3FileReader("whitelight.mp3"));
byte[] buffer = new byte[2];
byte[] sampleBuffer = new byte[160];
int sampleIndex = 0;
int bytesRead = pcmStream.Read(buffer, 0, 2);
while (bytesRead > 0)
{
var ulawByte = MuLawEncoder.LinearToMuLawSample(BitConverter.ToInt16(buffer, 0));
sampleBuffer[sampleIndex++] = ulawByte;
if (sampleIndex == 160)
{
m_rtpChannel.AddSample(sampleBuffer);
sampleBuffer = new byte[160];
sampleIndex = 0;
}
bytesRead = pcmStream.Read(buffer, 0, 2);
}
logger.Debug("Finished adding audio samples.");
}
Here's a few pointers. First of all, as long as you are using NAudio 1.5, no need for the additional WaveFormatConversionStream - Mp3FileReader's Read method returns PCM.
However, you will not be getting 8kHz out, so you need to resample it first. WaveFormatConversionStream can do this, although it uses the built-in Windows ACM sample rate conversion, which doesn't seem to filter the incoming audio well, so there could be aliasing artefacts.
Also, you usually read bigger blocks than just two bytes at a time as the MP3 decoder needs to decode frames one at a time (the resampler also will want to deal with bigger block sizes). I would try reading at least 20ms worth of bytes at a time.
Your use of BitConverter.ToInt16 is correct for getting the 16 bit sample value, but bear in mind that an MP3 is likely stereo, with left, right samples. Are you sure your phone expects stereo.
Finally, I recommend making a mu-law WAV file as a first step, using WaveFileWriter. Then you can easily listen to it in Windows Media Player and check that what you are sending to your softphone is what you intended.
Below is the way I eventually got it working. I do lose one of the channels from the mp3, and I guess there's some way to combine the channels as part of a conversion, but that doesn't matter for my situation.
The 160 byte buffer size gives me 20ms ulaw samples which work perfectly with the SIP softphone I'm testing with.
var pcmFormat = new WaveFormat(8000, 16, 1);
var ulawFormat = WaveFormat.CreateMuLawFormat(8000, 1);
using (WaveFormatConversionStream pcmStm = new WaveFormatConversionStream(pcmFormat, new Mp3FileReader("whitelight.mp3")))
{
using (WaveFormatConversionStream ulawStm = new WaveFormatConversionStream(ulawFormat, pcmStm))
{
byte[] buffer = new byte[160];
int bytesRead = ulawStm.Read(buffer, 0, 160);
while (bytesRead > 0)
{
byte[] sample = new byte[bytesRead];
Array.Copy(buffer, sample, bytesRead);
m_rtpChannel.AddSample(sample);
bytesRead = ulawStm.Read(buffer, 0, 160);
}
}
}

Categories