I have a video that I'm trying to play back using the FFMPEG framework, and NAudio to playback the audio frames. The video plays back just fine, but when I play back the audio using a BufferedWaveProvider while the underlying audio sounds correct, it has a ton of static/noise. What's important to note in the attached code is when the events fire off from the managed C++ class, a C# class receives the event and then grabs the latest relevant AVFrame data. This works flawlessly for video, but like I said, when I try to play back the audio I have a bunch of noise. For now I've hardcoded the NAudio WaveFormat settings, but they should match the video's audio.
///////////////////////// C# /////////////////////////////
// Gets the size of the data from the audio AVFrame (audioFrame->linesize[0])
int size = _Decoder.GetAVAudioFrameSize();
if (!once)
{
managedArray = new byte[size];
once = true;
}
// Gets the data array from the audio AVFrame
IntPtr data = _Decoder.GetAVAudioFrameData();
Marshal.Copy(data, managedArray, 0, size);
if (_waveProvider == null)
{
_waveProvider = new BufferedWaveProvider(new WaveFormat(44100, 16, 2));
_waveOut.NumberOfBuffers = 2;
_waveOut.DesiredLatency = 300;
_waveOut.Init(_waveProvider);
_waveOut.Play();
}
_waveProvider.AddSamples(managedArray, 0, size);
//////////////////////// Managed C++ /////////////////////////
while (_shouldRead)
{
// if we're not initialized, sleep
if (!_initialized || !_FFMpegObject->AVFormatContextReady())
{
Thread::Sleep(READ_INIT_WAIT_TIME);
}
else if (_sequentialFailedReadCount > MAX_SEQUENTIAL_READ_FAILS)
{
// we've failed a lot and probably lost the stream, try to reconnect.
_sequentialFailedReadCount = 0;
_initialized = false;
StartInitThread();
Thread::Sleep(READ_INIT_WAIT_TIME << 1);
}
else // otherwise, try reading one AV packet
{
if (_FFMpegObject->AVReadFrame())
{
if (_FFMpegObject->GetAVPacketStreamIndex() == _videoStreamIndex)
{
_sequentialFailedReadCount = 0;
// decode the video packet
_frameLock->WaitOne();
frameFinished = _FFMpegObject->AVCodecDecodeVideo2();
_frameLock->ReleaseMutex();
if (frameFinished)
{
_framesDecoded++;
if (!_isPaused) // If paused and AFAP playback, just don't call the callback
{
FrameFinished(this, EventArgs::Empty);
Thread::Sleep((int)(1000 / (GetFrameRate() * _speed)));
}
}
}
else if (_FFMpegObject->GetAVPacketStreamIndex() == _audioStreamIndex)
{
// decode the audio packet.
_frameLock->WaitOne();
audioFrameFinished = _FFMpegObject->AVCodecDecodeAudio4();
_frameLock->ReleaseMutex();
if (audioFrameFinished)
{
if (!_isPaused) // If paused and AFAP playback, just don't call the callback
{
// Fire the event - leaving it up to video to do the sleeps -- not sure if that will work or not
AudioFrameFinished(this, EventArgs::Empty);
}
}
}
_FFMpegObject->AVFreeFramePacket();
}
else // failed to read an AV packet
{
_sequentialFailedReadCount++;
_failedReadCount++;
Thread::Sleep(READ_FAILED_WAIT_TIME);
}
}
}
Related
Setup
Hey,
I'm trying to capture my screen and send/communicate the stream via MR-WebRTC. Communication between two PCs or PC with HoloLens worked with webcams for me, so I thought the next step could be streaming my screen. So I took the uwp application that I already had, which worked with my webcam and tried to make things work:
UWP App is based on the example uwp app from MR-WebRTC.
For Capturing I'm using the instruction from MS about screen capturing via GraphicsCapturePicker.
So now I'm stuck in the following situation:
I get a frame from the screen capturing, but its type is Direct3D11CaptureFrame. You can see it below in the code snipped.
MR-WebRTC takes a frame type I420AVideoFrame (also in a code snipped).
How can I "connect" them?
I420AVideoFrame wants a frame in the I420A format (YUV 4:2:0).
Configuring the framePool I can set the DirectXPixelFormat, but it has no YUV420.
I found this post on so, saying that it its possible.
Code Snipped Frame from Direct3D:
_framePool = Direct3D11CaptureFramePool.Create(
_canvasDevice, // D3D device
DirectXPixelFormat.B8G8R8A8UIntNormalized, // Pixel format
3, // Number of frames
_item.Size); // Size of the buffers
_session = _framePool.CreateCaptureSession(_item);
_session.StartCapture();
_framePool.FrameArrived += (s, a) =>
{
using (var frame = _framePool.TryGetNextFrame())
{
// Here I would take the Frame and call the MR-WebRTC method LocalI420AFrameReady
}
};
Code Snippet Frame from WebRTC:
// This is the way with the webcam; so LocalI420 was subscribed to
// the event I420AVideoFrameReady and got the frame from there
_webcamSource = await DeviceVideoTrackSource.CreateAsync();
_webcamSource.I420AVideoFrameReady += LocalI420AFrameReady;
// enqueueing the newly captured video frames into the bridge,
// which will later deliver them when the Media Foundation
// playback pipeline requests them.
private void LocalI420AFrameReady(I420AVideoFrame frame)
{
lock (_localVideoLock)
{
if (!_localVideoPlaying)
{
_localVideoPlaying = true;
// Capture the resolution into local variable useable from the lambda below
uint width = frame.width;
uint height = frame.height;
// Defer UI-related work to the main UI thread
RunOnMainThread(() =>
{
// Bridge the local video track with the local media player UI
int framerate = 30; // assumed, for lack of an actual value
_localVideoSource = CreateI420VideoStreamSource(
width, height, framerate);
var localVideoPlayer = new MediaPlayer();
localVideoPlayer.Source = MediaSource.CreateFromMediaStreamSource(
_localVideoSource);
localVideoPlayerElement.SetMediaPlayer(localVideoPlayer);
localVideoPlayer.Play();
});
}
}
// Enqueue the incoming frame into the video bridge; the media player will
// later dequeue it as soon as it's ready.
_localVideoBridge.HandleIncomingVideoFrame(frame);
}
I found a solution for my problem by creating an issue on the github repo. Answer was provided by KarthikRichie:
You have to use the ExternalVideoTrackSource
You can convert from the Direct3D11CaptureFrame to Argb32VideoFrame
// Setting up external video track source
_screenshareSource = ExternalVideoTrackSource.CreateFromArgb32Callback(FrameCallback);
struct WebRTCFrameData
{
public IntPtr Data;
public uint Height;
public uint Width;
public int Stride;
}
public void FrameCallback(in FrameRequest frameRequest)
{
try
{
if (FramePool != null)
{
using (Direct3D11CaptureFrame _currentFrame = FramePool.TryGetNextFrame())
{
if (_currentFrame != null)
{
WebRTCFrameData webRTCFrameData = ProcessBitmap(_currentFrame.Surface).Result;
frameRequest.CompleteRequest(new Argb32VideoFrame()
{
data = webRTCFrameData.Data,
height = webRTCFrameData.Height,
width = webRTCFrameData.Width,
stride = webRTCFrameData.Stride
});
}
}
}
}
catch (Exception ex)
{
}
}
private async Task<WebRTCFrameData> ProcessBitmap(IDirect3DSurface surface)
{
SoftwareBitmap softwareBitmap = await SoftwareBitmap.CreateCopyFromSurfaceAsync(surface, Windows.Graphics.Imaging.BitmapAlphaMode.Straight);
byte[] imageBytes = new byte[4 * softwareBitmap.PixelWidth * softwareBitmap.PixelHeight];
softwareBitmap.CopyToBuffer(imageBytes.AsBuffer());
WebRTCFrameData argb32VideoFrame = new WebRTCFrameData();
argb32VideoFrame.Data = GetByteIntPtr(imageBytes);
argb32VideoFrame.Height = (uint)softwareBitmap.PixelHeight;
argb32VideoFrame.Width = (uint)softwareBitmap.PixelWidth;
var test = softwareBitmap.LockBuffer(BitmapBufferAccessMode.Read);
int count = test.GetPlaneCount();
var pl = test.GetPlaneDescription(count - 1);
argb32VideoFrame.Stride = pl.Stride;
return argb32VideoFrame;
}
private IntPtr GetByteIntPtr(byte[] byteArr)
{
IntPtr intPtr2 = System.Runtime.InteropServices.Marshal.UnsafeAddrOfPinnedArrayElement(byteArr, 0);
return intPtr2;
}
I am processing frames received from Kinect v2 (Color and IR) in UWP. The program runs on remote machine (XBOX One S). The main goal is to get frames and write them to the disk with 30 fps for Color and IR to later process them further.
I am using the following code to check the frame rate:
public MainPage()
{
this.InitialiseFrameReader(); // initialises MediaCapture for IR and Color
}
const int COLOR_SOURCE = 0;
const int IR_SOURCE = 1;
private async void InitialiseFrameReader()
{
await CleanupMediaCaptureAsync();
var allGroups = await MediaFrameSourceGroup.FindAllAsync();
if (allGroups.Count == 0)
{
return;
}
_groupSelectionIndex = (_groupSelectionIndex + 1) % allGroups.Count;
var selectedGroup = allGroups[_groupSelectionIndex];
var kinectGroup = selectedGroup;
try
{
await InitializeMediaCaptureAsync(kinectGroup);
}
catch (Exception exception)
{
_logger.Log($"MediaCapture initialization error: {exception.Message}");
await CleanupMediaCaptureAsync();
return;
}
// Set up frame readers, register event handlers and start streaming.
var startedKinds = new HashSet<MediaFrameSourceKind>();
foreach (MediaFrameSource source in _mediaCapture.FrameSources.Values.Where(x => x.Info.SourceKind == MediaFrameSourceKind.Color || x.Info.SourceKind == MediaFrameSourceKind.Infrared)) //
{
MediaFrameSourceKind kind = source.Info.SourceKind;
MediaFrameSource frameSource = null;
int frameindex = COLOR_SOURCE;
if (kind == MediaFrameSourceKind.Infrared)
{
frameindex = IR_SOURCE;
}
// Ignore this source if we already have a source of this kind.
if (startedKinds.Contains(kind))
{
continue;
}
MediaFrameSourceInfo frameInfo = kinectGroup.SourceInfos[frameindex];
if (_mediaCapture.FrameSources.TryGetValue(frameInfo.Id, out frameSource))
{
// Create a frameReader based on the source stream
MediaFrameReader frameReader = await _mediaCapture.CreateFrameReaderAsync(frameSource);
frameReader.FrameArrived += FrameReader_FrameArrived;
_sourceReaders.Add(frameReader);
MediaFrameReaderStartStatus status = await frameReader.StartAsync();
if (status == MediaFrameReaderStartStatus.Success)
{
startedKinds.Add(kind);
}
}
}
}
private async Task InitializeMediaCaptureAsync(MediaFrameSourceGroup sourceGroup)
{
if (_mediaCapture != null)
{
return;
}
// Initialize mediacapture with the source group.
_mediaCapture = new MediaCapture();
var settings = new MediaCaptureInitializationSettings
{
SourceGroup = sourceGroup,
SharingMode = MediaCaptureSharingMode.SharedReadOnly,
StreamingCaptureMode = StreamingCaptureMode.Video,
MemoryPreference = MediaCaptureMemoryPreference.Cpu
};
await _mediaCapture.InitializeAsync(settings);
}
private void FrameReader_FrameArrived(MediaFrameReader sender, MediaFrameArrivedEventArgs args)
{
using (var frame = sender.TryAcquireLatestFrame())
{
if (frame != null)
{
//Settings.cameraframeQueue.Enqueue(null, frame.SourceKind.ToString(), frame.SystemRelativeTime.Value); //Add to Queue to process frame
Debug.WriteLine(frame.SourceKind.ToString() + " : " + frame.SystemRelativeTime.ToString());
}
}
}
I am trying to debug the application to check the frame rate so I have removed further processing.
I am not sure if I am not calculating it properly or something else is wrong.
For example, System Relative Time from 04:37:06 to 04:37:48 gives :
IR:
Fps(Occurrence)
31(1)
30(36)
29(18)
28(4)
Color:
Fps(Occurrence)
30(38)
29(18)
28(3)
I want this frame rate to be constant (30 fps) and aligned so IR and Color and same number of frames for that time.
This does not include any additional code. As soon as I have a process queue or any sort of code, the fps decreases and ranges from 15 to 30.
Can anyone please help me with this?
Thank you.
UPDATE:
After some testing and working around, it has come to my notice that PC produces 30fps but XBOX One (remote device) on debug mode produces very low fps. This does however improve when running it on release mode but the memory allocated for UWP apps is quite low.
https://learn.microsoft.com/en-us/windows/uwp/xbox-apps/system-resource-allocation
XBOX One has maximum available memory of 1 GB for Apps and 5 for Games.
https://learn.microsoft.com/en-us/windows/uwp/xbox-apps/system-resource-allocation
While in PC the fps is 30 (as the memory has no such restrictions).
This causes the frame rate to drop. However, the fps did improve when running it on release mode or published to MS Store.
I'm developping an audio application in C# and UWP using the AudioGraph API.
My AudioGraph setup is the following :
AudioFileInputNode --> AudioSubmixNode --> AudioDeviceOutputNode.
I attached a custom echo effect on the AudioSubmixNode.
If I play the AudioFileInputNode I can hear some echo.
But when the AudioFileInputNode playback finishes, the echo sound stops brutally.
I would like it to stop gradually after few seconds only.
If I use the EchoEffectDefinition from the AudioGraph API, the echo sound is not stopped after the sample playback has finished.
I don't know if the problem comes from my effect implementation or if it's a strange behavior of the AudioGraph API...
The behavior is the same in the "AudioCreation" sample in the SDK, scenario 6.
Here is my custom effect implementation :
public sealed class AudioEchoEffect : IBasicAudioEffect
{
public AudioEchoEffect()
{
}
private readonly AudioEncodingProperties[] _supportedEncodingProperties = new AudioEncodingProperties[]
{
AudioEncodingProperties.CreatePcm(44100, 1, 32),
AudioEncodingProperties.CreatePcm(48000, 1, 32),
};
private AudioEncodingProperties _currentEncodingProperties;
private IPropertySet _propertySet;
private readonly Queue<float> _echoBuffer = new Queue<float>(100000);
private int _delaySamplesCount;
private float Delay
{
get
{
if (_propertySet != null && _propertySet.TryGetValue("Delay", out object val))
{
return (float)val;
}
return 500.0f;
}
}
private float Feedback
{
get
{
if (_propertySet != null && _propertySet.TryGetValue("Feedback", out object val))
{
return (float)val;
}
return 0.5f;
}
}
private float Mix
{
get
{
if (_propertySet != null && _propertySet.TryGetValue("Mix", out object val))
{
return (float)val;
}
return 0.5f;
}
}
public bool UseInputFrameForOutput { get { return true; } }
public IReadOnlyList<AudioEncodingProperties> SupportedEncodingProperties { get { return _supportedEncodingProperties; } }
public void SetProperties(IPropertySet configuration)
{
_propertySet = configuration;
}
public void SetEncodingProperties(AudioEncodingProperties encodingProperties)
{
_currentEncodingProperties = encodingProperties;
// compute the number of samples for the delay
_delaySamplesCount = (int)MathF.Round((this.Delay / 1000.0f) * encodingProperties.SampleRate);
// fill empty samples in the buffer according to the delay
for (int i = 0; i < _delaySamplesCount; i++)
{
_echoBuffer.Enqueue(0.0f);
}
}
unsafe public void ProcessFrame(ProcessAudioFrameContext context)
{
AudioFrame frame = context.InputFrame;
using (AudioBuffer buffer = frame.LockBuffer(AudioBufferAccessMode.ReadWrite))
using (IMemoryBufferReference reference = buffer.CreateReference())
{
((IMemoryBufferByteAccess)reference).GetBuffer(out byte* dataInBytes, out uint capacity);
float* dataInFloat = (float*)dataInBytes;
int dataInFloatLength = (int)buffer.Length / sizeof(float);
// read parameters once
float currentWet = this.Mix;
float currentDry = 1.0f - currentWet;
float currentFeedback = this.Feedback;
// Process audio data
float sample, echoSample, outSample;
for (int i = 0; i < dataInFloatLength; i++)
{
// read values
sample = dataInFloat[i];
echoSample = _echoBuffer.Dequeue();
// compute output sample
outSample = (currentDry * sample) + (currentWet * echoSample);
dataInFloat[i] = outSample;
// compute delay sample
echoSample = sample + (currentFeedback * echoSample);
_echoBuffer.Enqueue(echoSample);
}
}
}
public void Close(MediaEffectClosedReason reason)
{
}
public void DiscardQueuedFrames()
{
// reset the delay buffer
_echoBuffer.Clear();
for (int i = 0; i < _delaySamplesCount; i++)
{
_echoBuffer.Enqueue(0.0f);
}
}
}
EDIT :
I changed my audio effect to mix the input samples with a sine wave. The ProcessFrame effect method runs continuously before and after the sample playback (when the effect is active). So the sine wave should be heared before and after the sample playback. But the AudioGraph API seems to ignore the effect output when there is no active playback...
Here is a screen capture of the audio output :
So my question is : How can the built-in EchoEffectDefinition output some sound after the playback finished ? An access to the EchoEffectDefinition source code would be a great help...
By infinitely looping the file input node, then it will always provide an input frame until the audio graph stops. But of course we do not want to hear the file loop, so we can listen the FileCompleted event of AudioFileInputNode. When the file finishes playing, it will trigger the event and we just need to set the OutgoingGain of AudioFileInputNode to zero. So the file playback once, but it continues to silently loop passing input frames that have no audio content to which the echo can be added.
Still using scenario 4 in the AudioCreation sample as an example. In the scenario4, there is a property named fileInputNode1. As mentioned above, please add the following code in fileInputNode1 and test again by using your custom echo effect.
fileInputNode1.LoopCount = null; //Null makes it loop infinitely
fileInputNode1.FileCompleted += FileInputNode1_FileCompleted;
private void FileInputNode1_FileCompleted(AudioFileInputNode sender, object args)
{
fileInputNode1.OutgoingGain = 0.0;
}
I'm trying to play sound provided by an Ethernet microphone.
The device stream live audio via udp packets, that I read in a network receiver thread :
MemoryStream msAudio = new MemoryStream();
private void process_stream(byte[] buffer)
{
msAudio.Write(fragment, 0, fragment.Length);
}
process_stream is called in a task
Then I have another task to play the stream in NAudio (NAudio isn't mandatory) :
while (IsConnected)
{
msAudio.Position = 0;
var waveFormat = new WaveFormat(8000, 16, 1); // Same format
using (WaveStream blockAlignedStream = new BlockAlignReductionStream(
WaveFormatConversionStream.CreatePcmStream(
new RawSourceWaveStream(msAudio , waveFormat))))
{
using (WaveOut waveOut = new WaveOut(WaveCallbackInfo.FunctionCallback()))
{
waveOut.Init(blockAlignedStream);
waveOut.Play();
while (waveOut.PlaybackState == PlaybackState.Playing)
{
System.Threading.Thread.Sleep(100);
}
}
}
}
My problems is :
I hear Toc Toc Toc noise (about 4 times by second)
The audio is sloooowed, voice is deformed like bitrate is too low (but 8khz is correct)
The audio is looped, I think i have to flush my stream, but I don't see were...
Thanks a lot if you can tell me some advice...
P.S :
for helping, original code is working in android using AudioTrack. the code is here
P.S 2 : Here the "image" of the audio noise that I have :
I don't like to respond myself.. but I think that my question is too specific...
So I resolve my problems with :
Using BufferedWaveProvider in place of a memoryStream
Doubling SampleRate (16KHz)
Removing first 4 byte of the buffer
Now My code look like :
public ctor()
{
var waveFormat = new WaveFormat(16000, 16, 1);
buffer = new BufferedWaveProvider(waveFormat)
{
BufferDuration = TimeSpan.FromSeconds(10),
DiscardOnBufferOverflow = true
};
}
internal void OnDataReceived(byte[] currentFrame)
{
if (mPlaying && mAudioTrack != null)
{
buffer.AddSamples(currentFrame, 4, currentFrame.Length);
}
}
internal void ConfigureCodec()
{
mAudioTrack = new WaveOut(WaveCallbackInfo.FunctionCallback());
mAudioTrack.Init(buffer);
if (mPlaying)
{
mAudioTrack.Play();
}
}
No more Thread....
I’m developing a UWP application ( for Windows 10) which works with audio data. It receives samples buffer at the start in the form of a float array of samples, which items are changing from -1f to 1f.
Earlier I used NAudio.dll 1.8.0 that gives all necessary functionality.
Worked with WaveFileReader, waveBuffer.FloatBuffer, WaveFileWriter classes.
However, when I finished this app and tried to build Release version, got this error:
ILT0042: Arrays of pointer types are not currently supported: 'System.Int32*[]'.
I’ve tried to solve it:
https://forums.xamarin.com/discussion/73169/uwp-10-build-fail-arrays-of-pointer-types-error
There is advice to remove the link to .dll, but I need it.
I’ve tried to install NAudio the same version using Manage NuGet Packages, but WaveFileReader, WaveFileWriter is not available.
In NAudio developer’s answer (How to store a .wav file in Windows 10 with NAudio) I’ve read about using AudioGraph, but I can build float array of samples only in the realtime playback, but I need get the full samples to pack right after the audio file uploading. Example of getting samples during the recording process or playback:
https://learn.microsoft.com/ru-ru/windows/uwp/audio-video-camera/audio-graphs
That’s why I need help: how to get FloatBuffer for working with samples after audio file uploading? For example, for building audio waves or calculation for audio effects applying.
Thank you in advance.
I’ve tried to use FileStream and BitConverter.ToSingle(), however, I had a different result compared to NAudio.
In other words, I’m still looking for a solution.
private float[] GetBufferArray()
{
string _path = ApplicationData.Current.LocalFolder.Path.ToString() + "/track_1.mp3";
FileStream _stream = new FileStream(_path, FileMode.Open);
BinaryReader _binaryReader = new BinaryReader(_stream);
int _dataSize = _binaryReader.ReadInt32();
byte[] _byteBuffer = _binaryReader.ReadBytes(_dataSize);
int _sizeFloat = sizeof(float);
float[] _floatBuffer = new float[_byteBuffer.Length / _sizeFloat];
for (int i = 0, j = 0; i < _byteBuffer.Length - _sizeFloat; i += _sizeFloat, j++)
{
_floatBuffer[j] = BitConverter.ToSingle(_byteBuffer, i);
}
return _floatBuffer;
}
Another way to read samples from an audio file in UWP is using AudioGraph API. It will work for all audio formats that Windows10 supports
Here is a sample code
namespace AudioGraphAPI_read_samples_from_file
{
// App opens a file using FileOpenPicker and reads samples into array of
// floats using AudioGragh API
// Declare COM interface to access AudioBuffer
[ComImport]
[Guid("5B0D3235-4DBA-4D44-865E-8F1D0E4FD04D")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
unsafe interface IMemoryBufferByteAccess
{
void GetBuffer(out byte* buffer, out uint capacity);
}
public sealed partial class MainPage : Page
{
StorageFile mediaFile;
AudioGraph audioGraph;
AudioFileInputNode fileInputNode;
AudioFrameOutputNode frameOutputNode;
/// <summary>
/// We are going to fill this array with audio samples
/// This app loads only one channel
/// </summary>
float[] audioData;
/// <summary>
/// Current position in audioData array for loading audio samples
/// </summary>
int audioDataCurrentPosition = 0;
public MainPage()
{
this.InitializeComponent();
}
private async void Open_Button_Click(object sender, RoutedEventArgs e)
{
// We ask user to pick an audio file
FileOpenPicker filePicker = new FileOpenPicker();
filePicker.SuggestedStartLocation = PickerLocationId.MusicLibrary;
filePicker.FileTypeFilter.Add(".mp3");
filePicker.FileTypeFilter.Add(".wav");
filePicker.FileTypeFilter.Add(".wma");
filePicker.FileTypeFilter.Add(".m4a");
filePicker.ViewMode = PickerViewMode.Thumbnail;
mediaFile = await filePicker.PickSingleFileAsync();
if (mediaFile == null)
{
return;
}
// We load samples from file
await LoadAudioFromFile(mediaFile);
// We wait 5 sec
await Task.Delay(5000);
if (audioData == null)
{
ShowMessage("Error loading samples");
return;
}
// After LoadAudioFromFile method finished we can use audioData
// For example we can find max amplitude
float max = audioData[0];
for (int i = 1; i < audioData.Length; i++)
if (Math.Abs(audioData[i]) > Math.Abs(max))
max = audioData[i];
ShowMessage("Maximum is " + max.ToString());
}
private async void ShowMessage(string Message)
{
var dialog = new MessageDialog(Message);
await dialog.ShowAsync();
}
private async Task LoadAudioFromFile(StorageFile file)
{
// We initialize an instance of AudioGraph
AudioGraphSettings settings =
new AudioGraphSettings(
Windows.Media.Render.AudioRenderCategory.Media
);
CreateAudioGraphResult result1 = await AudioGraph.CreateAsync(settings);
if (result1.Status != AudioGraphCreationStatus.Success)
{
ShowMessage("AudioGraph creation error: " + result1.Status.ToString());
}
audioGraph = result1.Graph;
if (audioGraph == null)
return;
// We initialize FileInputNode
CreateAudioFileInputNodeResult result2 =
await audioGraph.CreateFileInputNodeAsync(file);
if (result2.Status != AudioFileNodeCreationStatus.Success)
{
ShowMessage("FileInputNode creation error: " + result2.Status.ToString());
}
fileInputNode = result2.FileInputNode;
if (fileInputNode == null)
return;
// We read audio file encoding properties to pass them to FrameOutputNode creator
AudioEncodingProperties audioEncodingProperties = fileInputNode.EncodingProperties;
// We initialize FrameOutputNode and connect it to fileInputNode
frameOutputNode = audioGraph.CreateFrameOutputNode(audioEncodingProperties);
fileInputNode.AddOutgoingConnection(frameOutputNode);
// We add a handler achiving the end of a file
fileInputNode.FileCompleted += FileInput_FileCompleted;
// We add a handler which will transfer every audio frame into audioData
audioGraph.QuantumStarted += AudioGraph_QuantumStarted;
// We initialize audioData
int numOfSamples = (int)Math.Ceiling(
(decimal)0.0000001
* fileInputNode.Duration.Ticks
* fileInputNode.EncodingProperties.SampleRate
);
audioData = new float[numOfSamples];
audioDataCurrentPosition = 0;
// We start process which will read audio file frame by frame
// and will generated events QuantumStarted when a frame is in memory
audioGraph.Start();
}
private void FileInput_FileCompleted(AudioFileInputNode sender, object args)
{
audioGraph.Stop();
}
private void AudioGraph_QuantumStarted(AudioGraph sender, object args)
{
AudioFrame frame = frameOutputNode.GetFrame();
ProcessInputFrame(frame);
}
unsafe private void ProcessInputFrame(AudioFrame frame)
{
using (AudioBuffer buffer = frame.LockBuffer(AudioBufferAccessMode.Read))
using (IMemoryBufferReference reference = buffer.CreateReference())
{
// We get data from current buffer
((IMemoryBufferByteAccess)reference).GetBuffer(
out byte* dataInBytes,
out uint capacityInBytes
);
// We discard first frame; it's full of zeros because of latency
if (audioGraph.CompletedQuantumCount == 1) return;
float* dataInFloat = (float*)dataInBytes;
uint capacityInFloat = capacityInBytes / sizeof(float);
// Number of channels defines step between samples in buffer
uint step = fileInputNode.EncodingProperties.ChannelCount;
// We transfer audio samples from buffer into audioData
for (uint i = 0; i < capacityInFloat; i += step)
{
if (audioDataCurrentPosition < audioData.Length)
{
audioData[audioDataCurrentPosition] = dataInFloat[i];
audioDataCurrentPosition++;
}
}
}
}
}
}
Edited: It solves the problem because it reads samples from a file into a float array
First popular way of getting AudioData from Wav file.
Thanks to PI user’s answer How to read the data in a wav file to an array, I’ve solved the problem with wav file reading in float array in UWP project.
But file’s structure differs from standard one (maybe, only in my project there is such problem) when it records in wav file using AudioGraph. It leads to unpredictable result. We receive value1263424842 instead of predictable 544501094 getting format id. After that, all following values are displayed incorrectly. I’ve found out the correct id sequentially searching in the bytes. I realised that AudioGraph adds extra chunk of data to recorded wav file, but record’s format is still PCM. This extra chunk of data looks like the data about file format, but it contains also empty values, empty bytes. I can’t find any information about that, maybe somebody here knows? The solution from PI I’ve changed for my needs. That’s what I’ve got:
using (FileStream fs = File.Open(filename, FileMode.Open))
{
BinaryReader reader = new BinaryReader(fs);
int chunkID = reader.ReadInt32();
int fileSize = reader.ReadInt32();
int riffType = reader.ReadInt32();
int fmtID;
long _position = reader.BaseStream.Position;
while (_position != reader.BaseStream.Length-1)
{
reader.BaseStream.Position = _position;
int _fmtId = reader.ReadInt32();
if (_fmtId == 544501094) {
fmtID = _fmtId;
break;
}
_position++;
}
int fmtSize = reader.ReadInt32();
int fmtCode = reader.ReadInt16();
int channels = reader.ReadInt16();
int sampleRate = reader.ReadInt32();
int byteRate = reader.ReadInt32();
int fmtBlockAlign = reader.ReadInt16();
int bitDepth = reader.ReadInt16();
int fmtExtraSize;
if (fmtSize == 18)
{
fmtExtraSize = reader.ReadInt16();
reader.ReadBytes(fmtExtraSize);
}
int dataID = reader.ReadInt32();
int dataSize = reader.ReadInt32();
byte[] byteArray = reader.ReadBytes(dataSize);
int bytesForSamp = bitDepth / 8;
int samps = dataSize / bytesForSamp;
float[] asFloat = null;
switch (bitDepth)
{
case 16:
Int16[] asInt16 = new Int16[samps];
Buffer.BlockCopy(byteArray, 0, asInt16, 0, dataSize);
IEnumerable<float> tempInt16 =
from i in asInt16
select i / (float)Int16.MaxValue;
asFloat = tempInt16.ToArray();
break;
default:
return false;
}
//For one channel wav audio
floatLeftBuffer.AddRange(asFloat);
From buffer to file record has inverse algorithm. At this moment this is the only one correct algorithm for working with wav files which allows to get audio data.
Used this article working with AudioGraph - https://learn.microsoft.com/ru-ru/windows/uwp/audio-video-camera/audio-graphs. Note that you can set up necessary data of record’s format with AudioEncodingQuality recirdung from MIC to file.
Second way of getting AudioData using NAudio from Nugget Packages.
I used MediaFoundationReader class.
float[] floatBuffer;
using (MediaFoundationReader media = new MediaFoundationReader(path))
{
int _byteBuffer32_length = (int)media.Length * 2;
int _floatBuffer_length = _byteBuffer32_length / sizeof(float);
IWaveProvider stream32 = new Wave16ToFloatProvider(media);
WaveBuffer _waveBuffer = new WaveBuffer(_byteBuffer32_length);
stream32.Read(_waveBuffer, 0, (int)_byteBuffer32_length);
floatBuffer = new float[_floatBuffer_length];
for (int i = 0; i < _floatBuffer_length; i++) {
floatBuffer[i] = _waveBuffer.FloatBuffer[i];
}
}
Comparing two ways I noticed:
Received values of samples differ on 1/1 000 000. I can’t say what way is more precise (if you know, will be glad to hear);
Second way of getting AudioData works for MP3 files, too.
If you’ve found any mistakes or have comments about that, welcome.
Import statement
using NAudio.Wave;
using NAudio.Wave.SampleProviders;
Inside function
AudioFileReader reader = new AudioFileReader(filename);
ISampleProvider isp = reader.ToSampleProvider();
float[] buffer = new float[reader.Length / 2];
isp.Read(buffer, 0, buffer.Length);
buffer array will be having 32 bit IEEE float samples.
This is using NAudio Nuget Package Visual Studio.