This is similar to this question (kinda outdated), my problem is that I am trying to create an AudioClip from a float array that has been converted from byte array that has been converted from a base64 string. At the end it plays a loud, horrible and fast sound.
You can use this online tool to encode a small .wav file into a base64 string and this other online tool to make sure the large base64 string decoded generates the exact audio file.
I followed the other question solution and also tried changing the frequency value but it still plays a horrible sound. What Am I doing wrong?
AudioSource myAudio;
public float frequency = 44100; //Maybe 22050? since it is a wav mono sound
void Start(){
myAudio = GetComponent<AudioSource>();
}
//When clicking on the game object, play the sound
private void OnMouseDown(){
string audioAsString="UklGRjbPAQBXQVZFZm10IBIAAAABAAEAIlYAAESsAAACABAAAABkYXRh....."; //Base64 string encoded from the online tool. I can't put the whole string here because of max characters warning from StackOverflow questions.
byte[] byteArray = Convert.FromBase64String(audioAsString); //From Base64 string to byte[]
float[] floatArray = ConvertByteArrayToFloatArray(byteArray); //From byte[] to float[]
AudioClip audioClip = AudioClip.Create("test", floatArray.Length , 1, frequency, false);
audioClip.SetData(floatArray, 0);
myAudio.clip = audioClip;
myAudio.Play(); //Plays the audio
}
private float[] ConvertByteArrayToFloatArray(byte[] array)
{
float[] floatArr = new float[array.Length / 4];
for (int i = 0; i < floatArr.Length; i++)
{
if (BitConverter.IsLittleEndian) //I am still not sure what this line does
Array.Reverse(array, i * 4, 4);
floatArr[i] = BitConverter.ToSingle(array, i * 4) / 0x80000000;
}
return floatArr;
}
Related
(continuing to the question here... )
After some research, I have got the following:
private float[] Normalize(float[] data) {
float max = float.MinValue;
for (int i = 0; i < data.Length; i++){
if (System.Math.Abs(data[i]) > max) max = System.Math.Abs(data[i]);
}
for (int i = 0; i < data.Length; i++) data[i] = data[i] / max;
return data;
}
private float[] ConvertByteToFloat(byte[] array){
float[] floatArr = new float[array.Length / 4];
for (int i = 0; i < floatArr.Length; i++){
if (System.BitConverter.IsLittleEndian) System.Array.Reverse(array, i * 4, 4);
floatArr[i] = System.BitConverter.ToSingle(array, i * 4) ;
}
return Normalize(floatArr);
}
byte[] bytes = System.Convert.FromBase64String(data);
float[] f = ConvertByteToFloat(bytes);
qa[i] = AudioClip.Create("qjAudio", f.Length, 2, 44100, false);
qa[i].SetData(f,0);
However, all I heard was some random noise.
Someone suggested converting it to a file first:
[SerializeField] private AudioSource _audioSource;
private void Start()
{
StartCoroutine(ConvertBase64ToAudioClip(EXAMPLE_BASE64_MP3_STRING, _audioSource));
}
IEnumerator ConvertBase64ToAudioClip(string base64EncodedMp3String, AudioSource audioSource)
{
var audioBytes = Convert.FromBase64String(base64EncodedMp3String);
var tempPath = Application.persistentDataPath + "tmpMP3Base64.mp3";
File.WriteAllBytes(tempPath, audioBytes);
UnityWebRequest request = UnityWebRequestMultimedia.GetAudioClip(tempPath, AudioType.MPEG);
yield return request.SendWebRequest();
if (request.result.Equals(UnityWebRequest.Result.ConnectionError))
Debug.LogError(request.error);
else
{
audioSource.clip = DownloadHandlerAudioClip.GetContent(request);
audioSource.Play();
}
File.Delete(tempPath);
}
However, for this to work on Android devices, I will need to request Android permissions, which can discourage players to try my game. It will also create a lot of sophistication as I will need to handle cases when the player has already rejected them once so that the request dialogs won't appear again.
How can I correctly convert a base64 mp3 string into AudioClip without resorting to using the physical storage?
The code should be simple an not require any byte swapping. See following : https://docs.unity.cn/540/Documentation/ScriptReference/WWW.GetAudioClip.html
byte[] bytes = System.Convert.FromBase64String(data);
MemoryStream ms = new MemoryStream(bytes);
ms.Position = 0;
AudioClip GetAudioClip(bool threeD, bool stream, AudioType.MPEG);
I'm running the C# code below, which computes optical flow maps and saves them as PNGs during gameplay, using Unity with a VR headset that forces an upper limit of 90 FPS. Without this code, the project runs smoothly at 90 FPS. To run this code on top of the same project consistently above 80 FPS, I had to use WaitForSeconds(0.2f) in the coroutine but the ideal scenario would be to compute and save the optical flow map for every frame of the game, or at least with a lower delay about 0.01 seconds. I'm already using AsyncGPUReadback and WriteAsync.
Main Question: How can I speed up this code further?
Side Question: Any way I can dump the calculated optical flow maps as consecutive rows in a CSV file so that it would write on a single file rather than creating a separate PNG for each map? Or would this be even slower?
using System.Collections;
using UnityEngine;
using System.IO;
using UnityEngine.Rendering;
namespace OpticalFlowAlternative
{
public class OpticalFlow : MonoBehaviour {
protected enum Pass {
Flow = 0,
DownSample = 1,
BlurH = 2,
BlurV = 3,
Visualize = 4
};
public RenderTexture Flow { get { return resultBuffer; } }
[SerializeField] protected Material flowMaterial;
protected RenderTexture prevFrame, flowBuffer, resultBuffer, renderTexture, rt;
public string customOutputFolderPath = "";
private string filepathforflow;
private int imageCount = 0;
int targetTextureWidth, targetTextureHeight;
private EyeTrackingV2 eyeTracking;
protected void Start () {
eyeTracking = GameObject.Find("XR Rig").GetComponent<EyeTrackingV2>();
targetTextureWidth = Screen.width / 16;
targetTextureHeight = Screen.height / 16;
flowMaterial.SetFloat("_Ratio", 1f * Screen.height / Screen.width);
renderTexture = new RenderTexture(targetTextureWidth, targetTextureHeight, 0);
rt = new RenderTexture(Screen.width, Screen.height, 0);
StartCoroutine("StartCapture");
}
protected void LateUpdate()
{
eyeTracking.flowCount = imageCount;
}
protected void OnDestroy ()
{
if(prevFrame != null)
{
prevFrame.Release();
prevFrame = null;
flowBuffer.Release();
flowBuffer = null;
rt.Release();
rt = null;
renderTexture.Release();
renderTexture = null;
}
}
IEnumerator StartCapture()
{
while (true)
{
yield return new WaitForSeconds(0.2f);
ScreenCapture.CaptureScreenshotIntoRenderTexture(rt);
//compensating for image flip
Graphics.Blit(rt, renderTexture, new Vector2(1, -1), new Vector2(0, 1));
if (prevFrame == null)
{
Setup(targetTextureWidth, targetTextureHeight);
Graphics.Blit(renderTexture, prevFrame);
}
flowMaterial.SetTexture("_PrevTex", prevFrame);
//calculating motion flow frame here
Graphics.Blit(renderTexture, flowBuffer, flowMaterial, (int)Pass.Flow);
Graphics.Blit(renderTexture, prevFrame);
AsyncGPUReadback.Request(flowBuffer, 0, TextureFormat.ARGB32, OnCompleteReadback);
}
}
void OnCompleteReadback(AsyncGPUReadbackRequest request)
{
if (request.hasError)
return;
var tex = new Texture2D(targetTextureWidth, targetTextureHeight, TextureFormat.ARGB32, false);
tex.LoadRawTextureData(request.GetData<uint>());
tex.Apply();
WriteTextureAsync(tex);
}
async void WriteTextureAsync(Texture2D tex)
{
imageCount++;
filepathforflow = customOutputFolderPath + imageCount + ".png";
var stream = new FileStream(filepathforflow, FileMode.OpenOrCreate);
var bytes = tex.EncodeToPNG();
await stream.WriteAsync(bytes, 0, bytes.Length);
}
protected void Setup(int width, int height)
{
prevFrame = new RenderTexture(width, height, 0);
prevFrame.format = RenderTextureFormat.ARGBFloat;
prevFrame.wrapMode = TextureWrapMode.Repeat;
prevFrame.Create();
flowBuffer = new RenderTexture(width, height, 0);
flowBuffer.format = RenderTextureFormat.ARGBFloat;
flowBuffer.wrapMode = TextureWrapMode.Repeat;
flowBuffer.Create();
}
}
}
First though here is to use CommandBuffers, with them you can perform no-copy readback of the screen, apply your calculations and store them in separate buffers (textures).
Then you can request readbacks of part of the texture/multiple textures over frames, without blocking access to currently computing texture. When readback is performed, best way is to encode it to PNG/JPG in separate thread, without blocking main thread.
Alternately to async readbacks, if you are on DX11/Desktop, it's also possible to have D3D buffer configured for fast cpu readback, and map it every frame if you want to avoid few-frames latency which happens because of using async readback.
Creating texture from buffer is another waste of performance here, since readback gives you pixel values, you can use general purpose png encoders and save it multi-threaded (while texture creation is only allowed in "main" thread)
If latencies are fine for you, but you want to have exact framenumber to image mapping, it's also possible to encode frame number into target image, so you'll always have it before saving in png.
About side-question, CSV could be faster than default PNG encoding, because PNG using zip-like compression inside, while CSV is just a bunch of numbers compiled in strings
I am using C# WPF to make a real-time FFT.
I am using NAudio's WaveIn and BufferedWaveProvider to capture any sound recorded by Stereo Mix. I take the FFT of the buffer many times per second and display it using a bitmap so that the display shows a real-time fourier transform of any audio playing through the speakers.
My problem is that, as expected, the displayed FFT lags behind the audio coming from the speakers by a small amount (maybe 200 ms).
Is there any way I can record the current audio that is supposed to be playing from the speakers so that I can perform the FFT on it and then play it back a small amount of time later (ex. 200 ms) while muting the original real-time audio.
The end result would simply be to effectively remove the perceived delay from the displayed FFT. Audio from a youtube video, for example, would lag slightly behind the video while my program is running.
Here are the relevant methods from what I have right now:
public MainWindow()
{
sampleSize = (int)Math.Pow(2, 13);
BUFFERSIZE = sampleSize * 2;
InitializeComponent();
// get the WaveIn class started
WaveIn wi = new WaveIn();
wi.DeviceNumber = deviceNo;
wi.WaveFormat = new NAudio.Wave.WaveFormat(RATE, WaveIn.GetCapabilities(wi.DeviceNumber).Channels);
// create a wave buffer and start the recording
wi.DataAvailable += new EventHandler<WaveInEventArgs>(wi_DataAvailable);
bwp = new BufferedWaveProvider(wi.WaveFormat);
bwp.BufferLength = BUFFERSIZE; //each sample is 2 bytes
bwp.DiscardOnBufferOverflow = true;
wi.StartRecording();
}
public void UpdateFFT()
{
// read the bytes from the stream
byte[] buffer = new byte[BUFFERSIZE];
bwp.Read(buffer, 0, BUFFERSIZE);
if (buffer[BUFFERSIZE - 2] == 0) return;
Int32[] vals = new Int32[sampleSize];
Ys = new double[sampleSize];
for (int i = 0; i < vals.Length; i++)
{
// bit shift the byte buffer into the right variable format
byte hByte = buffer[i * 2 + 1];
byte lByte = buffer[i * 2 + 0];
vals[i] = (short)((hByte << 8) | lByte);
Ys[i] = vals[i];
}
FFT(Ys);
}
I am still new to audio processing - any help would be appreciated.
The cause of your delay is the latency of WaveIn which is about 200ms by default. You can reduce that but at the risk of dropouts.
Whilst you can capture audio being played by the system with WasapiCapture there is no way to change it with NAudio, or delay its playback.
(Newbie question)
NAudio allows to start playing an MP3 file from a given position (by converting it from ms into bytes using Waveformat.AverageBytesPerSecond), but is it possible to make it stop playing exactly at another given position (in ms)? Do I have to somehow manipulate the wavestream or there are easier ways?
There is a solution using a Timer simultaneously together with starting playback and stopping playback after the timer ticks, but it doesn't produce reliable results at all.
I'd create a custom IWaveProvider that only returns a maximum of a specified number of bytes from Read. Then reposition your Mp3FileReader to the start, and pass it in to the custom trimming wave provider
Here's some completely untested example code to give you an idea.
class TrimWaveProvider
{
private readonly IWaveProvider source;
private int bytesRead;
private readonly int maxBytesToRead;
public TrimWaveProvider(IWaveProvider source, int maxBytesToRead)
{
this.source = source;
this.maxBytesToRead = maxBytesToRead;
}
public WaveFormat WaveFormat { get { return source.WaveFormat; } }
public int Read(byte[] buffer, int offset, int bytesToRead)
{
int bytesToReadThisTime = Math.Min(bytesToRead, maxBytesToRead - bytesRead);
int bytesReadThisTime = source.Read(buffer, offset, bytesToReadThisTime);
bytesRead += bytesReadThisTime;
return bytesReadThisTime;
}
}
// and call it like this...
var reader = new Mp3FileReader("myfile.mp3");
reader.Position = reader.WaveFormat.AverageBytesPerSecond * 3; // start 3 seconds in
// read 5 seconds
var trimmer = new TrimWaveProvider(reader, reader.WaveFormat.AverageBytesPerSecond * 5);
WaveOut waveOut = new WaveOut();
waveOut.Init(trimmer);
I have a application where I create my own Depth Frame (using the Kinect SDK). The problem is when a human is detected the FPS of the depth (and then the color too) slows down significantly. Here is a movie of when the frame slows down. The code I am using:
using (DepthImageFrame DepthFrame = e.OpenDepthImageFrame())
{
depthFrame = DepthFrame;
pixels1 = GenerateColoredBytes(DepthFrame);
depthImage = BitmapSource.Create(
depthFrame.Width, depthFrame.Height, 96, 96, PixelFormats.Bgr32, null, pixels1,
depthFrame.Width * 4);
depth.Source = depthImage;
}
...
private byte[] GenerateColoredBytes(DepthImageFrame depthFrame2)
{
short[] rawDepthData = new short[depthFrame2.PixelDataLength];
depthFrame.CopyPixelDataTo(rawDepthData);
byte[] pixels = new byte[depthFrame2.Height * depthFrame2.Width * 4];
const int BlueIndex = 0;
const int GreenIndex = 1;
const int RedIndex = 2;
for (int depthIndex = 0, colorIndex = 0;
depthIndex < rawDepthData.Length && colorIndex < pixels.Length;
depthIndex++, colorIndex += 4)
{
int player = rawDepthData[depthIndex] & DepthImageFrame.PlayerIndexBitmask;
int depth = rawDepthData[depthIndex] >> DepthImageFrame.PlayerIndexBitmaskWidth;
byte intensity = CalculateIntensityFromDepth(depth);
pixels[colorIndex + BlueIndex] = intensity;
pixels[colorIndex + GreenIndex] = intensity;
pixels[colorIndex + RedIndex] = intensity;
if (player > 0)
{
pixels[colorIndex + BlueIndex] = Colors.Gold.B;
pixels[colorIndex + GreenIndex] = Colors.Gold.G;
pixels[colorIndex + RedIndex] = Colors.Gold.R;
}
}
return pixels;
}
FPS is quite crucial to me since I am making an app that saves pictures of people when they are detected. How can I maintain a faster FPS? Why is my application doing this?
G.Y is correct that you're not disposing properly. You should refactor your code so the DepthImageFrame is disposed of ASAP.
...
private short[] rawDepthData = new short[640*480]; // assuming your resolution is 640*480
using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())
{
depthFrame.CopyPixelDataTo(rawDepthData);
}
pixels1 = GenerateColoredBytes(rawDepthData);
...
private byte[] GenerateColoredBytes(short[] rawDepthData){...}
You said that you're using the depth frame elsewhere in the application. This is bad. If you need some specific data from the depth frame, save it separately.
dowhilefor is also correct that you should look at using a WriteableBitmap, it's super simple.
private WriteableBitmap wBitmap;
//somewhere in your initialization
wBitmap = new WriteableBitmap(...);
depth.Source = wBitmap;
//Then to update the image:
wBitmap.WritePixels(...);
Also, you're creating new arrays to store pixel data again and again on every frame. You should create these arrays as global variables, create them a single time, and then just overwrite them on every frame.
Finally, although this shouldn't make a huge difference, I'm curious about your CalculateIntensityFromDepth method. If the compiler isn't inlining that method, that's a lot of extraneous method calls. Try to remove that method and just write the code where the method call is right now.