Streaming video (C# using FFmpeg AutoGen) sends multiple data requests - c#

I've written a video generator that rights a video in h264 format (mp4). When I stream the video from my azure service, i'm seeing the following network traffic:
The AVCodecContext layout I'm using is as follows:
AVCodec* videoCodec = ffmpeg.avcodec_find_encoder(AVCodecID.AV_CODEC_ID_H264)
AVCodecContext* videoCodecContext = ffmpeg.avcodec_alloc_context3(videoCodec);
videoCodecContext->bit_rate = 400000;
videoCodecContext->width = 1280;
videoCodecContext->height = 720;
videoCodecContext->gop_size = 12;
videoCodecContext->max_b_frames = 1;
videoCodecContext->pix_fmt = videoCodec->pix_fmts[0];
videoCodecContext->codec_id = videoCodec->id;
videoCodecContext->codec_type = videoCodec->type;
videoCodecContext->time_base = new AVRational
{
num = 1,
den = 30
};
ffmpeg.av_opt_set(videoCodecContext->priv_data, "preset", "ultrafast");
I'm also tried setting the "movflags" option for avformat_write_header() via an AVDictionary, but then av_write_trailer() returns -2, cause the file to not finish writing.
I cannot figure out how to solve this problem. Videos generating using Windows Movie Maker stream perfectly.
I know this has something to do with mdat and mov positions.
Also, this appears to only happening in Google Chrome.

OK, figured this out. I've been writing the video frames first and the audio frames afterwards. Instead, you have to write them side by side in order for faststart to actually work and allow the video to stream.
So, write a specific amount of audio and then determine if a video frame should be written by checking the timebases against the current writing indexes.
This example will show you how its done.
Also, to get the video and audio streams to have accurate PTS/DTS values, look at this question.

Related

Microsoft SpeechSynthesizer crackles when outputting to files and streams

I'm writing a thing that uses the SpeechSynthesizer to generate wave files on request, but I'm having problems with crackling noises. The weird thing is that output directly to the sound card is just fine.
This short powershell script demonstrates the issue, though I'm writing my program in C#.
Add-Type -AssemblyName System.Speech
$speech = New-Object System.Speech.Synthesis.SpeechSynthesizer
$speech.Speak('Guybrush Threepwood, mighty pirate!')
$speech.SetOutputToWaveFile("${PSScriptRoot}\foo.wav")
$speech.Speak('Guybrush Threepwood, mighty pirate!')
What this should do, is output to the speakers, and then save that same sound as "foo.wav" next to the script.
What it does is output to the speakers, and then save a crackling, old record player sounding version as a wave file. I've tested this on three different machines, and though they select different voices by default (all Microsoft provided default ones), they all sound like garbage falling down stairs in the wave file.
Why?
EDIT: I am testing this on Windows 10 Pro, with the latest updates that add that annoying "People" button on the taskbar.
EDIT 2: Here's a link to an example sound generated with the above script. Notice the crackling voice, that's not there when the script outputs directly to the speakers.
EDIT 3: It's even more noticeable with a female voice
EDIT 4: The same voice as above, saved to file with TextAloud 3 - no cracking, no vertical spikes.
I find it hard to believe this is a PoSH issue.
It's not PoSH doing the encoding on the serialization to disk. Its the API/Class that is being used.
'msdn.microsoft.com/en-us/library/system.speech.synthesis.speechsynthesizer(v=vs.110).aspx'
As per the MSDN, there is no option to control the encoding, bit rate, etc.
.wav has never been HQ stuff. So, I'd wonder if you take that .wav through a converter to make it an .mp3 or mp4, if that would correct your quality concerns. But that also means getting the converter on users systems.
Secondly, since Win8, the default player does not even play .wav correctly or at all. Sure, you can still set the default play of .wav to Windows Media Player or call the file via VLC, but it's still a .wav file. Yet, that also means, you having to set the Media Player assignment on every target system.
This is an issue with the SpeechSynthesizer API, which simply provides bad quality, crackling audio as seen in the samples above. The solution is to do what TextAloud does, which is to use the SpeechLib COM objects directly.
This is done by adding a COM reference to "Microsoft Speech Object Library (5.4)". Here is a snippet of the code I ended up with, which produces audio clips of the same quality as TextAloud:
public new static byte[] GetSound(Order o)
{
const SpeechVoiceSpeakFlags speechFlags = SpeechVoiceSpeakFlags.SVSFlagsAsync;
var synth = new SpVoice();
var wave = new SpMemoryStream();
var voices = synth.GetVoices();
try
{
// synth setup
synth.Volume = Math.Max(1, Math.Min(100, o.Volume ?? 100));
synth.Rate = Math.Max(-10, Math.Min(10, o.Rate ?? 0));
foreach (SpObjectToken voice in voices)
{
if (voice.GetAttribute("Name") == o.Voice.Name)
{
synth.Voice = voice;
}
}
wave.Format.Type = SpeechAudioFormatType.SAFT22kHz16BitMono;
synth.AudioOutputStream = wave;
synth.Speak(o.Text, speechFlags);
synth.WaitUntilDone(Timeout.Infinite);
var waveFormat = new WaveFormat(22050, 16, 1);
using (var ms = new MemoryStream((byte[])wave.GetData()))
using (var reader = new RawSourceWaveStream(ms, waveFormat))
using (var outStream = new MemoryStream())
using (var writer = new WaveFileWriter(outStream, waveFormat))
{
reader.CopyTo(writer);
return o.Mp3 ? ConvertToMp3(outStream) : outStream.GetBuffer();
}
}
finally
{
Marshal.ReleaseComObject(voices);
Marshal.ReleaseComObject(wave);
Marshal.ReleaseComObject(synth);
}
}
This is the code to convert a wave file to mp3. It uses NAudio.Lame from nuget.
internal static byte[] ConvertToMp3(Stream wave)
{
wave.Position = 0;
using (var mp3 = new MemoryStream())
using (var reader = new WaveFileReader(wave))
using (var writer = new LameMP3FileWriter(mp3, reader.WaveFormat, 128))
{
reader.CopyTo(writer);
mp3.Position = 0;
return mp3.ToArray();
}
}

Windows Form vs Unity3D HttpWebRequest Performance

I am working on a simple program that grabs image from a remote IP camera. After days of research, I was able to extract JPEG images from MJPEG live stream with sample codes I got.
I did a prototype with using Windows Form. With Windows Form, I receive appropriately 80 images every 10 second from the IP camera.
Now I ported the code to Unity3D and I get about 2 frames every 10 seconds.
So basically about 78 Images are not received.
The thing looks like medieval PowerPoint slide show.
I am running the function in new Thread just like I did in the Windows Form. I first thought that the problem in Unity is because I was displaying the image, but it wasn't.
I removed the code that displays the Image as a texture and used an integer to count the number of images received. Still, I get about 2 to 4 images every 10 seconds. Meaning in the Windows Form App, I get about 80 to 100 images every 10 seconds.
Receiving 2 images in 10 seconds in Unity is unacceptable for what I am doing. The code I wrote doesn't seem to be the problem because it works great in Windows Form.
Things I've Tried:
I though the problem is from the Unity3D Editor run-time, so I called for Windows 10 64bit and ran it but that didn't solve the problem.
Changed the Scripting Backend from Mono2x to IL2CPP but the problem still remains.
Changed the Api compatibility Level from .NET 2.0 to .NET 2.0 Subset and nothing changed.
Below is a simple function I that is having that problem. It runs too slow on Unity even though I called it from another thread.
bool keepRunning = true;
private void Decode_MJPEG_Images(string streamTestURL = null)
{
keepRunning = true;
streamTestURL = "http://64.122.208.241:8000/axis-cgi/mjpg/video.cgi?resolution=320x240"; //For Testing purposes only
// create HTTP request
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(streamTestURL);
// get response
WebResponse resp = req.GetResponse();
System.IO.Stream imagestream = resp.GetResponseStream();
const int BufferSize = 5000000;
byte[] imagebuffer = new byte[BufferSize];
int a = 2;
int framecounter = 0;
int startreading = 0;
byte[] start_checker = new byte[2];
byte[] end_checker = new byte[2];
while (keepRunning)
{
start_checker[1] = (byte)imagestream.ReadByte();
end_checker[1] = start_checker[1];
//This if statement searches for the JPEG header, and performs the relevant operations
if (start_checker[0] == 0xff && start_checker[1] == 0xd8)// && Reset ==0)
{
Array.Clear(imagebuffer, 0, imagebuffer.Length);
//Rebuild jpeg header into imagebuffer
imagebuffer[0] = 0xff;
imagebuffer[1] = 0xd8;
a = 2;
framecounter++;
startreading = 1;
}
//This if statement searches for the JPEG footer, and performs the relevant operations
if (end_checker[0] == 0xff && end_checker[1] == 0xd9)
{
startreading = 0;
//Write final part of JPEG header into imagebuffer
imagebuffer[a] = start_checker[1];
System.IO.MemoryStream jpegstream = new System.IO.MemoryStream(imagebuffer);
Debug.Log("Received Full Image");
Debug.Log(framecounter.ToString());
//Display Image
}
//This if statement fills the imagebuffer, if the relevant flags are set
if (startreading == 1 && a < BufferSize)
{
imagebuffer[a] = start_checker[1];
a++;
}
//Catches error condition where a = buffer size - this should not happen in normal operation
if (a == BufferSize)
{
a = 2;
startreading = 0;
}
start_checker[0] = start_checker[1];
end_checker[0] = end_checker[1];
}
resp.Close();
}
Now I am blaming HttpWebRequest for this problem. Maybe it was poorly implemented in Unity. Not sure....
What's going on? Why is this happening? How can I fix it?
Is it perhaps the case that one has to use Read[a lot] instead of Read ??
Read[a lot]:
https://msdn.microsoft.com/en-us/library/system.io.stream.read(v=vs.110).aspx
Return Value Type: System.Int32 The total number of bytes read into the buffer. This can be less than the number of bytes requested if that many bytes are not currently available
Conceivably, ReadAsync could help, manual, although it results in wildly different code.
I'm a bit puzzled as to what part of your code you are saying has the performance problem - its it displaying the MPG or is it the snippet of code you've published here ? Assuming the HttpRequest isn't your problem (which you can easily test in Fiddler to see how long the call and fetch actually take) then I'm guessing your problem is in the display of the MPG not the code you've posted (which wont be different between WinForms and Unity)
My guess is, if the problem is in Unity you are passing the created MemoryStream to unity to create a graphics resource ? Your code looks like it reads the stream and when it hits the end-of-image character it creates a new MemoryStream which contains the entire data buffer content. This may be a problem for Unity that isn't a problem in WinForms - the memory stream seems to contain the whole buffer you created, but this is bigger than the actual content you read - does Unity perhaps see this as a corrupted Jpg ?
Try using the MemoryStream constructor that takes an byte range from your byte[] and pass through just the data you know make up your image stream.
Other issues in the code might be (but unlikely to be performance related); Large Object Heap fragmentation from the creation and discard of the large byte[]; non dynamic storage of the incoming stream (fixed destination buffer size); no checking of incoming stream size, or end-of-stream indicators (if the response stream does not contain the whole image stream, there doesnt seem to be a strategy to deal with it).

How to append image on running video using C#

I want to add images on running video using c#.
My Code Is but not working
byte[] mainAudio = System.IO.File.ReadAllBytes(Server.MapPath(image path));//Upload by User
byte[] intreAudio = System.IO.File.ReadAllBytes(Server.MapPath(video path));//File Selected For Interruption
List<byte> int1 = new List<byte>(mainAudio);
int1.AddRange(intreAudio);
byte[] gg = int1.ToArray();
using (FileStream fs =
System.IO.File.Create(Server.MapPath(#"\TempBasicAudio\myfile1.mp3")))
{
if (gg != null)
{
fs.Write(gg, 0, gg.Length);
}
}
Did it ever occor to you that a video file is not justa mindless "array of images" so you can not just add another byte range at the end?
Depending on the video type there is a quite complex structure of management structured you just ignore. Videos are highly complex encoding.
YOu may have to add the images in a specific form WHILE UPDATING THE MANAGEMENT INFORMATION - or you may even have to transcode that (decode all images, then reencode the whole video stream).
Maybe a book about the basics of video processing is in order now? You are like the guy asking why you can not get more horsepower out of your car by running it on rocket fuel - totally ignoring the realities of how cars operate.

Play sound on specific channel with NAudio

I'm using NAudio for sound work. And i have specific task. I have 8 channel sound card. How can I play sound in only 1, 2 or more specific channels. For example I have this code:
Device = new WaveOut();
var provider = new SineWaveProvider32();
provider.SetWaveFormat(44100, 1);
provider.Frequency = 1000f;
provider.Amplitude = 1f;
Device.DeviceNumber = number;
Device.Init(provider);
Device.Play();
This code play sound on all channels.
What i need to change in this?
You could use a MultiplexingWaveProvider and pass in a silence producing wave provider for one channel and a SineWaveProvider32 for the other.
Also note that your soundcard may not necessarily support multi-channel audio through the WaveOut API.

how to improve my code to make better video quality?

I am using the following code to leverage Windows Media Encoder to record screen. I am using Windows Vista, screen resolution 1024 × 768, 32-bit. My issue is, the video could be recorded successfully, but when I playback the recorded video, the quality of video is not very good -- e.g. characters are very obscure. I am wondering what are the parameters I should try to tune to get better quality of recorder video?
My code,
static WMEncoder encoder = new WMEncoder();
IWMEncSourceGroup SrcGrp;
IWMEncSourceGroupCollection SrcGrpColl;
SrcGrpColl = encoder.SourceGroupCollection;
SrcGrp = (IWMEncSourceGroup)SrcGrpColl.Add("SG_1");
IWMEncVideoSource2 SrcVid;
SrcVid = (IWMEncVideoSource2)SrcGrp.AddSource(WMENC_SOURCE_TYPE.WMENC_VIDEO);
SrcVid.SetInput("ScreenCap://ScreenCapture1", "", "");
IWMEncFile File = encoder.File;
File.LocalFileName = "C:\\OutputFile.avi";
// Choose a profile from the collection.
IWMEncProfileCollection ProColl = encoder.ProfileCollection;
IWMEncProfile Pro;
for (int i = 0; i < ProColl.Count; i++)
{
Pro = ProColl.Item(i);
if (Pro.Name == "Windows Media Video 8 for Local Area Network (384 Kbps)")
{
SrcGrp.set_Profile(Pro);
break;
}
}
encoder.Start();
thanks in advance,
George
Video encoders use a certain kbit/second ratio to limit the size of the generated stream. The fewer kbits/sec the less detail you will get due to fewer coefficients from the DCT and bigger quantization values. In other words: the more kbits/sec you put into the video the more detail can be stored in the stream by the encoder.
Judging by your code you have chosen a profile which uses 384 kbit/s which is not very much for a 1024*768 video. You should try other profiles or set bitrate you want yourself.

Categories