Using C# code, I want to take a number of images, add some music and create a video.
I think I can best explain what I want in pseudo-code...:
var video = new Video(1080, 1920); //create a video 1080*1920px
video.AddFrame("C:\temp\frame01.jpg", 2000); //show frame for 2000ms
video.AddTransition(Transitions.Fade, 500); //fade from first to second frame for 500ms
video.AddFrame("C:\temp\frame02.jpg", 1000); //show frame for 1000ms
video.AddTransition(Transitions.Fade, 500); //fade from second to third frame for 500ms
video.AddFrame("C:\temp\frame03.jpg", 2000); //show frame for 2000ms
video.AddSound("C:\temp\mymusic.mp3"); //added from start of video
video.Save("C:\temp\MyAwesomeVideo.avi", Format.MP4);
Does something like this exist?
I know there are a couple of older libraries, that can do some stuff with ffmpeg to create slideshows, but I looked at some of them, and they are insanely tricky to get working - and designed for something quite different.
Backstory:
I created a system for a cinema, which every week generates x number of images using movie posters, showtimes etc - and would like to take those images, turn them into a video which will be shared on social media.
Possibly check out AForge.NET.
I have used this previously, and the results were sufficient, especially considering the ease at which you cant construct a video. It uses FFMPEG under the hood, so you don't need to concern yourself with extensive terminal commands.
A possible downside to this is that there is no immediately available option (to my knowledge) to add transitions / keep a still image on the screen for more than one Frame. This would mean you would need to implement those capabilities yourself.
Keep in mind that AForge.NET is licensed under GPLv3
Check out an example of the VideoFileWriter class
Related
I am working on a rythm game, which means I want screen elements to displayed to the beat. For that I need to be able to tell where on the timeline of the music I currently am. I am using the Naudio library (ASIO to be more specific) and I know that I can call the CurrentTime field of the WaveStream I'm currently playing to find out how much time has passed since the music started playing, however that Field doesn't seem to be very accurate (at least not nearly accurate enough to fluently display a moving screen element) as it's just an approximation based on the average bytes per second. that's why I'm trying to sync a StopWatch to the music instead. However, no matter what I try I always get an offset. My current setup works like this:
var audioFileReader = new AudioFileReader("music.mp3");
var waveChannel32 = new WaveChannel32(audioFileReader); //
var asioOut = new AsioOut(0);
var stopWatch = new Stopwatch();
asioOut.Init(waveChannel32);
asioOut.Play();
Task.Delay(1000).Wait();
waveChannel32.Position = 0;
stopWatch.Start();
If I then compare stopWatch.Elapsed with waveChannel32.CurrentTime it looks like the stopWatch is on spot (it jumps around ∓10ms because the latter is inaccurate, but in average they're about equal. If however I compare it with the files from other rythm games (that note at which timestamps the beat kicks), my StopWatch turns out to be 50∓25ms early. now one could argue that the files may not be accurate or that the other rythm games have offsets, but after checking countless files in several games I feel save to say that I have this ruled out.
What is causing my offset here and how do I fix it?
EDIT:
I'm measuring the offset like this: whenever I press a key I play a clicky sound (known as "hitsount" in rythm games) and tab along with the song trying to sync not my keyboard input put the resulting hitsound with the beat and measure my average offset. This way I don't need to think about hardware and audio driver latency. Of course I'm not a robot and thus not perfectly accurate, but my inaccuracy is normally spread so I can just take the average and even without that I am at least accurate enough to be able to tell that something's wrong.
I have a question about speeding up a couple of emguCV calls. Currently I have a capture card that takes in a camera at 1920x1080#30Hz. Using directshow with a sample grabber I capture each frame and display it on a form. Now I have written an image stabilizer but the fastest I can run it is about 20Hz.
The first thing I do in my stabilizer is scale the 1920x1080 down to 640x480 beacuse it makes the feature track much faster.
Then I use goodFeaturesToTrack
previousFrameGray.GoodFeaturesToTrack(sampleSize, sampleQuality, minimumDistance, blockSize)[0]);
which takes about 12-15ms.
The next thing I do is an optical flow calculation using this
OpticalFlow.PyrLK(previousFrameGray, frame_gray, prev_corner.ToArray(), new Size(15, 15), 5, new MCvTermCriteria(5), out temp, out status, out err);
and that takes about 15-18ms.
The last time consuming method I call is the warpAffine function
Image<Bgr, byte> warped_frame = frame.WarpAffine(T, interpMethod, Emgu.CV.CvEnum.WARP.CV_WARP_DEFAULT, new Bgr(BackgroundColor));
this takes about 10-12ms.
The rest of the calculations, image scaling and what not take a total of around 7-8ms.
So the total time for a frame calculation is about 48ms or about 21Hz.
Somehow I need to get the total time under 33ms.
So now for my questions.
First: If I switch to using the GPU for goodFeatures and opticalFlow will that provide the nessesary increase in speed if any?
Second: Are there any other methods besides using the GPU that could speed up these calculations?
Well I finally converted the functions to their GPU counterparts and got the speed increase I was looking for. I went from 48ms down to 22ms.
OK, so I have been having a bit of a tough time with webcam capture, and am in need of help to find a way to capture the video at a consistent frame rate.
I have been using Aforge.AVIWriter and Aforge.VideoFileWriter, but to no avail, and have also typed any related phrase I can think of into Google.
I have had a look at the DirectShowLib, but am yet to find it any more accurate.
The video must have a minimum frame rate of 25fps, it is also to be shown in sync with other data which is collected at the same time.
I have also tried an infinite loop:
for (; ; )
{
if (recvid == false)
{
break;
}
if (writer.IsOpen)
{
Bitmap image = (Bitmap)videoSourcePlayer1.GetCurrentVideoFrame();
if (image != null)
{
writer.WriteVideoFrame(image);
}
Thread.Sleep(40);
}
}
Even though this is more accurate for timing, the user can see that the fps changes when they watch the video and view data at the same time.
Any pointers or tips would be greatly appreciated, as I cannot think of a way to go from here.
two main issues that i can see:
writer.write() is it happening in a seperate thread? if not it will take time and hence the timing might not be accurate.
second thread.sleep() says that sleep for at-least 40 ms not exactly 40 ms.. to get better results reduce the wait time to 5 ms and do it in a loop.. use the systems time to actually figure out how long you have slept for and then take a frame capture.
Hope this helps
With most web cameras (except maybe rare exceptions and higher end cameras that offer you fine control over capture process) you don't have enough control over camera frame rate. The camera will be capturing stream of frames at its maximal frame rate for the given mode of operation, esp. capped by resolution and data bandwidth, with possibly lower rate in low level conditions.
No Thread.Sleep is going to help you there because it is way too slow and unresponsive - in order to capture 25 fps the hardware needs to run smoothly without any interruptions and explicit instructions to "capture next frame now" pushing new data on the one end of the queue with you popping captured frames on the other end. You typically have a lag of a few video frames even with decent hardware.
I am trying to apply image manipulation effects in Windows 8 app on camera feeds directly.
I have tried a way using canvas and redrawing images after applying effects getting from webcam directly. But this approach works fine for basic effects but for effects like edge detection its creating large lag and flickering while using canvas approach.
Other way is to create MFT(media foundation transform) but it can be implemented in C about which i have no idea.
Can anyone tell me how can i achieve my purpose of applying effects on webcam stream directly in Windows 8 metro style app either by improving canvas approach so that large effects like edge detection don not have any issues or how can i apply MFT in C# since i have worked on C# language or by some other approach?
I have just played quite a bit in this area the last week and even considered writing a blog post about it. I guess this answer can be just as good.
You can go the MFT way, which needs to be done in C++, but the things you would need to write would not be much different between C# and C++. The only thing of note is that I think the MFT works in YUV color space, so your typical convolution filters/effects might behave a bit differently or require conversion to RGB. If you decide to go that route On the C# application side the only thing you would need to do is to call MediaCapture.AddEffectAsync(). Well that and you need to edit your Package.appxmanifest etc., but let's go with first things first.
If you look at the Media capture using webcam sample - it already does what you need. It applies a grayscale effect to your camera feed. It includes a C++ MFT project that is used in an application that is available in C# version. I had to apply the effect to a MediaElement which might not be what you need, but is just as simple - call MediaElement.AddVideoEffect() and your video file playback now applies the grayscale effect. To be able to use the MFT - you need to simply add a reference to the GrayscaleTransform project and add following lines to your appxmanifest:
<Extensions>
<Extension Category="windows.activatableClass.inProcessServer">
<InProcessServer>
<Path>GrayscaleTransform.dll</Path>
<ActivatableClass ActivatableClassId="GrayscaleTransform.GrayscaleEffect" ThreadingModel="both" />
</InProcessServer>
</Extension>
</Extensions>
How the MFT code works:
The following lines create a pixel color transformation matrix
float scale = (float)MFGetAttributeDouble(m_pAttributes, MFT_GRAYSCALE_SATURATION, 0.0f);
float angle = (float)MFGetAttributeDouble(m_pAttributes, MFT_GRAYSCALE_CHROMA_ROTATION, 0.0f);
m_transform = D2D1::Matrix3x2F::Scale(scale, scale) * D2D1::Matrix3x2F::Rotation(angle);
Depending on the pixel format of the video feed - a different transformation method is selected to scan the pixels. Look for these lines:
m_pTransformFn = TransformImage_YUY2;
m_pTransformFn = TransformImage_UYVY;
m_pTransformFn = TransformImage_NV12;
For my sample m4v file - the format is detected as NV12, so it is calling TransformImage_NV12.
For pixels within the specified range (m_rcDest) or within the entire screen if no range was specified - the TransformImage_~ methods call TransformChroma(mat, &u, &v).
For other pixels - the values from original frame are copied.
TransformChroma transforms the pixels using m_transform. If you want to change the effect - you can simply change the m_transform matrix or if you need access to neighboring pixels as in an edge detection filter - modify the TransformImage_ methods to process these pixels.
This is one way to do it. I think it is quite CPU intensive, so personally I prefer to write a pixel shader for such operations. How do you apply a pixel shader to a video stream though? Well, I am not quite there yet, but I believe you can transfer video frames to a DirectX surface fairly easily and call a pixel shader on them later. So far - I was able to transfer the video frames and I am hoping to apply the shaders next week. I might write a blog post about it. I took the meplayer class from the Media engine native C++ playback sample and moved it to a template C++ DirectX project converted to a WinRTComponent library, then used it with a C#/XAML application, associating the swapchain the meplayer class creates with the SwapChainBackgroundPanel that I use in the C# project to display the video. I had to make a few changes in the meplayer class. First - I had to move it to a public namespace that would make it available to other assembly. Then I had to modify the swapchain it creates to a format accepted for use with a SwapChainBackgroundPanel:
DXGI_SWAP_CHAIN_DESC1 swapChainDesc = {0};
swapChainDesc.Width = m_rcTarget.right;
swapChainDesc.Height = m_rcTarget.bottom;
// Most common swapchain format is DXGI_FORMAT_R8G8B8A8-UNORM
swapChainDesc.Format = m_d3dFormat;
swapChainDesc.Stereo = false;
// Don't use Multi-sampling
swapChainDesc.SampleDesc.Count = 1;
swapChainDesc.SampleDesc.Quality = 0;
//swapChainDesc.BufferUsage = DXGI_USAGE_BACK_BUFFER | DXGI_USAGE_RENDER_TARGET_OUTPUT;
swapChainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT; // Allow it to be used as a render target.
// Use more than 1 buffer to enable Flip effect.
//swapChainDesc.BufferCount = 4;
swapChainDesc.BufferCount = 2;
//swapChainDesc.Scaling = DXGI_SCALING_NONE;
swapChainDesc.Scaling = DXGI_SCALING_STRETCH;
swapChainDesc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL;
swapChainDesc.Flags = 0;
Finally - instead of calling CreateSwapChainForCoreWindow - I am calling CreateSwapChainForComposition and associating the swapchain with my SwapChainBackgroundPanel:
// Create the swap chain and then associate it with the SwapChainBackgroundPanel.
DX::ThrowIfFailed(
spDXGIFactory.Get()->CreateSwapChainForComposition(
spDevice.Get(),
&swapChainDesc,
nullptr, // allow on all displays
&m_spDX11SwapChain)
);
ComPtr<ISwapChainBackgroundPanelNative> dxRootPanelAsSwapChainBackgroundPanel;
// Set the swap chain on the SwapChainBackgroundPanel.
reinterpret_cast<IUnknown*>(m_swapChainPanel)->QueryInterface(
IID_PPV_ARGS(&dxRootPanelAsSwapChainBackgroundPanel)
);
DX::ThrowIfFailed(
dxRootPanelAsSwapChainBackgroundPanel->SetSwapChain(m_spDX11SwapChain.Get())
);
*EDIT follows
Forgot about one more thing. If your goal is to stay in pure C# - if you figure out how to capture frames to a WriteableBitmap (maybe by calling MediaCapture.CapturePhotoToStreamAsync() with a MemoryStream and then calling WriteableBitmap.SetSource() on the stream) - you can use WriteableBitmapEx to process your images. It might not be top performance, but if your resolution is not too high or your frame-rate requirements are not high - it might just be enough. The project on CodePlex does not officially support WinRT yet, but I have a version that should work that you can try here (Dropbox).
As far as I know, MFTs need to be implemented in C++. I believe that there is a media transform SDK sample which shows implementing some straightforward transforms from a metro style application.
We have an application, where we get a message from an external system and then we take a picture, do some processing and return something back to the external system. Doing some performance testing, I found two problems (they are somewhat related). I was hoping someone will be able to explain this to me.
1) Does _capture.QueryFrame() buffer frames?
What we see is, if there is a gap between the query for two frames from a web camera, the second frame is often an older picture and not the one when the queryFrame was called.
We were able to mitigate this problem to some extent by discarding some frames, i.e. calling _capture.QueryFrame() 2-3 times and discarding the results.
2) The second issue is when we timed different parts of the application, we found that clearing the buffer (calling QueryFrame() 2-3 times and not using the results) takes about 65ms and then this line: Image<Bgr, Byte> source = _capture.QueryFrame() takes about 80ms. These two parts take the biggest chunk of processing time, our actual processing takes just about 20-30ms more.
Is there a faster way (a) to clear the buffer (b) to capture the frame?
If you have experience with OpenCV and know of something related, please do let me know.
I answered a similar question System.TypeInitializationException using Emgu.CV in C# and having tested the various possibilities to acquire an up to date frame I found the bellow the bes method.
1) yes when you set up a Capture from a webcam a ring buffer is created to store the images in this allows effcient allocation of memory.
2) yes there is a faster way, Set your Capture device up globally and set it of recording and calling ProcessFrame to get an image from the buffer whenever it can. Now change your QueryFrame simply to copy whatever frames its just acquired. This will hopefully stop your problem of getting the previous frame and you will now have the most recent frame out of the buffer.
private Capture cap;
Image<Bgr, Byte> frame;
public CameraCapture()
{
InitializeComponent();
cap= new Capture();
cap.SetCaptureProperty(Emgu.CV.CvEnum.CAP_PROP.CV_CAP_PROP_FRAME_HEIGHT, height);
cap.SetCaptureProperty(Emgu.CV.CvEnum.CAP_PROP.CV_CAP_PROP_FRAME_WIDTH, width);
Application.Idle += ProcessFrame;
}
private void ProcessFrame(object sender, EventArgs arg)
{
frame = _capture.QueryFrame();
grayFrame = frame.Convert<Gray, Byte>();
}
public Image<Bgr,byte> QueryFrame()
{
return frame.Copy();
}
I hope this helps if not let me know and I'll try and tailor a solution to your requirements. Don't forget you can always have your acquisition running on a different thread and invoke the new QueryFrame method.
Cheers
Chris
This could also be due to the refresh rate of the webcamera you are using. My camera works at 60Hz so I have a timer that takes captures a frame every 15 milliseconds.