I am using MagicLeap Headset and MLCamera API to capture a rawvideocapture which the output is YUV_420_888 which I am assuming is YUV420P. API returns yBuffer, uBuffer and vBuffer separately. I am having trouble combining these channels on c# without bitmap since I am using unity I am using Mono. What I am trying to do is to combine these channels and send it to my remote python server to process the image that I have captured. To process the image, it needs to be a full image. I have tried just using the Y plane and creating a gray-scale image but the server couldn't process it so I need to combine all 3 channels on the client and then compress it to preferable jpeg since the size decreases drastically and I am processing the images at 420x420 size although the camera output is 1920x1080. I am trying different methods for the last week and half but couldn't find something decent. There are a couple methods especially for Android but I don't want to convert it to NV21 if I don't have to. I have also seen one with ARCore but I also can't use that one since I am using MagicLeap.
PS: The latency and the processing time is super important so if there is a way to convert YCbCr to jpeg directly without converting it to RGB, I think it would help my case better but I don't know if it's possible. In general I think I lack some basic knowledge that prevents me from going further.
Any help is greatly appreciated!
I've tried something similar in the past, was beating my head on the YUV420 stuff for weeks, but couldn't solve it. In the end, I bought this library OpenCV for Unity. It has custom parts just for the MagicLeap, including reading frames from the Camera in reduced resolution for speed up.
I'm not sure how ever if it managed real time. Maybe in the reduced resolution, yes.
Related
I’m making an audio synthesizer and I’m having issues figuring out what to use for audio playback. I’m using physics and math to calculate the source waveforms and then need to feed that waveform to something which can play it as sound. I need something that can 1) play the waveforms I calculate and 2) play multiple sounds simultaneously (like holding one key down on a piano while pressing other keys). I’ve done a fair bit of research into this and I can’t find something that does both of those things. As far as I know, I have 5 potential options:
DirectSound. It can take a waveform (a short[]) as a parameter and play it as sound, and can play multiple sounds simultaneously. But it won’t work with .NET 4.5.
System.Media.SoundPlayer. It works with .NET 4.5 and has better quality audio than Direct Sound, but it has to play sound from a .wav file and cannot play multiple sounds at once (nor can multiple instances of SoundPlayer). I ‘trick’ SoundPlayer into working by translating my waveform into .wav format in memory and then send SoundPlayer a MemoryStream of the in-memory .wav file. Could I potentially achieve control over the playback by altering the stream? I cannot append bytes to the stream (I tried) but I could potentially make the stream an arbitrary size and just re-write all the bytes in the stream with the next segment of audio data every time the end of the stream is reached.
System.Windows.Controls.MediaElement. I have not experimented with this yet, but from MSDNs documentation I don’t see a way to send it a waveform in memory without saving it to disk first and then reading it; I don’t think I can send it a stream.
System.Windows.Controls.MediaPlayer. I have not experimented with this either, but the documentation says it’s meant to be used as a companion to some kind of animation. I could potentially use this without doing any real (user-perceivable) animation to achieve my desired effect.
An open source solution. I’m hesitant to use an open source solution as I find they are typically poorly documented and not very maintainable, but I am open to ideas if there is one out there that is well documented and can do what I need.
Can anyone offer me any guidance on this or how to create flexible audio playback?
http://naudio.codeplex.com , without a doubt. Mark is a regular here on SO, the product is well alive, there are good code examples.
It works. We built some great stuff with it.
Recently, I finished my conference application. People can talk and watch to each other. Therefore I capture images (IntPtr of buffer converted to JPEG) from the webcam (DirectShow library). Right now I do not have any problems, since the program was used in a LAN only. But I'm planning to implement a internet version of it.
So my question is: Should I use something else than JPEG? Should I compare image x and image x+1 and only send differences? Should I use Motion-JPEG? (Sorry, I do not know anything about motion-jpeg, but it sounds relevant).
you are on the right track with recognizing that images change little from frame to frame, and that sending a sequence of jpegs is not the way to go. I believe mjpeg sends a sequence of jpegs, and is a poor choice. I do not use c#, but i believe that ffmpeg (a video compression library) makes a c# wrapper.
FFmpeg is extremely fast, but is not really well documented and is pure ANSI-C. I think that a better approach in your case is, as you already thought, to compress the difference between image x and image x-1, this should be enough to provide a significant bandwidth saving.
You should also include a method to compress the whole frame every once in a while, or compress the whole image when the difference with the previous one is above a certain threshold
hi
I am developing a video capture application using C#.net. i captured
video through webcam and saved it as a JPEG images then i want to make a
wmv file with those images. how can i do that what are the basic steps needed for that can any body help
I am working on this myself. I have found two ways that may be possible - both require the purchase of an outside library.
The first one looks to be the easiest but costs the most, although it will allow you to use it for free you will just have to deal with a pop up telling you to purchase the library: http://bytescout.com/products/developer/imagetovideosdk/imagetovideosdk_convert_jpg_to_video.html
The other involves using Microsoft Encoder 4. I am still working on this one. You can get the free version here: http://www.microsoft.com/download/en/details.aspx?id=18974
C# doesn't natively support much in the way of sound or video so outside reference assemblies seem to be a necessity.
Right now that is the best help I can offer until I figure it out.
I need some help with an algorithm. I'm using an artificial neural network to read an electrocardiogram and trying to recognize some disturbances in the waves. That's OK, and I have the neural network and I can test it no problem.
What I'd like to do is to give the function to the user to open an electrocardiogram (import a jpeg) and have the program find the waves and convert it in to the arrays that will feed my ANN, but there's the problem. I did some code that reads the image and transforms it into a binary image, but I can't find a nice way for the program to locate the waves, since the exact position can vary from hospital to hospital, I need some suggestions of approaches I should use.
If you've got the wave values in a list, you can use a Fourier transform or FFT (fast Fourier transform) to determine the frequency content at any particular time value. Disturbances typically create additional high-frequency content (ie, sharp, steep waves) that you should be able to use to spot irregularities.
You'd have to assume a certain minimal contrast between the "signal" (the waves) and the background of the image. An edge-finding algorithm might be useful in that case. You could isolate the wave from the background and plot the wave.
This post by Rick Barraza deals with vector fields in Silverlight. You might be able to adapt the concept to your particular problem.
I have a live 16-bit gray-scale video stream that is pushed through a ring-buffer in memory as a raw, uncompressed byte stream (2 bytes per pixel, 2^18 pixels/frame, 32 frames/sec). (This is coming from a scientific grade camera, via a PCI frame-grabber). I would like to do some simple processing on the video (clip dynamic range, colorize, add overlays) and then show it in a window, using C#.
I have this working using Windows Forms & GDI (for each frame, build a Bitmap object, write raw 32-bit RGB pixel values based on my post-processing steps, and then draw the frame using the Graphics class). But this uses a significant chunk of CPU that I'd like to use for other things. So I'm interested in using WPF for its GPU-accelerated video display. (I'd also like to start using WPF for its data binding & layout features.)
But I've never used WPF before, so I'm unsure how to approach this. Most of what I find online about video & WPF involves reading a compressed video file from disk (e.g. WMV), or getting a stream from a consumer-grade camera using a driver layer that Windows already understands. So it doesn't seem to apply here (but correct me if I'm wrong about this).
So, my questions:
Is there a straighforward, WPF-based way to play video from raw, uncompressed bytes in memory (even if just as 8-bit grayscale, or 24-bit RGB)?
Will I need to build DirectShow filters (or other DirectShow/Media Foundation-ish things) to get the post-processing working on the GPU?
Also, any general advice / suggestions for documentation, examples, blogs, etc that are appropriate to these tasks would be appreciated. Thanks!
Follow-up: After some experimentation, I found WriteableBitmap to be fast enough for my needs, and extremely easy to use correctly: Simply call WritePixels() and any Image controls bound to it will update themselves. InteropBitmap with memory-mapped sections is noticeably faster, but I had to write p/invokes to kernel32.dll to use it on .NET 3.5.
My VideoRendererElement, though very efficient, does use some hackery to make it work. You may also want to experiment with the WriteableBitmap in .NET 3.5 SP1.
Also the InteropBitmap is very fast too. Much more efficient than the WB as it's not double buffered. Though it can be subject to video tearing.
Some further Google-searching yielded this:
http://www.codeplex.com/VideoRendererElement
which I'm looking into now, but may be the right approach here. Of course further thoughts/suggestions are still very much welcome.