Multiple threads reading random files - c#

i have a thread pool and all threads have a fixed set of images. Let's say they have each 100 images. I need to detect which image contains another image that reside in an image bank.
thread 1 - 100 images
thread 2 - 100 images
thread 3 - 100 images
thread 4 - 100 images
Image Base - 50 images
Now i need all the threads to look inside the Image base to see if one of the images they hold resembles one of the image base. I have my image matching done, what i am worried about is if multiple threads might open the same image file. What would be the proper way of tackling this ? It would be nice not to "lock" all the other threads for each IO.
Thanks !

What about something like this, where each thread has a reference to the image bank and provides a delegate that will be called for each of the files in the image bank? Here is a skeleton of what the image bank might look like.
public class ImageBank {
public delegate bool ImageProcessorDelegate(String imageName);
private readonly List<String> _imageBank;
public ImageBank()
{
// initialize _imageBank with list of image file names
}
private int ImageBankCount {
get {
lock(_imageBank) {
return _imageBank.Count;
}
}
}
private List<String> FilterImageBank(ISet<String> exclude)
{
lock(_imageBank)
{
return _imageBank.Where(name => !exclude.Contains(name)).ToList();
}
}
public void ProcessImages(ImageProcessorDelegate imageProcessor)
{
var continueProcessing = true;
var processedImages = new HashSet<String>();
var remainingImages = new List<String>();
do
{
remainingImages = FilterImageBank(processedImages);
while(continueProcessing && remainingImages.Any())
{
var currentImageName = remainingImages[0];
remainingImages.RemoveAt(0);
// protect this image being accessed by multiple threads.
var mutex = new Mutex(false, currentImageName);
if (mutex.WaitOne(0))
{
try
{
// break out of loop of the processor found what it was looking for.
continueProcessing = imageProcessor(currentImageName);
}
catch (Exception)
{
// exception thrown by the imageProcessor... do something about it.
}
finally
{
// add the current name to the set of images we've alread seen and reset the mutex we acquired
processedImages.Add(currentImageName);
mutex.ReleaseMutex();
}
}
}
}
while(continueProcessing);
}
}
Then, each thread would have its list of images (_myImageList) and a ThreadProc that looks something like this:
void ThreadProc(object bank)
{
var imageBank = bank as ImageBank;
foreach(var myImage in _myImageList)
{
imageBank.ProcessImages(imageName =>
{
// do something with imageName and myImage
// return true to continue with the next image from the image bank
// or false to stop processing more images from the image bank
}
);
}
}

Assuming all threads have the same set of images they have to work on, and assuming the pahts to these files are in a list or some other collection, you could try something like this:
// A shared collection
List<string> paths = new List<string>();
// Fill this collection with your fixed set.
IEnumerator<T> e = paths.GetEnumerator();
// Now create all threads and use e as the parameter. Now all threads have the same enumerator.
// Inside each thread you can do this:
while(true)
{
string path;
lock(e)
{
if (!e.MoveNext())
return; // Exit the thread.
path = e.Current;
}
// From here, open the file, read the image, process it, etc.
}
In this example you only put a lock on the enumerator. Only one thread can read from it at the same time. So each time it is called, a different path will come out of it.
Outside the lock you can do all the processing, I/O, etc.
Of course the collection could also be of another type, like an array.

Related

An ItemsControl is inconsistent with its items source - Problem when using Dispatcher.Invoke()

I'm writing a WPF application (MVVM pattern using MVVM Light Toolkit) to read and display a bunch of internal log files my company uses. The goal is to read from multiple files, extract content from each line, put them in a class object and add the said object to an ObservableCollection. I've set the ItemsSource of a DataGrid on my GUI to this list so it displays the data in neat rows and columns. I have a ProgressBar control in a second window, which during the file read and display process will update the progress.
Setup
Note that all these methods are stripped down to the essentials removing all the irrelevant code bits.
Load Button
When the user selects the directory which contains log files and clicks this button, the process begins. I open up the window that contains the ProgressBar at this point. I use a BackgroundWorker for this process.
public void LoadButtonClicked()
{
_dialogService = new DialogService();
BackgroundWorker worker = new BackgroundWorker
{
WorkerReportsProgress = true
};
worker.DoWork += ProcessFiles;
worker.ProgressChanged += Worker_ProgressChanged;
worker.RunWorkerAsync();
}
ProcessFiles() Method
This reads all files in the directory selected, and processes them one by one. Here, when launching the progress bar window, I'm using Dispatcher.Invoke().
private void ProcessFiles(object sender, DoWorkEventArgs e)
{
LogLineList = new ObservableCollection<LogLine>();
System.Windows.Application.Current.Dispatcher.Invoke(() =>
{
_dialogService.ShowProgressBarDialog();
});
var fileCount = 0;
foreach (string file in FileList)
{
fileCount++;
int currProgress = Convert.ToInt32(fileCount / (double)FileList.Length * 100);
ProcessOneFile(file);
(sender as BackgroundWorker).ReportProgress(currProgress);
}
}
ProcessOneFile() Method
This, as the name suggests, reads one file, go through line-by-line, converts the content to my class objects and adds them to the list.
public void ProcessOneFile(string fileName)
{
if (FileIO.OpenAndReadAllLinesInFile(fileName, out List<string> strLineList))
{
foreach (string line in strLineList)
{
if (CreateLogLine(line, out LogLine logLine))
{
if (logLine.IsRobotLog)
{
LogLineList.Add(logLine);
}
}
}
}
}
So this works just fine, and displays my logs as I want them.
Problem
However, after displaying them, if I scroll my DataGrid, the GUI hangs and gives me the following exception.
System.InvalidOperationException: 'An ItemsControl is inconsistent
with its items source. See the inner exception for more
information.'
After reading about this on SO and with the help of Google I have figured out that this is because my LogLineList being inconsistent with the ItemsSource which results in a conflict.
Current Solution
I found out that if I put the line of code in ProcessOneFile where I add a class object to my list inside a second Dispatcher.Invoke() it solves my problem. Like so:
if (logLine.IsRobotLog)
{
System.Windows.Application.Current.Dispatcher.Invoke(() =>
{
LogLineList.Add(logLine);
});
}
Now this again works fine, but the problem is this terribly slows down the processing time. Whereas previously a log file with 10,000 lines took about 1s, now it's taking maybe 5-10 times as longer.
Am I doing something wrong, or is this to be expected? Is there a better way to handle this?
Well observable collection is not thread safe. So it works the second way because all work is being done on the UI thread via dispatcher.
You can use asynchronous operations to make this type of flow easier. By awaiting for the results and updating the collection\progress on the result, you will keep your UI responsive and code clean.
If you cant or don't want to use asynchronous operations, batch the updates to the collection and do the update on the UI thread.
Edit
Something like this as an example
private async void Button_Click(object sender, RoutedEventArgs e)
{
//dir contents
var files = new string[4] { "file1", "file2", "file3", "file4" };
//progress bar for each file
Pg.Value = 0;
Pg.Maximum = files.Length;
foreach(var file in files)
{
await ProcessOneFile(file, entries =>
{
foreach(var entry in entries)
{
LogEntries.Add(entry);
}
});
Pg.Value++;
}
}
public async Task ProcessOneFile(string fileName, Action<List<string>> onEntryBatch)
{
//Get the lines
var lines = await Task.Run(() => GetRandom());
//the max amount of lines you want to update at once
var batchBuffer = new List<string>(100);
//Process lines
foreach (string line in lines)
{
//Create the line
if (CreateLogLine(line, out object logLine))
{
//do your check
if (logLine != null)
{
//add
batchBuffer.Add($"{fileName} -{logLine.ToString()}");
//check if we need to flush
if (batchBuffer.Count != batchBuffer.Capacity)
continue;
//update\flush
onEntryBatch(batchBuffer);
//clear
batchBuffer.Clear();
}
}
}
//One last flush
if(batchBuffer.Count > 0)
onEntryBatch(batchBuffer);
}
public object SyncLock = new object();
In your constructor:
BindingOperations.EnableCollectionSynchronization(LogLineList, SyncLock);
Then in your function:
if (logLine.IsRobotLog)
{
lock(SyncLock)
{
LogLineList.Add(logLine);
}
}
This will keep the collection synchronized in which ever thread you update it from.

Speed improvement using multiple threads

I have a CustomControl called PlaylistView. It displays elements in a playlist with name and thumbnail. The method DisplayPlaylist ensures that a thread is started, in which the individual elements are added one by one and the thumbnails (30th frame) are read out:
public void DisplayPlaylist(Playlist playlist)
{
Thread thread = new Thread(() => DisplayElements(playlist));
thread.Start();
}
private void DisplayElements(Playlist playlist)
{
for (int i = 0; i < playlist.elements.Count; i++)
DisplayElement(playlist.elements[i], i);
}
private void DisplayElement(IPlayable element, int index)
{
VideoSelect videoSelect = null;
if (element is Audio)
//
else if (element is Video)
videoSelect = new VideoSelect(index, element.name, GetThumbnail(element.path, SystemData.thumbnailFrame));
videoSelect.Location = GetElementsPosition(index);
panel_List.BeginInvoke(new Action(() =>
{
panel_List.Controls.Add(videoSelect);
}));
}
private Bitmap GetThumbnail(string path, int frame)
{
VideoFileReader reader = new VideoFileReader();
try
{
reader.Open(path);
for (int i = 1; i < frame; i++)
reader.ReadVideoFrame();
return reader.ReadVideoFrame();
}
catch
{
return null;
}
}
But there is a problem.
It is much too slow (about 10 elements/sec). With a playlist length of 614, you would have to wait more than a minute until all are displayed. Each time you change the playlist, such as adding or deleting an item, the procedure starts with the new item. Adding 2 or more will make it even more complicated.
I now had the approach to use multiple threads and the number of threads used for this to be specified by the user (1 to max 10). The implementation in the code currently looks like this (only changed parts compared to the previously posted code)
public void DisplayPlaylist(Playlist playlist)
{
for (int i = 0; i < SystemData.usedDisplayingThreads; i++)
{
Thread thread = new Thread(() => DisplayElements(playlist, i));
thread.Start();
}
}
private void DisplayElements(Playlist playlist, int startIndex)
{
for (int i = startIndex; i < playlist.elements.Count; i += SystemData.usedDisplayingThreads)
DisplayElement(playlist.elements[i], i);
}
The problem is that now very often null is returned by the GetThumbnail function, so an error occurs. In addition, a System.AccessViolationException is often thrown out.
In my opinion, the reason for this is the presence of multiple, simultaneously active VideoFileReaders. However, I do not know what exactly triggers the problem so I cannot present any solution. Maybe you know what the actual trigger is and how to fix the problem or maybe you also know other methods for speed improvement, which maybe even more elegant.
I would start with logging what exception is raised in GetThumbnail method. Your code hides it and returns null. Change to catch (Exception exc), write exception details in log or at least evaluate in debugger. That can give a hint.
Also I'm pretty sure your VideoFileReader instances are IDisposable, so you have to dispose them by invoking reader.Close(). Maybe previous instances were not disposed and you are trying to open same file multiple times.
Update: video frame has to be disposed as well. Probably you will need to do a copy of bitmap if it is referenced with reader and prevents disposion.

Having 1 thread execute multiple methods at the same time

So I have a list with 900+ entries in C#. For every entry in the list a method has to be executed, though these must go all at the same time. First I thought of doing this:
public void InitializeThread()
{
Thread myThread = new Thread(run());
myThread.Start();
}
public void run()
{
foreach(Object o in ObjectList)
{
othermethod();
}
}
Now the problem here is that this will execute 1 method at a time for each entry in the list. But I want every single one of them to be running at the same time.
Then I tried making a seperate thread for each entry like this:
public void InitializeThread()
{
foreach(Object o in ObjectList)
{
Thread myThread = new Thread(run());
myThread.Start();
}
}
public void run()
{
while(//thread is allowed to run)
{
// do stuff
}
}
But this seems to give me system.outofmemory exceptions (not a suprise since the list has almost a 1000 entries.
Is there a way to succesfully run all those methods at the same time? Either using multiple threads or only one?
What I'm ultimately trying to achieve is this: I have a GMap, and want to have a few markers on it. These markers represent trains. The marker pops up on the GMap at a certain point in time, and dissappears when it reaches it's destination. All the trains move about at the same time on the map.
If I need to post more of the code I tried please let me know.
Thanks in advance!
What you're looking for is Parallel.ForEach:
Executes a foreach operation on an IEnumerable in which iterations may
run in parallel.
And you use it like this:
Parallel.ForEach(ObjectList, (obj) =>
{
// Do parallel work here on each object
});

Monotouch Threading - order/priority of requests

I had some performance issues when using an UICollectionView where the cell was showing an Image (loaded from the HD).
I solved that issue by loading the image in the background.
Basically, in my "GetCell" method I check if the image is in my ImageCache.
If so, set the image on my ImageView in the cell.
If not, load the image in the background and request a reload for that specific
item (I request a reload because I don't know if the cell is recycled
in the meanwhile, so it's not safe to directly set the image)
Snippet of my background process:
ThreadPool.QueueUserWorkItem (delegate {
ImagesCache.AddImageToCache(""+indexPath.Row,ImageForPosition(indexPath.Row));
InvokeOnMainThread (delegate {
collectionView.ReloadItems(Converter.ToIndexPathArray(indexPath));
});
});
It works ok, but if you scroll fast, it will load all these async tasks and the problem is that it will execute the requests in order (FIFO). So when you scroll fast, images for invisible cells will be loaded before images from visible cells.
Does anyone has an idea how I could optimise this process to get a better user experience?
Because if I would extend this to include images from the internet, the problem will be worse (because of the download).
Increasing the max amount of simultaneously threads will allow the later added threads to already start immediately but it will decrease the overall performance/download speed, so that's also not a real solution.
Thanks,
Matt
My project's solution in short: one thread to download images with queue support. Plus checking in target UI control, that it wasn't dequeued for reuse.
Long version:
Implement queue with methods Start/Stop. When Start is calling, start background thread, which in busy loop (while true { DoSomething(); }) will try to dequeue request from queue. If not dequeued, sleep a bit. If dequeued, execute it (download image). Stop method should say thread to exit from loop:
public void Start()
{
if (started) {
return;
}
started = true;
new Thread (new ThreadStart (() => {
while (started) {
var request = GetRequest();
if (request != null) {
request.State = RequestState.Executing;
Download (request);
} else {
Thread.Sleep (QueueSleepTime);
}
}
})).Start ();
}
public void Stop()
{
started = false;
}
Then, make a private method in queue to download image with such logic: check for image in file cache. If file is available, read and return it. If not, download it, save it to file, return it (call Action<UIImage> onDownload) or error (call Action<Exception> onError). Call this method in queue's busy loop. Name it Download:
public Download(Request request)
{
try {
var image = GetImageFromCache(request.Url);
if (image == null) {
image = DownloadImageFromServer(request.Url); // Could be synchronous
}
request.OnDownload(image);
} catch (Exception e) {
request.OnError(e);
}
}
Then, make a public method to add request to queue. Pattern Command is useful for wrapping requests for queue: storing Actions, current State. Name it DownloadImageAsync:
public DownloadImageAsync(string imageUrl, Action<UIImage> onDownload, Action<Exception> onError)
{
var request = MakeRequestCommand(imageUrl, onDownload, onError);
queue.Enqueue(request);
}
When UITableViewCell is preparing to show and request to download image:
// Custom UITableViewCell method, which is called in `UITableViewSource`'s `GetCell`
public void PrepareToShow()
{
var imageURLClosure = imageURL;
queue.DownloadImageAsync(imageURL, (UIImage image) => {
if (imageURLClosure == imageURL) {
// Cell was not dequeued. URL from request and from cell are equals.
imageView.Image = image;
} else {
// Do nothing. Cell was dequeued, imageURL was changed.
}
}, (Exception e) => {
// Set default image. Log.
});
}
Checking for (imageURLClosure == imageURL) is imprtant to avoid showing multiple images in one UIImageView when scrolls fast. One cell could initialize multiple request, but only last one result should be used.
Futher improvements:
LIFO execution. If no requests are already run, add new one to begin.
Using Action<byte[]> onDownload instead of Action<UIImage> onDownload for cross-platform code compatibility;
Availability to cancel download image request when cell become insivisble (WillMoveToSuperview). Well, it's not very necessary. After first downloading, image will be in cache, so any further requests for image will be done fast. Thanks to cache;
In-memory cache. Thus, in worst case, chain will be:
Find image in in-memory cache -> Find image in file cache -> Downloading from server.

Threading issue "The calling thread cannot access this object because a different thread owns it". Any solutions? [duplicate]

This question already has answers here:
The calling thread cannot access this object because a different thread owns it
(3 answers)
Closed 8 years ago.
I have a listview that contains file names. I have another listview that contains possible actions to rename these files. Finally I have a label that displays a preview of the result. When an object is selected in each of the lists I want to display the preview. You can select only one file but one or more actions. I use WPF/Xaml for my UI. I chose to perform my preview with a thread.
Here is a part of my code :
private Thread _thread;
public MainWindow()
{
InitializeComponent();
_thread = new Thread(DoWork);
}
public void DoWork()
{
while (true)
{
FileData fileData = listViewFiles.SelectedItem as FileData; // ERROR HERE
if (fileData != null)
{
string name = fileData.FileName;
foreach (var action in _actionCollection)
{
name = action.Rename(name);
}
previewLabel.Content = name;
}
Thread.Sleep(1000);
}
}
private void listViewFiles_SelectionChanged(object sender, SelectionChangedEventArgs e)
{
_thread.Start();
}
At run time I get the error "The calling thread cannot access this object because a different thread owns it." on the FileData fileData = listViewFiles.SelectedItem as FileData; line. Do you know what should I do ?
You can't modify or access UI from nonUI thread. So if you still want to use different thread first thing you need to do is to add some kind of model (for more info about binding and model try search for "wpf mvvm"), then bind you listViewFiles.SelectedItem to some property of this model this will allow you to access SelectedValue across threads. Second you need to separate all logic that changes UI to method or use lambda so in the end it can look like this:
public void DoWork()
{
while (true)
{
FileData fileData = Model.SelectedValue;
if (fileData != null)
{
string name = fileData.FileName;
foreach (var action in _actionCollection)
{
name = action.Rename(name);
}
this.Dispatcher.Invoke((Action)()=> //use Window.Dispatcher
{
label3.Content = fileData.FileName;
label4.Content = name;
});
}
Thread.Sleep(1000);
}
}
UPD. Some additional words about synchronizing with UI: in WPF every UI object inherits from DispatcherObject class. Thus all access to object of this type can be made only from thread in which this object was created, if you want to access DispatcherObject(DO) from another thread you need to use DO.Dispatcher.Invoke(Delegate) method, this will queue your code to DO thread. So in conclusion to run code in UI thread you need to use Dipatcher of any UI element in this case we use Dispatcher of Window (assume that code in window code behind).
Simple answer is that you can't do that: thread A cannot directly access winforms objects (controls) that thread B created.
In practice, you can use a delegate to run this safely on the other thread ala:
form.Invoke(new MethodInvoker(() => {
FileData fileData = listViewFiles.SelectedItem as FileData; // ERROR HERE
if (fileData != null)
{
string name = fileData.FileName;
foreach (var action in _actionCollection)
{
name = action.Rename(name);
}
previewLabel.Content = name;
}
}));
However, you may want to just
use a Background worker: http://msdn.microsoft.com/en-us/library/8xs8549b.aspx
In more detail at: http://weblogs.asp.net/justin_rogers/pages/126345.aspx

Categories