C# Downloader: should I use Threads, BackgroundWorker or ThreadPool? - c#

I'm writing a downloader in C# and stopped at the following problem: what kind of method should I use to parallelize my downloads and update my GUI?
In my first attempt, I used 4 Threads and at the completion of each of them I started another one: main problem was that my cpu goes 100% at each new thread start.
Googling around, I found the existence of BackgroundWorker and ThreadPool: stating that I want to update my GUI with the progress of each link that I'm downloading, what is the best solution?
1) Creating 4 different BackgroundWorker, attaching to each ProgressChanged event a Delegate to a function in my GUI to update the progress?
2) Use ThreadPool and setting max and min number of threads to the same value?
If I choose #2, when there are no more threads in the queue, does it stop the 4 working threads? Does it suspend them? Since I have to download different lists of links (20 links each of them) and move from one to another when one is completed, does the ThreadPool start and stop threads between each list?
If I want to change the number of working threads on live and decide to use ThreadPool, changing from 10 threads to 6, does it throw and exception and stop 4 random threads?
This is the only part that is giving me an headache.
I thank each of you in advance for your answers.

I would suggest using WebClient.DownloadFileAsync for this. You can have multiple downloads going, each raising the DownloadProgressChanged event as it goes along, and DownloadFileCompleted when done.
You can control the concurrency by using a queue with a semaphore or, if you're using .NET 4.0, a BlockingCollection. For example:
// Information used in callbacks.
class DownloadArgs
{
public readonly string Url;
public readonly string Filename;
public readonly WebClient Client;
public DownloadArgs(string u, string f, WebClient c)
{
Url = u;
Filename = f;
Client = c;
}
}
const int MaxClients = 4;
// create a queue that allows the max items
BlockingCollection<WebClient> ClientQueue = new BlockingCollection<WebClient>(MaxClients);
// queue of urls to be downloaded (unbounded)
Queue<string> UrlQueue = new Queue<string>();
// create four WebClient instances and put them into the queue
for (int i = 0; i < MaxClients; ++i)
{
var cli = new WebClient();
cli.DownloadProgressChanged += DownloadProgressChanged;
cli.DownloadFileCompleted += DownloadFileCompleted;
ClientQueue.Add(cli);
}
// Fill the UrlQueue here
// Now go until the UrlQueue is empty
while (UrlQueue.Count > 0)
{
WebClient cli = ClientQueue.Take(); // blocks if there is no client available
string url = UrlQueue.Dequeue();
string fname = CreateOutputFilename(url); // or however you get the output file name
cli.DownloadFileAsync(new Uri(url), fname,
new DownloadArgs(url, fname, cli));
}
void DownloadProgressChanged(object sender, DownloadProgressChangedEventArgs e)
{
DownloadArgs args = (DownloadArgs)e.UserState;
// Do status updates for this download
}
void DownloadFileCompleted(object sender, AsyncCompletedEventArgs e)
{
DownloadArgs args = (DownloadArgs)e.UserState;
// do whatever UI updates
// now put this client back into the queue
ClientQueue.Add(args.Client);
}
There's no need for explicitly managing threads or going to the TPL.

I think you should look into using the Task Parallel Library, which is new in .NET 4 and is designed for solving these types of problems

Having 100% cpu load has nothing to do with the download (as your network is practically always the bottleneck). I would say you have to check your logic how you wait for the download to complete.
Can you post some code of the thread's code you start multiple times?

By creating 4 different backgroundworkers you will be creating seperate threads that will no longer interfere with your GUI. Backgroundworkers are simple to implement and from what I understand will do exactly what you need them to do.
Personally I would do this and simply allow the others to not start until the previous one is finished. (Or maybe just one, and allow it to execute one method at a time in the correct order.)
FYI - Backgroundworker

Related

A thread for updating a RichTextBox.Text twice a second is not working

First of all - I'm very low skilled programmer. I am building the foundation of a simple music app for my bachelor degree project. My question is regarding a internal clock method that is meant to increase an int value by 1 BPM times a minute.
I've created an internalClock class:
public class internalClock
{
// THIS METHOD WILL BE CALLED WHEN THE THREAD IS STARTED
public static void clockMethod()
{
int BPM = 135;
int clockTick = 1;
Form1 clockForm = new Form1();
// infinite loop
while (true)
{
if (clockTick == 8)
{
clockTick = 1;
}
else
{
clockTick++;
}
clockForm.metrobox.Text = clockTick.ToString();
Thread.Sleep(60 * 1000 / BPM);
}
}
}
This is how I managed to get an access to the RichTextBox itself:
public RichTextBox metrobox
{
get { return metroBox; }
set { metroBox = value; }
}
In the main 'Program.cs' I've written what's meant to start a separate thread to run the clockMethod until the program is closed:
// THREADING
// Create a thread
internalClock oClock = new internalClock();
Thread oClockThread = new Thread(new ThreadStart(internalClock.clockMethod));
// Start the internalClock thread
oClockThread.Start();
It's not updating the text in the RichTextBox. Also, if I call the clockMethod() without creating a separate thread for it - the application freezes. Sorry for my amateur question, I'm just getting started with C# (yeah.. my uni is useless). What am I doing wrong in my code?
So the above code has several problems, however I would encourage you to check out the Timer control that you can add to the form you want to do processing at a certain interval or in certain "ticks". MSDN Form Timer
With the timer you can remove that class you have and invoking a new thread, etc etc. I would read up on the Timer class in the given link and think about how you can re-work your application structure to fit that. The concepts for why that thread isn't working, etc, is frankly not that important for where you're at. I think you just need to focus for now on a tool that already does what you want it to do, which I believe is the Timer.
As a general note, you usually don't need to create a raw thread in .NET. As of .NET 4.0 you can access types called Tasks to perform multi-threaded logic and processing. If you find the need to do that later on, check that out. Task Type MSDN

Method called twice in same moment

I'm working on a windows forms application and fighting with a very harsh error. The application is supposed to run on a local machine and handle requests form a server applicaton. The client application looks like this:
public Reader mr_obj;
public Form1()
{
mr_obj = new MyReader.Reader(7137);
mr_obj.UserEvent += new ReaderEvent(UserEvent);
}
private void UserEvent(UserEvent e, long threadID)
{
Thread.Sleep(1000);
SafeSomethingToDB();
}
The Reader() object is connecting the client application to the server application. So after this, the server application is able to trigger the UserEvent() method in the client application. Ther problem is now, that the client application, which handles the UserEvents, crashes if the UserEvent() method gets triggered twice within one second.
(Its actually not crashing just hanging untill you kill the task, a try catch wont return an error)
What I've tried so far is to delegate the Thread.Sleep() and SafeSomethingToDB() to another thread. This doesnt work because the server application does not wait until the tread is finished. So the server application does not find the data in the DB because its not waiting 1 second...
The same problem happens when I did that with background workers.
Is there a possibility to handle these two triggers, which come from the same server application, in sort of a parallell way at the same time?
Any suggestions very apreciated
EDIT: I think locking the method does not cause the application to process both triggers in the same time. To make this visible I'v tried this:
private void UserEventHandler(UserEvent e, long threadID)
{
lock (_lockObject)
{
MessageBox.Show("Messagebox 1");
MessageBox.Show("Messagebox 2");
}
}
When the first request triggers UserEvent() "MessageBox1" appeares. If you press OK, "MessageBox2" appeares. But if the UserEvent gets triggered a second time while "Messagebox2" is still opened, "MessageBox1" does not appear. Instead of that the application start hanging. Shouldn "MessageBox1" appear again triggered by the second trigger of UserEvent() when the two triggers really ar bbeing processed at the same time? So the two triggers are not beeing preformed parallel or am I mistaking here?
Without knowing why you do the Sleep or what exactly SafeSomethingToDB does and what causes your problems, try to synchronize the calls:
private readonly object _lockObject = new object();
private void UserEvent(UserEvent e, long threadID)
{
lock(_lockObject)
{
Thread.Sleep(1000);
SafeSomethingToDB();
}
}
I think a simple lock for synchronization will work for you, try this
public Reader mr_obj;
private static readonly object sync = new object();
public Form1()
{
mr_obj = new MyReader.Reader(7137);
mr_obj.UserEvent += new ReaderEvent(UserEvent);
}
private void UserEvent(UserEvent e, long threadID)
{
lock(sync)
{
SafeSomethingToDB();
}
}
As you write in the comments, if SafeSomethingToDB() is called a second time before the first call has finished, then it crashes. So in other words: SafeSomethingToDB() is not re-entrant.
What you can do is use a Mutex (which stands for mutual exclusion), which defines a "critical section" in your code, meaning a code that can have only one thread executing it at any one time.
For instance:
private static Mutex mutex = new Mutex();
public void SafeSomethingToDB()
{
mutex.WaitOne(); // wait until it is safe to enter the critical section
// Critical section begins here
DoWorkAndStuff();
mutex.ReleaseMutex(); // indicate the end of the critical section
}
For more about System.Threading.Mutex, see http://msdn.microsoft.com/en-us/library/system.threading.mutex(v=vs.110).aspx.

Will calling a progress report method slow down a C# file download on Windows Phone 7 significantly?

I have a Windows Phone 7 (7.1) method in C# that given a URL in string form downloads the contents of that URL to a file (see code below). As you can see from the code, I assign a DownloadProgressChanged() event handler. In that handler, if the caller provided an IProgress object, I call the Report() method on that object. Given the potential for the user having a slow Web connection, I want to make sure that the download will go as fast as possible. Will calling the IProgress.Report() method in the WebClient's DownloadProgressChanged() callback slow down the download considerably?
I'm not familiar enough with IProgress.Report() to know if it executes on the current thread or the calling thread. Does it execute on the calling thread? My concern is that a repetitive thread switch would really bog things down. I'll probably wrap the call to this method in a Task.Run() call to keep the UI thread happy. But just in case, I'll ask if there any potential problems with my code as far as bogging down the UI thread is concerned?
Any other comments on the code pertaining to structure or performance are appreciated. Note, I'm using the Microsoft.Bcl.Async package in this app.
UPDATE (added later): In regards to thread switching, apparently the DownloadProgressChanged() event is raised on the UI thread, not the download thread, so there is no need to do anything fancy with Dispatcher or the like to do UI updates in this scenario. At least according to this Code Project article:
Progress Reporting in C# 5 Async
public static void URLToFile(string strUrl, string strDestFilename, IProgress<int> progress, int iNumSecondsToWait = 30)
{
strUrl = strUrl.Trim();
if (String.IsNullOrWhiteSpace(strUrl))
throw new ArgumentException("(Misc::URLToFile) The URL is empty.");
strDestFilename = strDestFilename.Trim();
if (String.IsNullOrWhiteSpace(strDestFilename))
throw new ArgumentException("(Misc::URLToFile) The destination file name is empty.");
if (iNumSecondsToWait < 1)
throw new ArgumentException("(Misc::URLToFile) The number of seconds to wait is less than 1.");
// Create the isolated storage file.
StreamWriter sw = openIsoStorFileAsStreamWriter(strDestFilename);
// If the stream writer is NULL, then the file could not be created.
if (sw == null)
throw new System.IO.IOException("(Misc::URLToFile) Error creating or writing to the file named: " + strDestFilename);
// Asynchronous download. Note, the Silverlight version of WebClient does *not* implement
// IDisposable.
WebClient wc = new WebClient();
try
{
// Create a download progress changed handler so we can pass on progress
// reports to the caller if they provided a progress report object.
wc.DownloadProgressChanged += (s, e) =>
{
// Do we have a progress report handler?
if (progress != null)
// Yes, call it.
progress.Report(e.ProgressPercentage);
};
// Use a Lambda expression for the "completed" handler
// that writes the contents to a file.
wc.OpenReadCompleted += (s, e) =>
e.Result.CopyTo(sw.BaseStream);
// Now make the call to download the file.
wc.DownloadStringAsync(new Uri(strUrl));
}
finally
{
// Make sure the stream is cleaned up.
sw.Flush();
sw.Close();
// Make sure the StreamWriter is diposed of.
sw.Dispose();
} // try/finally
// CancellationTokenSource srcCancelToken = new CancellationTokenSource();
// srcCancelToken.CancelAfter(TimeSpan.FromSeconds(iNumSecondsToWait));
} // public static void URLToFile()
No, the download progress check does not necessarily affect the download speed. Mainly because the items that are checked are not downloaded per-se. When you initiate the download, you get a size declaration (content length) - that is used as a reference for a complete file. Then, the size of the local (downloaded) byte content is checked and a ratio can be built from the two values.
NOTE: Does not apply for streaming, for obvious reasons, since there is no final size estimate.

Receiving packets in c# using threading

I am following a Java example that uses a Completion Service to submit queries to a 3rd party app that receives packets by calling:
completionService.submit(new FetchData());
Then it calls:
Future<Data> future = completionService.take();
Data data = future.get(timeout, TimeUnit.MILLISECONDS);
Which waits for one of the submitted tasks to finish and returns the data. These two calls are in a while(true) loop.
I am developing an app in c# and I was wondering if this is the proper way to wait for packets and if it is how do I do it in c#.
I have tried this but I'm not sure if I am doing it right:
new Thread(delegate() {
Dictionary<ManualResetEvent, FetchData> dataDict = new Dictionary<ManualResetEvent, FetchData>();
ManualResetEvent[] doneEvents;
ManualResetEvent doneEvent;
FetchData fetch;
int index;
while(true) {
// Create new fetch
doneEvent = new ManualResetEvent(false);
fetch = new FetchData(this, doneEvent);
// event -> fetch association
dataDict.Add(doneEvent, fetch);
ThreadPool.QueueUserWorkItem(fetch.DoWork);
doneEvents = new ManualResetEvent[dataDict.Count];
dataDict.Keys.CopyTo(doneEvents, 0);
// wait for any of them to finish
index = WaitHandle.WaitAny(doneEvents, receiveThreadTimeout);
// did we timeout?
if (index == WaitHandle.WaitTimeout) {
continue;
}
// grab done event
doneEvent = doneEvents[index];
// grab fetch
fetch = dataDict[doneEvent];
// remove from dict
dataDict.Remove(doneEvent);
// process data
processData(fetch.GetData());
}
}).Start();
EDIT: One last note, I am using this in Unity which uses Mono 2.6 and is limited to .NET 2.0
EDIT 2: I changed the code around some. I realized that the ThreadPool has its own max limit and will queue up tasks if there are no threads left, so I removed that logic from my code.
Do you really need to use multithread in your Unity3D application? I'm asking this because Unity "is not" multi-threaded: there's a way to deal with threads but you'd better rely on coroutines to do this. Please refer to this documentation to find more about coroutines.
One note: if you are using Unity 3.5, it uses Mono 2.6.5 that supports almost everything of .NET 4.0. I don't know about the Task class, but it certainly covers .NET 3.0.
It turns out that I only need a single thread to listen for packets, so I don't have to use a thread pool like in my example above.

Background task in a ASP webapp

I'm fairly new to C#, and recently built a small webapp using .NET 4.0. This app has 2 parts: one is designed to run permanently and will continuously fetch data from given resources on the web. The other one accesses that data upon request to analyze it. I'm struggling with the first part.
My initial approach was to set up a Timer object that would execute a fetch operation (whatever that operation is doesn't really matter here) every, say, 5 minutes. I would define that timer on Application_Start and let it live after that.
However, I recently realized that applications are created / destroyed based on user requests (from my observation they seem to be destroyed after some time of inactivity). As a consequence, my background activity will stop / resume out of my control where I would like it to run continuously, with absolutely no interruption.
So here comes my question: is that achievable in a webapp? Or do I absolutely need a separate Windows service for that kind of things?
Thanks in advance for your precious help!
Guillaume
While doing this on a web app is not ideal..it is achievable, given that the site is always up.
Here's a sample: I'm creating a Cache item in the global.asax with an expiration. When it expires, an event is fired. You can fetch your data or whatever in the OnRemove() event.
Then you can set a call to a page(preferably a very small one) that will trigger code in the Application_BeginRequest that will add back the Cache item with an expiration.
global.asax:
private const string VendorNotificationCacheKey = "VendorNotification";
private const int IntervalInMinutes = 60; //Expires after X minutes & runs tasks
protected void Application_Start(object sender, EventArgs e)
{
//Set value in cache with expiration time
CacheItemRemovedCallback callback = OnRemove;
Context.Cache.Add(VendorNotificationCacheKey, DateTime.Now, null, DateTime.Now.AddMinutes(IntervalInMinutes), TimeSpan.Zero,
CacheItemPriority.Normal, callback);
}
private void OnRemove(string key, object value, CacheItemRemovedReason reason)
{
SendVendorNotification();
//Need Access to HTTPContext so cache can be re-added, so let's call a page. Application_BeginRequest will re-add the cache.
var siteUrl = ConfigurationManager.AppSettings.Get("SiteUrl");
var client = new WebClient();
client.DownloadData(siteUrl + "default.aspx");
client.Dispose();
}
private void SendVendorNotification()
{
//Do Tasks here
}
protected void Application_BeginRequest(object sender, EventArgs e)
{
//Re-add if it doesn't exist
if (HttpContext.Current.Request.Url.ToString().ToLower().Contains("default.aspx") &&
HttpContext.Current.Cache[VendorNotificationCacheKey] == null)
{
//ReAdd
CacheItemRemovedCallback callback = OnRemove;
Context.Cache.Add(VendorNotificationCacheKey, DateTime.Now, null, DateTime.Now.AddMinutes(IntervalInMinutes), TimeSpan.Zero,
CacheItemPriority.Normal, callback);
}
}
This works well, if your scheduled task is quick.
If it's a long running process..you definitely need to keep it out of your web app.
As long as the 1st request has started the application...this will keep firing every 60 minutes even if it has no visitors on the site.
I suggest putting it in a windows service. You avoid all the hoops mentioned above, the big one being IIS restarts. A windows service also has the following benefits:
Can automatically start when the server starts. If you are running in IIS and your server reboots, you have to wait until a request is made to start your process.
Can place this data fetching process on another machine if needed
If you end up load-balancing your website on multiple servers, you could accidentally have multiple data fetching processes causing you problems
Easier to main the code separately (single responsibility principle). Easier to maintain the code if it's just doing what it needs to do and not also trying to fool IIS.
Create a static class with a constructor, creating a timer event.
However like Steve Sloka mentioned, IIS has a timeout that you will have to manipulate to keep the site going.
using System.Runtime.Remoting.Messaging;
public static class Variables
{
static Variables()
{
m_wClass = new WorkerClass();
// creates and registers an event timer
m_flushTimer = new System.Timers.Timer(1000);
m_flushTimer.Elapsed += new System.Timers.ElapsedEventHandler(OnFlushTimer);
m_flushTimer.Start();
}
private static void OnFlushTimer(object o, System.Timers.ElapsedEventArgs args)
{
// determine the frequency of your update
if (System.DateTime.Now - m_timer1LastUpdateTime > new System.TimeSpan(0,1,0))
{
// call your class to do the update
m_wClass.DoMyThing();
m_timer1LastUpdateTime = System.DateTime.Now;
}
}
private static readonly System.Timers.Timer m_flushTimer;
private static System.DateTime m_timer1LastUpdateTime = System.DateTime.MinValue;
private static readonly WorkerClass m_wClass;
}
public class WorkerClass
{
public delegate WorkerClass MyDelegate();
public void DoMyThing()
{
m_test = "Hi";
m_test2 = "Bye";
//create async call to do the work
MyDelegate myDel = new MyDelegate(Execute);
AsyncCallback cb = new AsyncCallback(CommandCallBack);
IAsyncResult ar = myDel.BeginInvoke(cb, null);
}
private WorkerClass Execute()
{
//do my stuff in an async call
m_test2 = "Later";
return this;
}
public void CommandCallBack(IAsyncResult ar)
{
// this is called when your task is complete
AsyncResult asyncResult = (AsyncResult)ar;
MyDelegate myDel = (MyDelegate)asyncResult.AsyncDelegate;
WorkerClass command = myDel.EndInvoke(ar);
// command is a reference to the original class that envoked the async call
// m_test will equal "Hi"
// m_test2 will equal "Later";
}
private string m_test;
private string m_test2;
}
I think you can can achieve it by using a BackgroundWorker, but i would rather suggest you to go for a service.
Your application context lives as long as your Worker Process in IIS is functioning. In IIS there's some default timeouts for when the worker process will recycle (e.g. Number of Idle mins (20), or regular intervals (1740).
That said, if you adjust those settings in IIS, you should be able to have the requests live, however, the other answers of using a Service would work as well, just a matter of how you want to implement.
I recently made a file upload functionality for uploading Access files to the database (not the best way but just a temporary fix to a longterm issue).
I solved it by creating a background thread that ran through the ProcessAccess function, and was deleted when completed.
Unless IIS has a setting in which it kills a thread after a set amount of time regardless of inactivity, you should be able to create a thread that calls a function that never ends. Don't use recursion because the amount of open functions will eventually blow up in you face, but just have a for(;;) loop 5,000,000 times so it'll keep busy :)
Application Initialization Module for IIS 7.5 does precisely this type of init work. More details on the module are available here Application Initialization Module

Categories