I am writing a WCF service that has source data from multiple sources. These are large files in various formats.
I have implemented Caching and set-up a polling interval so these files are kept up to date with fresh data.
I have constructed a manager class that basically is responsible for returning XDocument objects back to the caller. The manager class first checks the cache for existence. If it doesn't exist - it makes the call to retrieve fresh data. Nothing big here.
What I would like to do to keep the response snappy is serialize the file previously downloaded and pass that back to the caller - again nothing new...however...I want to spawn a new thread as soon as the serialization is complete to retrieve the fresh data and overwrite the old file. This is my problem...
Admittedly an intermediate programmer - I came across a few examples on multi-threading (here for that matter)...The problem is it introduced the concept of delegates and I am really struggling with this.
Here is some of my code:
//this method invokes another object that is responsible for making the
//http call, decompressing the file and persisting to the hard drive.
private static void downloadFile(string url, string LocationToSave)
{
using (WeatherFactory wf = new WeatherFactory())
{
wf.getWeatherDataSource(url, LocationToSave);
}
}
//A new thread variable
private static Thread backgroundDownload;
//the delegate...but I am so confused on how to use this...
delegate void FileDownloader(string url, string LocationToSave);
//The method that should be called in the new thread....
//right now the compiler is complaining that I don't have the arguments from
//the delegate (Url and LocationToSave...
//the problem is I don't pass URL and LocationToSave here...
static void Init(FileDownloader download)
{
backgroundDownload = new Thread(new ThreadStart(download));
backgroundDownload.Start();
}
I'd like to implement this the correct way...so a bit of education on how to make this work would be appreciated.
I would use the Task Parallel library to do this:
//this method invokes another object that is responsible for making the
//http call, decompressing the file and persisting to the hard drive.
private static void downloadFile(string url, string LocationToSave)
{
using (WeatherFactory wf = new WeatherFactory())
{
wf.getWeatherDataSource(url, LocationToSave);
}
//Update cache here?
}
private void StartBackgroundDownload()
{
//Things to consider:
// 1. what if we are already downloading, start new anyway?
// 2. when/how to update your cache
var task = Task.Factory.StartNew(_=>downloadFile(url, LocationToSave));
}
Related
I have an application that runs both as a console and as a windows service. It processes files and can handle several files at once with threading. Every now and then it fetches a file that is "tagged" as a testfile.
The flow is something like this:
Read file
Determine if test
Validate file
Save contents of file into db
Move file
Report to other service about the file.
All the steps need to know if the file is a testfile or not, but all the steps don't have access to the file per say.
Instead of passing a bool isTest parameter to every method I would like to create a context variable that is applicable to just that file and that particular execution (specific for the stack). Somewhat similiar to OperationContext in WCF. I would use ThreadStatic but I have a lot of async await, and I'm afraid that the thread can be re-used by another context.
Is there a way to keep a variable in a sort of session or context that is bound to a specific execution? Something like this (just example code):
var isTest = true;
var fileProcessor = new FileProcessor(); // It's injected, but just an example.
using(ContextFactory.CreateContext(isTest))
{
// Process file.
// All methods in this stack should be able to determine if it's a testfile or not.
fileProcessor.ProcessFile(myFilePath);
}
public class FileProcessor
{
public void ProcessFile(string fileName)
{
// Should be able to determine if it's a test or not.
var isTest = ContextFactory.IsTest()
// This method will call other classes and other methods in a long chain.
}
}
I'm using Unity for IoC and C# 4.5.
I have an MVC3/.NET 4 application which uses Entity Framework (4.3.1 Code First)
I have wrapped EF into a Repository/UnitOfWork pattern as described here…
http://www.asp.net/mvc/tutorials/getting-started-with-ef-using-mvc/implementing-the-repository-and-unit-of-work-patterns-in-an-asp-net-mvc-application
Typically, as it explains in the article, when I require the creation of a new record I’ve been doing this…
public ActionResult Create(Course course)
{
unitOfWork.CourseRepository.Add(course);
unitOfWork.Save();
return RedirectToAction("Index");
}
However, when more than simply saving a record to a database is required I wrap the logic into what I’ve called an IService. For example…
private ICourseService courseService;
public ActionResult Create(Course course)
{
courseService.ProcessNewCourse(course);
return RedirectToAction("Index");
}
In one of my services I have something like the following…
public void ProcessNewCourse(Course course)
{
// Save the course to the database…
unitOfWork.CourseRepository.Add(course);
unitOfWork.Save();
// Generate a PDF that email some people about the new course being created, which requires more use of the unitOfWork…
var someInformation = unitOfWork.AnotherRepository.GetStuff();
var myPdfCreator = new PdfCreator();
IEnumerable<People> people = unitOfWork.PeopleRepository.GetAllThatWantNotifiying(course);
foreach(var person in people)
{
var message = “Hi ” + person.FullName;
var attachment = myPdfCreator.CreatePdf();
etc...
smtpClient.Send();
}
}
The above isn’t the actual code (my app has nothing to do with courses, I’m using view models, and I have separated the PDF creation and email message out into other classes) but the gist of what is going on is as above!
My problem is that the generation of the PDF and emailing it out is taking some time. The user just needs to know that the record has been saved to the database so I thought I would put the code below the unitOfWork.Save(); into an asynchronous method. The user can then be redirected and the server can happily take its time processing the emails, and attachments and whatever else I require it to do post save.
This is where I’m struggling.
I’ve tried a few things, the current being the following in ICourseService…
public class CourseService : ICourseService
{
private delegate void NotifyDelegate(Course course);
private NotifyDelegate notifyDelegate;
public CourseService()
{
notifyDelegate = new NotifyDelegate(this.Notify);
}
public void ProcessNewCourse(Course course)
{
// Save the course to the database…
unitOfWork.CourseRepository.Add(course);
unitOfWork.Save();
notifyDelegate.BeginInvoke(course);
}
private void Notify(Course course)
{
// All the stuff under unitOfWork.Save(); moved here.
}
}
My Questions/Problems
I’m randomly getting the error: "There is already an open DataReader associated with this Command which must be closed first." in the Notify() method.
Is it something to do with the fact that I’m trying to share the unitOrWork and therefore a dbContext across threads?
If so, can someone be kind enough to explain why this is a problem?
Should I be giving a new instance of unitOfWork to the Notify method?
Am I using the right patterns/classes to invoke the method asynchronously? Or should I be using something along the lines of....
new System.Threading.Tasks.Task(() => { Notify(course); }).Start();
I must say I've become very confused with the terms asynchronous, parallel, and concurrent!!
Any links to articles (c# async for idiots) would be appreciated!!
Many thanks.
UPDATE:
A little more digging got me to this SO page: https://stackoverflow.com/a/5491978/192999 which says...
"Be aware though that EF contexts are not thread safe, i.e. you cannot use the same context in more than one thread."
...so am I trying to achieve the impossible? Does this mean I should be creating a new IUnitOfWork instance for my new thread?
You could create a polling background thread that does the lengthy operation separately from your main flow. This thread could scan the database for new items (or items marked to process). This solution is pretty simple and ensures that jobs get done even if you application crashes (it will be picked up when the polling thread is started again).
You could also use a Synchronised Queue if it's not terrible if the request is 'lost', in the case your application crashes after the doc is requested and before it's generated/sent.
One thing is almost sure - as rikitikitik said - you will need to use a new unit of work, which means a separate transaction.
You could also look at Best threading queue example / best practice .
I'm taking over a C# project, and when testing it out I'm getting errors. The error is that the log file cannot be written to because it is in use by another process. Here's the code:
public void WriteToLog(string msg)
{
if (!_LogExists)
{
this.VerifyOrCreateLogFile(); // Creates log file if it does not already exist.
}
// do the actual writing on its own thread so execution control can immediately return to the calling routine.
Thread t = new Thread(new ParameterizedThreadStart(WriteToLog));
t.Start((object)msg);
}
private void WriteToLog(object msg)
{
lock (_LogLock)
{
string message = msg as string;
using (StreamWriter sw = File.AppendText(LogFile))
{
sw.Write(message);
sw.Close();
}
}
}
_LogLock is defined as a class variable:
private object _LogLock = 0;
Based on my research and the fact that this has been working fine in a production system for a few years now, I don't know what the problem could be. The lock should prevent another thread from attempting to write to the log file.
The changes I've made that need to be tested are a lot more log usage. We're basically adding a debug mode to save much more info to the log than used to be saved.
Thanks for any help!
EDIT:
Thanks for the quick answers! The code for VerifyOrCreateLogFile() does use the _LogLock, so that shouldn't be an issue. It does do some writing to the log before it errors out, so it gets past creating the file just fine.
What seems to be the problem is that previously only one class created an instance of the log class, and now I've added instances to other classes. It makes sense that this would create problems. Changing the _LogLock field to be static fixes the issue.
Thanks again!
The lock should prevent another thread from attempting to write to the log file.
This is only true if you're using a single instance of this class.
If each (or even some) of the log requests use a separate instance, then the lock will not protect you.
You can easily "correct" this by making the _LogLock field static:
private static object _LogLock = 0;
This way, all instances will share the same lock.
I see 2 problems with the code:
Lock must be the same among all "users" of ths Log class, easiest solution is to make either _LogLock or the complete class static
VerifyOrCreateLogFile could pose a problem if 2 or more parallel threads call WriteToLog when _LogExists is false...
One possibility is that the OS isn't releasing the file lock quickly enough before you exit the lock in WriteToLog and another thread that was blocked waiting for the lock tried to open it before the OS finished releasing the file lock. Yes, it can happen. You either need to sleep for a little before trying to open the file, centralize the writing to the log to a dedicated object (so that he and only he has access to this file and you don't have to worry about file lock contentions).
Another possibility is that you need to lock around
if (!_LogExists) {
this.VerifyOrCreateLogFile(); // Creates log file if it does not already exist.
}
The third possibility is that you have multiple instances of whatever class is housing these methods. The lock object won't be shared across instances (make it static to solve this).
At the end of the day, unless you're an expert in writing safe multi-threaded code, just let someone else worry about this stuff for you. Use a framework that handles these issues for you (log4net?).
you can do the code executable by simply
removing sw.Close(); from your code ...
do it....
it will work fine.....
I'm fairly new to C#, and recently built a small webapp using .NET 4.0. This app has 2 parts: one is designed to run permanently and will continuously fetch data from given resources on the web. The other one accesses that data upon request to analyze it. I'm struggling with the first part.
My initial approach was to set up a Timer object that would execute a fetch operation (whatever that operation is doesn't really matter here) every, say, 5 minutes. I would define that timer on Application_Start and let it live after that.
However, I recently realized that applications are created / destroyed based on user requests (from my observation they seem to be destroyed after some time of inactivity). As a consequence, my background activity will stop / resume out of my control where I would like it to run continuously, with absolutely no interruption.
So here comes my question: is that achievable in a webapp? Or do I absolutely need a separate Windows service for that kind of things?
Thanks in advance for your precious help!
Guillaume
While doing this on a web app is not ideal..it is achievable, given that the site is always up.
Here's a sample: I'm creating a Cache item in the global.asax with an expiration. When it expires, an event is fired. You can fetch your data or whatever in the OnRemove() event.
Then you can set a call to a page(preferably a very small one) that will trigger code in the Application_BeginRequest that will add back the Cache item with an expiration.
global.asax:
private const string VendorNotificationCacheKey = "VendorNotification";
private const int IntervalInMinutes = 60; //Expires after X minutes & runs tasks
protected void Application_Start(object sender, EventArgs e)
{
//Set value in cache with expiration time
CacheItemRemovedCallback callback = OnRemove;
Context.Cache.Add(VendorNotificationCacheKey, DateTime.Now, null, DateTime.Now.AddMinutes(IntervalInMinutes), TimeSpan.Zero,
CacheItemPriority.Normal, callback);
}
private void OnRemove(string key, object value, CacheItemRemovedReason reason)
{
SendVendorNotification();
//Need Access to HTTPContext so cache can be re-added, so let's call a page. Application_BeginRequest will re-add the cache.
var siteUrl = ConfigurationManager.AppSettings.Get("SiteUrl");
var client = new WebClient();
client.DownloadData(siteUrl + "default.aspx");
client.Dispose();
}
private void SendVendorNotification()
{
//Do Tasks here
}
protected void Application_BeginRequest(object sender, EventArgs e)
{
//Re-add if it doesn't exist
if (HttpContext.Current.Request.Url.ToString().ToLower().Contains("default.aspx") &&
HttpContext.Current.Cache[VendorNotificationCacheKey] == null)
{
//ReAdd
CacheItemRemovedCallback callback = OnRemove;
Context.Cache.Add(VendorNotificationCacheKey, DateTime.Now, null, DateTime.Now.AddMinutes(IntervalInMinutes), TimeSpan.Zero,
CacheItemPriority.Normal, callback);
}
}
This works well, if your scheduled task is quick.
If it's a long running process..you definitely need to keep it out of your web app.
As long as the 1st request has started the application...this will keep firing every 60 minutes even if it has no visitors on the site.
I suggest putting it in a windows service. You avoid all the hoops mentioned above, the big one being IIS restarts. A windows service also has the following benefits:
Can automatically start when the server starts. If you are running in IIS and your server reboots, you have to wait until a request is made to start your process.
Can place this data fetching process on another machine if needed
If you end up load-balancing your website on multiple servers, you could accidentally have multiple data fetching processes causing you problems
Easier to main the code separately (single responsibility principle). Easier to maintain the code if it's just doing what it needs to do and not also trying to fool IIS.
Create a static class with a constructor, creating a timer event.
However like Steve Sloka mentioned, IIS has a timeout that you will have to manipulate to keep the site going.
using System.Runtime.Remoting.Messaging;
public static class Variables
{
static Variables()
{
m_wClass = new WorkerClass();
// creates and registers an event timer
m_flushTimer = new System.Timers.Timer(1000);
m_flushTimer.Elapsed += new System.Timers.ElapsedEventHandler(OnFlushTimer);
m_flushTimer.Start();
}
private static void OnFlushTimer(object o, System.Timers.ElapsedEventArgs args)
{
// determine the frequency of your update
if (System.DateTime.Now - m_timer1LastUpdateTime > new System.TimeSpan(0,1,0))
{
// call your class to do the update
m_wClass.DoMyThing();
m_timer1LastUpdateTime = System.DateTime.Now;
}
}
private static readonly System.Timers.Timer m_flushTimer;
private static System.DateTime m_timer1LastUpdateTime = System.DateTime.MinValue;
private static readonly WorkerClass m_wClass;
}
public class WorkerClass
{
public delegate WorkerClass MyDelegate();
public void DoMyThing()
{
m_test = "Hi";
m_test2 = "Bye";
//create async call to do the work
MyDelegate myDel = new MyDelegate(Execute);
AsyncCallback cb = new AsyncCallback(CommandCallBack);
IAsyncResult ar = myDel.BeginInvoke(cb, null);
}
private WorkerClass Execute()
{
//do my stuff in an async call
m_test2 = "Later";
return this;
}
public void CommandCallBack(IAsyncResult ar)
{
// this is called when your task is complete
AsyncResult asyncResult = (AsyncResult)ar;
MyDelegate myDel = (MyDelegate)asyncResult.AsyncDelegate;
WorkerClass command = myDel.EndInvoke(ar);
// command is a reference to the original class that envoked the async call
// m_test will equal "Hi"
// m_test2 will equal "Later";
}
private string m_test;
private string m_test2;
}
I think you can can achieve it by using a BackgroundWorker, but i would rather suggest you to go for a service.
Your application context lives as long as your Worker Process in IIS is functioning. In IIS there's some default timeouts for when the worker process will recycle (e.g. Number of Idle mins (20), or regular intervals (1740).
That said, if you adjust those settings in IIS, you should be able to have the requests live, however, the other answers of using a Service would work as well, just a matter of how you want to implement.
I recently made a file upload functionality for uploading Access files to the database (not the best way but just a temporary fix to a longterm issue).
I solved it by creating a background thread that ran through the ProcessAccess function, and was deleted when completed.
Unless IIS has a setting in which it kills a thread after a set amount of time regardless of inactivity, you should be able to create a thread that calls a function that never ends. Don't use recursion because the amount of open functions will eventually blow up in you face, but just have a for(;;) loop 5,000,000 times so it'll keep busy :)
Application Initialization Module for IIS 7.5 does precisely this type of init work. More details on the module are available here Application Initialization Module
In a Windows Form window, multiple events can trigger an asynchronous method. This method downloads a file and caches it. My problem is that I want that method to be executed once. In other words, I want to prevent the file to be downloaded multiple times.
If the method downloading the file is triggered twice, I want the second call to wait for the file (or wait for the first method to be done).
Does someone have an idea on how to achieve that?
UPDATE: I am simply trying to prevent unnecessary downloads. In my case, when a client put its mouse over an item in a ListBox for more than a couple milliseconds, we start to download. We make the assumption that the user will click and request the file. What can potentially happen is that the user keeps his mouse over the item for one second and then click. In this case two downloads start. I am looking for the best way to handle such scenario.
UPDATE 2:: There is a possibility that the user will move its mouse over multiple items. In consequences, multiple downloads will occur. I've not really tough of this scenario, but right now if we face such scenario we don't abandon the download. The file will be downloaded (files are usually around 50-100kb) and then are going to be cached.
Maintain the state of what's happening in a form variable and have your async method check that state before it does anything. Make sure you synchronize access to it, though! Mutexes and semaphores are good for this kind of thing.
If you can download different files simultaneously, you'll need to keep track of what's being downloaded in a list for reference.
If only one file can be downloaded at a time, and you don't want to queue things up, you could just unhook the event while something is being downloaded, too, and rehook it when the download is complete.
Here is a dummy implementation that supports multiple file downloads:
Dictionary<string, object> downloadLocks = new Dictionary<string, object>();
void DownloadFile(string localFile, string url)
{
object fileLock;
lock (downloadLocks)
{
if (!downloadLocks.TryGetValue(url, out fileLock))
{
fileLock = new object();
downloadLocks[url] = fileLock;
}
}
lock (fileLock)
{
// check if file is already downloaded
// if not then download file
}
}
You can simply wrap your method call within a lock statement like this
private static readonly Object padLock = new Object();
...
lock(padLock)
{
YourMethod();
}
i'm not sure how it would be done in C#, but in java, you would synchonize on an private static final object in the class before downloading the file. This would block any further requests until the current one was completed. You could then check to see if the file was downloaded or not and act appropriately.
private static final Object lock = new Object();
private File theFile;
public method() {
synchronized(lock) {
if(theFile != null) {
//download the file
}
}
}
In general, I agree with Michael, use a lock around the code that actually gets the file. However, if there's a single event that always occurs first and you can always load the file then, consider using Futures. In the initial event, start the future running
Future<String> file = InThe.Future<String>(delegate { return LoadFile(); });
and in every other event, wait on the future's value
DoSomethingWith(file.Value);
If you want one thread to wait for another thread to finish a task, you probably want to use a ManualResetEvent. Maybe something like this:
private ManualResetEvent downloadCompleted = new ManualResetEvent();
private bool downloadStarted = false;
public void Download()
{
bool doTheDownload = false;
lock(downloadCompleted)
{
if (!downloadStarted)
{
downloadCompleted.Reset();
downloadStarted = true;
doTheDownload = true;
}
}
if (doTheDownload)
{
// Code to do the download
lock(downloadCompleted)
{
downloadStarted = false;
}
// When finished notify anyone waiting.
downloadCompleted.Set();
}
else
{
// Wait until it is done...
downloadCompleted.WaitOne();
}
}