Task Parallel Library - I don't understand what I'm doing wrong - c#

This is a two part question.
I have a class that gets all processes asynchronously and polls them for CPU usage. Yesterday I had a bug with it and it was solved here.
The first part of the question is why the solution helped. I didn't understand the explanation.
The second part of the question is that I still get an "Object reference not set to an instance of object" exception occasionally when I try to print the result at the end of the process. This is because item.Key is indeed null. I don't understand why that is because I put a breakpoint checking for (process == null) and it was never hit. What am I doing wrong?
Code is below.
class ProcessCpuUsageGetter
{
private IDictionary<Process, int> _usage;
public IDictionary<Process, int> Usage { get { return _usage; } }
public ProcessCpuUsageGetter()
{
while (true)
{
Process[] processes = Process.GetProcesses();
int processCount = processes.Count();
Task[] tasks = new Task[processCount];
_usage = new Dictionary<Process, int>();
for (int i = 0; i < processCount; i++)
{
var localI = i;
var localProcess = processes[localI];
tasks[localI] = Task.Factory.StartNew(() => DoWork(localProcess));
}
Task.WaitAll(tasks);
foreach (var item in Usage)
{
Console.WriteLine("{0} - {1}%", item.Key.ProcessName, item.Value);
}
}
}
private void DoWork(object o)
{
Process process = (Process)o;
PerformanceCounter pc = new PerformanceCounter("Process", "% Processor Time", process.ProcessName, true);
pc.NextValue();
Thread.Sleep(1000);
int cpuPercent = (int)pc.NextValue() / Environment.ProcessorCount;
if (process == null)
{
var x = 5;
}
if (_usage == null)
{
var t = 6;
}
_usage.Add(process, cpuPercent);
}
}

The line
_usage.Add(process, cpuPercent);
is accessing a not-threadsafe collection from a thread.
Use a ConcurrentDictionary<K,V> instead of the normal dictionary.
The 'null reference' error is just a random symptom, you could get other errors too.

Related

Display CPU usage of all running processes, programmatically?

Question
How do I get the cpu usage of each process into PopulateApplications()?
What's happening
getCPUUsage() gives me the same value for each process. It's like it's getting the cpu usage for only one process.
The rest of the code seems to work fine.
getCPUUsage() from class Core:
public static double getCPUUsage()
{
ManagementObject processor = new ManagementObject("Win32_PerfFormattedData_PerfOS_Processor.Name='_Total'");
processor.Get();
return double.Parse(processor.Properties["PercentProcessorTime"].Value.ToString());
}
What I've tried
In form1, I have a method by which I display information about processes like icons, name, and statuses (i.e. running/not running).
void PopulateApplications()
{
DoubleBufferedd(dataGridView1, true);
int rcount = dataGridView1.Rows.Count;
int rcurIndex = 0;
foreach (Process p in Process.GetProcesses())
{
try
{
if (File.Exists(p.MainModule.FileName))
{
var icon = Icon.ExtractAssociatedIcon(p.MainModule.FileName);
Image ima = icon.ToBitmap();
ima = resizeImage(ima, new Size(25, 25));
ima = (Image)(new Bitmap(ima, new Size(25, 25)));
String status = p.Responding ? "Running" : "Not Responding";
if (rcurIndex < rcount - 1)
{
var currentRow = dataGridView1.Rows[rcurIndex];
currentRow.Cells[0].Value = ima;
currentRow.Cells[1].Value = p.ProcessName;
currentRow.Cells[2].Value = cpuusage;
currentRow.Cells[3].Value = status;
}
else
{
dataGridView1.Rows.Add(
ima, p.ProcessName,cpuusage, status);//false, ima, p.ProcessName, status);
}
rcurIndex++;
}
}
catch ( Exception e)
{
string t = "error";
}
}
if (rcurIndex < rcount - 1)
{
for (int i = rcurIndex; i < rcount - 1; i++)
{
dataGridView1.Rows.RemoveAt(rcurIndex);
}
}
}
I added this line:
currentRow.Cells[2].Value = cpuusage;
cpuusage is double-type variable.
I changed this line, also, to include addition of cpuusage:
dataGridView1.Rows.Add(
ima, p.ProcessName,cpuusage, status);
Now I have a background worker event, dowork, whereby I use cpuusage to get the cpu usage values:
this.Invoke(new Action(() => cpuusage = Core.getCPUUsage()));
Maybe I don't need to call the method getCPUUsage() through backgroundworker.
This is what i see when im running the program:
All the processes have the same cpu usage ? Not logic.
Then when there is an update i see:
Again all the cells have the same cpu usage value. But on the left there are many processes each should have it's own cpu usage.

Use Task.Run instead of Delegate.BeginInvoke

I have recently upgraded my projects to ASP.NET 4.5 and I have been waiting a long time to use 4.5's asynchronous capabilities. After reading the documentation I'm not sure whether I can improve my code at all.
I want to execute a task asynchronously and then forget about it. The way that I'm currently doing this is by creating delegates and then using BeginInvoke.
Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited:
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
var invoker = new MethodInvoker(delegate
{
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
});
invoker.BeginInvoke(StopAsynchronousMethod, invoker);
base.OnActionExecuting(filterContext);
}
But in order to finish this asynchronous task, I need to always define a callback, which looks like this:
public void StopAsynchronousMethod(IAsyncResult result)
{
var state = (MethodInvoker)result.AsyncState;
try
{
state.EndInvoke(result);
}
catch (Exception e)
{
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
}
}
I would rather not use the callback at all due to the fact that I do not need a result from the task that I am invoking asynchronously.
How can I improve this code with Task.Run() (or async and await)?
If I understood your requirements correctly, you want to kick off a task and then forget about it. When the task completes, and if an exception occurred, you want to log it.
I'd use Task.Run to create a task, followed by ContinueWith to attach a continuation task. This continuation task will log any exception that was thrown from the parent task. Also, use TaskContinuationOptions.OnlyOnFaulted to make sure the continuation only runs if an exception occurred.
Task.Run(() => {
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
}).ContinueWith(task => {
task.Exception.Handle(ex => {
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(ex, username);
});
}, TaskContinuationOptions.OnlyOnFaulted);
As a side-note, background tasks and fire-and-forget scenarios in ASP.NET are highly discouraged. See The Dangers of Implementing Recurring Background Tasks In ASP.NET
It may sound a bit out of scope, but if you just want to forget after you launch it, why not using directly ThreadPool?
Something like:
ThreadPool.QueueUserWorkItem(
x =>
{
try
{
// Do something
...
}
catch (Exception e)
{
// Log something
...
}
});
I had to do some performance benchmarking for different async call methods and I found that (not surprisingly) ThreadPool works much better, but also that, actually, BeginInvoke is not that bad (I am on .NET 4.5). That's what I found out with the code at the end of the post. I did not find something like this online, so I took the time to check it myself. Each call is not exactly equal, but it is more or less functionally equivalent in terms of what it does:
ThreadPool: 70.80ms
Task: 90.88ms
BeginInvoke: 121.88ms
Thread: 4657.52ms
public class Program
{
public delegate void ThisDoesSomething();
// Perform a very simple operation to see the overhead of
// different async calls types.
public static void Main(string[] args)
{
const int repetitions = 25;
const int calls = 1000;
var results = new List<Tuple<string, double>>();
Console.WriteLine(
"{0} parallel calls, {1} repetitions for better statistics\n",
calls,
repetitions);
// Threads
Console.Write("Running Threads");
results.Add(new Tuple<string, double>("Threads", RunOnThreads(repetitions, calls)));
Console.WriteLine();
// BeginInvoke
Console.Write("Running BeginInvoke");
results.Add(new Tuple<string, double>("BeginInvoke", RunOnBeginInvoke(repetitions, calls)));
Console.WriteLine();
// Tasks
Console.Write("Running Tasks");
results.Add(new Tuple<string, double>("Tasks", RunOnTasks(repetitions, calls)));
Console.WriteLine();
// Thread Pool
Console.Write("Running Thread pool");
results.Add(new Tuple<string, double>("ThreadPool", RunOnThreadPool(repetitions, calls)));
Console.WriteLine();
Console.WriteLine();
// Show results
results = results.OrderBy(rs => rs.Item2).ToList();
foreach (var result in results)
{
Console.WriteLine(
"{0}: Done in {1}ms avg",
result.Item1,
(result.Item2 / repetitions).ToString("0.00"));
}
Console.WriteLine("Press a key to exit");
Console.ReadKey();
}
/// <summary>
/// The do stuff.
/// </summary>
public static void DoStuff()
{
Console.Write("*");
}
public static double RunOnThreads(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var toProcess = calls;
var stopwatch = new Stopwatch();
var resetEvent = new ManualResetEvent(false);
var threadList = new List<Thread>();
for (var i = 0; i < calls; i++)
{
threadList.Add(new Thread(() =>
{
// Do something
DoStuff();
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
{
resetEvent.Set();
}
}));
}
stopwatch.Start();
foreach (var thread in threadList)
{
thread.Start();
}
resetEvent.WaitOne();
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnThreadPool(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var toProcess = calls;
var resetEvent = new ManualResetEvent(false);
var stopwatch = new Stopwatch();
var list = new List<int>();
for (var i = 0; i < calls; i++)
{
list.Add(i);
}
stopwatch.Start();
for (var i = 0; i < calls; i++)
{
ThreadPool.QueueUserWorkItem(
x =>
{
// Do something
DoStuff();
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
{
resetEvent.Set();
}
},
list[i]);
}
resetEvent.WaitOne();
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnBeginInvoke(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var beginInvokeStopwatch = new Stopwatch();
var delegateList = new List<ThisDoesSomething>();
var resultsList = new List<IAsyncResult>();
for (var i = 0; i < calls; i++)
{
delegateList.Add(DoStuff);
}
beginInvokeStopwatch.Start();
foreach (var delegateToCall in delegateList)
{
resultsList.Add(delegateToCall.BeginInvoke(null, null));
}
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(rs => !rs.IsCompleted))
{
Thread.Sleep(10);
}
beginInvokeStopwatch.Stop();
totalMs += beginInvokeStopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnTasks(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var resultsList = new List<Task>();
var stopwatch = new Stopwatch();
stopwatch.Start();
for (var i = 0; i < calls; i++)
{
resultsList.Add(Task.Factory.StartNew(DoStuff));
}
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(task => !task.IsCompleted))
{
Thread.Sleep(10);
}
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
}
Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited
Auditing is certainly not something I would call "fire and forget". Remember, on ASP.NET, "fire and forget" means "I don't care whether this code actually executes or not". So, if your desired semantics are that audits may occasionally be missing, then (and only then) you can use fire and forget for your audits.
If you want to ensure your audits are all correct, then either wait for the audit save to complete before sending the response, or queue the audit information to reliable storage (e.g., Azure queue or MSMQ) and have an independent backend (e.g., Azure worker role or Win32 service) process the audits in that queue.
But if you want to live dangerously (accepting that occasionally audits may be missing), you can mitigate the problems by registering the work with the ASP.NET runtime. Using the BackgroundTaskManager from my blog:
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
BackgroundTaskManager.Run(() =>
{
try
{
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
}
catch (Exception e)
{
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
}
});
base.OnActionExecuting(filterContext);
}

Parallel.For loop for spell checking using NHunspell and c#

I have a list of string and if I run spell checking using NHunspell in sequential manner then everything works fine; but if I use Parallel.For loop against the List the application stops working in the middle( some address violation error )
public static bool IsSpellingRight(string inputword, byte[] frDic, byte[] frAff, byte[] enDic, byte[] enAff)
{
if (inputword.Length != 0)
{
bool correct;
if (IsEnglish(inputword))
{
using (var hunspell = new Hunspell(enAff, enDic))
{
correct = hunspell.Spell(inputword);
}
}
else
{
using (var hunspell = new Hunspell(frAff, frDic))
{
correct = hunspell.Spell(inputword);
}
}
return correct ;
}
return false;
}
Edit:
var tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
var poptions = new ParallelOptions();
// Keep one core/CPU free...
poptions.MaxDegreeOfParallelism = Environment.ProcessorCount - 1;
Task task = Task.Factory.StartNew(delegate
{
Parallel.For(0, total, poptions, i =>
{
if (words[i] != "")
{
_totalWords++;
if (IsSpellingRight(words[i],dictFileBytes,
affFileBytes,dictFileBytesE,affFileBytesE))
{
// do something
}
else
{
BeginInvoke((Action) (() =>
{
//do something on UI thread
}));
}
}
});
}, tokenSource.Token);
task.ContinueWith((t) => BeginInvoke((Action) (() =>
{
MessaageBox.Show("Done");
})));
I think you should forget about your parallel loop implement the things right.
Are you aware of the fact that this code loads and constructs the dictionary:
using (var hunspell = new Hunspell(enAff, enDic))
{
correct = hunspell.Spell(inputword);
}
Your are loading and construction the dictionary over and over again with your code. This is awfully slow! Load Your dictionary once and check all words, then dispose it. And don't do this in parallel because Hunspell objects are not thread safe.
Pseodocode:
Hunspell hunspell = null;
try
{
hunspell = new Hunspell(enAff, enDic)
for( ... )
{
hunspell.Spell(inputword[i]);
}
}
}
finally
{
if( hunspell != null ) hunspell.Dispose();
}
If you need to check words massive in parallel consider to read this article:
http://www.codeproject.com/Articles/43769/Spell-Check-Hyphenation-and-Thesaurus-for-NET-with
Ok, now I can see a potential problem. In line
_totalWords++;
you're incrementing a value, that (I suppose) is declared somewhere outside the loop. Use locking mechanism.
edit:
Also, you could use Interlocked.Increment(ref val);, which would be faster, than simple locking.
edit2:
Here's how the locking I described in comment should look like for the problem you encounter:
static object Locker = new object(); //anywhere in the class
//then in your method
if (inputword.Length != 0)
{
bool correct;
bool isEnglish;
lock(Locker) {isEnglish = IsEnglish(inputword);}
if(isEnglish)
{
//..do your stuff
}
//rest of your function
}

Producer Consumer model using TPL, Tasks in .net 4.0

I have a fairly large XML file(around 1-2GB).
The requirement is to persist the xml data in to database.
Currently this is achieved in 3 steps.
Read the large file with less memory foot print as much as possible
Create entities from the xml-data
Store the data from the created entities in to the database using SqlBulkCopy.
To achieve better performance I want to create a Producer-consumer model where the producer creates a set of entities say a batch of 10K and adds it to a Queue. And the consumer should take the batch of entities from the queue and persist to the database using sqlbulkcopy.
Thanks,
Gokul
void Main()
{
int iCount = 0;
string fileName = #"C:\Data\CatalogIndex.xml";
DateTime startTime = DateTime.Now;
Console.WriteLine("Start Time: {0}", startTime);
FileInfo fi = new FileInfo(fileName);
Console.WriteLine("File Size:{0} MB", fi.Length / 1048576.0);
/* I want to change this loop to create a producer consumer pattern here to process the data parallel-ly
*/
foreach (var element in StreamElements(fileName,"title"))
{
iCount++;
}
Console.WriteLine("Count: {0}", iCount);
Console.WriteLine("End Time: {0}, Time Taken:{1}", DateTime.Now, DateTime.Now - startTime);
}
private static IEnumerable<XElement> StreamElements(string fileName, string elementName)
{
using (var rdr = XmlReader.Create(fileName))
{
rdr.MoveToContent();
while (!rdr.EOF)
{
if ((rdr.NodeType == XmlNodeType.Element) && (rdr.Name == elementName))
{
var e = XElement.ReadFrom(rdr) as XElement;
yield return e;
}
else
{
rdr.Read();
}
}
rdr.Close();
}
}
Is this what you are trying to do?
void Main()
{
const int inputCollectionBufferSize = 1024;
const int bulkInsertBufferCapacity = 100;
const int bulkInsertConcurrency = 4;
BlockingCollection<object> inputCollection = new BlockingCollection<object>(inputCollectionBufferSize);
Task loadTask = Task.Factory.StartNew(() =>
{
foreach (object nextItem in ReadAllElements(...))
{
// this will potentially block if there are already enough items
inputCollection.Add(nextItem);
}
// mark this collection as done
inputCollection.CompleteAdding();
});
Action parseAction = () =>
{
List<object> bulkInsertBuffer = new List<object>(bulkInsertBufferCapacity);
foreach (object nextItem in inputCollection.GetConsumingEnumerable())
{
if (bulkInsertBuffer.Length == bulkInsertBufferCapacity)
{
CommitBuffer(bulkInsertBuffer);
bulkInsertBuffer.Clear();
}
bulkInsertBuffer.Add(nextItem);
}
};
List<Task> parseTasks = new List<Task>(bulkInsertConcurrency);
for (int i = 0; i < bulkInsertConcurrency; i++)
{
parseTasks.Add(Task.Factory.StartNew(parseAction));
}
// wait before exiting
loadTask.Wait();
Task.WaitAll(parseTasks.ToArray());
}

ThreadPool QueueUserWorkItem with list

I would like to use the QueueUserWorkItem from the ThreadPool. When I use the following code everything works well.
private int ThreadCountSemaphore = 0;
private void (...) {
var reportingDataList = new List<LBReportingData>();
ThreadCountSemaphore = reportingDataList.Count;
using (var autoResetEvent = new AutoResetEvent(false)) {
ThreadPool.QueueUserWorkItem((o) => this.FillReportingData(settings, reportingDataList[0], autoResetEvent));
ThreadPool.QueueUserWorkItem((o) => this.FillReportingData(settings, reportingDataList[1], autoResetEvent));
ThreadPool.QueueUserWorkItem((o) => this.FillReportingData(settings, reportingDataList[2], autoResetEvent));
}
}
private void FillReportingData(...) {
if (Interlocked.Decrement(ref this.ThreadCountSemaphore) == 0) {
waitHandle.Set();
}
}
But when I use a list instead the single method calls, then my program crash without an exception.
private void (...) {
var reportingDataList = new List<LBReportingData>();
ThreadCountSemaphore = reportingDataList.Count;
using (var autoResetEvent = new AutoResetEvent(false)) {
ThreadPool.QueueUserWorkItem((o) => this.FillReportingData(settings, reportingDataList[i], autoResetEvent));
}
}
What do i wrong? What should I change?
Update
Sorry, I've made a fault in the code. I use .NET 2.0 with VS2010.
Here's the complete code:
private int ThreadCountSemaphore = 0;
private IList<LBReportingData> LoadReportsForBatch() {
var reportingDataList = new List<LBReportingData>();
var settings = OnNeedEntitySettings();
if (settings.Settings.ReportDefinition != null) {
var definitionList = new List<ReportDefinitionen> { ReportDefinitionen.OrgStatus, ReportDefinitionen.Mittelwerte, ReportDefinitionen.Verteilungsstatistik };
using (var autoResetEvent = new AutoResetEvent(false)) {
foreach (var reportDefinition in definitionList) {
foreach (DataRow row in settings.Settings.ReportDefinition.Select("AuswertungsTyp = " + (int)reportDefinition)) {
reportingDataList.Add(new LBReportingData { SourceData = row, ReportType = reportDefinition });
}
}
ThreadCountSemaphore = reportingDataList.Count;
foreach(var reportingDataItem in reportingDataList) {
ThreadPool.QueueUserWorkItem((o) => this.FillReportingData(settings, reportingDataItem, autoResetEvent));
}
autoResetEvent.WaitOne();
}
}
return reportingDataList;
}
private void FillReportingData(IEntitySettings<DSLBUReportDefinition> settings, LBReportingData reportingData, AutoResetEvent waitHandle){
DoSomeWork();
if (Interlocked.Decrement(ref this.ThreadCountSemaphore) == 0) {
waitHandle.Set();
}
}
Thanks
You are disposing the WaitHandle immediately after queueing the work items. There is race between the call to Dispose in the main thread and Set in the worker thread. There may be other problems, but it is difficult to guess because the code is incomplete.
Here is how the pattern is suppose to work.
using (var finished = new CountdownEvent(1))
{
foreach (var item in reportingDataList)
{
var captured = item;
finished.AddCount();
ThreadPool.QueueUserWorkItem(
(state) =>
{
try
{
DoSomeWork(captured); // FillReportingData?
}
finally
{
finished.Signal();
}
}, null);
}
finished.Signal();
finished.Wait();
}
The code uses the CountdownEvent class. It is available in .NET 4.0 or as part of the Reactive Extensions download.
As Hans pointed out, it is not clear where "i" is coming from. But also I can see your disposing block going out and disposed because you are not using WaitOne on it (or you have not copied that part of code).
Also I would prefer to use WaitAll and not using interlocked.

Categories