I tried searching for this but did not find the suggestion best suited for the issue that I am facing.
My issue is that we have list/stack of available resources (Calculation Engines). These resources are used to perform certain calculation.
The request to perform the calculation is triggered from an external process. So when the request for calculation is made, I need to check if any of the available resources are currently not performing other calculations, If so wait for some time and check again.
I was wondering what the best way to implement this is. I have the following code in place, but not sure if it is very safe.
If you have any further suggestions, that will be great:
void Process(int retries = 0) {
CalcEngineConnection connection = null;
bool securedConnection = false;
foreach (var calcEngineConnection in _connections) {
securedConnection = Monitor.TryEnter(calcEngineConnection);
if (securedConnection) {
connection = calcEngineConnection;
break;
}
}
if (securedConnection) {
//Dequeue the next request
var calcEnginePool = _pendingPool.Dequeue();
//Perform the operation and exit.
connection.RunCalc(calcEnginePool);
Monitor.Exit(connection);
}
else {
if (retries < 10)
retries += 1;
Thread.Sleep(200);
Process(retries);
}
}
I'm not sure that using Monitor is the best approach here anyway, but if you do decide to go that route, I'd refactor the above code to:
bool TryProcessWithRetries(int retries) {
for (int attempt = 0; attempt < retries; attempt++) {
if (TryProcess()) {
return true;
}
Thread.Sleep(200);
}
// Throw an exception here instead?
return false;
}
bool TryProcess() {
foreach (var connection in _connections) {
if (TryProcess(connection)) {
return true;
}
}
return false;
}
bool TryProcess(CalcEngineConnection connection) {
if (!Monitor.TryEnter(connection)) {
return false;
}
try {
var calcEnginePool = _pendingPool.Dequeue();
connection.RunCalc(calcEnginePool);
} finally {
Monitor.Exit(connection);
}
return true;
}
This decomposes the three pieces of logic:
Retrying several times
Trying each connection in a collection
Trying a single connection
It also avoids using recursion for the sake of it, and puts the Monitor.Exit call into a finally block, which it absolutely should be in.
You could replace the middle method implementation with:
return _connections.Any(TryProcess);
... but that may be a little too "clever" for its own good.
Personally I'd be tempted to move TryProcess into CalcEngineConnection itself - that way this code doesn't need to know about whether or not the connection is able to process something - it's up to the object itself. It means you can avoid having publicly visible locks, and also it would be flexible if some resources could (say) process two requests at a time in the future.
There are multiple issues that could potentially occur, but let's simplify your code first:
void Process(int retries = 0)
{
foreach (var connection in _connections)
{
if(Monitor.TryEnter(connection))
{
try
{
//Dequeue the next request
var calcEnginePool = _pendingPool.Dequeue();
//Perform the operation and exit.
connection.RunCalc(calcEnginePool);
}
finally
{
// Release the lock
Monitor.Exit(connection);
}
return;
}
}
if (retries < 10)
{
Thread.Sleep(200);
Process(retries+1);
}
}
This will correctly protect your connection, but note that one of the assumptions here is that your _connections list is safe and it will not be modified by another thread.
Furthermore, you might want to use a thread safe queue for the _connections because at certain load levels you might end up using only the first few connections (not sure if that will make a difference). In order to use all of your connections relatively evenly, I would place them in a queue and dequeue them. This will also guarantee that no two threads are using the same connection and you don't have to use the Monitor.TryEnter().
Related
Note: I've went through millions of questions when the issue is not disposing the reader/connection properly, or when the error is because of badly handled lazy loading. I believe that this issue is a different one, and probably related to MySQL's .NET connector.
I'm using MySQL server (5.6) database extensively through its .NET connector (6.8.3). All tables are created with MyISAM engine for performance reasons. I have only one process with one thread (update: in fact, it's not true, see below) accessing the DB sequentially, so there is no need for transactions and concurrency.
Today, after many hours of processing the following piece of code:
public IEnumerable<VectorTransition> FindWithSourceVector(double[] sourceVector)
{
var sqlConnection = this.connectionPool.Take();
this.selectWithSourceVectorCommand.Connection = sqlConnection;
this.selectWithSourceVectorCommand.Parameters["#epsilon"].Value
= this.epsilonEstimator.Epsilon.Min() / 10;
for (int d = 0; d < this.dimensionality; ++d)
{
this.selectWithSourceVectorCommand.Parameters["#source_" + d.ToString()]
.Value = sourceVector[d];
}
// *** the following line (201) throws the exception presented below
using (var reader = this.selectWithSourceVectorCommand.ExecuteReader())
{
while (reader.Read())
{
yield return ReaderToVectorTransition(reader);
}
}
this.connectionPool.Putback(sqlConnection);
}
threw the following exception:
MySqlException: There is already an open DataReader associated with this Connection which must be closed first.
Here is the relevant part of the stack trace:
at MySql.Data.MySqlClient.ExceptionInterceptor.Throw(Exception exception)
at MySql.Data.MySqlClient.MySqlConnection.Throw(Exception ex)
at MySql.Data.MySqlClient.MySqlCommand.CheckState()
at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader(CommandBehavior behavior)
at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader()
at implementation.VectorTransitionsMySqlTable.d__27.MoveNext() in C:\Users\bartoszp...\implementation\VectorTransitionsMySqlTable.cs:line 201
at System.Linq.Enumerable.d__3a1.MoveNext()
at System.Linq.Buffer1..ctor(IEnumerable1 source)
at System.Linq.Enumerable.ToArray[TSource](IEnumerable1 source)
at implementation.VectorTransitionService.Add(VectorTransition vectorTransition) in C:\Users\bartoszp...\implementation\VectorTransitionService.cs:line 38
at Program.Go[T](Environment`2 p, Space parentSpace, EpsilonEstimator epsilonEstimator, ThresholdEstimator thresholdEstimator, TransitionTransformer transitionTransformer, AmbiguityCalculator ac, VectorTransitionsTableFactory vttf, AxesTableFactory atf, NeighbourhoodsTableFactory ntf, AmbiguitySamplesTableFactory astf, AmbiguitySampleMatchesTableFactory asmtf, MySqlConnectionPool connectionPool, Boolean rejectDuplicates, Boolean addNew) in C:\Users\bartoszp...\Program.cs:line 323
The connectionPool.Take returns the first connection that satisfies the following predicate:
private bool IsAvailable(MySqlConnection connection)
{
var result = false;
try
{
if (connection != null
&& connection.State == System.Data.ConnectionState.Open)
{
result = connection.Ping();
}
}
catch (Exception e)
{
Console.WriteLine("Ping exception: " + e.Message);
}
return result && connection.State == System.Data.ConnectionState.Open;
}
(This is related to my previous question, when I resolved a different, but similar issue: MySQL fatal error during information_schema query (software caused connection abort))
The FindWithSourceVector method is called by the following piece of code:
var existing
= this.vectorTransitionsTable
.FindWithSourceVector(vectorTransition.SourceVector)
.Take(2)
.ToArray();
(I need to find at most two duplicate vectors) - this is the VectorTransitionService.cs:line 38 part of the stack trace.
Now the most interesting part: when the debugger stopped execution after the exception occured, I've investigated the sqlConnection object to find, that it doesn't have a reader associated with it (picture below)!
Why is this happening (apparently at "random" - this method was being called almost every minute for the last ~20h)? Can I avoid that (in ways other then guess-adding some sleeps when Ping throws an exception and praying it'll help)?
Additional information regarding the implementation of the connection pool:
Get is intended for methods that call only simple queries and are not using readers, so the returned connection can be used in a re-entrant way. It is not used directly in this example (because of the reader involved):
public MySqlConnection Get()
{
var result = this.connections.FirstOrDefault(IsAvailable);
if (result == null)
{
Reconnect();
result = this.connections.FirstOrDefault(IsAvailable);
}
return result;
}
The Reconnect method just iterates though the whole array and recreates and opens the connections.
Take uses Get but also removes the returned connection from the list of available connections so in case of some methods that during their usage of a reader call other methods that also need a connection, it will not be shared. This is also not the case here, as the FindSourceVector method is simple (doesn't call other methods that use the DB). However, the Take is used for the sake of convention - if there is a reader, use Take:
public MySqlConnection Take()
{
var result = this.Get();
var index = Array.IndexOf(this.connections, result);
this.connections[index] = null;
return result;
}
Putback just puts a connection to the first empty spot, or just forgets about it if the connection pool is full:
public void Putback(MySqlConnection mySqlConnection)
{
int index = Array.IndexOf(this.connections, null);
if (index >= 0)
{
this.connections[index] = mySqlConnection;
}
else if (mySqlConnection != null)
{
mySqlConnection.Close();
mySqlConnection.Dispose();
}
}
I suspect this is the problem, at the end of the method:
this.connectionPool.Putback(sqlConnection);
You're only taking two elements from the iterator - so you never complete the while loop unless there's actually only one value returned from the reader. Now you're using LINQ, which will automatically be calling Dispose() on the iterator, so your using statement will still be disposing of the reader - but you're not putting the connection back in the pool. If you do that in a finally block, I think you'll be okay:
var sqlConnection = this.connectionPool.Take();
try
{
// Other stuff here...
using (var reader = this.selectWithSourceVectorCommand.ExecuteReader())
{
while (reader.Read())
{
yield return ReaderToVectorTransition(reader);
}
}
}
finally
{
this.connectionPool.Putback(sqlConnection);
}
Or ideally, if your connection pool is your own implementation, make Take return something which implements IDisposable and returns the connection back to the pool when it's done.
Here's a short but complete program to demonstrate what's going on, without any actual databases involved:
using System;
using System.Collections.Generic;
using System.Linq;
class DummyReader : IDisposable
{
private readonly int limit;
private int count = -1;
public int Count { get { return count; } }
public DummyReader(int limit)
{
this.limit = limit;
}
public bool Read()
{
count++;
return count < limit;
}
public void Dispose()
{
Console.WriteLine("DummyReader.Dispose()");
}
}
class Test
{
static IEnumerable<int> FindValues(int valuesInReader)
{
Console.WriteLine("Take from the pool");
using (var reader = new DummyReader(valuesInReader))
{
while (reader.Read())
{
yield return reader.Count;
}
}
Console.WriteLine("Put back in the pool");
}
static void Main()
{
var data = FindValues(2).Take(2).ToArray();
Console.WriteLine(string.Join(",", data));
}
}
As written - modelling the situation with the reader only finding two values - the output is:
Take from the pool
DummyReader.Dispose()
0,1
Note that the reader is disposed, but we never get as far as returning anything from the pool. If you change Main to model the situation where the reader only has one value, like this:
var data = FindValues(1).Take(2).ToArray();
Then we get all the way through the while loop, so the output changes:
Take from the pool
DummyReader.Dispose()
Put back in the pool
0
I suggest you copy my program and experiment with it. Make sure you understand everything about what's going on... then you can apply it to your own code. You might want to read my article on iterator block implementation details too.
TyCobb and Jon Skeet have correctly guessed, that the problem was the pool implementation and multi-threading. I forgot that actually I did start some tiny Tasks in the Reconnect method. The first connection was created and opened synchronously but all other where opened asynchronously.
The idea was that because I only need one connection at time, there others can reconnect in different threads. However, because I didn't always put the connection back (as explained in Jon's answer) reconnecting was happening quite frequently, and because the system was quite loaded these reconnection threads weren't fast enough, which eventually led to race conditions. The fix is to reconnect in a more simple and straightforward manner:
private void Reconnect()
{
for (int i = 0; i < connections.Length; ++i)
{
if (!IsAvailable(this.connections[i]))
{
this.ReconnectAt(i);
}
}
}
private void ReconnectAt(int index)
{
try
{
this.connections[index] = new MySqlConnection(this.connectionString);
this.connections[index].Open();
}
catch (MySqlException mse)
{
Console.WriteLine("Reconnect error: " + mse.Message);
this.connections[index] = null;
}
}
I'm doing what amounts to a glorified mail merge and then file conversion to PDF... Based on .Net 4.5 I see a couple ways I can do the threading. The one using a thread safe queue seems interesting (Plan A), but I can see a potential problem. What do you think? I'll try to keep it short, but put in what is needed.
This works on the assumption that it will take far more time to do the database processing than the PDF conversion.
In both cases, the database processing for each file is done in its own thread/task, but PDF conversion could be done in many single threads/tasks (Plan B) or it can be done in a single long running thread (Plan A). It is that PDF conversion I am wondering about. It is all in a try/catch statement, but that thread must not fail or all fails (Plan A). Do you think that is a good idea? Any suggestions would be appreciated.
/* A class to process a file: */
public class c_FileToConvert
{
public string InFileName { get; set; }
public int FileProcessingState { get; set; }
public string ErrorMessage { get; set; }
public List<string> listData = null;
c_FileToConvert(string inFileName)
{
InFileName = inFileName;
FileProcessingState = 0;
ErrorMessage = ""; // yah, yah, yah - String.Empty
listData = new List<string>();
}
public void doDbProcessing()
{
// get the data from database and put strings in this.listData
DAL.getDataForFile(this.InFileName, this.ErrorMessage); // static function
if(this.ErrorMessage != "")
this.FileProcessingState = -1; //fatal error
else // Open file and append strings to it
{
foreach(string s in this.listData}
...
FileProcessingState = 1; // enum DB_WORK_COMPLETE ...
}
}
public void doPDFProcessing()
{
PDFConverter cPDFConverter = new PDFConverter();
cPDFConverter.convertToPDF(InFileName, InFileName + ".PDF");
FileProcessingState = 2; // enum PDF_WORK_COMPLETE ...
}
}
/*** These only for Plan A ***/
public ConcurrentQueue<c_FileToConvert> ConncurrentQueueFiles = new ConcurrentQueue<c_FileToConvert>();
public bool bProcessPDFs;
public void doProcessing() // This is the main thread of the Windows Service
{
List<c_FileToConvert> listcFileToConvert = new List<c_FileToConvert>();
/*** Only for Plan A ***/
bProcessPDFs = true;
Task task1 = new Task(new Action(startProcessingPDFs)); // Start it and forget it
task1.Start();
while(1 == 1)
{
List<string> listFileNamesToProcess = new List<string>();
DAL.getFileNamesToProcessFromDb(listFileNamesToProcess);
foreach(string s in listFileNamesToProcess)
{
c_FileToConvert cFileToConvert = new c_FileToConvert(s);
listcFileToConvert.Add(cFileToConvert);
}
foreach(c_FileToConvert c in listcFileToConvert)
if(c.FileProcessingState == 0)
Thread t = new Thread(new ParameterizedThreadStart(c.doDbProcessing));
/** This is Plan A - throw it on single long running PDF processing thread **/
foreach(c_FileToConvert c in listcFileToConvert)
if(c.FileProcessingState == 1)
ConncurrentQueueFiles.Enqueue(c);
/*** This is Plan B - traditional thread for each file conversion ***/
foreach(c_FileToConvert c in listcFileToConvert)
if(c.FileProcessingState == 1)
Thread t = new Thread(new ParameterizedThreadStart(c.doPDFProcessing));
int iCount = 0;
for(int iCount = 0; iCount < c_FileToConvert.Count; iCount++;)
{
if((c.FileProcessingState == -1) || (c.FileProcessingState == 2))
{
DAL.updateProcessingState(c.FileProcessingState)
listcFileToConvert.RemoveAt(iCount);
}
}
sleep(1000);
}
}
public void startProcessingPDFs() /*** Only for Plan A ***/
{
while (bProcessPDFs == true)
{
if (ConncurrentQueueFiles.IsEmpty == false)
{
try
{
c_FileToConvert cFileToConvert = null;
if (ConncurrentQueueFiles.TryDequeue(out cFileToConvert) == true)
cFileToConvert.doPDFProcessing();
}
catch(Exception e)
{
cFileToConvert.FileProcessingState = -1;
cFileToConvert.ErrorMessage = e.message;
}
}
}
}
Plan A seems like a nice solution, but what if the Task fails somehow? Yes, the PDF conversion can be done with individual threads, but I want to reserve them for the database processing.
This was written in a text editor as the simplest code I could, so there may be something, but I think I got the idea across.
How many files are you working with? 10? 100,000? If the number is very large, using 1 thread to run the DB queries for each file is not a good idea.
Threads are a very low-level control flow construct, and I advise you try to avoid a lot of messy and detailed thread spawning, joining, synchronizing, etc. etc. in your application code. Keep it stupidly simple if you can.
How about this: put the data you need for each file in a thread-safe queue. Create another thread-safe queue for results. Spawn some number of threads which repeatedly pull items from the input queue, run the queries, convert to PDF, then push the output into the output queue. The threads should share absolutely nothing but the input and output queues.
You can pick any number of worker threads which you like, or experiment to see what a good number is. Don't create 1 thread for each file -- just pick a number which allows for good CPU and disk utilization.
OR, if your language/libraries have a parallel map operator, use that. It will save you a lot of messing around.
Yesterday I was giving a talk about the new C# "async" feature, in particular delving into what the generated code looked like, and the GetAwaiter() / BeginAwait() / EndAwait() calls.
We looked in some detail at the state machine generated by the C# compiler, and there were two aspects we couldn't understand:
Why the generated class contains a Dispose() method and a $__disposing variable, which never appear to be used (and the class doesn't implement IDisposable).
Why the internal state variable is set to 0 before any call to EndAwait(), when 0 normally appears to mean "this is the initial entry point".
I suspect the first point could be answered by doing something more interesting within the async method, although if anyone has any further information I'd be glad to hear it. This question is more about the second point, however.
Here's a very simple piece of sample code:
using System.Threading.Tasks;
class Test
{
static async Task<int> Sum(Task<int> t1, Task<int> t2)
{
return await t1 + await t2;
}
}
... and here's the code which gets generated for the MoveNext() method which implements the state machine. This is copied directly from Reflector - I haven't fixed up the unspeakable variable names:
public void MoveNext()
{
try
{
this.$__doFinallyBodies = true;
switch (this.<>1__state)
{
case 1:
break;
case 2:
goto Label_00DA;
case -1:
return;
default:
this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
if (this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate))
{
return;
}
this.$__doFinallyBodies = true;
break;
}
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
this.<a2>t__$await4 = this.t2.GetAwaiter<int>();
this.<>1__state = 2;
this.$__doFinallyBodies = false;
if (this.<a2>t__$await4.BeginAwait(this.MoveNextDelegate))
{
return;
}
this.$__doFinallyBodies = true;
Label_00DA:
this.<>1__state = 0;
this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();
this.<>1__state = -1;
this.$builder.SetResult(this.<1>t__$await1 + this.<2>t__$await3);
}
catch (Exception exception)
{
this.<>1__state = -1;
this.$builder.SetException(exception);
}
}
It's long, but the important lines for this question are these:
// End of awaiting t1
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
// End of awaiting t2
this.<>1__state = 0;
this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();
In both cases the state is changed again afterwards before it's next obviously observed... so why set it to 0 at all? If MoveNext() were called again at this point (either directly or via Dispose) it would effectively start the async method again, which would be wholly inappropriate as far as I can tell... if and MoveNext() isn't called, the change in state is irrelevant.
Is this simply a side-effect of the compiler reusing iterator block generation code for async, where it may have a more obvious explanation?
Important disclaimer
Obviously this is just a CTP compiler. I fully expect things to change before the final release - and possibly even before the next CTP release. This question is in no way trying to claim this is a flaw in the C# compiler or anything like that. I'm just trying to work out whether there's a subtle reason for this that I've missed :)
Okay, I finally have a real answer. I sort of worked it out on my own, but only after Lucian Wischik from the VB part of the team confirmed that there really is a good reason for it. Many thanks to him - and please visit his blog (on archive.org), which rocks.
The value 0 here is only special because it's not a valid state which you might be in just before the await in a normal case. In particular, it's not a state which the state machine may end up testing for elsewhere. I believe that using any non-positive value would work just as well: -1 isn't used for this as it's logically incorrect, as -1 normally means "finished". I could argue that we're giving an extra meaning to state 0 at the moment, but ultimately it doesn't really matter. The point of this question was finding out why the state is being set at all.
The value is relevant if the await ends in an exception which is caught. We can end up coming back to the same await statement again, but we mustn't be in the state meaning "I'm just about to come back from that await" as otherwise all kinds of code would be skipped. It's simplest to show this with an example. Note that I'm now using the second CTP, so the generated code is slightly different to that in the question.
Here's the async method:
static async Task<int> FooAsync()
{
var t = new SimpleAwaitable();
for (int i = 0; i < 3; i++)
{
try
{
Console.WriteLine("In Try");
return await t;
}
catch (Exception)
{
Console.WriteLine("Trying again...");
}
}
return 0;
}
Conceptually, the SimpleAwaitable can be any awaitable - maybe a task, maybe something else. For the purposes of my tests, it always returns false for IsCompleted, and throws an exception in GetResult.
Here's the generated code for MoveNext:
public void MoveNext()
{
int returnValue;
try
{
int num3 = state;
if (num3 == 1)
{
goto Label_ContinuationPoint;
}
if (state == -1)
{
return;
}
t = new SimpleAwaitable();
i = 0;
Label_ContinuationPoint:
while (i < 3)
{
// Label_ContinuationPoint: should be here
try
{
num3 = state;
if (num3 != 1)
{
Console.WriteLine("In Try");
awaiter = t.GetAwaiter();
if (!awaiter.IsCompleted)
{
state = 1;
awaiter.OnCompleted(MoveNextDelegate);
return;
}
}
else
{
state = 0;
}
int result = awaiter.GetResult();
awaiter = null;
returnValue = result;
goto Label_ReturnStatement;
}
catch (Exception)
{
Console.WriteLine("Trying again...");
}
i++;
}
returnValue = 0;
}
catch (Exception exception)
{
state = -1;
Builder.SetException(exception);
return;
}
Label_ReturnStatement:
state = -1;
Builder.SetResult(returnValue);
}
I had to move Label_ContinuationPoint to make it valid code - otherwise it's not in the scope of the goto statement - but that doesn't affect the answer.
Think about what happens when GetResult throws its exception. We'll go through the catch block, increment i, and then loop round again (assuming i is still less than 3). We're still in whatever state we were before the GetResult call... but when we get inside the try block we must print "In Try" and call GetAwaiter again... and we'll only do that if state isn't 1. Without the state = 0 assignment, it will use the existing awaiter and skip the Console.WriteLine call.
It's a fairly tortuous bit of code to work through, but that just goes to show the kinds of thing that the team has to think about. I'm glad I'm not responsible for implementing this :)
if it was kept at 1 (first case) you would get a call to EndAwait without a call to BeginAwait. If it's kept at 2 (second case) you'd get the same result just on the other awaiter.
I'm guessing that calling the BeginAwait returns false if it has be started already (a guess from my side) and keeps the original value to return at the EndAwait. If that's the case it would work correctly whereas if you set it to -1 you might have an uninitialized this.<1>t__$await1 for the first case.
This however assumes that BeginAwaiter won't actually start the action on any calls after the first and that it will return false in those cases. Starting would of course be unacceptable since it could have side effect or simply give a different result. It also assumpes that the EndAwaiter will always return the same value no matter how many times it's called and that is can be called when BeginAwait returns false (as per the above assumption)
It would seem to be a guard against race conditions
If we inline the statements where movenext is called by a different thread after the state = 0 in questions it woule look something like the below
this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.<>1__state = 0;
//second thread
this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.$__doFinallyBodies = true;
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
//other thread
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
If the assumptions above are correct the there's some unneeded work done such as get sawiater and reassigning the same value to <1>t__$await1. If the state was kept at 1 then the last part would in stead be:
//second thread
//I suppose this un matched call to EndAwait will fail
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
further if it was set to 2 the state machine would assume it already had gotten the value of the first action which would be untrue and a (potentially) unassigned variable would be used to calculate the result
Could it be something to do with stacked/nested async calls ?..
i.e:
async Task m1()
{
await m2;
}
async Task m2()
{
await m3();
}
async Task m3()
{
Thread.Sleep(10000);
}
Does the movenext delegate get called multiple times in this situation ?
Just a punt really?
Explanation of actual states:
possible states:
0 Initialized (i think so) or waiting for end of operation
>0 just called MoveNext, chosing next state
-1 ended
Is it possible that this implementation just wants to assure that if another Call to MoveNext from whereever happens (while waiting) it will reevaluate the whole state-chain again from the beginning, to reevaluate results which could be in the mean time already outdated?
I'm building a T4 template that will help people construct Azure queues in a consistent and simple manner. I'd like to make this self-documenting, and somewhat consistent.
First I made the queue name at the top of the file, the queue names have to be in lowercase so I added ToLower()
The public constructor uses the built-in StorageClient API's to access the connection strings. I've seen many different approaches to this, and would like to get something that works in almost all situations. (ideas? do share)
I dislike the unneeded HTTP requests to check if the queues have been created so I made is a static bool . I didn't implement a Lock(monitorObject) since I don't think one is needed.
Instead of using a string and parsing it with commas (like most MSDN documentation) I'm serializing the object when passing it into the queue.
For further optimization I'm using a JSON serializer extension method to get the most out of the 8k limit. Not sure if an encoding will help optimize this any more
Added retry logic to handle certain scenarios that occur with the queue (see html link)
Q: Is "DataContext" appropriate name for this class?
Q: Is it a poor practice to name the Queue Action Name in the manner I have done?
What additional changes do you think I should make?
public class AgentQueueDataContext
{
// Queue names must always be in lowercase
// Is named like a const, but isn't one because .ToLower won't compile...
static string AGENT_QUEUE_ACTION_NAME = "AgentQueueActions".ToLower();
static bool QueuesWereCreated { get; set; }
DataModel.SecretDataSource secDataSource = null;
CloudStorageAccount cloudStorageAccount = null;
CloudQueueClient cloudQueueClient = null;
CloudQueue queueAgentQueueActions = null;
static AgentQueueDataContext()
{
QueuesWereCreated = false;
}
public AgentQueueDataContext() : this(false)
{
}
public AgentQueueDataContext(bool CreateQueues)
{
// This pattern of setting up queues is from:
// ttp://convective.wordpress.com/2009/11/15/queues-azure-storage-client-v1-0/
//
this.cloudStorageAccount = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
this.cloudQueueClient = cloudStorageAccount.CreateCloudQueueClient();
this.secDataSource = new DataModel.SecretDataSource();
queueAgentQueueActions = cloudQueueClient.GetQueueReference(AGENT_QUEUE_ACTION_NAME);
if (QueuesWereCreated == false || CreateQueues)
{
queueAgentQueueActions.CreateIfNotExist();
QueuesWereCreated = true;
}
}
// This is the method that will be spawned using ThreadStart
public void CheckQueue()
{
while (true)
{
try
{
CloudQueueMessage msg = queueAgentQueueActions.GetMessage();
bool DoRetryDelayLogic = false;
if (msg != null)
{
// Deserialize using JSON (allows more data to be stored)
AgentQueueEntry actionableMessage = msg.AsString.FromJSONString<AgentQueueEntry>();
switch (actionableMessage.ActionType)
{
case AgentQueueActionEnum.EnrollNew:
{
// Add to
break;
}
case AgentQueueActionEnum.LinkToSite:
{
// Link within Agent itself
// Link within Site
break;
}
case AgentQueueActionEnum.DisableKey:
{
// Disable key in site
// Disable key in AgentTable (update modification time)
break;
}
default:
{
break;
}
}
//
// Only delete the message if the requested agent has been missing for
// at least 10 minutes
//
if (DoRetryDelayLogic)
{
if (msg.InsertionTime != null)
if (msg.InsertionTime < DateTime.UtcNow + new TimeSpan(0, 10, 10))
continue;
// ToDo: Log error: AgentID xxx has not been found in table for xxx minutes.
// It is likely the result of a the registratoin host crashing.
// Data is still consistent. Deleting queued message.
}
//
// If execution made it to this point, then we are either fully processed, or
// there is sufficent reason to discard the message.
//
try
{
queueAgentQueueActions.DeleteMessage(msg);
}
catch (StorageClientException ex)
{
// As of July 2010, this is the best way to detect this class of exception
// Description: ttp://blog.smarx.com/posts/deleting-windows-azure-queue-messages-handling-exceptions
if (ex.ExtendedErrorInformation.ErrorCode == "MessageNotFound")
{
// pop receipt must be invalid
// ignore or log (so we can tune the visibility timeout)
}
else
{
// not the error we were expecting
throw;
}
}
}
else
{
// allow control to fall to the bottom, where the sleep timer is...
}
}
catch (Exception e)
{
// Justification: Thread must not fail.
//Todo: Log this exception
// allow control to fall to the bottom, where the sleep timer is...
// Rationale: not doing so may cause queue thrashing on a specific corrupt entry
}
// todo: Thread.Sleep() is bad
// Replace with something better...
Thread.Sleep(9000);
}
Q: Is "DataContext" appropriate name for this class?
In .NET we have a lot of DataContext classes, so in the sense that you want names to appropriately communicate what the class does, I think XyzQueueDataContext properly communicates what the class does - although you can't query from it.
If you want to stay more aligned to accepted pattern languages, Patterns of Enterprise Application Architecture calls any class that encapsulates access to an external system for a Gateway, while more specifically you may want to use the term Channel in the language of Enterprise Integration Patterns - that's what I would do.
Q: Is it a poor practice to name the Queue Action Name in the manner I have done?
Well, it certainly tightly couples the queue name to the class. This means that if you later decide that you want to decouple those, you can't.
As a general comment I think this class might benefit from trying to do less. Using the queue is not the same thing as managing it, so instead of having all of that queue management code there, I'd suggest injecting a CloudQueue into the instance. Here's how I implement my AzureChannel constructor:
private readonly CloudQueue queue;
public AzureChannel(CloudQueue queue)
{
if (queue == null)
{
throw new ArgumentNullException("queue");
}
this.queue = queue;
}
This better fits the Single Responsibility Principle and you can now implement queue management in its own (reusable) class.
I have a thread which produces data in the form of simple object (record). The thread may produce a thousand records for each one that successfully passes a filter and is actually enqueued. Once the object is enqueued it is read-only.
I have one lock, which I acquire once the record has passed the filter, and I add the item to the back of the producer_queue.
On the consumer thread, I acquire the lock, confirm that the producer_queue is not empty,
set consumer_queue to equal producer_queue, create a new (empty) queue, and set it on producer_queue. Without any further locking I process consumer_queue until it's empty and repeat.
Everything works beautifully on most machines, but on one particular dual-quad server I see in ~1/500k iterations an object that is not fully initialized when I read it out of consumer_queue. The condition is so fleeting that when I dump the object after detecting the condition the fields are correct 90% of the time.
So my question is this: how can I assure that the writes to the object are flushed to main memory when the queue is swapped?
Edit:
On the producer thread:
(producer_queue above is m_fillingQueue; consumer_queue above is m_drainingQueue)
private void FillRecordQueue() {
while (!m_done) {
int count;
lock (m_swapLock) {
count = m_fillingQueue.Count;
}
if (count > 5000) {
Thread.Sleep(60);
} else {
DataRecord rec = GetNextRecord();
if (rec == null) break;
lock (m_swapLock) {
m_fillingQueue.AddLast(rec);
}
}
}
}
In the consumer thread:
private DataRecord Next(bool remove) {
bool drained = false;
while (!drained) {
if (m_drainingQueue.Count > 0) {
DataRecord rec = m_drainingQueue.First.Value;
if (remove) m_drainingQueue.RemoveFirst();
if (rec.Time < FIRST_VALID_TIME) {
throw new InvalidOperationException("Detected invalid timestamp in Next(): " + rec.Time + " from record " + rec);
}
return rec;
} else {
lock (m_swapLock) {
m_drainingQueue = m_fillingQueue;
m_fillingQueue = new LinkedList<DataRecord>();
if (m_drainingQueue.Count == 0) drained = true;
}
}
}
return null;
}
The consumer is rate-limited, so it can't get ahead of the consumer.
The behavior I see is that sometimes the Time field is reading as DateTime.MinValue; by the time I construct the string to throw the exception, however, it's perfectly fine.
Have you tried the obvious: is microcode update applied on the fancy 8-core box(via BIOS update)? Did you run Windows Updates to get the latest processor driver?
At the first glance, it looks like you're locking your containers. So I am recommending the systems approach, as it sound like you're not seeing this issue on a good-ol' dual core box.
Assuming these are in fact the only methods that interact with the m_fillingQueue variable, and that DataRecord cannot be changed after GetNextRecord() creates it (read-only properties hopefully?), then the code at least on the face of it appears to be correct.
In which case I suggest that GregC's answer would be the first thing to check; make sure the failing machine is fully updated (OS / drivers / .NET Framework), becasue the lock statement should involve all the required memory barriers to ensure that the rec variable is fully flushed out of any caches before the object is added to the list.