Related
We are using the following method in a Stateful Service on Service-Fabric. The service has partitions. Sometimes we get a FabricNotReadableException from this peace of code.
public async Task HandleEvent(EventHandlerMessage message)
{
var queue = await StateManager.GetOrAddAsync<IReliableQueue<EventHandlerMessage>>(EventHandlerServiceConstants.EventHandlerQueueName);
using(ITransaction tx = StateManager.CreateTransaction())
{
await queue.EnqueueAsync(tx, message);
await tx.CommitAsync();
}
}
Does that mean that the partition is down and is being moved? Of that we hit a secondary partition? Because there is also a FabricNotPrimaryException that is being raised in some cases.
I have seen the MSDN link (https://msdn.microsoft.com/en-us/library/azure/system.fabric.fabricnotreadableexception.aspx). But what does
Represents an exception that is thrown when a partition cannot accept reads.
mean? What happened that a partition cannot accept a read?
Under the covers Service Fabric has several states that can impact whether a given replica can safely serve reads and writes. They are:
Granted (you can think of this as normal operation)
Not Primary
No Write Quorum (again mainly impacting writes)
Reconfiguration Pending
FabricNotPrimaryException which you mention can be thrown whenever a write is attempted on a replica which is not currently the Primary, and maps to the NotPrimary state.
FabricNotReadableException maps to the other states (you don't really need to worry or differentiate between them), and can happen in a variety of cases. One example is if the replica you are trying to perform the read on is a "Standby" replica (a replica which was down and which has been recovered, but there are already enough active replicas in the replica set). Another example is if the replica is a Primary but is being closed (say due to an upgrade or because it reported fault), or if it is currently undergoing a reconfiguration (say for example that another replica is being added). All of these conditions will result in the replica not being able to satisfy writes for a small amount of time due to certain safety checks and atomic changes that Service Fabric needs to handle under the hood.
You can consider FabricNotReadableException retriable. If you see it, just try the call again and eventually it will resolve into either NotPrimary or Granted. If you get FabricNotPrimary exception, generally this should be thrown back to the client (or the client in some way notified) that it needs to re-resolve in order to find the current Primary (the default communication stacks that Service Fabric ships take care of watching for non-retriable exceptions and re-resolving on your behalf).
There are two current known issues with FabricNotReadableException.
FabricNotReadableException should have two variants. The first should be explicitly retriable (FabricTransientNotReadableException) and the second should be FabricNotReadableException. The first version (Transient) is the most common and is probably what you are running into, certainly what you would run into in the majority of cases. The second (non-transient) would be returned in the case where you end up talking to a Standby replica. Talking to a standby won't happen with the out of the box transports and retry logic, but if you have your own it is possible to run into it.
The other issue is that today the FabricNotReadableException should be deriving from FabricTransientException, making it easier to determine what the correct behavior is.
Posted as an answer (to asnider's comment - Mar 16 at 17:42) because it was too long for comments! :)
I am also stuck in this catch 22. My svc starts and immediately receives messages. I want to encapsulate the service startup in OpenAsync and set up some ReliableDictionary values, then start receiving message. However, at this point the Fabric is not Readable and I need to split this "startup" between OpenAsync and RunAsync :(
RunAsync in my service and OpenAsync in my client also seem to have different Cancellation tokens, so I need to work around how to deal with this too. It just all feels a bit messy. I have a number of ideas on how to tidy this up in my code but has anyone come up with an elegant solution?
It would be nice if ICommunicationClient had a RunAsync interface that was called when the Fabric becomes ready/readable and cancelled when the Fabric shuts down the replica - this would seriously simplify my life. :)
I was running into the same problem. My listener was starting up before the main thread of the service. I queued the list of listeners needing to be started, and then activated them all early on in the main thread. As a result, all messages coming in were able to be handled and placed into the appropriate reliable storage. My simple solution (this is a service bus listener):
public Task<string> OpenAsync (CancellationToken cancellationToken)
{
string uri;
Start ();
uri = "<your endpoint here>";
return Task.FromResult (uri);
}
public static object lockOperations = new object ();
public static bool operationsStarted = false;
public static List<ClientAuthorizationBusCommunicationListener> pendingStarts = new List<ClientAuthorizationBusCommunicationListener> ();
public static void StartOperations ()
{
lock (lockOperations)
{
if (!operationsStarted)
{
foreach (ClientAuthorizationBusCommunicationListener listener in pendingStarts)
{
listener.DoStart ();
}
operationsStarted = true;
}
}
}
private static void QueueStart (ClientAuthorizationBusCommunicationListener listener)
{
lock (lockOperations)
{
if (operationsStarted)
{
listener.DoStart ();
}
else
{
pendingStarts.Add (listener);
}
}
}
private void Start ()
{
QueueStart (this);
}
private void DoStart ()
{
ServiceBus.WatchStatusChanges (HandleStatusMessage,
this.clientId,
out this.subscription);
}
========================
In the main thread, you call the function to start listener operations:
protected override async Task RunAsync (CancellationToken cancellationToken)
{
ClientAuthorizationBusCommunicationListener.StartOperations ();
...
This problem likely manifested itself here as the bus in question already had messages and started firing the second the listener was created. Trying to access anything in state manager was throwing the exception you were asking about.
I've got a C# console app running on Windows Server 2003 whose purpose is to read a table called Notifications and a field called "NotifyDateTime" and send an email when that time is reached. I have it scheduled via Task Scheduler to run hourly, check to see if the NotifyDateTime falls within that hour, and then send the notifications.
It seems like because I have the notification date/times in the database that there should be a better way than re-running this thing every hour.
Is there a lightweight process/console app I could leave running on the server that reads in the day's notifications from the table and issues them exactly when they're due?
I thought service, but that seems overkill.
My suggestion is to write simple application, which uses Quartz.NET.
Create 2 jobs:
First, fires once a day, reads all awaiting notification times from database planned for that day, creates some triggers based on them.
Second, registered for such triggers (prepared by the first job), sends your notifications.
What's more,
I strongly advice you to create windows service for such purpose, just not to have lonely console application constantly running. It can be accidentally terminated by someone who have access to the server under the same account. What's more, if the server will be restarted, you have to remember to turn such application on again, manually, while the service can be configured to start automatically.
If you're using web application you can always have this logic hosted e.g. within IIS Application Pool process, although it is bad idea whatsoever. It's because such process is by default periodically restarted, so you should change its default configuration to be sure it is still working in the middle of the night, when application is not used. Unless your scheduled tasks will be terminated.
UPDATE (code samples):
Manager class, internal logic for scheduling and unscheduling jobs. For safety reasons implemented as a singleton:
internal class ScheduleManager
{
private static readonly ScheduleManager _instance = new ScheduleManager();
private readonly IScheduler _scheduler;
private ScheduleManager()
{
var properties = new NameValueCollection();
properties["quartz.scheduler.instanceName"] = "notifier";
properties["quartz.threadPool.type"] = "Quartz.Simpl.SimpleThreadPool, Quartz";
properties["quartz.threadPool.threadCount"] = "5";
properties["quartz.threadPool.threadPriority"] = "Normal";
var sf = new StdSchedulerFactory(properties);
_scheduler = sf.GetScheduler();
_scheduler.Start();
}
public static ScheduleManager Instance
{
get { return _instance; }
}
public void Schedule(IJobDetail job, ITrigger trigger)
{
_scheduler.ScheduleJob(job, trigger);
}
public void Unschedule(TriggerKey key)
{
_scheduler.UnscheduleJob(key);
}
}
First job, for gathering required information from the database and scheduling notifications (second job):
internal class Setup : IJob
{
public void Execute(IJobExecutionContext context)
{
try
{
foreach (var kvp in DbMock.ScheduleMap)
{
var email = kvp.Value;
var notify = new JobDetailImpl(email, "emailgroup", typeof(Notify))
{
JobDataMap = new JobDataMap {{"email", email}}
};
var time = new DateTimeOffset(DateTime.Parse(kvp.Key).ToUniversalTime());
var trigger = new SimpleTriggerImpl(email, "emailtriggergroup", time);
ScheduleManager.Instance.Schedule(notify, trigger);
}
Console.WriteLine("{0}: all jobs scheduled for today", DateTime.Now);
}
catch (Exception e) { /* log error */ }
}
}
Second job, for sending emails:
internal class Notify: IJob
{
public void Execute(IJobExecutionContext context)
{
try
{
var email = context.MergedJobDataMap.GetString("email");
SendEmail(email);
ScheduleManager.Instance.Unschedule(new TriggerKey(email));
}
catch (Exception e) { /* log error */ }
}
private void SendEmail(string email)
{
Console.WriteLine("{0}: sending email to {1}...", DateTime.Now, email);
}
}
Database mock, just for purposes of this particular example:
internal class DbMock
{
public static IDictionary<string, string> ScheduleMap =
new Dictionary<string, string>
{
{"00:01", "foo#gmail.com"},
{"00:02", "bar#yahoo.com"}
};
}
Main entry of the application:
public class Program
{
public static void Main()
{
FireStarter.Execute();
}
}
public class FireStarter
{
public static void Execute()
{
var setup = new JobDetailImpl("setup", "setupgroup", typeof(Setup));
var midnight = new CronTriggerImpl("setuptrigger", "setuptriggergroup",
"setup", "setupgroup",
DateTime.UtcNow, null, "0 0 0 * * ?");
ScheduleManager.Instance.Schedule(setup, midnight);
}
}
Output:
If you're going to use service, just put this main logic to the OnStart method (I advice to start the actual logic in a separate thread not to wait for the service to start, and the same avoid possible timeouts - not in this particular example obviously, but in general):
protected override void OnStart(string[] args)
{
try
{
var thread = new Thread(x => WatchThread(new ThreadStart(FireStarter.Execute)));
thread.Start();
}
catch (Exception e) { /* log error */ }
}
If so, encapsulate the logic in some wrapper e.g. WatchThread which will catch any errors from the thread:
private void WatchThread(object pointer)
{
try
{
((Delegate) pointer).DynamicInvoke();
}
catch (Exception e) { /* log error and stop service */ }
}
You trying to implement polling approach, where a job is monitoring a record in DB for any changes.
In this case we are trying to hit DB for periodic time, so if the one hour delay reduced to 1 min later stage, then this solution turns to performance bottle neck.
Approach 1
For this scenario please use Queue based approach to avoid any issues, you can also scale up number of instances if you are sending so many emails.
I understand there is a program updates NotifyDateTime in a table, the same program can push a message to Queue informing that there is a notification to handle.
There is a windows service looking after this queue for any incoming messages, when there is a message it performs the required operation (ie sending email).
Approach 2
http://msdn.microsoft.com/en-us/library/vstudio/zxsa8hkf(v=vs.100).aspx
you can also invoke C# code from SQL Server Stored procedure if you are using MS SQL Server. but in this case you are making use of your SQL server process to send mail, which is not a good practice.
However you can invoke a web service, OR WCF service which can send emails.
But Approach 1 is error free, Scalable , Track-able, Asynchronous , and doesn't trouble your data base OR APP, you have different process to send email.
Queues
Use MSMQ which is part of windows server
You can also try https://www.rabbitmq.com/dotnet.html
Pre-scheduled tasks (at undefined times) are generally a pain to handle, as opposed to scheduled tasks where Quartz.NET seems well suited.
Furthermore, another distinction is to be made between fire-and-forget for tasks that shouldn't be interrupted/change (ex. retries, notifications) and tasks that need to be actively managed (ex. campaign or communications).
For the fire-and-forget type tasks a message queue is well suited. If the destination is unreliable, you will have to opt for retry levels (ex. try send (max twice), retry after 5 minutes, try send (max twice), retry after 15 minutes) that at least require specifying message specific TTL's with a send and retry queue. Here's an explanation with a link to code to setup a retry level queue
The managed pre-scheduled tasks will require that you use a database queue approach (Click here for a CodeProject article on designing a database queue for scheduled tasks)
. This will allow you to update, remove or reschedule notifications given you keep track of ownership identifiers (ex. specifiy a user id and you can delete all pending notifications when the user should no longer receive notifications such as being deceased/unsubscribed)
Scheduled e-mail tasks (including any communication tasks) require finer grained control (expiration, retry and time-out mechanisms). The best approach to take here is to build a state machine that is able to process the e-mail task through its steps (expiration, pre-validation, pre-mailing steps such as templating, inlining css, making links absolute, adding tracking objects for open tracking, shortening links for click tracking, post-validation and sending and retrying).
Hopefully you are aware that the .NET SmtpClient isn't fully compliant with the MIME specifications and that you should be using a SAAS e-mail provider such as Amazon SES, Mandrill, Mailgun, Customer.io or Sendgrid. I'd suggest you look at Mandrill or Mailgun. Also if you have some time, take a look at MimeKit which you can use to construct MIME messages for the providers allow sending raw e-mail and doesn't necessarily support things like attachments/custom headers/DKIM signing.
I hope this sets you on the right path.
Edit
You will have to use a service to poll at specific intervals (ex. 15 seconds or 1 minute). The database load can be somewhat negated by checkout out a certain amount of due tasks at a time and keeping an internal pool of messages due for sending (with a time-out mechanism in place). When there's no messages returned, just 'sleep' the polling for a while. I'd would advise against building such a system out against a single table in a database - instead design an independant e-mail scheduling system that you can integrate with.
I would turn it into a service instead.
You can use System.Threading.Timer event handler for each of the scheduled times.
Scheduled tasks can be scheduled to run just once at a specific time (as opposed to hourly, daily, etc.), so one option would be to create the scheduled task when the specific field in your database changes.
You don't mention which database you use, but some databases support the notion of a trigger, e.g. in SQL: http://technet.microsoft.com/en-us/library/ms189799.aspx
If you know when the emails need to be sent ahead of time then I suggest that you use a wait on an event handle with the appropriate timeout. At midnight look at the table then wait on an event handle with the timeout set to expire when the next email needs to be sent. After sending the email wait again with the timeout set based on the next mail that should be sent.
Also, based on your description, this should probably be implemented as a service but it is not required.
I have been dealing with the same problem about three years ago. I have changed the process several times before it was good enough, I tell you why:
First implementation was using special deamon from webhosting which called the IIS website. The website checked the caller IP and then check the database and send emails. This was working until one day, when I got a lot of very dirty emails from the users that I have totally spammed their mailboxes. The drawback of keeping email in database and sending from SMTP email is that there is NOTHING which ensure DB to SMTP transaction. You are never sure if the email has been successfully sent or not. Sending email can be successfull, can failed or it can be false positive or it can be false negative (SMTP client tells you, that the email was not sent, but it was). There was some problem with SMTP server and the server returned false(email not send), but the email was sent. The daemon was resending the email every hour the whole day before the dirty emails appears.
Second implementation: To prevent spamming, I have changed the algorithm, that the email is considered to be sent even if it failed (my email notification was not too important). My first advice is: "Don't launch the deamon too often, because this false negative smtp error makes users upset."
After several month there were some changes on the server and the daemon was not working well. I got the idea from the stackoverflow: bind the .NET timer to the web application domain. It wasn't good idea, because it seems, that IIS can restart application from time to time because of memory leaks and the timer never fires if the restarts are more often then timer ticks.
The last implementation. Windows scheduler every hour fires python batch which read local website. This fire ASP.NET code. The advantage is that time windows scheduler call the the local batch and website reliably. IIS doesn't hang, it has restart ability. The timer site is part of my website, it is still one projects. (you can use console app instead). Simple is better. It just works!
Your first choice is the correct option in my opinion. Task Scheduler is the MS recommended way to perform periodic jobs. Moreover it's flexible, can reports failures to ops, is optimized and amortized amongst all tasks in the system, ...
Creating any console-kind app that runs all the time is fragile. It can be shutdown by anyone, needs an open seesion, doesn't restart automatically, ...
The other option is creating some kind of service. It's guaranteed to be running all the time, so that would at least work. But what was your motivation?
"It seems like because I have the notification date/times in the database that there should be a better way than re-running this thing every hour."
Oh yeah optimization... So you want to add a new permanently running service to your computer so that you avoid one potentially unrequired SQL query every hour? The cure looks worse than the disease to me.
And I didn't mention all the drawbacks of the service. On one hand, your task uses no resource when it doesn't run. It's very simple, lightweight and the query efficient (provided you have the right index).
On the other hand, if your service crashes it's probably gone for good. It needs a way to be notified of new e-mails that may need to be sent earlier than what's currently scheduled. It permanently uses computer resources, such as memory. Worse, it may contain memory leaks.
I think that the cost/benefit ratio is very low for any solution other than the trivial periodic task.
I am working on an assignment in asp.net to send notification email to users at specific intervals.
But the problem is that since the server is not privately owned i cannot implement a windows service on it.
Any ideas?
There's no reliable way to achieve that. If you cannot install a Windows Service on the host you could write a endpoint (.aspx or .ashx) that will send the email and then purchase on some other site a service which will ping this endpoint at regular intervals by sending it HTTP request. Obviously you should configure this endpoint to be accessible only from the IP address of the provider you purchase the service from, otherwise anyone could send an HTTP request to the endpoint and trigger the process which is probably undesirable.
Further reading: The Dangers of Implementing Recurring Background Tasks In ASP.NET.
There are several ways to get code executing on an interval that don't require a windows service.
One option is to use the Cache class - use one of the Insert overloads that takes a CacheItemRemovedCallback - this will be called when the cache item is removed. You can re-add the cache item with this callback again and again...
Though, the first thing you need to do is contact the hosting company and find out if they already have some sort of solution for you.
You could set up a scheduled task on the server to invoke a program with the desired action.
You can always use a System.Timer and create a call at specific intervals. What you need to be careful is that this must be run one time, eg on application start, but if you have more than one pools, then it may run more times, and you also need to access some database to read the data of your actions.
using System.Timers;
var oTimer = new Timer();
oTimer.Interval = 30000; // 30 second
oTimer.Elapsed += new ElapsedEventHandler(MyThreadFun);
oTimer.Start();
private static void MyThreadFun(object sender, ElapsedEventArgs e)
{
// inside here you read your query from the database
// get the next email that must be send,
// you send them, and mark them as send, log the errors and done.
}
why I select system timer:
http://msdn.microsoft.com/en-us/magazine/cc164015.aspx
more words
I use this in a more complex class and its work fine. What are the points that I have also made.
Signaling the application stop, to wait for the timer to end.
Use mutex and database for synchronize the works.
Easiest solution is to exploit global.asax application events
On application startup event, create a thread (or task) into a static singleton variable in the global class.
The thread/task/workitem will have an endless loop while(true) {...} with your "service like" code inside.
You'll also want to put a Thread.Sleep(60000) in the loop so it doesn't eat unnecessary CPU cycles.
static void FakeService(object obj) {
while(true) {
try {
// - get a list of users to send emails to
// - check the current time and compare it to the interval to send a new email
// - send emails
// - update the last_email_sent time for the users
} catch (Exception ex) {
// - log any exceptions
// - choose to keep the loop (fake service) running or end it (return)
}
Thread.Sleep(60000); //run the code in this loop every ~60 seconds
}
}
EDIT Because your task is more or less a simple timer job any of the ACID type concerns from an app pool reset or other error don't really apply, because it can just start up again and keep trucking along with any data corruption. But you could also use the thread to simply execute a request to an aspx or ashx that would hold your logic.
new WebClient().DownloadString("http://localhost/EmailJob.aspx");
Can anyone point me to a good working solution to the following problem?
The application I'm working on needs to communicate over TCP to software running on another system. Some of the requests I send to that system can take a long time to complete (up to 15sec).
In my application I have a number of threads, including the main UI thread, which can access the service which communicates with the remote system. There is only a single instance of the service which is accessed by all threads.
I need to only allow a single request to be processed at a time, i.e. it needs to be serialized, otherwise bad things happen with the TCP comms.
Attempted Solutions so far
Initially I tried using lock() with a static object to protect each 'command' method, as follows:
lock (_cmdLock)
{
SetPosition(position);
}
However I found that sometimes it wouldn't release the lock, even though there are timeouts on the remote system and on the TCP comms. Additionally, if two calls came in from the same thread (e.g. a user double clicked a button) then it would get past the lock - after reading up about locking again I know that the same thread won't wait for the lock.
I then tried to use AutoResetEvents to only allow a single call through at a time. But without the locking it wouldn't work with multiple threads. The following is the code I used to send a command (from the calling thread) and process a command request (running in the background on its own thread)
private static AutoResetEvent _cmdProcessorReadyEvent = new AutoResetEvent(false);
private static AutoResetEvent _resultAvailableEvent = new AutoResetEvent(false);
private static AutoResetEvent _sendCommandEvent = new AutoResetEvent(false);
// This method is called to send each command and can run on different threads
private bool SendCommand(Command cmd)
{
// Wait for processor thread to become ready for next cmd
if (_cmdProcessorReadyEvent.WaitOne(_timeoutSec + 500))
{
lock (_sendCmdLock)
{
_currentCommand = cmd;
}
// Tell the processor thread that there is a command present
_sendCommandEvent.Set();
// Wait for a result from the processor thread
if (!_resultAvailableEvent.WaitOne(_timeoutSec + 500))
_lastCommandResult.Timeout = true;
}
return _lastCommandResult.Success;
}
// This method runs in a background thread while the app is running
private void ProcessCommand()
{
try
{
do
{
// Indicate that we are ready to process another commnad
_cmdProcessorReadyEvent.Set();
_sendCommandEvent.WaitOne();
lock (_sendCmdLock)
{
_lastCommandResult = new BaseResponse(false, false, "No Command");
RunCOMCommand(_currentCommand);
}
_resultAvailableEvent.Set();
} while (_processCommands);
}
catch (Exception ex)
{
_lastCommandResult.Success = false;
_lastCommandResult.Timeout = false;
_lastCommandResult.LastError = ex.Message;
}
}
I haven't tried implementing a queue of command requests as the calling code expects everything to be synchronous - i.e. the previous command must have completed before I sent the next one.
Additional Background
The software running on the remote system is a 3rd party product and I don't have access to it, it is used to control a laser marking machine with an integrated XY table.
I'm actually using a legacy VB6 DLL to communicate with the laser as it has all the code for formatting commands and processing the responses. This VB6 DLL uses a WinSock control for the comms.
I'm not sure why a queueing solution wouldn't work.
Why not put each request, plus details for a callback with result, on a queue ? Your application would queue these requests, and the module interfacing to your 3rd party system can take each queue item in turn, process, and return the result.
I think it's a cleaner separation of concerns between modules rather than implementing locking around request dispatch etc. Your requestor is largely oblivious of the serialisation constraints, and the 3rd-party interfacing module can look after serialisation, managing timeouts and other errors etc.
Edit: In the Java world we have BlockingQueues which are synchronised for consumers/publishers and make this sort of thing quite easy. I'm not sure if you have the same in the C# world. A quick search suggests not, but there's source code floating around for this sort of thing (if anyone in the C# world can shed some light that would be appreciated)
My original question from a while ago is MSMQ Slow Queue Reading, however I have advanced from that and now think I know the problem a bit more clearer.
My code (well actually part of an open source library I am using) looks like this:
queue.Receive(TimeSpan.FromSeconds(10), MessageQueueTransactionType.Automatic);
Which is using the Messaging.MessageQueue.Receive function and queue is a MessageQueue. The problem is as follows.
The above line of code will be called with the specified timeout (10 seconds). The Receive(...) function is a blocking function, and is supposed to block until a message arrives in the queue at which time it will return. If no message is received before the timeout is hit, it will return at the timeout. If a message is in the queue when the function is called, it will return that message immediately.
However, what is happening is the Receive(...) function is being called, seeing that there is no message in the queue, and hence waiting for a new message to come in. When a new message comes in (before the timeout), it isn't detecting this new message and continues waiting. The timeout is eventually hit, at which point the code continues and calls Receive(...) again, where it picks up the message and processes it.
Now, this problem only occurs after a number of days/weeks. I can make it work normally again by deleting & recreating the queue. It happens on different computers, and different queues. So it seems like something is building up, until some point when it breaks the triggering/notification ability that the Receive(...) function uses.
I've checked a lot of different things, and everything seems normal & isn't different from a queue that is working normally. There is plenty of disk space (13gig free) and RAM (about 350MB free out of 1GB from what I can tell). I have checked registry entries which all appear the same as other queues, and the performance monitor doesn't show anything out of the normal. I have also run the TMQ tool and can't see anything noticably wrong from that.
I am using Windows XP on all the machines and they all have service pack 3 installed. I am not sending a large amount of messages to the queues, at most it would be 1 every 2 seconds but generally a lot less frequent than that. The messages are only small too and nowhere near the 4MB limit.
The only thing I have just noticed is the p0000001.mq and r0000067.mq files in C:\WINDOWS\system32\msmq\storage are both 4,096KB however they are that size on other computers also which are not currently experiencing the problem. The problem does not happen to every queue on the computer at once, as I can recreate 1 problem queue on the computer and the other queues still experience the problem.
I am not very experienced with MSMQ so if you post possible things to check can you please explain how to check them or where I can find more details on what you are talking about.
Currently the situation is:
ComputerA - 4 queues normal
ComputerB - 2 queues experiencing problem, 1 queue normal
ComputerC - 2 queues experiencing problem
ComputerD - 1 queue normal
ComputerE - 2 queues normal
So I have a large number of computers/queues to compare and test against.
Any particular reason you aren't using an event handler to listen to the queues? The System.Messaging library allows you to attach a handler to a queue instead of, if I understand what you are doing correctly, looping Receive every 10 seconds. Try something like this:
class MSMQListener
{
public void StartListening(string queuePath)
{
MessageQueue msQueue = new MessageQueue(queuePath);
msQueue.ReceiveCompleted += QueueMessageReceived;
msQueue.BeginReceive();
}
private void QueueMessageReceived(object source, ReceiveCompletedEventArgs args)
{
MessageQueue msQueue = (MessageQueue)source;
//once a message is received, stop receiving
Message msMessage = null;
msMessage = msQueue.EndReceive(args.AsyncResult);
//do something with the message
//begin receiving again
msQueue.BeginReceive();
}
}
We are also using NServiceBus and had a similar problem inside our network.
Basically, MSMQ is using UDP with two-phase commits. After a message is received, it has to be acknowledged. Until it is acknowledged, it cannot be received on the client side as the receive transaction hasn't been finalized.
This was caused by different things in different times for us:
once, this was due to the Distributed Transaction Coordinator unable to communicate between machines as firewall misconfiguration
another time, we were using cloned virtual machines without sysprep which made internal MSMQ ids non-unique and made it receive a message to one machine and ack to another. Eventually, MSMQ figures things out but it takes quite a while.
Try this
public Message Receive( TimeSpan timeout, Cursor cursor )
overloaded function.
To get a cursor for a MessageQueue, call the CreateCursor method for that queue.
A Cursor is used with such methods as Peek(TimeSpan, Cursor, PeekAction) and Receive(TimeSpan, Cursor) when you need to read messages that are not at the front of the queue. This includes reading messages synchronously or asynchronously. Cursors do not need to be used to read only the first message in a queue.
When reading messages within a transaction, Message Queuing does not roll back cursor movement if the transaction is aborted. For example, suppose there is a queue with two messages, A1 and A2. If you remove message A1 while in a transaction, Message Queuing moves the cursor to message A2. However, if the transaction is aborted for any reason, message A1 is inserted back into the queue but the cursor remains pointing at message A2.
To close the cursor, call Close.
If you want to use something completely synchronous and without event you can test this method
public object Receive(string path, int millisecondsTimeout)
{
var mq = new System.Messaging.MessageQueue(path);
var asyncResult = mq.BeginReceive();
var handles = new System.Threading.WaitHandle[] { asyncResult.AsyncWaitHandle };
var index = System.Threading.WaitHandle.WaitAny(handles, millisecondsTimeout);
if (index == 258) // Timeout
{
mq.Close();
return null;
}
var result = mq.EndReceive(asyncResult);
return result;
}