WMQ: Distributing MQ readers over several machines - c#

I am using WMQ to access an IBM WebSphere MQ on a mainframe - using c#.
We are considering spreading out our service on several machines, and we then need to make sure that two services on two different machines cannot read/get the same MQ message at the same time.
My code for getting messages is this:
var connectionProperties = new Hashtable();
const string transport = MQC.TRANSPORT_MQSERIES_CLIENT;
connectionProperties.Add(MQC.TRANSPORT_PROPERTY, transport);
connectionProperties.Add(MQC.HOST_NAME_PROPERTY, mqServerIP);
connectionProperties.Add(MQC.PORT_PROPERTY, mqServerPort);
connectionProperties.Add(MQC.CHANNEL_PROPERTY, mqChannelName);
_mqManager = new MQQueueManager(mqManagerName, connectionProperties);
var queue = _mqManager.AccessQueue(_queueName, MQC.MQOO_INPUT_SHARED + MQC.MQOO_FAIL_IF_QUIESCING);
var queueMessage = new MQMessage {Format = MQC.MQFMT_STRING};
var queueGetMessageOptions = new MQGetMessageOptions {Options = MQC.MQGMO_WAIT, WaitInterval = 2000};
queue.Get(queueMessage, queueGetMessageOptions);
queue.Close();
_mqManager.Commit();
return queueMessage.ReadString(queueMessage.MessageLength);
Is WebSphere MQ transactional by default, or is there something I need to change in my configuration to enable this?
Or - do I need to ask our mainframe guys to do some of their magic?
Thx

Unless you actively BROWSE the message (ie read it but leave it there with no locks), only one getter will ever be able to 'get' the message. Even without transactionality, MQ will still only deliver the message once... but once delivered its gone
MQ is not transactional 'by default' - you need to get with GMO_SYNCPOINT (MQ transactions) and commit at the connection (MQQueueManager level) if you want transactionality (or integrate with .net transactions is another option)
If you use syncpoint then one getter will get the message, the other will ignore it, but if you subsequently have an issue and rollback, then it is made available to any getter (as you would want). It is this scenario where you might see a message twice, but thats because you aborted the transaction and hence asked for it to be put back to how it was before the get.

I wish I'd found this sooner because the accepted answer is incomplete. MQ provides once and only once delivery of messages as described in the other answer and IBM's documentation. If you have many clients listening on the same queue, MQ will deliver only one copy of the message. This is uncontested.
That said, MQ, or any other async messaging for that matter, must deal with session handling and ambiguous outcomes. The affect of these factors is such that any async messaging application should be designed to gracefully handle dupe messages.
Consider an application putting a message onto a queue. If the PUT call receives a 2009 Connection Broken response, it is unclear whether the connection failed before or after the channel agent received and acted on the API call. The application, having no way to tell the difference, must put the message again to assure it is received. Doing the PUT under syncpoint can result in a 2009 on the COMMIT (or equivalent return code in messaging transports other than MQ) and the app doesn't know if the COMMIT was successful or if the PUT will eventually be rolled back. To be safe it must PUT the message again.
Now consider the partner application receiving the messages. A GET issued outside of syncpoint that reaches the channel agent will permanently remove the message from the queue, even if the channel agent cannot then deliver it. So use of transacted sessions ensures that the message is not lost. But suppose that the message has been received and processed and the COMMIT returns a 2009 Connection Broken. The app has no way to know whether the message was removed during the COMMIT or will be rolled back and delivered again. At the very least the app can avoid losing messages by using transacted sessions to retrieve them, but can not guarantee to never receive a dupe.
This is of course endemic to all async messaging, not just MQ, which is why the JMS specification directly address it. The situation is addressed in all versions but in the JMS 1.1 spec look in section 4.4.13 Duplicate Production of Messages which states:
If a failure occurs between the time a client commits its work on a
Session and the commit method returns, the client cannot determine if
the transaction was committed or rolled back. The same ambiguity
exists when a failure occurs between the non-transactional send of a
PERSISTENT message and the return from the sending method.
It is up to a JMS application to deal with this ambiguity. In some
cases, this may cause a client to produce functionally duplicate
messages.
A message that is redelivered due to session recovery is not
considered a duplicate message.
If it is critical that the application receive one and only one copy of the message, use 2-Phase transactions. The transaction manager and XA protocol will provide very strong (but still not absolute) assurance that only one copy of the message will be processed by the application.
The behavior of the messaging transport in delivering one and only one copy of a given message is a measure of the reliability of the transport. By contrast, the behavior of an application which relies on receipt of one and only one copy of the message is a measure of the reliability of the application.
Any duplicate messages received from an IBM MQ transport are almost certainly going to be due to the application's failure to use XA to account for the ambiguous outcomes inherent in async messaging and not a defect in MQ. Please keep this in mind when the Production version of the application chokes on its first duplicate message.
On a related note, if Disaster Recovery is involved, the app must also gracefully recover from lost messages, or else find a way to violate the laws of relativity.

Related

Trying to understand the nature of consumer queues with RabbitMq

RabbitMq 3.8.5, C# RabbitMqClient v6.1.0, .Net Core 3.1
I feel that I'm misunderstanding something with RabbitMq so I'm looking for clarification:
If I have a client sending a message to an exchange, and there's no consumer on the other side, what is meant to happen?
I had thought that it should sit in a queue until it's picked up, but the issue I've got is that, right now there is no queue on the other end of the exchange (which may well be my issue).
This is my declaration code:
channel.ExchangeDeclare(name, exchangeType, durable, autoDelete);
var queueName = ret._channel.QueueDeclare().QueueName;
channel.ConfirmSelect();
and this is my publisher:
channel.BasicPublish(exchangeName, routingKeyOrTopicName, messageProperties, message);
However doing that gives me one queue name for the outbound exchange, and another for the inbound consumer.
Would someone help this poor idiot out in understanding how this is meant to work? What is the expected behavior if there's no consumer at the other end? I do have an RPC mechanism that does work, but wasn't sure if that's the right way to handle this, or not.
Everything works find if I have my consumer running first, however if I fire up my Consumer after the client, then the messages are lost.
Edit
To further clarify, I've set up a simple RPC type test; I've two Direct Exchanges on the client side, one for the outbound Exchange, and another for the inbound RPC consumer.
Both those have their own queue.
Exchange queue name = amq.gen-fp-J9-TQxOJ7NpePEnIcGQ
Consumer queue name = amq.gen-wDFEJ269QcMsHMbAz-t3uw
When the Consumer app fires up, it declares its own Direct exchange and its own queue.
Consumer queue name = amq.gen-o-1O2uSczjXQDihTbkgeqA
If I do it that way though, the message gets lost.
If I fire up the consumer first then I still get three queues in total, but the messages are handled correctly.
This is the code I use to send my RPC message:
messageProperties.ReplyTo = _rpcResponder._routingKeyOrTopicName;
messageProperties.Type = "rpc";
messageProperties.Priority = priority;
messageProperties.Persistent = persistent;
messageProperties.Headers = headers;
messageProperties.Expiration = "3600000";
Looking at the management GUI, I see that all three queues end up being marked as Exclusive, but I'm not declaring them as such. In fact, I'm not creating any queues myself, rather letting the Client library handle that for me, for example, this is how I define my Consumer:
channel.ExchangeDeclare(name, exchangeType, durable, autoDelete);
var queueName = ret._channel.QueueDeclare().QueueName;
Console.WriteLine($"Consumer queue name = {queueName}");
channel.QueueBind(ret.QueueName, name, routingKeyOrTopicName, new Dictionary<string, object>());
In RabbitMQ, messages stay in queues, but they are published to exchanges. The way to link an exchange to a queue is through bindings (there are some default bindings).
If there are no queues, or the exchange's policy doesn't find any queue to forward the message, the message is lost.
Once a message is in a queue, the message is sent to one of that queue's consumers.
Maybe you're using exclusive queues? These queues get deleted when their declaring connection is gone.
Found the issue: I was allowing the library to generate the queue names rather than using specific ones. This meant that RabbitMq was always having to deal with a shifting target each time.
If I use 'well defined' queue names AND the consumer has fired up at least once to define the queue on RabbitMq, then I do see the message being dropped into the queue and stay there, even though the consumer isn't running.

Check if NamedPipeClientStream write is successful

Basically the title... I'd like to have same feedback on weather NamedPipeServerStream object successfully received a value. This is the starting code:
static void Main(string[] args){
Console.WriteLine("Client running!");
NamedPipeClientStream npc = new NamedPipeClientStream("somename");
npc.Connect();
// npc.WriteTimeout = 1000; does not work, says it is not supported for this stream
byte[] message = Encoding.UTF8.GetBytes("Message");
npc.Write(message);
int response = npc.ReadByte();
Console.WriteLine("response; "+response);
}
I've implemented a small echo message from the NamedPipeServerStream on every read. I imagine I could add some async timeout to check if npc.ReadByte(); did return a value in lets say 200ms. Similar to how TCP packets are ACKed.
Is there a better way of inspecting if namedPipeClientStream.Write() was successful?
I'd like to have same feedback on weather NamedPipeServerStream object successfully received a value
The only way to know for sure that the data you sent was received and successfully processed by the client at the remote endpoint, is for your own application protocol to include such acknowledgements.
As a general rule, you can assume that if your send operations are completing successfully, the connection remains viable and the remote endpoint is getting the data. If something happens to the connection, you'll eventually get an error while sending data.
However, this assumption only goes so far. Network I/O is buffered, usually at several levels. Any of your send operations almost certainly involve doing nothing more than placing the data in a local buffer for the network layer. The method call for the operation will return as soon as the data has been buffered, without regard for whether the remote endpoint has received it (and in fact, almost never will have by the time your call returns).
So if and when such a call throws an exception or otherwise reports an error, it's entirely possible that some of the previously sent data has also been lost in transit.
How best to address this possibility depends on what you're trying to do. But in general, you should not worry about it at all. It will typically not matter if a specific transmission has been received. As long as you can continue transmitting without error, the connection is fine, and asking for acknowledgement is just unnecessary overhead.
If you want to handle the case where an error occurs, invalidating the connection, forcing you to retry, and you want to make the broader operation resumable (e.g. you're streaming some data to the remote endpoint and want to ensure all of the data has been received, without having to resend data that has already been received), then you should build into your application protocol the ability to resume, where on reconnecting the remote endpoint reports the number of bytes it's received so far, or the most recent message ID, or whatever it is your application protocol would need to understand where it needs to start sending again.
See also this very closely-related question (arguably maybe even an actual duplicate…though it doesn't mention named pipes specifically, pretty much all network I/O will involve similar issues):
Does TcpClient write method guarantees the data are delivered to server?
There's a good answer there, as well as links to even more useful Q&A in that answer.

Why does MessageQueue.Send(Object) NOT work when sending to remote private queue using FormatName?

So...this one's got me baffled.
The target queue lives on ServerA where MSMQ is running in Workgroup mode. The queue is a non-transactional, private queue, with Full rights on pretty much the world (including NETWORK SERVICE, but EXCLUDING ANONYMOUS LOGON).
I'm specifying the queue address as such: FormatName:DIRECT=OS:ServerA\private$\targetqueue.
If I'm interested in sending "fire-and-forget"-style (no need for transaction as there is no other persistence going on), I would assume it would be fine to simply call:
Message message = ConstructMessageWithObjectPayload(serializableObject);
using (MessageQueue queue = new MessageQueue(queueAddress))
{
queue.Send(message);
}
But strangely, the message never arrives in the target queue and enabling negative source journaling (which interestingly enough causes the message to be sent to the Dead-letter messages queue on the target server) tells me that it is a "Nontransactional message".
Consequently, using
queue.Send(message, MessageQueueTransactionType.Single);
works! Having a hard time wrapping my head around this. What am I missing?
Also, I've seen a good number of posts by others where their similar problem was solved by giving ANONYMOUS LOGIN Full rights. In what scenario is this necessary? Giving NETWORK SERVICE access somewhat made sense because that is the account that MSMQ itself runs under. If running in Workgroup mode like I am, is it necessary at all to assign rights to Everyone or even the account that my process runs under?
Appreciate the help!

nservicebus + webhooks +Errors +MaxRetries

Feature Description
The NServiceBus gateway, http://docs.particular.net/nservicebus/gateway/, seems to be a way to achieve an internal webhook using the NServiceBus infrastructure.
We need to go further with this concept to open up a few event to any 3rd party subscriber that has access to register a webhook url in our system.
Review
We plan to create two initial window services
1) WebHookBatchService, that can be added as a subscriber to specific messages of interest.
<UnicastBusConfig>
<MessageEndpointMappings>
.......
<add Messages="MyMessages.MyImportantMessage, MyMessages" Endpoint="WebHookBatchService.Queue"/>
.......
</MessageEndpointMappings>
</UnicastBusConfig>
2) WebHookProcessService - actually processes 1 message sent by the WebHookBatchService.
Once messages are received on the WebHookBatchService.Queue our WebHookBatchService will look up all the subscribers for the specific tenant + message type and foreach send individual messages to WebHookProcessService.Queue for the WebHookProcessService (which we can make an instance of nservicebus loadbalancer to bridge the batch and actual processor) to actually process the real messages probably using http://restsharp.org/.
Questions
Are there any existing open source projects that do this today?
Now since we have no control of the durability of the subscribers how should we manage errors?
http://wiki.shopify.com/WebHook
A webhook will be deleted if there are 19 consecutive failures for the exact same webhook.
It doesn't mention any delays in the webhook.. What have people experienced with standard delay in retry logic?
Here are some other thoughts:
proposal 0: MaxRetries="1". Purge WebHookProcessService.ErrorQueue nightly. (no retry - guaranteed message loss if it fails the first time)
proposal 1:
MaxRetries="1" on exception catch send email containing xml version of the message that would have been delivered over http.
Purge WebHookProcessService.ErrorQueue nightly.
-- I see potential a spam issues.
proposal 2: The nservicebus MaxRetries retries right away without delay. So i would need to create (1hr - 24hr) bucket queues and use a RetrySchedulerService although I see this as difficult to maintain and confusing for subscribers when they all at once get 25 messages in a non DateCreated ordered fashion when there service endpoint begins to work.
Digging for ideas...
The Gateway is typically used for communication between physical sites over HTTP. Since you are exposing an endpoint to the world to accept callbacks, I'm thinking you could just use the built-in WCF hosting and expose your endpoint through the firewall to 3rd parties. The rest of your setup sounds appropriate to me.
As for errors, you are correct, NSB retries immediately, but if you using web call backs this may get you by in the cases there are small hiccups. You will need to determine how you want to process the error queues, we just build in a new endpoint to process the error queues with logic to determine the retries, delay etc. A nice way to accomplish this is to use a Saga, which includes a Timeout manager. This enables a workflow where you can retry a specified number of times, try another communication, log everything, and ultimately notify someone who can contact the 3rd party to let them know there stuff is busted.

What should the client do while the TIBCO EMS server attempts failover?

The TIBCO EMS user's guide (pg 292) says:
The backup server will work indefinitely to either A) become the
primary server or B) reconnect to the primary server. It also says
clients may receive fail-over notification when the switch is successful (see also TIBCO EMS .NET reference pg 220).
I have some questions spinning off of these facts...
What kind of errors occur on the client side while the servers are attempting fail-over/reconnect?
What is the appropriate response from the client?
Get new Connection objects from the ConnectionFactory until one works?
Wait for fail-over notification? (are current Connection instances fixed at this time? or do I need to get a new instance?)
I hope the scenario is clear, any related information or advice would be appreciated too.
I can at least answer #1 above.
If you have enabled Tibems.SetExceptionOnFTSwitch(true); and have set up an exception handler to capture the messages the server sends to the client, you will see the following:
For single-server, non-fault tolerant connection failures:
"Connection has been terminated".
For fault-tolerant connection failures:
"Connection has performed fault-tolerant switch to "
If you attempt to publish while the connection is down, a TIBCO.EMS.IllegalStateException is thrown with the "Producer is closed" message.
for #2 above, I think the answer is to allow the EMS library to handle as much as possible. Once we got the EMS reconnect functionality to work, it gracefully tried to reconnect until the server became available again and once it reconnected, it was like there was never a problem. The only gotcha is probably if you try to publish a message before the ems connection is back. This is where the exception handler comes in, Once notified that you are in failover mode, you can adjust exception handling on the publisher side to suppress the error until the connection is back. The thing I don't know is how do you tell when you've exhausted all reconnect attempts.
Anyway, Seems like our two worlds are closely related when it comes to EMS - hope our findings (based on your comments on my questions) help you.
We use TEMS (Tibco EMS - a Tibco Product for WCF) So it becomes a custom binding. We tried to break it by doing things like bounce the server to force switch overs and it works really well. make sure you are using version 1.2 not 1.1 because you cannot do anything other then client acknowledgement.

Categories