TxSelect and TransactionScope - c#

Recently, I've been checking out RabbitMQ over C# as a way to implement pub/sub. I'm more used to working with NServiceBus. NServiceBus handles transactions by enlisting MSMQ in a TransactionScope. Other transaction aware operations can also enlist in the same TransactionScope (like MSSQL) so everything is truly atomic. Underneath, NSB brings in MSDTC to coordinate.
I see that in the C# client API for RabbitMQ there is a IModel.TxSelect() and IModel.TxCommit(). This works well to not send messages to the exchange before the commit. This covers the use case where there are multiple messages sent to the exchange that need to be atomic. However, is there a good way to synchronize a database call (say to MSSQL) with the RabbitMQ transaction?

You can write a RabbitMQ Resource Manager to be used by MSDTC by implementing the IEnlistmentNotification interface. The implementation provides two phase commit notification callbacks for the transaction manager upon enlisting for participation. Please note that MSDTC comes with a heavy price and will degrade your overall performance drastically.
Example of RabbitMQ resource manager:
sealed class RabbitMqResourceManager : IEnlistmentNotification
{
private readonly IModel _channel;
public RabbitMqResourceManager(IModel channel, Transaction transaction)
{
_channel = channel;
_channel.TxSelect();
transaction.EnlistVolatile(this, EnlistmentOptions.None);
}
public RabbitMqResourceManager(IModel channel)
{
_channel = channel;
_channel.TxSelect();
if (Transaction.Current != null)
Transaction.Current.EnlistVolatile(this, EnlistmentOptions.None);
}
public void Commit(Enlistment enlistment)
{
_channel.TxCommit();
enlistment.Done();
}
public void InDoubt(Enlistment enlistment)
{
Rollback(enlistment);
}
public void Prepare(PreparingEnlistment preparingEnlistment)
{
preparingEnlistment.Prepared();
}
public void Rollback(Enlistment enlistment)
{
_channel.TxRollback();
enlistment.Done();
}
}
Example using resource manager
using(TransactionScope trx= new TransactionScope())
{
var basicProperties = _channel.CreateBasicProperties();
basicProperties.DeliveryMode = 2;
new RabbitMqResourceManager(_channel, trx);
_channel.BasicPublish(someExchange, someQueueName, basicProperties, someData);
trx.Complete();
}

As far as I'm aware there is no way of coordinating the TxSelect/TxCommit with the TransactionScope.
Currently the approach that I'm taking is using durable queues with persistent messages to ensure they survive RabbitMQ restarts. Then when consuming from the queues I read a message off do some processing and then insert a record into the database, once all this is done I ACK(nowledge) the message and it is removed from the queue. The potential problem with this approach is that the message could end up being processed twice (if for example the message is committed to the DB but say the connection to RabbitMQ is disconnected before the message can be ack'd), but for the system that we're building we're concerned about throughput. (I believe this is called the "at-least-once" approach).
The RabbitMQ site does say that there is a significant performance hit using the TxSelect and TxCommit so I would recommend benchmarking both approaches.
However way you do it, you will need to ensure that your consumer can cope with the message potentially being processed twice.
If you haven't found it yet take a look at the .Net user guide for RabbitMQ here, specifically section 3.5

Lets say you've got a service bus implementation for your abstraction IServiceBus. We can pretend it's rabbitmq under the hood, but it certainly doesn't need to be.
When you call servicebus.Publish, you can check System.Transaction.Current to see if you're in a transaction. If you are and it's a transaction for a mssql server connection, instead of publishing to rabbit you can publish to a broker queue within sql server which will respect the commit/rollback with whatever database operation you're performing (you want to do some connection magic here to avoid the broker publish upgrading your txn to msdtc)
Now you need to create a service that needs to read the broker queue and do an actual publish to rabbit, this way, for very important things, you can gaurantee that your database operation completed previously and that the message gets published to rabbit at some point in the future (when the service relays it). its still possible for failures here if when committing the broker receive an exception occurs, but the window for problems is drastically reduced and worse case scenario you would end up publishing multiple times, you would never lose a message. This is very unlikely, the sql server going offline after receive but before commit would be an example of when you would end up at minimum double publishing (when the server comes on-line you'd publish again) You can build your service smart to mitigate some, but unless you use msdtc and all that comes with it (yikes) or build your own msdtc (yikes yikes) you are going to have potential failures, it's all about making the window small and unlikely to occur.

Related

Does MassTransit In-Memory Outbox work with Mediator?

Does the In Memory Outbox only work with an underlying messaging transport configured.
The documentation and some of the various posts I have read are leading to believe me that it will ONLY work with a specific underlying transport specified. It would be nice if that wasn't the case.
I say this as I have read discussion around the outbox and acknowledging messages "from a broker" and only once all processing has completed successfully - messages are acknowledged and publishing occurs.
So, when handling the messaging (i.e. via Amazon SQS) oneself and publishing messages into the state machine (i.e taking the transport message, creating a new message and then handing off to publishing to a consumer or saga state machine, how would the outbox know about and work with underlying transport messages.)
To be really clear, will the outbox work when using the following configuration (note the absence of any messaging transport configuration) :
services.AddMediator(configurator =>
{
configurator.AddConsumer<PublishMessageConsumer>();
configurator.AddSagaStateMachine<YetAnotherStateMachine, YetSomeMoreState>(
sagaConfigurator =>
{
sagaConfigurator.UseInMemoryOutbox();
}).DynamoDbRepository()
/// Snip
});
If it DOES work - if I wanted a consumer AND the Saga statemachine to work in concert such that the the Saga published to the Consumer and the Consumer failed for some reason. What would actually happen?
The sole purpose of the in-memory outbox is to defer calls to Send/Publish until after the consumer has completed. In the case of a saga, it means after the saga has been persisted to the saga repository after all state machine behaviors for the event have completed successfully (without throwing an exception).
In the case above, the saga would complete all activities for triggering event, the instance would be saved to the saga repository, and finally the consumer would be created/called by the Send/Publish call from the saga.
If the consumer throws an exception, it won't affect the already persisted saga instance in any way, as that has already completed.
NOW. If you do NOT use the in-memory outbox in this scenario, since it is using mediator (and not a transport), if you call Send/Publish in a state machine activity, control is transferred immediately to the consumer of the message sent/published. After that consumer completes, controls returns to the saga, which once the activities have completed would be persisted to the repository and the original message consumed by the saga completes, returning control to the original Send/Publish call.
Mediator is immediate, and any messages produced by consumers and/or sagas are consumed immediately as well.

It was not possible to connect to the redis server(s); ConnectTimeout

I'm using Azure Function V1 with StackExchange.Redis 1.2.6. Function receiving 1000s of messages per minutes, For every message, For every device, I'm checking Redis. I noticed When we have more messages at that time we are getting below an error.
Exception while executing function: TSFEventRoutingFunction No connection is available to service this operation: HGET GEO_DYNAMIC_hash; It was not possible to connect to the redis server(s); ConnectTimeout; IOCP: (Busy=1,Free=999,Min=24,Max=1000), WORKER: (Busy=47,Free=32720,Min=24,Max=32767), Local-CPU: n/a It was not possible to connect to the redis server(s); ConnectTimeout
CacheService as recommended by MS
public class CacheService : ICacheService
{
private readonly IDatabase cache;
private static readonly string connectionString = ConfigurationManager.AppSettings["RedisConnection"];
public CacheService()
{
this.cache = Connection.GetDatabase();
}
private static Lazy<ConnectionMultiplexer> lazyConnection = new Lazy<ConnectionMultiplexer>(() =>
{
return ConnectionMultiplexer.Connect(connectionString);
});
public static ConnectionMultiplexer Connection
{
get
{
return lazyConnection.Value;
}
}
public async Task<string> GetAsync(string hashKey, string ruleKey)
{
return await this.cache.HashGetAsync(hashKey, ruleKey);
}
}
I'm injecting ICacheService in Azure function and calling GetAsync Method on every request.
Using Azure Redis Instance C3
Currently, you can see I have a single connection, Creating multiple connections will help to solve this issue? or Any other suggestion to solve/understand this issue.
There are many different causes of the error you are getting. Here are some I can think of off the top of my head (not in any particular order):
Your connectTimeout is too small. I often see customers set a small connect timeout often because they think it will ensure that the connection is established within that time span. The problem with this approach is that when something goes wrong (high client CPU, high server CPU, etc), then the connection attempt will fail. This often makes a bad situation worse - instead of helping, it aggravates the problem by forcing the system to restart the process of trying to reconnect, often resulting in a connect -> fail -> retry loop. I generally recommend that you leave your connectionTimeout at 15 seconds or higher. It is better to let your connection attempt succeed after 15 or 20 seconds than it is to have it fail after 5 seconds repeatedly, resulting in an outage lasting several minutes until the system finally recovers.
A server-side failover occurs. A connection is severed by the server as a result of some type of failover from master to replica. This can happen if the server-side software is updated at the Redis layer, the OS layer or the hosting layer.
A networking infrastructure failure of some type (hardware sitting between the client and the server sees some type of issue).
You change the access password for your Redis instance. Changing the password will reset connections to all clients to force them to re-authenticate.
Thread Pool Settings need to be adjusted. If your thread pool settings are not adjusted correctly for your workload, then you can run into delays in spinning up new threads as explained here.
I have written a bunch of best practices for Redis that will help you avoid other problems as well.
We solved this issue by upgrading StackExchange.Redis to 2.1.30.

How to open multiple sql connections with ADO when handling an nservicebus message

I have a message handler using NServiceBus that needs to execute SQL code on two different databases. The connection strings have different initial catalogs but are otherwise identical.
When the message is picked up, the first sql connection opens successfully but the second sql connection causes the following exception to be thrown when .Open is called.
Network access for Distributed Transaction Manager (MSDTC) has been
disabled. Please enable DTC for network access in the security
configuration for MSDTC using the Component Services Administrative
tool.
We don't use MSDTC.
Here's the code that fails. It will fail on connB.Open()
public void Handle(MyMsgCmd message)
{
using (SqlConnection connA = new SqlConnection(myConnectionStringA))
{
connA.Open();
}
using (SqlConnection connB = new SqlConnection(myConnectionStringB))
{
connB.Open();
}
}
This same code works perfectly fine when run from a command line application or web application. The exception is only thrown when it's called from NServiceBus.
Each of these connections will successfully open when opened first or when opened by itself but whenever there's a second connection present the second connection will always fail to open with the same exception even when it's known good.
Is there additional configuration needed to open more than one connection in sequence with NServiceBus?
It looks like by default NServiceBus wraps each message handler in a transaction and that causes queries to different database connections inside the same message handler to fail unless MSDTC is enabled.
I can disable that with BusConfiguration.Transactions().DoNotWrapHandlersExecutionInATransactionScope()
You can find more information on transaction in the NServiceBus documentation.
This isn't related exclusively to NServiceBus, we just provide different ways of connecting to a transport (like MSMQ, Azure Service Bus, etc), a persister and your own database.
But even without NServiceBus, when connecting to two databases, you need either a distributed transaction, or make sure the transaction is not escalated to a distributed transaction. The thing is, without distributed transactions, when one transaction successfully commits, the other transaction might fail. With the result that your two databases are not in-sync or consistent anymore.
If orders in DatabaseA are stored and inventory is tracked in DatabaseB, you might deduct 1 from inventory, but the order might never be stored because the transaction failed. You need to compensate for this yourself without distributed transactions.
THat's not to say distributed transactions are always the way to go. You're probably not using them because your DBA doesn't like them. MSDTC always puts serializable transactions on your data, which have the heaviest locks. The longer you keep them open, the more concurrently running transactions will need to wait. With possibly huge performance issues in your software.
On the other hand, it can be very, very difficult to create compensating transactions. And just think about the fact that DatabaseA might fail, DatabaseB might succeed. But what happens to the message? Is it gone from the queue? Or will it remain in the queue and be processed again? Will DatabaseB succeed again with the possible result of duplicate data?
Luckily you're already using NServiceBus. You might want to check out the Outbox feature that can help solve some of these issues.

WMQ: Distributing MQ readers over several machines

I am using WMQ to access an IBM WebSphere MQ on a mainframe - using c#.
We are considering spreading out our service on several machines, and we then need to make sure that two services on two different machines cannot read/get the same MQ message at the same time.
My code for getting messages is this:
var connectionProperties = new Hashtable();
const string transport = MQC.TRANSPORT_MQSERIES_CLIENT;
connectionProperties.Add(MQC.TRANSPORT_PROPERTY, transport);
connectionProperties.Add(MQC.HOST_NAME_PROPERTY, mqServerIP);
connectionProperties.Add(MQC.PORT_PROPERTY, mqServerPort);
connectionProperties.Add(MQC.CHANNEL_PROPERTY, mqChannelName);
_mqManager = new MQQueueManager(mqManagerName, connectionProperties);
var queue = _mqManager.AccessQueue(_queueName, MQC.MQOO_INPUT_SHARED + MQC.MQOO_FAIL_IF_QUIESCING);
var queueMessage = new MQMessage {Format = MQC.MQFMT_STRING};
var queueGetMessageOptions = new MQGetMessageOptions {Options = MQC.MQGMO_WAIT, WaitInterval = 2000};
queue.Get(queueMessage, queueGetMessageOptions);
queue.Close();
_mqManager.Commit();
return queueMessage.ReadString(queueMessage.MessageLength);
Is WebSphere MQ transactional by default, or is there something I need to change in my configuration to enable this?
Or - do I need to ask our mainframe guys to do some of their magic?
Thx
Unless you actively BROWSE the message (ie read it but leave it there with no locks), only one getter will ever be able to 'get' the message. Even without transactionality, MQ will still only deliver the message once... but once delivered its gone
MQ is not transactional 'by default' - you need to get with GMO_SYNCPOINT (MQ transactions) and commit at the connection (MQQueueManager level) if you want transactionality (or integrate with .net transactions is another option)
If you use syncpoint then one getter will get the message, the other will ignore it, but if you subsequently have an issue and rollback, then it is made available to any getter (as you would want). It is this scenario where you might see a message twice, but thats because you aborted the transaction and hence asked for it to be put back to how it was before the get.
I wish I'd found this sooner because the accepted answer is incomplete. MQ provides once and only once delivery of messages as described in the other answer and IBM's documentation. If you have many clients listening on the same queue, MQ will deliver only one copy of the message. This is uncontested.
That said, MQ, or any other async messaging for that matter, must deal with session handling and ambiguous outcomes. The affect of these factors is such that any async messaging application should be designed to gracefully handle dupe messages.
Consider an application putting a message onto a queue. If the PUT call receives a 2009 Connection Broken response, it is unclear whether the connection failed before or after the channel agent received and acted on the API call. The application, having no way to tell the difference, must put the message again to assure it is received. Doing the PUT under syncpoint can result in a 2009 on the COMMIT (or equivalent return code in messaging transports other than MQ) and the app doesn't know if the COMMIT was successful or if the PUT will eventually be rolled back. To be safe it must PUT the message again.
Now consider the partner application receiving the messages. A GET issued outside of syncpoint that reaches the channel agent will permanently remove the message from the queue, even if the channel agent cannot then deliver it. So use of transacted sessions ensures that the message is not lost. But suppose that the message has been received and processed and the COMMIT returns a 2009 Connection Broken. The app has no way to know whether the message was removed during the COMMIT or will be rolled back and delivered again. At the very least the app can avoid losing messages by using transacted sessions to retrieve them, but can not guarantee to never receive a dupe.
This is of course endemic to all async messaging, not just MQ, which is why the JMS specification directly address it. The situation is addressed in all versions but in the JMS 1.1 spec look in section 4.4.13 Duplicate Production of Messages which states:
If a failure occurs between the time a client commits its work on a
Session and the commit method returns, the client cannot determine if
the transaction was committed or rolled back. The same ambiguity
exists when a failure occurs between the non-transactional send of a
PERSISTENT message and the return from the sending method.
It is up to a JMS application to deal with this ambiguity. In some
cases, this may cause a client to produce functionally duplicate
messages.
A message that is redelivered due to session recovery is not
considered a duplicate message.
If it is critical that the application receive one and only one copy of the message, use 2-Phase transactions. The transaction manager and XA protocol will provide very strong (but still not absolute) assurance that only one copy of the message will be processed by the application.
The behavior of the messaging transport in delivering one and only one copy of a given message is a measure of the reliability of the transport. By contrast, the behavior of an application which relies on receipt of one and only one copy of the message is a measure of the reliability of the application.
Any duplicate messages received from an IBM MQ transport are almost certainly going to be due to the application's failure to use XA to account for the ambiguous outcomes inherent in async messaging and not a defect in MQ. Please keep this in mind when the Production version of the application chokes on its first duplicate message.
On a related note, if Disaster Recovery is involved, the app must also gracefully recover from lost messages, or else find a way to violate the laws of relativity.

WCF - Client callback vs. polling for "keep list of subscribers"

I want to create a simple client-server example in WCF. I did some testing with callbacks, and it works fine so far. I played around a little bit with the following interface:
[ServiceContract(SessionMode = SessionMode.Required, CallbackContract = typeof(IStringCallback))]
public interface ISubscribeableService
{
[OperationContract]
void ExecuteStringCallBack(string value);
[OperationContract]
ServerInformation Subscribe(ClientInformation c);
[OperationContract]
ServerInformation Unsubscribe(ClientInformation c);
}
Its a simple example. a little bit adjusted. You can ask the server to "execute a string callback" in which case the server reversed the string and calls all subscribed client callbacks.
Now, here comes the question: If I want to implement a system where all clients "register" with the server, and the server can "ask" the clients if they are still alive, would you implement this with callbacks (so instead of this "stringcallback" a kind of TellTheClientThatIAmStillHereCallback). By checking the communication state on the callback I can also "know" if a client is dead. Something similar to this:
Subscribers.ForEach(delegate(IStringCallback callback)
{
if (((ICommunicationObject)callback).State == CommunicationState.Opened)
{
callback.StringCallbackFunction(new string(retVal));
}
else
{
Subscribers.Remove(callback);
}
});
My problem, put in another way:
The server might have 3 clients
Client A dies (I pull the plug of the laptop)
The server dies and comes back online
A new client comes up
So basically, would you use callbacks to verify the "still living state" of clients, or would you use polling and keep track "how long I havent heard of a client"...
You can detect most changes to the connection state via the Closed, Closing, and Faulted events of ICommunicationObject. You can hook them at the same time that you set up the callback. This is definitely better than polling.
IIRC, the Faulted event will only fire after you actually try to use the callback (unsuccessfully). So if the Client just disappears - for example, a hard reboot or power-off - then you won't be notified right away. But do you need to be? And if so, why?
A WCF callback might fail at any time, and you always need to keep this in the back of your mind. Even if both the client and server are fine, you might still end up with a faulted channel due to an exception or a network outage. Or maybe the client went offline sometime between your last poll and your current operation. The point is, as long as you code your callback operations defensively (which is good practice anyway), then hooking the events above is usually enough for most designs. If an error occurs for any reason - including a client failing to respond - the Faulted event will kick in and run your cleanup code.
This is what I would refer to as the passive/lazy approach and requires less coding and network chatter than polling or keep-alive approaches.
If you enable reliable sessions, WCF internally maintains a keep-alive control mechanism. It regularly checks, via hidden infrastructure test messages, if the other end is still there. The time interval of these checks can be influenced via the ReliableSession.InactivityTimeout property. If you set the property to, say, 20 seconds, then the ICommunicationObject.Faulted event will be raised about 20 to 30 (maximum) seconds after a service breakdown has occurred on the other side.
If you want to be sure that client applications always remain "auto-connected", even after temporary service breakdowns, you may want to use a worker thread (from the thread pool) that repeatedly tries to create a new proxy instance on the client side, and calls a session-initiating operation, after the Faulted event has been raised there.
As a second approach, since you are implementing a worker thread mechanism anyway, you might also ignore the Faulted event and let the worker thread loop during the whole lifetime of the client application. You let the thread repeatedly check the proxy state, and try to do its repair work whenever the state is faulted.
Using the first or the second approach, you can implement a service bus architecture (mediator pattern), guaranteeing that all client application instances are constantly ready to receive "spontaneous" service messages whenever the service is running.
Of course, this only works if the reliable session "as such" is configured correctly to begin with (using a session-capable binding, and applying the ServiceContractAttribute.SessionMode, ServiceBehaviorAttribute.InstanceContextMode, OperationContractAttribute.IsInitiating, and OperationContractAttribute.IsTerminating properties in meaningful ways).
I had a similar situation using WCF and callbacks. I did not want to use polling, but I was using a "reilable" protocol, so if a client died, then it would hang the server until it timed out and crashed.
I do not know if this is the most correct or elegant solution, but what I did was create a class in the service to represent the client proxy. Each instance of this class contained a reference to the client proxy, and would execute the callback function whenever the server set the "message" property of the class. By doing this, when a client disconnected, the individual wrapper class would get the timeout excetpion, and remove itself from the server's list of listeners, but the service would not have to wait for it. This doesn't actually answer your question about determining if the client is alive, but it is another way of structuring the service to addrss the issue. If you needed to know when a client died, you would be able to pick up when the client wrapper removed itself from the listener list.
I have not tried to use WCF callbacks over the wire but i have used them for interprocess communication. I was having a problem where call of the calls that were being sent were ending up on the same thread and making the service dead lock when there were calls that were dependant on the same thread.
This may apply to the problem that you are currently have so here is what I had to do to fix the problem.
Put this attribute onto the server and client of the WCF server implemetation class
[ServiceBehavior(ConcurrencyMode = ConcurrencyMode.Multiple)]
public class WCFServerClass
The ConcurrencyMode.Multiple makes each call process on its own thread which should help you with the server locking up when a client dies until it timesout.
I also made sure to use a Thread Pool on the client side to make sure that there were no threading issues on the client side

Categories