We are writing a C# application to process events/ messages from a student application system.
The web based portal sends events/ messages to a queue table. We dequeue these and, based on message type, want to process each event.
Example events are ‘applicationSubmitted’, ‘applicationUpdated’, ‘offerAccepted’ etc.
There are quite a number of different event/ message types. We want to use the CQRS pattern, but would value input on how commands would be structured. Would you create a command for each event/ message type? Would you use a command factory of some sort?
In an event-driven CQRS architecture, the events are the result of validating commands (commands can be rejected, events can be at most ignored (but cannot be denied)). Typically you have commands coming into an application which validates them against a write-model (almost always with some sort of concurrency control which imposes an ordering), which is the authoritative source of truth emitting events for consumption by a read model.
So the question then becomes, where is your write-model?
You have a web-based portal sending events/messages to a queue. Is that portal the authoritative source of truth? If so, then it's writing events to the queue and your commands sound like they're going to be the HTTP requests and their bodies: you can structure those like you would any other web request.
On the other hand, if the processor which dequeues the message is maintaining its state and deciding what the truth is, then "application submitted" isn't an event but is a command (in which case, a more imperative phrasing is probably less confusing: "submit application").
Very often, one finds that one component's events are another component's commands.
The command pattern doesn't really have anything to do with the C in CQRS. The latter "command" is in the (to use the domain-driven-design terminology) architecture bounded context and the former "command" is in the lower-level implementation bounded context. Neither implies the other. Similarly there's a (perhaps small) distinction between "event" in the architecture bounded context (along the lines of event-driven architecture) and "event" in the implementation bounded context (along the lines of event-sourcing).
Related
I'm trying to figure out how to implement a fault-tolerant message publication solution using MassTransit. We'll focus on the simple scenario where we only need to commit a database change, and publish an event indicating that change. Because there is no (built-in) mechanism that allows an atomic "commit and publish", when our process crashes, we will get into an inconsistent state (some messages would only be committed to the database, and some might only be published to the message queue).
This documentation page offers a solution, where because we assume message handling is idempotent, we can rely on the entire operation to be retried in case of failure, and these partial commits will be resolved eventually. This is a great solution, but it only has one caveat: it assumes that the operation we are performing was triggered by a message, and if we won't send an ack, processing will be retried. This is not a reasonable assumption, as messaging is typically used only for internal communication inside the system, not for communication with the outside world. What should I do when I need to save-and-publish when handling an HTTP request from an external client?
One possible solution is to hack our way into the approach presented in the article, by only publishing (or sending) a message, and listening to it ourselves, then in the message handler we do the commit and the publishing of the actual event we want others to listen to. The main problem I have with this is that it assumes we never have to return anything in the HTTP response. What if we need to indicate the success or failure of the database transaction back to the HTTP client? (example: if we rely on a UNIQUE constraint to tell us whether or not we should accept the request, and we want to indicate failure to the client). We could solve it by using request-response over the message queue (with ourselves), but this is ugly and increases latency and complexity considerably, for what is actually a very common scenario.
The approach I see the most over the internet to solve this problem, is to use an outbox that is persisted to the same database we need to write to anyway, and thus we can wrap the two operations in a regular ACID database transaction. Then a background task polls this database for new events and publishes them to the message broker. Unlike other frameworks, I understand that MassTransit does not support this behavior out of the box. So I guess my question boils down to: before rushing to implement this relatively complex mechanism myself (once per database technology), is there another solution I'm missing? what is the accepted solution to this problem in the MassTransit community?
This has been asked several times, in a variety of forms, here and other places. But the short answer is simple.
In your controller, write to the message broker only. Let the consumer deal with the database, in the context of consuming a reliable message, with all the nice retry and redelivery options that are available in that context. Then you get all the benefits of the InMemoryOutbox, without adding extreme complexity related to having a third-party (HTTP, database, and broker) in a single conversation.
I understand the difference between transactional consistency and eventual consistency. Say I am developing an application where there are three Microservices and there is a Message Bus, which sends messages between them when Integration Events are raised meaning eventual consistency. For example, Microservice B publishes an integration event and Microservice A handles it two hours later because Microservice B was down at the time the event was published and the message is durable - this is fine.
The way I understand it; there should be transactional consistency inside a Microservice - Aggregate A may publish a domain event that Aggregate B is interested in so a Domain Event is raised and any updates to the database are performed within the same transaction.
I do not understand how CQRS fits into this transactional consistency/eventual consistency scenario because:
I cannot use transactional consistency because the read model (NoSQL) and write model (SQL server) cannot be updated inside the same transaction.
I cannot use the Message Bus because updating the read model is not an integration event i.e. the read model and write model are contained inside the same Microservice.
With CQRS I believe there are two options:
If using Event Store for the write side, then the read side can poll it - this solves the problem because there is no event.
If using an Event Log/relational database for the write side, then a Domain event is raised to update the read side.
If option two is chosen then how can I guarantee that the read model will be in sync with the write model eventually? For example, the read model may be down when the event is raised.
CQRS fits into the concept of eventual consistency by giving you lower vulnerability against optimistic locks when using DBMS systems for your read-only systems. Separating your commands and queries enables you to have a working read/write regardless of either's availability.
1). Transactional consistency is not advisable if you want to have highly available endpoints because of optimistic locking.
2). You can definitely use a message bus for updating you read models since the concept of queueing is not synonymous to inter-context data synchronization.
Technically an aggregate IS the unit of atomicity in DDD, so there doesn't need to be a guaranteed of consistency between aggregates communicating via domain events. From Evan's book:
An AGGREGATE is a cluster of associated objects that we treat as a
unit for the purpose of data changes ... Invariants, which are
consistency rules that must be maintained whenever data changes, will
involve relationships between members of the AGGREGATE. Any rule that
spans AGGREGATES will not be expected to be up-to-date at all times
... But the invariants applied within an AGGREGATE will be enforced
with the completion of each transaction.
For practical purposes, however, most of the services I've developed do wrap the processing of domain events in the same ambient transaction created to handle the processing of the initial request. Distributed applications are hard enough to design and debug without worrying about things like compensating actions inside a service!
I'm currently using the MediatR library to decouple the domain event handlers from the original command/request handler that generates them. It has very similar send/handle semantics to messaging systems, and include a robust middleware-like pipeline for validation and pre-/post-processing.
If option two is chosen then how can I guarantee that the read model will be in sync with the write model eventually?
The solution is a mix of your two options:
Raise a domain event when executing a command.
Store the domain event in an event store in the write model. This task is perrformed by a static lightweight subscriber. The event is stored in the same transaction that the command is executed.
A worker or batch process takes the events of event store and send them through a message queue.
A subscriber takes them from the queue and updates the read model.
This way you never lose events. If the read model isn't available for whatever reason, the worker will rethrougj the event again.
I have a project which is designed or at least should be according to the well known DDD principles.
Back - DDD + CQRS + Event Store
UI - ngrx/store
I have a lot of questions to ask about it but for now I will stick to these two:
How should the UI store be updated after a single Command/Action is executed ?
a) subscribe to response.ok
b) listen to domain events
c) trigger a generic event holding the created/updated/removed object ?
Is it a good idea to transfer the whole aggregate root dto with all its entities in each command / event or it is better to have more granular commands / events for ex.: with only a single property ?
How should the UI store be updated after a single Command/Action is executed ?
The command methods from my Aggregates return void (respecting CQS); thus, the REST endpoints that receive the command requests respond only with something like OK, command is accepted. Then, it depends on how the command is processed inside the backend server:
if the command is processed synchronously then a simple OK, command is accepted is sufficient as the UI will refresh itself and the new data will be there;
if the command is processed asynchronously then things get more complicated and some kind of command ID should be returned, so a response like OK, command is accepted and it has the ID 1234-abcd-5678-efgh; please check later at this URI for command completion status
At the same time, you could listen to the domain events. I do this using Server sent events that are send from the backend to the UI; this is useful if the UI is web based as there could be more than one browser windows open and the data will be updated in the background for pages; that's nice, client is pleased.
About including some data from the read side in the command response: this is something that depends on your specific case; I avoid it because it implies reading when writing and this means I can't separate the write from the read on a higher level; I like to be able to scale independently the write from the read part. So, a response.ok is the cleanest solution. Also, it implies that the command/write endpoint makes some query assumptions about the caller; why should a command handler/command endpoint assume what data the caller needs? But there could be exceptions, for example if you want to reduce the number of request or if you use an API gateway that do also a READ after the command is send to the backend server.
Is it a good idea to transfer the whole aggregate root dto with all its entities in each command / event or it is better to have more granular commands / events for ex.: with only a single property ?
I never send the whole Aggregate when using CQRS; you have the read-models so each Aggregate has a different representation on each read-model. So, you should create a read-model for each UI component, in this way you keep&send only the data that is displayed on the UI and not some god-like object that contains anything that anybody would need to display anywhere.
Commands basically fall into one of two categories : creation commands and the rest.
Creation commands
With creation commands, you often want to get back a handle to the thing you just created, otherwise you're left in the dark with no place to go to further manipulate it.
I believe that creation commands in CQS and CQRS can return an identifier or location of some sort : see my answer here. The identifier will probably be known by the command handler which can return it in its response. This maps well to 201 Created + Location header in REST.
You can also have the client generate the ID. In that case, see below.
All other commands
The client obviously has the address of the object. It can simply requery its location after it got an OK from the HTTP part. Optionally, you could poll the location until something indicates that the command was successful. It could be a resource version id, a status as Constantin pointed out, an Atom feed etc.
Also note that it might be simpler for the Command Handler to return the success status of the operation, it's debatable whether that really violates CQS or not (again, see answer above).
Is it a good idea to transfer the whole aggregate root dto with all
its entities in each command / event or it is better to have more
granular commands / events for ex.: with only a single property ?
Indeed it is better to have granular commands and events.
Commands and events should be immutable, expressive objects that clearly express an intent or past business event. This works best if the objects exactly contain the data that is about to change or was changed.
Let's imagine I have a hot observable that is a source of weather events. This source is a socket connection to a remote server that provides information about the weather in my current location.
That remote server can also send me events about other related topics as traffic, extreme weather warnings, etc... if I send a command to indicate this desire.
How can I model this with Reactive Extensions without creating coupling between observable and observer?
The idea is that when I subscribe ExtremeWeatherWarnings (observer) to my WeatherServerConnection (observable), somehow the first issue some commands to the second, so it enables the subscription.
I have tried to implement an special interface in the observer, that tells the observable the commands it needs to execute in subscription and unsubscription, but it does not work in the moment I put a Where in the middle because LINQ RX is wrapping the observable, and that wrapper does not implement any interface.
I can also require an instance of the WeatherServerConnection object on the ExtremeWeatherWarnings constructor, but that will create coupling and I would like to avoid that.
Cheers.
If your observables is designed to send generic messages, and your observer is design to translate them, you also need a way of indicating to the producer what kinds of messages you're interested in.
One way to do this is "requesting an observable".
ObservableWeatherSource GetNotifications(WeatherWarnings warningTypes, string location);
Another way might be to lazily indicate what notifications you're interested in.
ObservableWeatherSource source = GetWeatherSource();
source
.Where(x => x.WeatherWarningType === WeatherWarnings.Rain)
.Subscribe(new WeatherObserver());
source.ExpressInterestIn(WeatherWarnings.Rain, "San Francisco");
Or, possibly, you might be interested in writing a specialized query language for weather. You could probably do this through the IQbservable and a query provider, but I have little knowledge of this area of Rx.
It all depends on how you think of your observables.
If it's a stream of events independent of the observers, you can just expose the observable to anyone that want's to subscribe to.
That was the approach I followed on my reactive GeoCoordinateWatcher. The wrapped GeoCoordinateWatcher class will generate events independent of the subscription.
For the reactive Geolocator I chose to follow the same approach. But because the Geolocator needs parameterization to produce events, I could have chosen to implement an observable factory.
The bottom line is (as #Christopher said) that you send commands to something that exposes an observable and not to the observale itself.
You migth do something like sending commands to the observable by applying Rx operators. If those filters are to be applied remotely, you can implement (as #Christopher said) an IQbservale instead.
After reading through the pub/sub project sample in MassTransit, it left me scratching my head.
In the sample, the client application publishes a request for the subscriber application to update the password of a fictitious user. This sample code works fine, and it's easy to follow the bouncing ball of this project.
HOWEVER--
In a real-world environment, the purpose of pub/sub (in my understanding) is to have a small number of publishers interacting with a large number of subscribers. In the case of a subscriber performing any sort of CRUD operation, shouldn't the communication pattern prevent more than one subscriber from handling the message? It would be less than desirable to have twenty subscribers attempt to update the same database record, for instance.
Is this just a case of a misguided sample project?
If pub/sub can be used for CRUD operations, how do you configure the framework to only allow one subscriber to perform an operation?
Am I just completely missing some basic info on the purpose of pub/sub?
Thanks for any clarification provided...
David
The scenario you refer to is usually referred to as 'competing consumers', and is quite typical of pub/sub.
If each consumer has it's own, unique queue name, each consumer will receive it's own copy of messages.
Alternatively, to get competing consumer behaviour, if consumers share the same queue name, there will be competition between the consumers for each message (so each message will only be received once)
You can have n-to-n, many-to-few, or few-to-many publishers to subscribers in any pub/sub system. It's really a matter of how many actors you want responding to a given message.
The sample project might not be the best, but we feel it shows what's going on. In real world cases though, it can be used for CRUD type behaviours; however it's more along the lines of many front ends sending "load data" type messages to middleware (cache) requesting a respond of same data. If that data gets updated on the front end somehow, it must publish some message indicating that and multiple middleware pieces need to update (cache, backend store, etc). [see CQRS]
Messaging in general is more about working with disconnected systems. Your specific world is more about the structure of consumers and publishers. I've seen implementations of MassTransit where most of the routes where static and it wasn't really pub/sub at all but just a lot of sends along a known topography of systems. Really understanding the concepts, the best book I know of is Enterprise Service Bus: Theory in Practice.
I hope this helps!
Edit: Also see our documentation, some of the concepts are touched on there.