Inverted pipeline/chain of responsibility pattern

Inverted pipeline/chain of responsibility pattern - c#

I am wondering whether there is an establish pattern to control the flow that my application will have.
Simply put, it's supposed to be something like that:
User provides a file
File is being processed
User receives a processed file
There will be several processing steps, lets say
PreprocessingOne, PreprocessingTwo, PreprocessingThree and FinalProcessing.
Naturally, we do not control the files that the user provides - they will require a different amount of preprocessing steps.
Since my message handler services will be in separate APIs, I don't want to invoke them just to return 'Cannot process yet' or 'Does not require processing' for performance reason.
Similarily, I don't want to pass the uploaded file around between services.
Ideally, I would like to design the flow for a file dynamically by evaluating the content and inserting only those of the message handlers that make sense.
I am saying 'Inverted' pipeline, because instead of going from A to Z I would rather like to check which stages I need starting from Z and only insert the last ones.
So, if the uploaded file qualifies for FinalProcessing right away, the flow would be just one element.
If the file requires to go from PreprocessingTwo then the flow would be PreprocessingTwo > PreprocessingThree > FinalProcessing
So, I was thinking I could implement something like that, but I am not sure about the details.
public interface IMessageHandler
{
void Process(IFile file);
}
public interface IContentEvaluator
{
IList<IMessageHandler> PrepareWorkflow(IFile file);
}
public interface IPipelineExecutor
{
void ExecuteWorkflow(IList<IMessageHandler> workflow, IFile file);
}
And then in the application
public void Start(IFile newFile)
{
var contentEvaluator = new ContentEvaluator(this.availableHandlers); // would be DI
var workflow = contentEvaluator.PrepareWorkflow(newFile);
this.executor.ExecuteWorkflow(workflow, newFile);
}
Could you please advise, recommend some approach or further read?

You can consider to use Strategy pattern: ...selects an algorithm at runtime...
But if you have too many combinations of the flow than the number of strategies which needs to be implemented will increase and solution can be complex.
Another approach can be to use SEDA: ...decomposes a complex, event-driven application into a set of stages connected by queues...
PreprocessingOne, PreprocessingTwo, PreprocessingThree and FinalProcessing are the stages, and flows can be defined by directing outgoing messages to different queues.

Is that a decorator pattern
Definition
Attach additional responsibilities to an object dynamically.
Decorators provide a flexible alternative to subclassing for extending
functionality.

Related

How to put a TPL Dataflow TranformBlock or ActionBlock in a separate file?

I want to use the TPL Dataflow for my .NET Core application and followed the example from the docs.
Instead of having all the logic in one file I would like to separate each TransformBlock and ActionBlock (I don't need the other ones yet) into their own files. A small TransformBlock example converting integers to strings
class IntToStringTransformer : TransformBlock<int, string>
{
public IntToStringTransformer() : base(number => number.ToString()) { }
}
and a small ActionBlock example writing strings to the console
class StringWriter : ActionBlock<string>
{
public StringWriter() : base(Console.WriteLine) { }
}
Unfortunately this won't work because the block classes are sealed. Is there a way I can organize those blocks into their own files?

Dataflow steps/blocks/goroutines are fundamentally functional in nature and best organized as modules of factory functions, not separate classes. A TPL DataFlow pipeline is quite similar to a pipeline of function calls in F#, or any other language. In fact, one could look at it as a PowerShell pipeline, except it's easier to write.
There's no need to create a class or implement an interface to add a new function to that pipeline, you just add it and redirect the output to the next function.
TPL Dataflow blocks provide the primitives to construct a pipeline already and only require a transformation function. That's why they are sealed, to prevent misuse.
The natural way to organize dataflows is similar to F# too - create libraries with the functions that perform each job, putting them in modules of related functions. Those functions are stateless, so they can easily go into a static library, just like extension methods.
For example, there could be one module for database related functions that perform bulk inserts or read data, another to handle exports to various file formats, separate classes to call external web services, another to parse specific message formats.
A real Example
For the last 7 years I'm working with several complex pipelines for an Online Travel Agency (OTA). One of them calls several GDSs (the intermediaries between OTAs and airlines) to retrieve transaction information - ticket issues, refunds, cancellations etc. Next step retrieves the ticket records, the detailed ticket informations. Finally, the records are inserted into the database.
GDSs are too big to bother with standards, so their "SOAP" web services aren't even SOAP-compliant, much less follow WS-* standards. So each GDS needs a separate class library to call the services and parse the outputs. No dataflows there yet, the project is already complex enough
Writing the data to the database is pretty much the same always, so there's a separate project with methods that take eg an IEnumerable<T> and write it to the database with SqlBulkCopy.
It's not enough to load new data though, things often go wrong so I need to be able to load already stored ticket information.
Organisation
To preserve sanity :
Each pipeline gets its own file:
A Daily pipeline to load new data,
A Reload pipeline to load all stored data
A "Rerun" pipeline to use the existing data and ask again for any missing data.
Static classes are used to hold the worker functions and separately factory methods that produce Dataflow blocks based on configuration. Eg, a CreateLogger(path,level) creates an ActionBlock<Message> that logs specific messages.
Common dataflow extension methods - since DataFlow blocks follow the same basic patterns, it's easy to create a logged block by combining eg a Func<TIn,TOut> and a logger block. Or create a LinkTo overload that redirects bad records to a logger or database. Those are common enough they can become extension methods.
If those were in the same file, it would be very hard to edit one pipeline without affecting another. Besides, there's a lot more to a pipeline than the core tasks, eg:
Logging
Handling bad records and partial results (can't stop a 100K import for 10 errors)
error handling (which isn't the same as handling bad records)
monitoring - what's this monster doing for the last 15 minutes? Did a DOP=10 improve performance at all?
Don't create a parent pipeline class.
Some of the steps are common, so at first, I created a parent class with common steps that got overloaded, or simply replaced in child classes. VERY BAD IDEA. Each pipeline is similar but not quite, and inheritance means that modifying one step or one connection risks breaking everything. After about 1 year things became unbearable, so I split the parent class into separate classes.

As #Panagiotis explained, I think you have to put aside the OOP Mindset a little.
What you have with DataFlow are Buildingblocks that you configure to execute what you need. I'll try to create a little example of what I mean by that:
// Interface and impl. are in separate files. Actually, they could
// even be in a different project ...
public interface IMyComplicatedTransform
{
Task<string> TransformFunction(int input);
}
public class MyComplicatedTransform : IMyComplicatedTransform
{
public Task<string> IMyComplicatedTransform.TransformFunction(int input)
{
// Some complex logic
}
}
class DataFlowUsingClass{
private readonly IMyComplicatedTransform myTransformer;
private readonly TransformBlock<int , string> myTransform;
// ... some more blocks ...
public DataFlowUsingClass()
{
myTransformer = new MyComplicatedTransform(); // maybe use ctor injection?
CreatePipeline();
}
private void CreatePipeline()
{
// create blocks
myTransform = new TransformBlock<int, string>(myTransformer.TransformFunction);
// ... init some more blocks
// TODO link blocks
}
}
I think this is the closest to what you are looking for to do.
What you end up with is a set of interfaces and implementations which can be tested independently. The client basically boils down to "gluecode".
Edit: As #Panagiotis correctly states, the interfaces are even superfluent. You could do without.

Proper permission management when using CQRS

I use Command Query Separation in my system.
To describe the problem lets start with an example. Let's say we have a code as follows:
public class TenancyController : ControllerBase{
public async Task<ActionResult> CreateTenancy(CreateTenancyRto rto){
// 1. Run Blah1Command
// 2. Run Blah2Command
// 3. Run Bar1Query
// 4. Run Blah3Command
// 5. Run Bar2Query
// ...
// n. Run BlahNCommand
// n+1. Run BarNQuery
//example how to run a command in the system:
var command = new UploadTemplatePackageCommand
{
Comment = package.Comment,
Data = Request.Body,
TemplatePackageId = id
};
await _commandDispatcher.DispatchAsync(command);
return Ok();
}
}
The CreateTenancy has a very complex implementation and runs many different queries and commands.
Each command or query can be reused in other places of the system.
Each Command has a CommandHandler
Each Query has a QueryHandler
Example:
public class UploadTemplatePackageCommandHandler : PermissionedCommandHandler<UploadTemplatePackageCommand>
{
//ctor
protected override Task<IEnumerable<PermissionDemand>> GetPermissionDemandsAsync(UploadTemplatePackageCommand command) {
//return list of demands
}
protected override async Task HandleCommandAsync(UploadTemplatePackageCommand command)
{
//some business logic
}
}
Every time you try to run the command or query there is a permission check. The problem which appears in the CreateTenancy is when you run let's say 10 commands.
There can be a case when you have permissions to all of the first 9 commands but you are missing some permissions to run the last command. In such a situation you can make some complex modifications to the system running these 9 commands and at the end, you are not able to finish the whole transaction because you are not able to run the last command. In such a case, there is a need to make a complex rollback.
I believe that in the above example the permission check should be done only once at the very beginning of the whole transaction but I'm not sure what is the best way to achieve this.
My first idea is to create a command called let's say CreateTenancyCommand and in the HandleCommandAsync place the whole logic from CreateTenancy(CreateTenancyRto rto)
So it would look like:
public class CreateTenancyCommand : PermissionedCommandHandler<UploadTemplatePackageCommand>
{
//ctor
protected override Task<IEnumerable<PermissionDemand>> GetPermissionDemandsAsync(UploadTemplatePackageCommand command) {
//return list of demands
}
protected override async Task HandleCommandAsync(UploadTemplatePackageCommand command)
{
// 1. Run Blah1Command
// 2. Run Blah2Command
// 3. Run Bar1Query
// 4. Run Blah3Command
// 5. Run Bar2Query
// ...
// n. Run BlahNCommand
// n+1. Run BarNQuery
}
}
I'm not sure if it's a good approach to invoke a command inside a command handler of another command?
I think that each command handler should be independent.
Am I right that the permission check should happen only once?
If yes- how to do the permission check in the case when you want to run a command to modify the database and then return some data to the client?
In such a case, you would need to do 2 permission checks...
There can be a theoretical case when you modify the database running the command and then cannot run a query which only reads the database because you are missing some of the permissions. It can be very problematic for the developer to detect such a situation if the system is big and there are hundreds of
different permissions and even the good unit tests coverage can fail.
My second idea is to create some kind of wrapper or extra layer above the commands and queries and do the permission check there
but not sure how to implement it.
What is the proper way to do the permissions check in the described transaction CreateTenancy which is implemented in the action of the controller in the above example?

In a situation where you have some sort of process which requires multiple commands / service calls to carry out the process, then this is an ideal candidate for a DomainService.
A DomainService is by definition one which has some Domain Knowledge, and is used to facilitate a process which interacts with multiple Aggregates / services.
In this instance I would look to have your Controller Action call a CQRS Command/CommandHandler. That CommandHandler will take the domain service as a single dependency. The CommandHandler then has the single responsibility of calling the Domain Service method.
This then means your CreateTenancy process is contained in one place, The DomainService.
I typically have my CommandHandlers simply call into service methods. Therefore a DomainService can call into multiple services to perform it's function, rather than calling into multiple CommandHandlers. I treat the Command Handlers as a facade through which my Controllers can access the Domain.
When it comes to permissions, I typically first decide whether the users authorisation to carry out a process is a Domain issue. If so, I will typically create an Interface to describe the users permissions. And also, I will typically create an Interface for this specific to the Bounded Context I am working in. So in this case you may have something like:
public interface ITenancyUserPermissions
{
bool CanCreateTenancy(string userId);
}
I would then have the ITenancyUserPermission interface be a dependancy in my CommandValidator:
public class CommandValidator : AbstractValidator<Command>
{
private ITenancyUserPermissions _permissions;
public CommandValidator(ITenancyUserPermissions permissions)
{
_permissions = permissions;
RuleFor(r => r).Must(HavePermissionToCreateTenancy).WithMessage("You do not have permission to create a tenancy.");
}
public bool HavePermissionToCreateTenancy(Command command)
{
return _permissions.CanCreateTenancy(command.UserId);
}
}
You said that the permission to create a Tenancy is dependent on the permission to perform the other tasks / commands. Those other commands would have their own set of Permission Interfaces. And then ultimately within your application you would have an implementation of these interfaces such as:
public class UserPermissions : ITenancyUserPermissions, IBlah1Permissions, IBlah2Permissions
{
public bool CanCreateTenancy(string userId)
{
return CanBlah1 && CanBlah2;
}
public bool CanBlah1(string userID)
{
return _authService.Can("Blah1", userID);
}
public bool CanBlah2(string userID)
{
return _authService.Can("Blah2", userID);
}
}
In my case I use a ABAC system, with the policy stored and processed as a XACML file.
Using the above method may mean you have slightly more code, and several Permissions interfaces, but it does mean that any permissions you define are specific to the Bounded Context you are working within. I feel this is better than having a Domain Model wide IUserPermissions interface, which may define methods which of no relevance, and/or confusing in your Tenancy bounded context.
This means you can check user permissions in your QueryValidator or CommandValidator instances. And of course you can use the implementation of your IPermission interfaces at the UI level to control which buttons / functions etc are shown to the user.

There is no "The Proper Way", but I'd suggest that you could approach the solution from the following angle.
Usage of the word Controller in your names and returning Ok() lets me understand that you are handling an http request. But what is happening inside is a part of a business use case that has nothing to deal with http. So, you'd better get some Onion-ish and introduce a (business) application layer.
This way, your http controller would be responsible for: 1) Parsing create tenancy http request into a create tenancy business request - i.e. the request object model in terms of domain language void of any infrastructure terms. 2) Formatting business response into an http response including translating business errors into http errors.
So, what you get entering the application layer is a business create tenancy request. But it's not a command yet. I can't remember the source, but someone once said, that command should be internal to a domain. It cannot come from outside. You may consider a command to be a comprehensive object model necessary to make a decision whether to change an application's state. So, my suggestion is that in your business application layer you build a command not only from business request, but also from results of all these queries, including queries to necessary permission read models.
Next, you may have a separate decision-making business core of a system that takes a command (a value object) with all the comprehensive data, applies a pure decision-making function and returns a decision, also a value object (event or rejection), containing, again, all necessary data calculated from the command.
Then, when your business application layer gets back a decision, it can execute it, writing to event stores or repositories, logging, firing events and ultimately producing a business response to the controller.
In most cases you'll be ok with this single-step decision-making process. If it needs more than a single step - maybe it's a hint to reconsider the business flow, because it gets too complex for a single http request processing.
This way you'll get all the permissions before handling a command. So, your business core will be able to make a decision whether those permissions are sufficient to proceed. It also may make a decision-making logic much more testable and, therefore, reliable. Because it is the main part that should be tested in any calculation flow branch.
Keep in mind that this approach leans toward eventual consistency, which you have anyway in a distributed system. Though, if interacting with a single database, you may run an application-layer code in a single transaction. I suppose, though, that you deal with eventual consistency anyway.
Hope this helps.

Routing an object in C# without using switch statements

I am writing a piece of software in c# .net 4.0 and am running into a wall in making sure that the code-base is extensible, re-usable and flexible in a particular area.
We have data coming into it that needs to be broken down in discrete organizational units. These units will need to be changed, sorted, deleted, and added to as the company grows.
No matter how we slice the data structure we keep running into a boat-load of conditional statements (upwards of 100 or so to start) that we are trying to avoid, allowing us to modify the OUs easily.
We are hoping to find an object-oriented method that would allow us to route the object to different workflows based on properties of that object without having to add switch statements every time.
So, for example, let's say I have an object called "Order" come into the system. This object has 'orderItems' inside of it. Each of those different kinds of 'orderItems' would need to fire a different function in the code to be handled appropriately. Each 'orderItem' has a different workflow. The conditional looks basically like this -
if(order.orderitem == 'photo')
{do this}
else if(order.orderitem == 'canvas')
{do this}
edit: Trying to clarify.

I'm not sure your question is very well defined, you need a lot more specifics here - a sample piece of data, sample piece of code, what have you tried...
No matter how we slice the data structure we keep running into a boat-load of conditional statements (upwards of 100 or so to start) that we are trying to avoid
This usually means you're trying to encode data in your code - just add a data field (or a few).
Chances are your ifs are linked to each other, it's hard to come up with 100 independent ifs - that would imply you have 100 independent branches for 100 independent data conditions. I haven't encountered such a thing in my career that really would require hard-coding 100 ifs.
Worst case scenario you can make an additional data field contain a config file or even a script of your choice. Either case - your data is incomplete if you need 100 ifs
With the update you've put in your question here's one simple approach, kind of low tech. You can do better with dependency injection and some configuration but that can get excessive too, so be careful:
public class OrderHandler{
public static Dictionary<string,OrderHandler> Handlers = new Dictionary<string,OrderHandler>(){
{"photo", new PhotoHandler()},
{"canvas", new CanvasHandler()},
};
public virtual void Handle(Order order){
var handler = handlers[order.OrderType];
handler.Handle(order);
}
}
public class PhotoHandler: OrderHandler{...}
public class CanvasHandler: OrderHandler{...}

What you could do is called - "Message Based Routing" or "Message Content Based" Routing - depending on how you implement it.
In short, instead of using conditional statements in your business logic, you should implement organizational units to look for the messages they are interested in.
For example:
Say your organization has following departments - "Plant Products", "Paper Products", "Utilities". Say there is only one place where the orders come in - Ordering (module).
here is a sample incoming message.
Party:"ABC Cop"
Department: "Plant Product"
Qty: 50
Product: "Some plan"
Publish out a message with this information. In the module that processes orders for "Plant Products" configure it such that it listens to a message that has "Department = Plant Products". This way, you push the onus on the department modules instead of on the main ordering module.
You can do this using NServiceBus, BizTalk, or any other ESB you might already have.
This is how you do in BizTalk and this is how you can do in NServiceBus

Have you considered sub-typing OrderItem?
public class PhotoOrderItem : OrderItem {}
public class CanvasOrderItem : OrderItem {}
Another option would be to use the Strategy pattern. Add an extra property to your OrderItem class definition for the OrderProcessStrategy and use a PhotoOrderStrategy/CanvasOrderStrategy to contain all of the different logic.
public class OrderItem{
public IOrderItemStrategy Strategy;
}
public interface IOrderItemStrategy{
public void Checkout();
public Control CheckoutStub{get;}
public bool PreCheckoutValidate();
}
public class PhotoOrderStrategy : IOrderItemStrategy{}
public class CanvasOrderStrategy : IOrderItemStrategy{}

Taking the specific example:
You could have some Evaluator that takes an order and iterates each line item. Instead of processing if logic raise events that carry in their event arguments the photo, canvas details.
Have a collection of objects 'Initiators' that define: 1)an handler that can process Evaluator messages, 2)a simple bool that can be set to indicate if they know what to do with something in the message, and 3)an Action or Process method which can perform or initiate the workflow. Design an interface to abstract these.
Issue the messages. Visit each Initiator, ask it if it can process the lineItem if it can tell it to do so. The processing is kicked off by the 'initiators' and they can call other workflows etc.
Name the pieces outlined above whatever best suits your domain. This should offer some flexibility. Problems may arise depending on concurrent processing requirements and workflow dependencies between the Initiators.
In general, without knowing a lot more detail, size of the project, workflows, use cases etc it is hard to comment.

How to implement undo functionality?

In my application I want to provide the user with a small undo functionality. There aren't many actions than can be undone by the user. Particularly the actions are:
Add notes to an object
Color an object
Tag a objcet with a string
Now I thought about how to implement this. I first thought of a Action Class that is the abstract base class for the 3 different actions that can be taken by the user. Every time the user takes on of these actions, a new appropriate instance of a subclass of this abstract Action class is created and inserted into a list that contains all actions.
Whenever the user wants to undo something, the list is displayed to the user and he can choose which action he want to undo.
Now I was thinking what has to be stored in such an action object:
the state of the object before the action
the actual action that was taken (e.g. the string that was added to a object's notes)
I'm not sure if this is enough. I also thought about something like a chronological ordering, but this should be necessary since the list can be maintained chronologically correct.
Are there any other things I should consider?

Undo/redo is commonly implemented with the Command Pattern. The Action class can be used as the basis for this, but you need a 'do' action and an 'undo' action within each command. Here is an example of this in practice.
You should probably store the commands executed in a stack as it makes it much easier to implement and much easier for the user to follow.

You could do something simple like this:
Stack<Action> undoStack = new Stack<Action>();
void ChangeColor(Color color)
{
var original = this.Object.Color;
undoStack.Push(() => this.Object.Color = original);
this.Object.Color = color;
}

you should implement the Command Pattern for every action you want undo:
how to implement undo/redo operation without major changes in program

For Correct and proven implememtation for UNDO functionality is Command Pattern

Its hard to overlook this Simple-Undo-redo-library-for-Csharp-NET when adding Undo/Redo functionality to existing projects.

Centralizing Messagebox handling for application

I'm wondering how others deal with trying to centralize MessageBox function calling. Instead of having long text embedded all over the place in code, in the past (non .net language), I would put system and application base "messagebox" type of messages into a database file which would be "burned" into the executable, much like a resource file in .Net. When a prompting condition would arise, I would just do call something like
MBAnswer = MyApplication.CallMsgBox( IDUserCantDoThat )
then check the MBAnswer upon return, such as a yes/no/cancel or whatever.
In the database table, I would have things like what the messagebox title would be, the buttons that would be shown, the actual message, a special flag that automatically tacked on a subsequent standard comment like "Please contact help desk if this happens.". The function would call the messagebox with all applicable settings and just return back the answer. The big benefits of this was, one location to have all the "context" of messages, and via constants, easier to read what message was going to be presented to the user.
Does anyone have a similar system in .Net to do a similar approach, or is this just a bad idea in the .Net environment.

We used to handle centralized messages with Modules (VB). We had one module with all messages and we call that in our code. This was done so that we change the message in one place (due to business needs) and it gets reflected everywhere. And it was also easy to handle change in one file instead of multiple files to change the message. Also we opened up that file to Business Analysts (VSS) so that they can change it. I don't think it is a bad idea if it involves modules or static class but it might be a overkill to fetch it from DB.
HTH

You could use resource files to export all text into there (kinda localization feature as well). Resharper 5.0 really helps in that highlighting text that can be moved to resource.
Usually it looks like this:
Before: MessageBox.Show(error.ToString(), "Error with extraction");
Suggestion: Localizable string "Error with extraction"
Right click Move to Resource
Choose resource file and name (MainForm_ExtractArchive_Error_with_extraction), also check checkbox Find identical items in class ...
Call it like this MessageBox.Show(error.ToString(), Resources.MainForm_ExtractArchive_Error_with_extraction);
Best of all it makes it easy to translate stuff to other languages as well as keeping text for MessageBox in separate Resource. Of course Resharper does it all for you so no need to type that much :-)

I suppose you could use a HashTable to do something similar like this, this can be found in:
using System.Collections;
To keep it globally accessable i was thinking a couple of functions in a class holding the hashtable to get/set a certain one.
lets see now.
public class MessageBoxStore
{
private HashTable stock;
public string Get(string msg)
{
if (stock.ContainsKey(msg))
return stock[msg];
else
return string.Empty;
}
public string Set(string msg, string msgcontent)
{
stock[msg] = msgcontent;
}
}
or something like that, you could keep multiple different information in the hashtable and subsequently compose the messagebox in the function too.. instead of just returning the string for the messagebox contents...
but to use this it would be quite simple.
call a function like this on program load.
public LoadErrorMessages()
{
storeClass = new MessageBoxStore();
storeClass.Set("UserCantDoThat", "Invalid action. Please confirm your action and try again");
}
for example, and then.
MessageBox.Show(storeClass.Get("UserCantDoThat"));
i put this in a new class instead of using the HashTable get/set methods direct because this leaves room for customization so the messagebox could be created in the get, and more than 1 piece of information could be stored in the set to handle messagebox title, buttontype, content, etc etc.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.