I am writing a piece of software in c# .net 4.0 and am running into a wall in making sure that the code-base is extensible, re-usable and flexible in a particular area.
We have data coming into it that needs to be broken down in discrete organizational units. These units will need to be changed, sorted, deleted, and added to as the company grows.
No matter how we slice the data structure we keep running into a boat-load of conditional statements (upwards of 100 or so to start) that we are trying to avoid, allowing us to modify the OUs easily.
We are hoping to find an object-oriented method that would allow us to route the object to different workflows based on properties of that object without having to add switch statements every time.
So, for example, let's say I have an object called "Order" come into the system. This object has 'orderItems' inside of it. Each of those different kinds of 'orderItems' would need to fire a different function in the code to be handled appropriately. Each 'orderItem' has a different workflow. The conditional looks basically like this -
if(order.orderitem == 'photo')
{do this}
else if(order.orderitem == 'canvas')
{do this}
edit: Trying to clarify.
I'm not sure your question is very well defined, you need a lot more specifics here - a sample piece of data, sample piece of code, what have you tried...
No matter how we slice the data structure we keep running into a boat-load of conditional statements (upwards of 100 or so to start) that we are trying to avoid
This usually means you're trying to encode data in your code - just add a data field (or a few).
Chances are your ifs are linked to each other, it's hard to come up with 100 independent ifs - that would imply you have 100 independent branches for 100 independent data conditions. I haven't encountered such a thing in my career that really would require hard-coding 100 ifs.
Worst case scenario you can make an additional data field contain a config file or even a script of your choice. Either case - your data is incomplete if you need 100 ifs
With the update you've put in your question here's one simple approach, kind of low tech. You can do better with dependency injection and some configuration but that can get excessive too, so be careful:
public class OrderHandler{
public static Dictionary<string,OrderHandler> Handlers = new Dictionary<string,OrderHandler>(){
{"photo", new PhotoHandler()},
{"canvas", new CanvasHandler()},
};
public virtual void Handle(Order order){
var handler = handlers[order.OrderType];
handler.Handle(order);
}
}
public class PhotoHandler: OrderHandler{...}
public class CanvasHandler: OrderHandler{...}
What you could do is called - "Message Based Routing" or "Message Content Based" Routing - depending on how you implement it.
In short, instead of using conditional statements in your business logic, you should implement organizational units to look for the messages they are interested in.
For example:
Say your organization has following departments - "Plant Products", "Paper Products", "Utilities". Say there is only one place where the orders come in - Ordering (module).
here is a sample incoming message.
Party:"ABC Cop"
Department: "Plant Product"
Qty: 50
Product: "Some plan"
Publish out a message with this information. In the module that processes orders for "Plant Products" configure it such that it listens to a message that has "Department = Plant Products". This way, you push the onus on the department modules instead of on the main ordering module.
You can do this using NServiceBus, BizTalk, or any other ESB you might already have.
This is how you do in BizTalk and this is how you can do in NServiceBus
Have you considered sub-typing OrderItem?
public class PhotoOrderItem : OrderItem {}
public class CanvasOrderItem : OrderItem {}
Another option would be to use the Strategy pattern. Add an extra property to your OrderItem class definition for the OrderProcessStrategy and use a PhotoOrderStrategy/CanvasOrderStrategy to contain all of the different logic.
public class OrderItem{
public IOrderItemStrategy Strategy;
}
public interface IOrderItemStrategy{
public void Checkout();
public Control CheckoutStub{get;}
public bool PreCheckoutValidate();
}
public class PhotoOrderStrategy : IOrderItemStrategy{}
public class CanvasOrderStrategy : IOrderItemStrategy{}
Taking the specific example:
You could have some Evaluator that takes an order and iterates each line item. Instead of processing if logic raise events that carry in their event arguments the photo, canvas details.
Have a collection of objects 'Initiators' that define: 1)an handler that can process Evaluator messages, 2)a simple bool that can be set to indicate if they know what to do with something in the message, and 3)an Action or Process method which can perform or initiate the workflow. Design an interface to abstract these.
Issue the messages. Visit each Initiator, ask it if it can process the lineItem if it can tell it to do so. The processing is kicked off by the 'initiators' and they can call other workflows etc.
Name the pieces outlined above whatever best suits your domain. This should offer some flexibility. Problems may arise depending on concurrent processing requirements and workflow dependencies between the Initiators.
In general, without knowing a lot more detail, size of the project, workflows, use cases etc it is hard to comment.
Related
I want to use the TPL Dataflow for my .NET Core application and followed the example from the docs.
Instead of having all the logic in one file I would like to separate each TransformBlock and ActionBlock (I don't need the other ones yet) into their own files. A small TransformBlock example converting integers to strings
class IntToStringTransformer : TransformBlock<int, string>
{
public IntToStringTransformer() : base(number => number.ToString()) { }
}
and a small ActionBlock example writing strings to the console
class StringWriter : ActionBlock<string>
{
public StringWriter() : base(Console.WriteLine) { }
}
Unfortunately this won't work because the block classes are sealed. Is there a way I can organize those blocks into their own files?
Dataflow steps/blocks/goroutines are fundamentally functional in nature and best organized as modules of factory functions, not separate classes. A TPL DataFlow pipeline is quite similar to a pipeline of function calls in F#, or any other language. In fact, one could look at it as a PowerShell pipeline, except it's easier to write.
There's no need to create a class or implement an interface to add a new function to that pipeline, you just add it and redirect the output to the next function.
TPL Dataflow blocks provide the primitives to construct a pipeline already and only require a transformation function. That's why they are sealed, to prevent misuse.
The natural way to organize dataflows is similar to F# too - create libraries with the functions that perform each job, putting them in modules of related functions. Those functions are stateless, so they can easily go into a static library, just like extension methods.
For example, there could be one module for database related functions that perform bulk inserts or read data, another to handle exports to various file formats, separate classes to call external web services, another to parse specific message formats.
A real Example
For the last 7 years I'm working with several complex pipelines for an Online Travel Agency (OTA). One of them calls several GDSs (the intermediaries between OTAs and airlines) to retrieve transaction information - ticket issues, refunds, cancellations etc. Next step retrieves the ticket records, the detailed ticket informations. Finally, the records are inserted into the database.
GDSs are too big to bother with standards, so their "SOAP" web services aren't even SOAP-compliant, much less follow WS-* standards. So each GDS needs a separate class library to call the services and parse the outputs. No dataflows there yet, the project is already complex enough
Writing the data to the database is pretty much the same always, so there's a separate project with methods that take eg an IEnumerable<T> and write it to the database with SqlBulkCopy.
It's not enough to load new data though, things often go wrong so I need to be able to load already stored ticket information.
Organisation
To preserve sanity :
Each pipeline gets its own file:
A Daily pipeline to load new data,
A Reload pipeline to load all stored data
A "Rerun" pipeline to use the existing data and ask again for any missing data.
Static classes are used to hold the worker functions and separately factory methods that produce Dataflow blocks based on configuration. Eg, a CreateLogger(path,level) creates an ActionBlock<Message> that logs specific messages.
Common dataflow extension methods - since DataFlow blocks follow the same basic patterns, it's easy to create a logged block by combining eg a Func<TIn,TOut> and a logger block. Or create a LinkTo overload that redirects bad records to a logger or database. Those are common enough they can become extension methods.
If those were in the same file, it would be very hard to edit one pipeline without affecting another. Besides, there's a lot more to a pipeline than the core tasks, eg:
Logging
Handling bad records and partial results (can't stop a 100K import for 10 errors)
error handling (which isn't the same as handling bad records)
monitoring - what's this monster doing for the last 15 minutes? Did a DOP=10 improve performance at all?
Don't create a parent pipeline class.
Some of the steps are common, so at first, I created a parent class with common steps that got overloaded, or simply replaced in child classes. VERY BAD IDEA. Each pipeline is similar but not quite, and inheritance means that modifying one step or one connection risks breaking everything. After about 1 year things became unbearable, so I split the parent class into separate classes.
As #Panagiotis explained, I think you have to put aside the OOP Mindset a little.
What you have with DataFlow are Buildingblocks that you configure to execute what you need. I'll try to create a little example of what I mean by that:
// Interface and impl. are in separate files. Actually, they could
// even be in a different project ...
public interface IMyComplicatedTransform
{
Task<string> TransformFunction(int input);
}
public class MyComplicatedTransform : IMyComplicatedTransform
{
public Task<string> IMyComplicatedTransform.TransformFunction(int input)
{
// Some complex logic
}
}
class DataFlowUsingClass{
private readonly IMyComplicatedTransform myTransformer;
private readonly TransformBlock<int , string> myTransform;
// ... some more blocks ...
public DataFlowUsingClass()
{
myTransformer = new MyComplicatedTransform(); // maybe use ctor injection?
CreatePipeline();
}
private void CreatePipeline()
{
// create blocks
myTransform = new TransformBlock<int, string>(myTransformer.TransformFunction);
// ... init some more blocks
// TODO link blocks
}
}
I think this is the closest to what you are looking for to do.
What you end up with is a set of interfaces and implementations which can be tested independently. The client basically boils down to "gluecode".
Edit: As #Panagiotis correctly states, the interfaces are even superfluent. You could do without.
I am wondering whether there is an establish pattern to control the flow that my application will have.
Simply put, it's supposed to be something like that:
User provides a file
File is being processed
User receives a processed file
There will be several processing steps, lets say
PreprocessingOne, PreprocessingTwo, PreprocessingThree and FinalProcessing.
Naturally, we do not control the files that the user provides - they will require a different amount of preprocessing steps.
Since my message handler services will be in separate APIs, I don't want to invoke them just to return 'Cannot process yet' or 'Does not require processing' for performance reason.
Similarily, I don't want to pass the uploaded file around between services.
Ideally, I would like to design the flow for a file dynamically by evaluating the content and inserting only those of the message handlers that make sense.
I am saying 'Inverted' pipeline, because instead of going from A to Z I would rather like to check which stages I need starting from Z and only insert the last ones.
So, if the uploaded file qualifies for FinalProcessing right away, the flow would be just one element.
If the file requires to go from PreprocessingTwo then the flow would be PreprocessingTwo > PreprocessingThree > FinalProcessing
So, I was thinking I could implement something like that, but I am not sure about the details.
public interface IMessageHandler
{
void Process(IFile file);
}
public interface IContentEvaluator
{
IList<IMessageHandler> PrepareWorkflow(IFile file);
}
public interface IPipelineExecutor
{
void ExecuteWorkflow(IList<IMessageHandler> workflow, IFile file);
}
And then in the application
public void Start(IFile newFile)
{
var contentEvaluator = new ContentEvaluator(this.availableHandlers); // would be DI
var workflow = contentEvaluator.PrepareWorkflow(newFile);
this.executor.ExecuteWorkflow(workflow, newFile);
}
Could you please advise, recommend some approach or further read?
You can consider to use Strategy pattern: ...selects an algorithm at runtime...
But if you have too many combinations of the flow than the number of strategies which needs to be implemented will increase and solution can be complex.
Another approach can be to use SEDA: ...decomposes a complex, event-driven application into a set of stages connected by queues...
PreprocessingOne, PreprocessingTwo, PreprocessingThree and FinalProcessing are the stages, and flows can be defined by directing outgoing messages to different queues.
Is that a decorator pattern
Definition
Attach additional responsibilities to an object dynamically.
Decorators provide a flexible alternative to subclassing for extending
functionality.
I am new to DDD, and I am trying to figure out a way to update aggregate by using a PUT verb.
If all properties in the aggregate have private setters, then it's obvious I need to have set of functionality for every business requirement. For an example
supportTicket.Resolve();
It's clear for me that I can achieve this with an endpoint such as /api/tickets/5/resolve, but what if i want to provide a way to update whole ticket atomically?
As an example, user can make a PUT request to /api/tickets/5 with a following body
{"status" : "RESOLVED", "Title":"Some crazy title"}
Do I need to do something like this in the ApplicationSercvice
if(DTO.Status != null && dto.Status == "RESOLVED")
supportTicket.Resolve();
if(DTO.Title != null)
supportTicket.setNewTitle(DTO.title);
If that's the case and changing ticket title has some business logic to prevent changing it if the ticket is resolved, should I consider some kind of prioritizing when updating aggregate, or I am looking at this entirely wrong?
Domain Driven Design for RESTful Systems -- Jim Webber
what if i want to provide a way to update whole ticket atomically?
If you want to update the whole ticket atomically, ditch aggregates; aggregates are the wrong tool in your box if what you really want is a key value store with CRUD semantics.
Aggregates only make sense when their are business rules for the domain to enforce. Don't build a tractor when all you need is a shovel.
As an example, user can make a PUT request to /api/tickets/5
That's going to make a mess. In a CRUD implementation, replacing the current state of a resource by sending it a representation of a new state is appropriate. But that doesn't really fit for aggregates at all, because the state of the aggregate is not under the control of you, the client/publisher.
The more appropriate idiom is to publish a message onto a bus, which when handled by the domain will have the side effect of achieving the changes you want.
PUT /api/tickets/5/messages/{messageId}
NOW your application service looks at the message, and sends commands to the aggregate
if(DTO.Status != null && dto.Status == "RESOLVED")
supportTicket.Resolve();
if(DTO.Title != null)
supportTicket.setNewTitle(DTO.title);
This is OK, but in practice its much more common to make the message explicit about what is to be done.
{ "messageType" : "ResolveWithNewTitle"
, "status" : "RESOLVED"
, "Title":"Some crazy title"
}
or even...
[
{ "messageType" : "ChangeTitle"
, "Title" : "Some crazy title"
}
, { "messageType" : "ResolveTicket"
}
]
Basically, you want to give the app enough context that it can do real message validation.
let's say I had aggregates which encapsulated needed business logic, but besides that there is a new demand for atomic update functionality and I am trying to understand a best way to deal with this.
So the right way to deal with this is first to deal with it on the domain level -- sit down with your domain experts, make sure that everybody understands the requirement and how to express it in the ubiquitous language, etc.
Implement any new methods that you need in the aggregate root.
Once you have the use case correctly supported in the domain, then you can start worrying about your resources following the previous pattern - the resource just takes the incoming request, and invokes the appropriate commands.
Is changing the Title a requirement of Resolving a ticket? If not, they should not be the same action in DDD. You wouldn't want to not resolve the ticket if the new name was invalid, and you wouldn't want to not change the name if the ticket was not resolvable.
Make 2 calls to perform the 2 separate actions. This also allows for flexibility such as, the Title can be changed immediately, but perhaps "resolving" the ticket will kick off some complex and time consuming (asyncronous) work flow before the ticket is actually resolved. Perhaps it needs to have a manager sign off? You don't want the call to change "title" tied up in that mix.
If needs be, create something to orchestrate multiple commands as per #VoiceOfUnreason's comment.
Wherever possible, keep things separate, and code to use cases as opposed to minimizing interacitons with entities.
You're probably right. But it's probably wiser to encapsulate such logic inside the ticket it self, by making a "change()" method, receiving a changeCommandModel (or something like this), so you can define the business rules inside your domain object.
if (DTO.Status != null && dto.Status == "RESOLVED")
supportTicket.Resolve(DTO.title);
I will change the underlying method to take title as parameter, this clarify the resolve action. That second if and validation you want in the domain method. It's really preference, more importantly is the message and I agree with #VoiceOfUnreason second option.
Many web applications I've contributed in (mostly ASP.NET), need to handle multiple and different user types.
Let's say you have a school portal, which both pupils and teachers use daily. On the front page of the application the users are met with almost the same GUI except some links to some tools only the teachers have access to. Let's say this is a messaging tool. A teacher can have different roles which defines who the teacher can send to.
For example:
A teacher with the role Publisher are allowed to send to everyone in the school.
A teacher with no extra roles are only allowed to send to everyone in his/her classes.
In the future, parents will also be able to access this portal and see detailed information about their children.
The problem I always get into is that my code is always getting cluttered with if-statements when handling different user types all over the application. Not only because of the different user types, but also by different business rules. I feel I can't find any way to properly handle different user types.
I guess the concept of roles in ASP.NET kind of solves this, but you would still end up with if-statements around the application.
My question is: Is there any best-practice on how to handle different users/user types in an application without infecting the code with if-statements?
You should split these responsibilities (like the sending capabilities of a teacher for example). You can do this by using the strategy pattern.
So a teacher has an extra property Publishing, which serves as an interface. The implementation can have multiple realizations (e.g. NoPublishing for a teacher who does not have publishing functionality, or DefaultPublishing). Each teacher can have its Publishing property set to either NoPublishing or DefaultPublishing. It can even be changed runtime if needed.
An example:
public class Teacher
{
public IPublishing Publishing { get; }
}
interface IPublishing
{
void Send();
}
public NoPublishing : IPublishing
{
public void Send()
{
// Implementatation
}
}
public PublishDefault : IPublishing
{
public void Send()
{
// Send a message the default way
}
}
Create a teacher:
var teacher = new Teacher();
Create a publisher strategy.
var defaultStrategy = new PublishDefault();
Connect them
teacher.Publishing = defaultStrategy;
Now you can send a message by:
teacher.Publishing.Send();
Depending on which Publishing strategy has been connected it will either send nothing or send something the default way.
You only need to instantiate each used Publishing strategy once and reuse it for each Teacher (or even other classes who needs to be able to send).
When you need other publish functionality, just add a new strategy (e.g. SmsPublishing, LetterPublishing etc).
You can even change the strategy on the fly if it is needed (by reassigning the Publishing property).
Why not implementing the interface not directly in Teacher?
Separation of Concerns principle: IPublish contains a specific and different responsibility.
Possibly IPublish contains functionality that can be used later in different classes or even other projects, so it is more reusable.
Testing is easier since IPublish does not need any knowledge about Teacher.
Possibility to realtime change the behavior of publishing in Teacher.
(note: I don't have a compiler here, so code is only for explanation purposes).
Back story:
So I've been stuck on an architecture problem for the past couple of nights on a refactor I've been toying with. Nothing important, but it's been bothering me. It's actually an exercise in DRY, and an attempt to take it to such an extreme as the DAL architecture is completely DRY. It's a completely philosophical/theoretical exercise.
The code is based in part on one of #JohnMacIntyre's refactorings which I recently convinced him to blog about at http://whileicompile.wordpress.com/2010/08/24/my-clean-code-experience-no-1/. I've modified the code slightly, as I tend to, in order to take the code one level further - usually, just to see what extra mileage I can get out of a concept... anyway, my reasons are largely irrelevant.
Part of my data access layer is based on the following architecture:
abstract public class AppCommandBase : IDisposable { }
This contains basic stuff, like creation of a command object and cleanup after the AppCommand is disposed of. All of my command base objects derive from this.
abstract public class ReadCommandBase<T, ResultT> : AppCommandBase
This contains basic stuff that affects all read-commands - specifically in this case, reading data from tables and views. No editing, no updating, no saving.
abstract public class ReadItemCommandBase<T, FilterT> : ReadCommandBase<T, T> { }
This contains some more basic generic stuff - like definition of methods that will be required to read a single item from a table in the database, where the table name, key field name and field list names are defined as required abstract properties (to be defined by the derived class.
public class MyTableReadItemCommand : ReadItemCommandBase<MyTableClass, Int?> { }
This contains specific properties that define my table name, the list of fields from the table or view, the name of the key field, a method to parse the data out of the IDataReader row into my business object and a method that initiates the whole process.
Now, I also have this structure for my ReadList...
abstract public ReadListCommandBase<T> : ReadCommandBase<T, IEnumerable<T>> { }
public class MyTableReadListCommand : ReadListCommandBase<MyTableClass> { }
The difference being that the List classes contain properties that pertain to list generation (i.e. PageStart, PageSize, Sort and returns an IEnumerable) vs. return of a single DataObject (which just requires a filter that identifies a unique record).
Problem:
I'm hating that I've got a bunch of properties in my MyTableReadListCommand class that are identical in my MyTableReadItemCommand class. I've thought about moving them to a helper class, but while that may centralize the member contents in one place, I'll still have identical members in each of the classes, that instead point to the helper class, which I still dislike.
My first thought was dual inheritance would solve this nicely, even though I agree that dual inheritance is usually a code smell - but it would solve this issue very elegantly. So, given that .NET doesn't support dual inheritance, where do I go from here?
Perhaps a different refactor would be more suitable... but I'm having trouble wrapping my head around how to sidestep this problem.
If anyone needs a full code base to see what I'm harping on about, I've got a prototype solution on my DropBox at http://dl.dropbox.com/u/3029830/Prototypes/Prototype%20-%20DAL%20Refactor.zip. The code in question is in the DataAccessLayer project.
P.S. This isn't part of an ongoing active project, it's more a refactor puzzle for my own amusement.
Thanks in advance folks, I appreciate it.
Separate the result processing from the data retrieval. Your inheritance hierarchy is already more than deep enough at ReadCommandBase.
Define an interface IDatabaseResultParser. Implement ItemDatabaseResultParser and ListDatabaseResultParser, both with a constructor parameter of type ReadCommandBase ( and maybe convert that to an interface too ).
When you call IDatabaseResultParser.Value() it executes the command, parses the results and returns a result of type T.
Your commands focus on retrieving the data from the database and returning them as tuples of some description ( actual Tuples or and array of arrays etc etc ), your parser focuses on converting the tuples into objects of whatever type you need. See NHibernates IResultTransformer for an idea of how this can work (and it's probably a better name than IDatabaseResultParser too).
Favor composition over inheritance.
Having looked at the sample I'll go even further...
Throw away AppCommandBase - it adds no value to your inheritance hierarchy as all it does is check that the connection is not null and open and creates a command.
Separate query building from query execution and result parsing - now you can greatly simplify the query execution implementation as it is either a read operation that returns an enumeration of tuples or a write operation that returns the number of rows affected.
Your query builder could all be wrapped up in one class to include paging / sorting / filtering, however it may be easier to build some form of limited structure around these so you can separate paging and sorting and filtering. If I was doing this I wouldn't bother building the queries, I would simply write the sql inside an object that allowed me to pass in some parameters ( effectively stored procedures in c# ).
So now you have IDatabaseQuery / IDatabaseCommand / IResultTransformer and almost no inheritance =)
I think the short answer is that, in a system where multiple inheritance has been outlawed "for your protection", strategy/delegation is the direct substitute. Yes, you still end up with some parallel structure, such as the property for the delegate object. But it is minimized as much as possible within the confines of the language.
But lets step back from the simple answer and take a wide view....
Another big alternative is to refactor the larger design structure such that you inherently avoid this situation where a given class consists of the composite of behaviors of multiple "sibling" or "cousin" classes above it in the inheritance tree. To put it more concisely, refactor to an inheritance chain rather than an inheritance tree. This is easier said than done. It usually requires abstracting very different pieces of functionality.
The challenge you'll have in taking this tack that I'm recommending is that you've already made a concession in your design: You're optimizing for different SQL in the "item" and "list" cases. Preserving this as is will get in your way no matter what, because you've given them equal billing, so they must by necessity be siblings. So I would say that your first step in trying to get out of this "local maximum" of design elegance would be to roll back that optimization and treat the single item as what it truly is: a special case of a list, with just one element. You can always try to re-introduce an optimization for single items again later. But wait till you've addressed the elegance issue that is vexing you at the moment.
But you have to acknowledge that any optimization for anything other than the elegance of your C# code is going to put a roadblock in the way of design elegance for the C# code. This trade-off, just like the "memory-space" conjugate of algorithm design, is fundamental to the very nature of programming.
As is mentioned by Kirk, this is the delegation pattern. When I do this, I usually construct an interface that is implemented by the delegator and the delegated class. This reduces the perceived code smell, at least for me.
I think the simple answer is... Since .NET doesn't support Multiple Inheritence, there is always going to be some repetition when creating objects of a similar type. .NET simply does not give you the tools to re-use some classes in a way that would facilitate perfect DRY.
The not-so-simple answer is that you could use code generation tools, instrumentation, code dom, and other techniques to inject the objects you want into the classes you want. It still creates duplication in memory, but it would simplify the source code (at the cost of added complexity in your code injection framework).
This may seem unsatisfying like the other solutions, however if you think about it, that's really what languages that support MI are doing behind the scenes, hooking up delegation systems that you can't see in source code.
The question comes down to, how much effort are you willing to put into making your source code simple. Think about that, it's rather profound.
I haven't looked deeply at your scenario, but I have some thoughs on the dual-hierarchy problem in C#. To share code in a dual-hierarchy, we need a different construct in the language: either a mixin, a trait (pdf) (C# research -pdf) or a role (as in perl 6). C# makes it very easy to share code with inheritance (which is not the right operator for code-reuse), and very laborious to share code via composition (you know, you have to write all that delegation code by hand).
There are ways to get a kind of mixin in C#, but it's not ideal.
The Oxygene (download) language (an Object Pascal for .NET) also has an interesting feature for interface delegation that can be used to create all that delegating code for you.