Step by step pattern with data transfer between steps

Step by step pattern with data transfer between steps - c#

I write simple application which consists of several step. Each step takes its specified input data and produces its specified output. I try to realize such pattern based on pipeline pattern. There are common parts:
Interface that each step must realize:
interface IStep
{
Data Execute(Data data);
}
Interface that must be implemented by class that processes steps:
interface IProcess
{
void AddStep(IStep step);
void Run();
}
class Process: IProcess
{
private List<IStep> steps = new List<IStep>();
public void AddStep(IStep step)
{
steps.Add(step);
}
public void Run()
{
var data = new Data();
foreach(step in steps)
{
data = step.Ececute(data);
}
}
}
Implementation of Data class is the next:
public class Data: Dictionary<string, object> {}
And this is a problem. I have to store key constants for Data and in the beggining of each step extract values required by step. Is there more elegant way to implement such approach?

Related

Refactoring to make code open for extensions but closed for modifications

For my project purpose I need to send metrics to AWS.
I have main class called SendingMetrics.
private CPUMetric _cpuMetric;
private RAMMetric _ramMetric;
private HDDMetric _hddMetric;
private CloudWatchClient _cloudWatchClient(); //AWS Client which contains method Send() that sends metrics to AWS
public SendingMetrics()
{
_cpuMetric = new CPUMetric();
_ramMetric = new RAMMetric();
_hddMetric = new HDDMetric();
_cloudwatchClient = new CloudwatchClient();
InitializeTimer();
}
private void InitializeTimer()
{
//here I initialize Timer object which will call method SendMetrics() each 60 seconds.
}
private void SendMetrics()
{
SendCPUMetric();
SendRAMMetric();
SendHDDMetric();
}
private void SendCPUMetric()
{
_cloudwatchClient.Send("CPU_Metric", _cpuMetric.GetValue());
}
private void SendRAMMetric()
{
_cloudwatchClient.Send("RAM_Metric", _ramMetric.GetValue());
}
private void SendHDDMetric()
{
_cloudwatchClient.Send("HDD_Metric", _hddMetric.GetValue());
}
Also I have CPUMetric, RAMMetric and HDDMetric classes that looks pretty much similar so I will just show code of one class.
internal sealed class CPUMetric
{
private int _cpuThreshold;
public CPUMetric()
{
_cpuThreshold = 95;
}
public int GetValue()
{
var currentCpuLoad = ... //logic for getting machine CPU load
if(currentCpuLoad > _cpuThreshold)
{
return 1;
}
else
{
return 0;
}
}
}
So the problem I have is that clean coding is not satisfied in my example. I have 3 metrics to send and if I need to introduce new metric I will need to create new class, initialize it in SendingMetrics class and modify that class and that is not what I want. I want to satisfy Open Closed principle, so it is open for extensions but closed for modifications.
What is the right way to do it? I would move those send methods (SendCPUMetric, SendRAMMetric, SendHDDMetric) to corresponding classes (SendCPUMetric method to CPUMetric class, SendRAMMEtric to RAMMetric, etc) but how to modfy SendingMetrics class so it is closed for modifications and if I need to add new metric to not change that class.

In object oriented languages like C# the Open Closed Principle (OCP) is usually achieved by using the concept of polymorphism. That is that objects of the same kind react different to one and the same message. Looking at your class "SendingMetrics" it's obvious that the class works with different types of "Metrics". The good thing is that your class "SendingMetrics" talks to a all types of metrics in the same way by sending the message "getData". Hence you can introduce a new abstraction by creating an Interface "IMetric" that is implemented by the concrete types of metrics. That way you decouple your "SendingMetrics" class from the concrete metric types wich means the class does not know about the specific metric types. It only knows IMetric and treats them all in the same way wich makes it possible to add any new collaborator (type of metric) that implements the IMetric interface (open for extension) without the need to change the "SendingMetrics" class (closed for modification). This also requires that the objects of the different types of metrics are not created within the "SendingMetrics" class but e.g. by a factory or outside of the class and being injected as IMetrics.
In addition to using inheritance to enable polymorphism and achiving OCP by introducing the interface IMetric you can also use inheritance to remove redundancy. Which means you can introduce an abstract base class for all metric types that implements common behaviour that is used by all types of metrics.

Your design is almost correct. You got 3 data retriever and 1 data sender. So it's easy to add more metric (more retriever) (open for extensions) without affecting current metrics (closed for modifications), you just need a bit more refactor to reduce duplicated code.
Instead of have 3 metrics classes look very similar. Only below line is different
var currentCpuLoad = ... //logic for getting machine CPU load
You can create a generic metric like this
internal interface IGetMetric
{
int GetData();
}
internal sealed class Metric
{
private int _threshold;
private IGetMetric _getDataService;
public Metric(IGetMetric getDataService)
{
_cpuThreshold = 95;
_getDataService = getDataService;
}
public int GetValue()
{
var currentCpuLoad = _getDataService.GetData();
if(currentCpuLoad > _cpuThreshold)
{
return 1;
}
else
{
return 0;
}
}
}
Then just create 3 GetMetric classes to implement that interface. This is just 1 way to reduce the code duplication. You can also use inheritance (but I don't like inheritance). Or you can use a Func param.
UPDATED: added class to get CPU metric
internal class CPUMetricService : IGetMetric
{
public int GetData() { return ....; }
}
internal class RAMMetricService : IGetMetric
{
public int GetData() { return ....; }
}
public class AllMetrics
{
private List<Metric> _metrics = new List<Metric>()
{
new Metric(new CPUMetricService());
new Metric(new RAMMetricService());
}
public void SendMetrics()
{
_metrics.ForEach(m => ....);
}
}

Strategy pattern or no strategy pattern?

Without entering in academic definitions, let's say that the Strategy Pattern is used when you have a client code (Context) which will execute an operation, and this operation could be implemented in different ways (algorithms). For instance: https://www.dofactory.com/net/strategy-design-pattern
Which Strategy (or algorithm) will be used depend on many occasions of some input conditions. That is why Strategy Pattern sometimes is used in combination with Factory Pattern. The Client pass the input conditions to the Factory. Then the Factory knows which Strategy has to create. Then the Client execute the operation of the Strategy created.
However, I have come across in several occasions with a problem that seems to me the opposite. The operation to be execute is always the same, but it would be only executed depending on a family of input conditions. For example:
public interface IStrategy
{
string FileType { get; }
bool CanProcess(string text);
}
public class HomeStrategy : IStrategy
{
public string FileType => ".txt";
public bool CanProcess(string text)
{
return text.Contains("house") || text.Contains("flat");
}
}
public class OfficeStrategy : IStrategy
{
public string FileType => ".doc";
public bool CanProcess(string text)
{
return text.Contains("office") || text.Contains("work") || text.Contains("metting");
}
}
public class StragetyFactory
{
private List<IStrategy> _strategies = new List<IStrategy>{ new HomeStrategy(), new OfficeStrategy() };
public IStrategy CreateStrategy(string fileType)
{
return _strategies.Single(s => s.FileType == fileType);
}
}
Now the client code will get the files from some repository and will save the files in the database. This is the operation, store the files in the database, just depending on the type of the file and the specific conditions for each file.
public class Client
{
public void Execute()
{
var files = repository.GetFilesFromDisk();
var factory = new StragetyFactory();
foreach (var file in files)
{
var strategy = factory.CreateStrategy(file.Type);
if (strategy.CanProcess(file.ContentText))
{
service.SaveInDatabase(file);
}
}
}
}
Am I wrong to believe that this is a different pattern than the Strategy Pattern? (even though I have called Strategy in the code above because I have seem it like this in several occasions)
If this problem is different than the one the Strategy Pattern solves, then which pattern is it?.

Not really a strategy pattern, because as definition in the strategy pattern in Wikipedia says:
In computer programming, the strategy pattern (also known as the
policy pattern) is a behavioral software design pattern that enables
selecting an algorithm at runtime. Instead of implementing a single
algorithm directly, code receives run-time instructions as to which in
a family of algorithms to use.[1]
You're not selecting an algorithm to execute at runtime, you just check conditions to see if file type satisfies conditions and then execute the algorithm.
Do you expect this to change ever ? Do you need this to be extensible, so that in the future if you need to execute different code based on file type you can do it easily.
If answer to those questions is yes, then you can keep strategies and apply few changes.
First define base strategy class that defines the code to execute
public abstract class StrategyBase
{
public abstract bool CanProcess(string fileType);
public virtual void Execute(File file)
{
_service.SaveInDatabase(file);
}
}
Your strategies change to derive from base
public class HomeStrategy : StrategyBase
{
public string FileType => ".txt";
public override bool CanProcess(string text)
{
return text.Contains("house") || text.Contains("flat");
}
}
// implement the same for the rest of strategies...
As mentioned in the comment, it's not really a factory as it doesn't create a new strategy on every call. It's more like a provider which provides strategy to execute based on file type.
public class StragetyProvider
{
private List<StrategyBase> _strategies = new List<StrategyBase>{ new HomeStrategy(), new OfficeStrategy() };
public StrategyBase GetStrategy(string fileType)
{
return _strategies.FirstOrDefault(s => s.CanProcess(fileType));
}
}
As a result client code became much simpler:
public class Client
{
public void Execute()
{
var files = repository.GetFilesFromDisk();
var provider = new StragetyProvider();
foreach (var file in files)
{
var strategy = provider.GetStrategy(file.Type);
strategy?.Execute(file);
}
}
}
Notice, when you need to add new condition, you just implement a new class that derives from StrategyBase and add it to the list of strategies in the provider and no other changes required. If you would need to execute different logic for some new file type, you will create new strategy and override Execute method and that's it.
If this does really look like an overkill and you don't need to ever extend this solution with new behavior & the only thing you want is to be able to add new condition then go with another approach.
public interface ISatisfyFileType
{
bool Satisfies(string fileType);
}
public class HomeCondition : ISatisfyFileType
{
public string FileType => ".txt";
public bool Satisfies(string text)
{
return text.Contains("house") || text.Contains("flat");
}
}
// the rest of conditions
Compose all conditions into one
public class FileConditions
{
private List<ISatisfyFileType> _conditions = new List<ISatisfyFileType>{ new HomeStrategy(), new OfficeStrategy() };
public bool Satisfies(string fileType) =>
_conditions.Any(condition => condition.Satisfies(fileType));
}
And the client:
public class Client
{
public void Execute()
{
var files = repository.GetFilesFromDisk();
var fileTypeConditions = new FileConditions();
foreach (var file in files)
{
if (fileTypeConditions.Satisfies(file.ContentText))
{
service.SaveInDatabase(file);
}
}
}
}
This also has the benefit of implementing a new condition and adding it to FileConditions class should you need a new condition without touching client code.

Some design-pattern suggestions needed

C#. I have a base class called FileProcessor:
class FileProcessor {
public Path {get {return m_sPath;}}
public FileProcessor(string path)
{
m_sPath = path;
}
public virtual Process() {}
protected string m_sath;
}
Now I'd like to create to other classes ExcelProcessor & PDFProcessor:
class Excelprocessor: FileProcessor
{
public void ProcessFile()
{
//do different stuff from PDFProcessor
}
}
Same for PDFProcessor, a file is Excel if Path ends with ".xlsx" and pdf if it ends with ".pdf". I could have a ProcessingManager class:
class ProcessingManager
{
public void AddProcessJob(string path)
{
m_list.Add(Path;)
}
public ProcessingManager()
{
m_list = new BlockingQueue();
m_thread = new Thread(ThreadFunc);
m_thread.Start(this);
}
public static void ThreadFunc(var param) //this is a thread func
{
ProcessingManager _this = (ProcessingManager )var;
while(some_condition) {
string fPath= _this.m_list.Dequeue();
if(fPath.EndsWith(".pdf")) {
new PDFProcessor().Process();
}
if(fPath.EndsWith(".xlsx")) {
new ExcelProcessor().Process();
}
}
}
protected BlockingQueue m_list;
protected Thread m_thread;
}
I am trying to make this as modular as possible, let's suppose for example that I would like to add a ".doc" processing, I'd have to do a check inside the manager and implement another DOCProcessor.
How could I do this without the modification of ProcessingManager? and I really don't know if my manager is ok enough, please tell me all your suggestions on this.

I'm not really aware of your problem but I'll try to give it a shot.
You could be using the Factory pattern.
class FileProcessorFactory {
public FileProcessor getFileProcessor(string extension){
switch (extension){
case ".pdf":
return new PdfFileProcessor();
case ".xls":
return new ExcelFileProcessor();
}
}
}
class IFileProcessor{
public Object processFile(Stream inputFile);
}
class PdfFileProcessor : IFileProcessor {
public Object processFile(Stream inputFile){
// do things with your inputFile
}
}
class ExcelFileProcessor : IFileProcessor {
public Object processFile(Stream inputFile){
// do things with your inputFile
}
}
This should make sure you are using the FileProcessorFactory to get the correct processor, and the IFileProcessor will make sure you're not implementing different things for each processor.
and implement another DOCProcessor
Just add a new case to the FileProcessorFactory, and a new class which implements the interface IFileProcessor called DocFileProcessor.

You could decorate your processors with custom attributes like this:
[FileProcessorExtension(".doc")]
public class DocProcessor()
{
}
Then your processing manager could find the processor whose FileProcessorExtension property matches your extension, and instantiate it reflexively.

I agree with Highmastdon, his factory is a good solution. The core idea is not to have any FileProcessor implementation reference in your ProcessingManager anymore, only a reference to IFileProcessor interface, thus ProcessingManager does not know which type of file it deals with, it just knows it is an IFileProcessor which implements processFile(Stream inputFile).
In the long run, you'll just have to write new FileProcessor implementations, and voila. ProcessingManager does not change over time.

Use one more method called CanHandle for example:
abstract class FileProcessor
{
public FileProcessor()
{
}
public abstract Process(string path);
public abstract bool CanHandle(string path);
}
With excel file, you can implement CanHandle as below:
class Excelprocessor: FileProcessor
{
public override void Process(string path)
{
}
public override bool CanHandle(string path)
{
return path.EndsWith(".xlsx");
}
}
In ProcessingManager, you need a list of processor which you can add in runtime by method RegisterProcessor:
class ProcessingManager
{
private List<FileProcessor> _processors;
public void RegisterProcessor(FileProcessor processor)
{
_processors.Add(processor)
}
....
So LINQ can be used in here to find appropriate processor:
while(some_condition)
{
string fPath= _this.m_list.Dequeue();
var proccessor = _processors.SingleOrDefault(p => p.CanHandle(fPath));
if (proccessor != null)
proccessor.Process(proccessor);
}
If you want to add more processor, just define and add it into ProcessingManager by using
RegisterProcessor method. You also don't change any code from other classes even FileProcessorFactory like #Highmastdon's answer.

You could use the Factory pattern (a good choice)
In Factory pattern there is the possibility not to change the existing code (Follow SOLID Principle).
In future if a new Doc file support is to be added, you could use the concept of Dictionaries. (instead of modifying the switch statement)
//Some Abstract Code to get you started (Its 2 am... not a good time to give a working code)
1. Define a new dictionary with {FileType, IFileProcessor)
2. Add to the dictionary the available classes.
3. Tomorrow if you come across a new requirement simply do this.
Dictionary.Add(FileType.Docx, new DocFileProcessor());
4. Tryparse an enum for a userinput value.
5. Get the enum instance and then get that object that does your work!
Otherwise an option: It is better to go with MEF (Managed Extensibility Framework!)
That way, you dynamically discover the classes.
For example if the support for .doc needs to be implemented you could use something like below:
Export[typeof(IFileProcessor)]
class DocFileProcessor : IFileProcessor
{
DocFileProcessor(FileType type);
/// Implement the functionality if Document type is .docx in processFile() here
}
Advantages of this method:
Your DocFileProcessor class is identified automatically since it implements IFileProcessor
Application is always Extensible. (You do an importOnce of all parts, get the matching parts and Execute.. Its that simple!)

How to refer to multiple classes without creating a mess?

I have a situation where I have 8 steps (think of it as a wizard). Each step consists of something different so I've created 8 classes. Each of the classes need some information from the previous steps (classes). All the classes are called from one main class. The neatiest way I've found to handle this situation is :
public void Main()
{
var step1 = new Step1();
step1.Process();
var step2 = new Step2(step1);
step2.Process();
var step3 = new Step3(step1, step2);
//...
var step8 = new Step8(step1, step2, step3, step4, step5, step6, step7);
step8.Process();
}
Obviously, this is a mess. I don't want to send that many parameters and I don't want to use static classes (probably not a good practice).
How would you handle such situation?

This sounds to me like something that you could accomplish via a Chain of Responsibility Pattern. That is the direction that I would look into at least.
If you go down that path, then you will leave yourself open to a cleaner implementation of adding/removing steps in the future.
And, as far as the multiple data sets, John Koerner is correct in that you should have one data model that is updated in each step. This will allow you to implement a clean chain of responsibility.

Have a single class that is your datamodel that can be used throughout the processes. The steps update their piece of the datamodel and that is the only object passed to each subsequent step.

Seems like Java's inner classes are better suited for this than anything C# has. But, C# is so much better in so many other aspects, we'll let this one pass.
You should create one class that contains all your data. If your steps are simple, you should have one method per step in that one class. If your steps are complicated, separate them into classes, but give each of them access to the data class.

You can have interface IProcess with method Run(Wizard) and property Name, several processes and everyone inherits IProcess, and class Wizard that contain processes to run in the list. So:
class Wizard
{
private IList<IProcess> processes = new List<IProcess();
public T GetProcess<T>(string name)
where T : IProcess
{
return (T)processes.Single(x => x.Name == name);
}
public void Run()
{
foreach (var proc in processes)
proc.Run(this);
}
}
Every process can have access to the wizard using argument of the Run method, or just have it in the constructor. By calling wizard.GetProcess<Process1>("some name") you can have your process that was previously executed (you can add a check).
Other option is to contain results in the Wizard class.
This is only one of many variants. You can look at Chain of Responsibility Pattern, like Justin suggests

I would say a classical example for a variation of a Chain-Of-Responsibility.
Here is an example:
class Request
{
private List<string> _data;
public IEnumerable<string> GetData()
{
return _data.AsReadOnly();
}
public string AddData(string value)
{
_data.Add(value);
}
}
abstract class Step
{
protected Step _nextStep;
public void SetSuccessor(Step step)
{
_nextStep = step;
}
public abstract void Process(Request request);
}
sealed class Step1 : Step
{
public override void Process(Request request)
{
var data = request.GetData();
Console.Write("Request processed by");
foreach (var datum in data)
{
Console.Write(" {0} ", datum);
}
Console.WriteLine("Now is my turn!");
request.AddData("step1");
_nextStep.Process(request);
}
}
// Other steps omitted.
sealed class Step8 : Step
{
public override void Process(Request request)
{
var data = request.GetData();
Console.Write("Request processed by");
foreach (var datum in data)
{
Console.Write(" {0} ", datum);
}
Console.WriteLine("Guess we're through, huh?");
}
}
void Main()
{
var step1 = new Step1();
// ...
var step8 = new Step8();
step8.SetSuccessor(step1);
var req = new Request();
step1.Process(req);
}

Create just one class and use different methods as steps
class Wizard
{
int someIntInfo;
string some StringInfo;
...
public void ProcessStep1();
public void ProcessStep2();
public void ProcessStep3();
public void ProcessStep4();
}
Or create a step and an info interface and declare the wizard like this by passing the same info to all steps
interface IWizardInfo
{
int someIntInfo { get set; }
string someStringInfo { get set; }
...
}
interface IStep
{
void Process(IWizardInfo info);
}
class Wizard
{
IWizardInfo _info = ....;
IStep[] _steps = new IStep[] { new Step1(), new Step2(), ... };
int _currentStep;
void ProcessCurrentStep()
{
_steps[_currentStep++].Process(_info);
}
}
EDIT:
Create a compound class which can hold all previous steps
class Step1 { public Step1(AllSteps steps) { steps.Step1 = this; } ... }
class Step2 { public Step2(AllSteps steps) { steps.Step2 = this; } ... }
class Step3 { public Step3(AllSteps steps) { steps.Step3 = this; } ... }
class AllSteps
{
public Step1 Step1 { get; set; }
public Step2 Step2 { get; set; }
public Step3 Step3 { get; set; }
}
Pass the same info to all steps. The steps are responsible to add themselves to the info
AllSteps allSteps = new AllSteps();
var stepX = new StepX(allSteps);

Why not create one single class for all of your steps and implement state management within that class? e.g.
private class Steps
{
private int _stepIndex = 0;
public void Process()
{
switch(_stepIndex)
{
case 0: // First Step
... // Perform business logic for step 1.
break;
case 1: // Second Step
... // Perform business logic for step 2.
break;
}
_stepIndex++;
}
}

I would have two ArrayLists (or depending on your class and method structure they might be simple Lists) - one for methods (as delegates) and one for results.
So, foreach method would go through delegates and invoke them with results list as parameter (you might accommodate your methods to accept such parametars and work with them) and add result to results list.

Architectural/best-practices question about generics

I'm working my way through 'head first design patterns' and want to use this in practice immediately.
I'm writing a piece of code that connects an application with other applications. In fact, I need to generate an e-mail containing an XML file and send it via e-mail. But other stuff might be required in the future.
Thus, I identified 'the things that change':
- The data for the transmission
- The means of transmitting (could be e-mail, but could be FTP or webservice for another data-exchange)
So, I:
- Created an abstract class DataObject
- Created an interface ITransmissionMethod
- Created an dataExchange abstract class:
abstract class DataExchange<T,U>
{
private T DataObject;
private U SendMethod;
}
And SendViaMail is like
class SendViaMail : ISendMethod<System.Net.Mail.Attachment>
{
public override void Send(System.Net.Mail.Attachment dataItem)
{
throw new NotImplementedException();
}
}
Now - I can create classes like:
class MyExchange : DataExchange<MyDataObject,SendViaMail> { }
What do you think about this approach? Now what I would really like to do is Create an abstract method in DataExchange that should look something like
private abstract [the type of the T in ISendMethod<T>] PrepareObjectForSending(T dataObject) {
}
Visual Studio would force me to implement a method like:
private abstract System.Net.Mail.Attachment PrepareObjectForSendingMyDataObject dataObject) {
// Serialize XML file and make it into attachment object
}
Wouldn't that be sweet? But what do you guys think about this approach? In the future, people can create new dataObjects and new sendmethods and the code would still work. What I've been trying to do is: program against interface and extract changing parts. How about it?

That would work, but you could separate concerns even more. Here is just another version - make DataExchange to be very simple and delegate real work to workers:
class DataExchange<TDataObject, TTransmissionObject>
{
IConverter<TDataObject, TTransmissionObject> conterver;
ISendMethod<TTransmissionObject> sender;
public Send(TDataObject dataObject)
{
TTransmissionObject tro = conterver.Convert(dataObject);
sender.Send(tro);
}
}
Converts would just convert data objects to objects suitable for transmission:
class DataToAttachmentConverter : IConverter<DataObject, Attachment>
{
Attachment Convert(DataObject) { }
}
class DataToXmlConverter : IConverter<DataObject, XmlDocument>
{
XmlDocument Convert(DataObject) { }
}
Senders would only send.
class MailSender : ISendMethod<Attachment>
{
void Send(Attachment) {}
}
class FtpPublisher : ISendMethod<XmlDocument>
{
void Send(XmlDocument) {}
}
Putting all together:
var exchanges = new [] {
new DataExchange<DataObject, Attachment>( new DataToAttachmentConverter(), new MailSender()),
new DataExchange<DataObject, XmlDocument>( new DataToXmlConverter(), new FtpPublisher())
};
foreach(var ex in exchanges)
ex.Send(dataObject); //send as an attachent and put to ftp site.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Step by step pattern with data transfer between steps - c#

Related

Refactoring to make code open for extensions but closed for modifications

Strategy pattern or no strategy pattern?

Some design-pattern suggestions needed

How to refer to multiple classes without creating a mess?

Architectural/best-practices question about generics

Categories

Resources