I am writing an app that processes a bunch of ticker data from a page. The main class that I am working with is called Instrument, which is used to store all the relevant data pertaining to any instrument. The data is downloaded from a website, and parsed.
class Instrument
{
string Ticker {get; set;}
InstrumentType Type {get; set;}
DateTime LastUpdate {get; set;}
}
My issue is that I am not sure how to properly structure the classes that deal with the parsing of the data. Not only do I need to parse data to fill in many different fields (Tickers, InstrumentType, Timestamps etc.), but because the data is pulled from a variety of sources, there is no one standard pattern that will handle all of the parsing. There are even some parsing methods that need to make use of lower level parsing methods (situations where I regex parse the stock/type/timestamp from a string, and then need to individually parse the group matches).
My initial attempt was to create one big class ParsingHandler that contained a bunch of methods to deal with every particular parsing nuance, and add that as a field to the Instrument class, but I found that many times, as the project evolved, I was forced to either add methods, or add parameters to adapt the class for new unforeseen situations.
class ParsingHandler
{
string GetTicker(string haystack);
InstrumentType GetType(string haystack);
DateTime GetTimestamp(string haystack);
}
After trying to adapt a more interface-centric design methodology, I tried an alternate route and defined this interface:
interface IParser<outParam, inParam>
{
outParam Parse(inParam data);
}
And then using that interface I defined a bunch of parsing classes that deal with every particular parsing situation. For example:
class InstrumentTypeParser : IParser<InstrumentType, string>
{
InstrumentType Parse(string data);
}
class RegexMatchParser : IParser<Instrument, Match> where Instrument : class, new()
{
public RegexMatchParser(
IParser<string, string> tickerParser,
IParser<InstrumentType, string> instrumentParser,
IParser<DateTime, string> timestampParser)
{
// store into private fields
}
Instrument Parser(Match haystack)
{
var instrument = new Instrument();
//parse everything
return instrument;
}
}
This seems to work fine but I am now in a situation were it seems like I have a ton of implementations that I will need to pass into class constructors. It seems to be dangerously close to being incomprehensible. My thoughts on dealing with it are to now define enums and dictionaries that will house all the particular parsing implementations but I am worried that it is incorrect, or that I am heading down the wrong path in general with this fine-grained approach. Is my methodology too segmented? Would it be better to have one main parsing class with a ton of methods like I originally had? Are there alternative approaches for this particular type of situation?
I wouldn't agree with attempt to make the parser so general, as IParser<TOut, TIn>. I mean, something like InstrumentParser looks to be quite sufficient to deal with instruments.
Anyway, as you are parsing different things, like dates from Match objects and similar, then you can apply one interesting technique that deals with generic arguments. Namely, you probably want to have no generic arguments in cases when you know what you are parsing (like string to Instrument - why generics there?). In that case you can define special interfaces and/or classes with reduced generic arguments list:
interface IStringParser<T>: IParser<T, string> { }
You will probably parse data from strings anyway. In that case, you can provide a general-purpose class which parses from Match objects:
class RegexParser: IStringParser<T>
{
Regex regex;
IParser<T, Match> parser;
public RegexParser(Regex regex, IParser<T, Match> containedParser)
{
this.regex = regex;
this.parser = containedParser;
}
...
T Parse(string data)
{
return parser.Parse(regex.Match(data));
}
}
By repeatedly applying this technique, you can make your top-most consuming classes only depend on non-generic interfaces or interfaces with one generic member. Intermediate classes would wrap around more complicated (and more specific) implementations and it all becomes just a configuration issue.
The goal is always to go towards as simple consuming class as possible. Therefore, try to wrap specifics and hide them away from the consumer.
Related
I've been working on learning how to use interfaces correctly in c# and I think I mostly understand how they should be used but still feel confused about certain things.
I want to create a program that will create a CSV from Sales Orders or Invoices. Since they are both very similar I figured I could create an IDocument interface that could be used to make a CSV document.
class Invoice : IDocument
{
public Address billingAddress { get; set; }
public Address shippingAddress { get; set; }
public Customer customer { get; set; }
public List<DocumentLine> lines { get; set; }
// other class specific info for invoice goes here
}
I can create a method CreateCSV(IDocument) but how would I deal with the few fields that differ from Sales Orders and Invoices? Is this a bad use of interfaces?
You don't inherit interfaces, you implement them; and in this case the interface is an abstraction; it says "all things that implement this interface have the following common characteristics (properties, methods, etc)"
In your case, you have found that in fact Invoices and Sales Orders don't quite share the exact same characteristics.
Therefore from the point of view of representing them in CSV format, it's not a great abstraction (although for other things, like calculating the value of the document, it's an excellent one)
There are a number of ways you can work around this though, here are two (of many)
Delegate the work to the classes
You can declare an ICanDoCSVToo interface that returns the document in some kind of structure that represents CSV (let's say a CSVFormat class that wraps a collection of Fields and Values).
Then you can implement this on both Invoices and Sales Orders, specifically for those use cases, and when you want to turn either of them into CSV format, you pass them by the ICanDoCSVToo interface.
However I personally don't like that as you don't really want your Business Logic mixed up with your export/formatting logic - that's a violation of the SRP. Note you can achieve the same effect with abstract classes but ultimately it's the same concept - you allow someone to tell the class that knows about itself, to do the work.
Delegate the work to specialised objects via a factory
You can also create a Factory class - let's say a CSVFormatterFactory, which given an IDocument object figures out which formatter to return - here is a simple example
public class CSVFormatterLibrary
{
public ICSVFormatter GetFormatter(IDocument document)
{
//we've added DocType to IDocument to identify the document type.
if(document.DocType==DocumentTypes.Invoice)
{
return new InvoiceCSVFormatter(document);
}
if (document.DocType==DocumentTypes.SalesOrders)
{
return new SalesOrderCSVFormatter(document);
}
//And so on
}
}
In reality, you'd might make this generic and use an IOC library to worry about which concrete implementation you would return, but it's the same concept.
The individual formatters themselves can then cast the IDocument to the correct concrete type, and then do whatever is specifically required to produce a CSV representation of that specialised type.
There are other ways to handle this as well, but the factory option is reasonably simple and should get you up and running whilst you consider the other options.
I have a class, which holds some details in a large data structure, accepts an algorithm to perform some calculations on it, has methods to validate inputs to the data structure.
But then I would like to return the data structure, so that it can be transformed into various output forms (string / C# DataTable / custom file output) by the View Model.
class MyProductsCollection {
private IDictionary<string, IDictionary<int, ISet<Period>>> products;
// ctors, verify input, add and run_algorithm methods
}
I know that you are supposed to use the "depend on interface not implementation" design principle, so I want to create an interface for the class.
How can I avoid writing the following interface?
Reason being it would expose implementation details and bind any other concrete implementations to return the same form.
interface IProductsCollection {
IDictionary<string, IDictionary<int, ISet<IPeriod>>> GetData();
// other methods
}
How can I easily iterate over the data structure to form different varieties of outputs without bluntly exposing it like this?
EDIT:
Since the class takes in IFunc<IDictionary<string, IDictionary<int, ISet<IPeriod>>>> in the constructor to iterate over the data structure and perform calculations, I could supply it with another IFunc, which would construct the output instead of running calculations. However, I don't know how I could do this aside from the concrete class constructor.
The structure of the IDictionary<string,IDictionary<int,ISet<Period>>> is very suspicious indeed - when you see a dictionary of dictionaries, good chances are that you have missed an opportunity or two to create a class to encapsulate the inner dictionary.
Without knowing much of the domain of your problem, I would suggest defining an interface to encapsulate the inner dictionary. It looks like something that associates a number to a set of periods, so you would define an interface like this:
interface IYearlyPeriods {
bool HasPeriodsForYear(int year);
ISet<Periond> GetPeriodsForYear(int year);
}
I have no idea what's in the periods, so you would need to choose a domain-specific name for the interface.
Moreover, you can wrap the next level of IDictionary too:
interface IProductDataSource {
IEnumerable<string> ProductNames { get; }
IYearlyPeriods GetProductData(string productName);
}
Now you can define an interface like this:
interface IProductsCollection {
IProductDataSource GetDataSource();
// other methods
}
The main idea is to use domain-specific interfaces in place of generic collections, so that the readers and implementers of your code would have some idea of what's inside without referring to the documentation.
You could go even further, and replace the IDictionary with the complex structure that you keep internally with an IDictionary of IProductPeriods implementation. If you would like to keep IYearlyPeriods that you expose to the users immutable, but would like to be able to make modifications yourself, you can make a mutable implementation, and keep it internal to the implementing class.
I would suggest to keep the IDictionary private and provide a simple IEnumerable in the interface.
In your case this could be a custom DTO that hides all the nastiness of the IDictionary<int, ISet<IPeriod>> - which is already quite complex and could (probably) easily change as you need to implement new features.
This could be something like:
class ExposedPeriod
{
public int PeriodIdentifier { get; set; }
public IEnumerable<IPeriod> Periods { get; set; }
}
The ExposedPeriod and PeriodIdentifier probably need better names though. Good names might be found in your domain vocabulary.
I'm refactoring a class that represents the data in some XML. Currently, the class loads the XML itself and property implementations parse the XML every time. I'd like to factor out the XML logic and use a factory to create these objects. But there are several 'optional' properties and I'm struggling to find an elegant way to handle this.
Let's say the XML looks like this:
<data>
<foo>a</foo>
<bar>b</bar>
</data>
Assume both foo and bar are optional. The class implementation looks something like this:
interface IOptionalFoo
{
public bool HasFoo();
public string Foo { get; }
}
// Assume IOptionalBar is similar
public class Data : IOptionalFoo, IOptionalBar
{
// ...
}
(Don't ask me why there's a mix of methods and properties for it. I didn't design that interface and it's not changing.)
So I've got a factory and it looks something like this:
class DataFactory
{
public static Data Create(string xml)
{
var dataXml = new DataXml(xml);
if (dataXml.HasFoo())
{
// ???
}
// Create and return the object based on the data that was gathered
}
}
This is where I can't seem to settle on an elegant solution. I've done some searching and found some solutions I don't like. Suppose I leave out all of the optional properties from the constructor:
I can implement Foo and Bar as read/write on Data. This satisfies the interface but I don't like it from a design standpoint. The properties are meant to be immutable and this fudges that.
I could provide SetFoo() and SetBar() methods in Data. This is just putting lipstick on the last method.
I could use the internal access specifier; for the most part I don't believe this class is being used outside of its assembly so again it's just a different way to do the first technique.
The only other solution I can think of involves adding some methods to the data class:
class Data : IOptionalFoo, IOptionalBar
{
public static Data WithFoo(Data input, string foo)
{
input.Foo = foo;
return input;
}
}
If I do that, the setter on Foo can be private and that makes me happier. But I don't really like littering the data object with a lot of creation methods, either. There's a LOT of optional properties. I've thought about making some kind of DataInitialization object with a get/set API of nullable versions for each property, but so many of the properties are optional it'd end up more like the object I am refactoring becomes a facade over a read/write version. Maybe that's the best solution: an internal read/write version of the class.
Have I enumerated the options? Do I need to quit being so picky and settle on one of the techniques above? Or is there some other solution I haven't thought of?
You can think of such keywords as virtual/castle dynamic proxy/reflection/T4 scripts - each one can solve the problem on a slightly different angle.
On another note, this seems perfectably reasonable, unless I misunderstood you:
private void CopyFrom(DataXml dataXml) // in Data class
{
if (dataXml.HasFoo()) Foo = dataXml.Foo;
//etc
}
What I did:
I created a new class that represented a read/write interface for all of the properties. Now the constructor of the Data class takes an instance of that type via the constructor and wraps the read/write properties with read-only versions. It was a little tedious, but wasn't as bad as I thought.
I have a some code that gets passed a class derived from a certain class. Let's call this a parameter class.
The code uses reflection to walk the class' members and analyze certain custom attributes given to them. Basically, it's a configurable parser which will analyze input according to the attributes and put what it found into the data members.
This is used in several places in our code. You specify the parameter class, putting in attributed data members, and pass this to the parser. Something like this:
public class MyFancyParameters : ParametersBase
{
[SomeAttribute(Name="blah", AnotherParam=true)]
public string Blah { get; set; }
// .. .more such stuff
}
var parameters = new MyFancyParameters();
Parser.Parse(input, parameters);
In many places there are similar groups of attributed data members that need to get parsed. So the parameter classes are, in some places, similar. That's redundant and that, of course, hurts. Whenever I need a change in such an area, I need to make that change in half a dozen places, all clones. It's just a matter of time when these parts will start drift apart.
However, the similarities cannot be grouped in acyclic graphs, so I can't use single inheritance to group them.
What I would do in C++ is to put these chunks of similar stuff into their own classes, just inherit a bunch of them that contain whatever I need, and be done. (I think that's referred to as mix-in inheritance.)
C#, however, doesn't have multiple inheritance. So I was thinking of putting these chunks into data members and change the parser to recurse into data members. But that would considerably complicate the parser.
What else is there?
Can you have your parser accept a collection of parameter classes instead of a single parameter class? Alternately, you could allow the parser to recurse into your parameter class and have it supply additional parameter classes as properties. Basically, every property of a ParametersBase derived class that inherits from type ParametersBase is recursed into and flattened into a single list of parameters.
Actually, I just saw that you already mentioned the recursive solution. I think this is probably your best bet and it's not too complex to support. You should be able to create a helper function for enumerating the parameter properties that makes a hierarchy look like a flat class.
Here's some code that would provided a 'flattened' view of your properties, if I understand your requirement correctly. You'll probably want to augment the production code with additional safeguards (such as keeping a stack of types to detect circular references.)
public class ParametersParser
{
public static IEnumerable<PropertyInfo> GetAllParameterProperties(Type parameterType)
{
foreach (var property in parameterType.GetProperties())
{
if (Attribute.IsDefined(property, typeof(SomeAttribute)))
yield return property;
if (typeof(ParametersBase).IsAssignableFrom(property.PropertyType))
{
foreach (var subProperty in GetAllParameterProperties(property.PropertyType))
yield return subProperty;
}
}
}
}
I am trying to create a web-based tool for my company that, in essence, uses geographic input to produce tabular results. Currently, three different business areas use my tool and receive three different kinds of output. Luckily, all of the outputs are based on the same idea of Master Table - Child Table, and they even share a common Master Table.
Unfortunately, in each case the related rows of the Child Table contain vastly different data. Because this is the only point of contention I extracted a FetchChildData method into a separate class called DetailFinder. As a result, my code looks like this:
DetailFinder DetailHandler;
if (ReportType == "Planning")
DetailHandler = new PlanningFinder();
else if (ReportType == "Operations")
DetailHandler = new OperationsFinder();
else if (ReportType == "Maintenance")
DetailHandler = new MaintenanceFinder();
DataTable ChildTable = DetailHandler.FetchChildData(Master);
Where PlanningFinder, OperationsFinder, and MaintenanceFinder are all subclasses of DetailFinder.
I have just been asked to add support for another business area and would hate to continue this if block trend. What I would prefer is to have a parse method that would look like this:
DetailFinder DetailHandler = DetailFinder.Parse(ReportType);
However, I am at a loss as to how to have DetailFinder know what subclass handles each string, or even what subclasses exist without just shifting the if block to the Parse method. Is there a way for subclasses to register themselves with the abstract DetailFinder?
You could use an IoC container, many of them allows you to register multiple services with different names or policies.
For instance, with a hypothetical IoC container you could do this:
IoC.Register<DetailHandler, PlanningFinder>("Planning");
IoC.Register<DetailHandler, OperationsFinder>("Operations");
...
and then:
DetailHandler handler = IoC.Resolve<DetailHandler>("Planning");
some variations on this theme.
You can look at the following IoC implementations:
AutoFac
Unity
Castle Windsor
You might want to use a map of types to creational methods:
public class DetailFinder
{
private static Dictionary<string,Func<DetailFinder>> Creators;
static DetailFinder()
{
Creators = new Dictionary<string,Func<DetailFinder>>();
Creators.Add( "Planning", CreatePlanningFinder );
Creators.Add( "Operations", CreateOperationsFinder );
...
}
public static DetailFinder Create( string type )
{
return Creators[type].Invoke();
}
private static DetailFinder CreatePlanningFinder()
{
return new PlanningFinder();
}
private static DetailFinder CreateOperationsFinder()
{
return new OperationsFinder();
}
...
}
Used as:
DetailFinder detailHandler = DetailFinder.Create( ReportType );
I'm not sure this is much better than your if statement, but it does make it trivially easy to both read and extend. Simply add a creational method and an entry in the Creators map.
Another alternative would be to store a map of report types and finder types, then use Activator.CreateInstance on the type if you are always simply going to invoke the constructor. The factory method detail above would probably be more appropriate if there were more complexity in the creation of the object.
public class DetailFinder
{
private static Dictionary<string,Type> Creators;
static DetailFinder()
{
Creators = new Dictionary<string,Type>();
Creators.Add( "Planning", typeof(PlanningFinder) );
...
}
public static DetailFinder Create( string type )
{
Type t = Creators[type];
return Activator.CreateInstance(t) as DetailFinder;
}
}
As long as the big if block or switch statement or whatever it is appears in only one place, it isn't bad for maintainability, so don't worry about it for that reason.
However, when it comes to extensibility, things are different. If you truly want new DetailFinders to be able to register themselves, you may want to take a look at the Managed Extensibility Framework which essentially allows you to drop new assemblies into an 'add-ins' folder or similar, and the core application will then automatically pick up the new DetailFinders.
However, I'm not sure that this is the amount of extensibility you really need.
To avoid an ever growing if..else block you could switch it round so the individal finders register which type they handle with the factory class.
The factory class on initialisation will need to discover all the possible finders and store them in a hashmap (dictionary). This could be done by reflection and/or using the managed extensibility framework as Mark Seemann suggests.
However - be wary of making this overly complex. Prefer to do the simplest thing that could possibly work now with a view to refectoring when you need it. Don't go and build a complex self-configuring framework if you'll only ever need one more finder type ;)
You can use the reflection.
There is a sample code for Parse method of DetailFinder (remember to add error checking to that code):
public DetailFinder Parse(ReportType reportType)
{
string detailFinderClassName = GetDetailFinderClassNameByReportType(reportType);
return Activator.CreateInstance(Type.GetType(detailFinderClassName)) as DetailFinder;
}
Method GetDetailFinderClassNameByReportType can get a class name from a database, from a configuration file etc.
I think information about "Plugin" pattern will be useful in your case: P of EAA: Plugin
Like Mark said, a big if/switch block isn't bad since it will all be in one place (all of computer science is basically about getting similarity in some kind of space).
That said, I would probably just use polymorphism (thus making the type system work for me). Have each report implement a FindDetails method (I'd have them inherit from a Report abstract class) since you're going to end with several kinds of detail finders anyway. This also simulates pattern matching and algebraic datatypes from functional languages.