I am writing a log file decoder which should be capable of reading many different structures of files. My question is how best to represent this data. I am using C#, but am new to OOP.
An example:
The log files have a range of sensor values. One sensor reading can be called A, another B. Obviously, there are many more than 2 entry types.
In different log files, they could be stored either as ABABABABAB or AAAAABBBBB.
I was thinking of describing this as blocks of entries. So in the first case, a block would be 'AB', with 5 blocks. In the second case, the first block is 'A', read 5 times. This is followed by a block of 'B', read 5 times.
This is quite a simplification (there are actually 40 different types of log file, each with up to 40 sensor values in a block). No log has more than 300 blocks.
At the moment, I store all of this in a datatable. I have a column for each entry, with a property of how many to read. If this is set to -1, it continues to the next column in the block. If not, it will assume that it has reached the end of the block.
This all seems quite clumsy. Can anyone suggest a better way of doing this?
I think you should first start here, and then here to learn a little bit about what object oriented programming is. Don't worry about your current problem while learning about OOP.
As you are learning about OO concepts, you should begin to understand code is not data, and data is not code. It does not matter how you represent your data from an OOP stance. You can write OO code to consume your data, or you could write procedurage code to consume your data, that part is irrelevant to the format of the data.
So then getting back to your question
My question is how best to represent this data
It depends on your needs. What is writing the log file? Do you have control over the writer and reader? If I did I would rely on build the built in serialization methods to minize the amount of code I need to write. Is the log file going to be really long? If so the "datatable" approach you described is usually better. If the log file isn't going to be a huge in file size, XML is really easy to work with.
Very basic and straightforward:
Define an interface for IEnrty with properties like string EntryBlock, int Count
Define a class which represents an Entry and implements IEntry
Code which doing a binary serialization should be aware of interfaces, for instance it should reffer IEnumerable<IEntry>
Class Entry could override ToString() to return something like [ABAB-2], surely if this is would be helpful whilst serialization
Interface IEntry could provide method void CreateFromRawString(string rawDataFromLog) if it would be helpful, decide yourself
If you want more info please share code you are using for serialization/deserializaton
In addition to what Bob has offered, I highly recommend Head First Design Patterns as a gentle, but robust introduction to OO for a C# programmer. The samples are in Java, which translate easily to C#.
As for OOP, you want to learn SOLID.
I would suggest you build this using Test Driven Development.
Start small, with a simple fragment of your log data and write a test like (you'll find a better way to do this with experience and apply it to your situation):
[Test]
public void ReadSequence_FiveA_ReturnsProperList()
{
// Arrange
string sequenceStub = "AAAAA";
// Act
MyFileDecoder decoder = new MyFileDecoder();
List<string> results = decoder.ReadSequence(sequenceStub);
// Assert
Assert.AreEqual(5, results.Count);
Assert.AreEqual("A", results[0]);
}
That test code snippet is just a starting point, and I've tried to be rather verbose in the assertions. You can come up with more creative ways over time. The point is to start small. Once this test passes, add another test where you mix "AB" and change your decoder to handle this properly. Eventually, you'll have a large set of tests that handle your different formats. Using TDD, you'll be on the path to using SOLID properly. Whenever you find something you can't test, you should review the rules and see if you can't make it simpler and inject dependencies.
Eventually you'll get into mocking. For example, you might find that you'd rather INJECT the ability for your MyFileDecoder class to have a dependency that will read your log file. In that case, you would create a mock object and pass that into the constructor and set the mock to return the sequenceStub when a method is called.
Related
I've been driving myself crazy for hours trying to figure this one out, and I'm not moving anywhere with it.
I'm creating a checkout till for a cashier, specifically I need to sum the items, then apply the promotional discounts. I'm trying to do it without violating any design principles (impossible, I know, I can let things slide when it makes sense).
Promotional discounts could be anything, from a black friday deal flat discount, to 'Orders over £100 save 10%' to '3 for 2 on these items', or 'Buy at least two cans of coke, and the price for them all drops to £0.50!'
I cannot see how to fit the promotional deals in. Each may require a different set of data from different locations. For instance one of the big problems being the '3 for 2' deal. Getting access to the items in the Checkout has been a plague on my mind.
So far, my best approach has been to use the Decorator pattern. Wrap the checkout up in a bunch of promotional deals when the price is calculated, as each decorator holds an instance of the checkout, we'll have access to the orginal checkout with the list of items.
In the future, the only thing I'd need to do is write the new rule, add it to the factory, and update any DB data which is perfect, the minimal change.
This kind of works. I can justify it in my head that it's still a checkout, and therefore all the rules being able to access the checkout make sense, gives me a nice way to chain the discount. But there is a problem, in that I'm sure I shouldn't be using it the way I'm suggesting. For instance, if one of the promotions wears off, you shouldn't really 'unwrap' it, and realistically while it's nice to be able to add promotions dynamically to extend an instance, it's not necessary.
I've read through more design patterns but can't seem to find anything that applies. I saw the following article:
https://levelup.gitconnected.com/rules-design-pattern-in-c-6c62f0e20ee0
This is basically what I want to do, but the implementation feels clumsy to me.
public bool IsValid(FileInfo fileInfo)
{
var rules = new List<IFileValidationRule> { new FileExtensionRule(new string[] { "txt", "html" }) };
if (AdminConfig.CheckFileSize)
{
rules.Add(new FileSizeRule("txt", 5 * 1024 * 1024));
rules.Add(new FileSizeRule("html", 10 * 1024 * 1024));
}
if (User.Status != UserStatus.Premium)
{
rules.Add(new MaxFileLengthRule(50));
}
bool isValid = rules.All(rule => rule.IsValid(fileInfo));
return isValid;
}
Specifically this part, seems to violate a few key principles, Open-Closed principle, Dependency Inversion, etc.
The other big problem I can't wrap my head arround is as below:
Imagine for the above example, a new rule needs to be added that reads the file data, check if there are any bad characters in there, doesn't matter what.
Implementing this is easy, you inject the file or the file data, whatever into the 'FileValidator' class, you then instantiate your rule and pass the file data into it. You then run the rule and return the success, great! But is this ok?
Reading this says no: http://wiki.c2.com/?TellDontAsk
"Tell, don't ask" - It is okay to use accessors to get the state of an object, as long as you don't use the result to make decisions outside the object.
That would be exactly what the code is doing! I guess the alternative to this is to update the 'FileData' object to essentially take a list of bad characters, check the file data, return false to the rule, which then fails the whole process, but this would start to throw a bunch more rules out the door. You're now breaking the Open-Closed principle, Single Responsibility principle, and it feels like you're building a rod for your own back, adding these custom methods for singular rules, bloating your object. (The link does discuss how you can pass a function into the method, which is pretty nice, but still not perfect, at the end of the day, aren't you just indirectly handing control to the caller?)
The above alone wouldn't be enough to stop me, but I'm struggling to justify making a private set of items public so one rule out of the bunch can make use of that data.
I'm in OOP recursion, tumbling towards a stack overflow. Can anyone pull me out and help me consilidate my thoughts? None of the design patterns seem to work but I'm sure this is a basic problem solved many times in the past. What am I missing?
I would like to do a very simple test for the Constructor of my class,
[Test]
public void InitLensShadingPluginTest()
{
_lensShadingStory.WithScenario("Init Lens Shading plug-in")
.Given(InitLensShadingPlugin)
.When(Nothing)
.Then(PluginIsCreated)
.Execute();
}
this can be in Given or When it... I think it should be in When() but it doesn't really matter.
private void InitLensShadingPlugin()
{
_plugin = new LSCPlugin(_imagesDatabaseProvider, n_iExternalToolImageViewerControl);
}
Since the Constructor is the one being tested, I do not have anything to do inside the When() statement,
And in Then() I assert about the plugin creation.
private void PluginIsCreated()
{
Assert.NotNull(_plugin);
}
my question is about StoryQ, since I do not want to do anything inside When()
i tried to use When(()=>{}) however this is not supported by storyQ,
this means I need to implement something like
private void Nothing()
{
}
and call When(Nothing)
is there a better practice?
It's strange that StoryQ doesn't support missing steps; your scenario is actually pretty typical of other examples I've used of starting applications, games etc. up:
Given the chess program is running
Then the pieces should be in the starting positions
for instance. So your desire to use a condition followed by an outcome is perfectly valid.
Looking at StoryQ's API, it doesn't look as if it supports these empty steps. You could always make your own method and call both the Given and When steps inside it, returning the operation from the When:
.GivenIStartedWith(InitLensShadingPlugin)
.Then(PluginIsCreated)
If that seems too clunky, I'd do as you suggested and move the Given to a When, initializing the Given with an empty method with a more meaningful name instead:
Given(NothingIsInitializedYet)
.When(InitLensShadingPlugin)
.Then(PluginIsCreated)
Either of these will solve your problem.
However, if all you're testing is a class, rather than an entire application, using StoryQ is probably overkill. The natural-language BDD frameworks like StoryQ, Cucumber, JBehave etc. are intended to help business and development teams collaborate in their exploration of requirements. They incur significant setup and maintenance overhead, so if the audience of your class-level scenarios / examples is technical, there may be an easier way.
For class-level examples of behaviour I would just go with a plain unit testing tool like NUnit or MSpec. I like using NUnit and putting my "Given / When / Then" in comments:
// Given I initialized the lens shading plugin on startup
_plugin = new LSCPlugin(_imagesDatabaseProvider, n_iExternalToolImageViewerControl);
// Then the plugin should have been created
Assert.NotNull(_plugin);
Steps at a class level aren't reused in the same way they are in full-system scenarios, because classes have much smaller, more encapsulated responsibilities; and developers benefit from reading the code rather than having it hidden away in the step definitions.
Your Given/When/Then comments here might still echo scenarios at a higher level, if the class is directly driving the functionality that the user sees.
Normally for full-system scenarios we would derive the steps from conversations with the "3 amigos":
a business representative (PO, SME, someone who has a problem to be solved)
a tester (who spots scenarios we might otherwise miss)
the dev (who's going to solve the problem).
There might be a pair of devs. UI designers can get involved if they want to. Matt Wynne says it's "3 amigos, where 3 is any number between 3 and 7". The best time to have the conversations is right before the devs pick up the work to begin coding it.
However, if you're working on your own, whether it's a toy or a real application, you might benefit just from having imaginary conversations. I use a pixie called Thistle for mine.
I am self-training on the TPL-Dataflow, and I have read that using immutable objects for messages is the way to go.
To comply with this, I have designed specific classes for every block inputs and outputs.
Unfortunately, when I link my block each other, because the blocks input and output types are very different, it leads to a proliferation of TransformBlock:
var proc1 = new TransformBlock<proc1In,proc1Out>(...
var convertOut1toIn2 = new TransformBlock<proc1Out,proc2In>(p1 => new proc2In { ...
var proc2 = TransformBlock<proc2In,proc2Out>(...
proc1.LinkTo(convertOut1ToIn2);
convertOut1ToIn2.LinkTo(proc2);
Using Batch and Join blocks later to merge results together makes me struggle with a very messy code.
Every samples I read on the internet uses simple types as int, string... I have not found anything that deals with a bit more complex types.
I feel the urge to use single big object and pass its reference through all the blocks. Before doing this mistake, I would like to know if there is some better way to do.
After some time musing with TPL-Dataflow, it turns out that:
Envisioning Dataflow as a conveyor belt carrying manufacturing items towards different working stations where items are enriched and built is completely wrong: doing this way leads to excruciating hard concurrency issues. Dataflow is a messaging system.
Instead, I feel it better picturing it as a mesh of people who deals with external facilities to make things (IO, Database persistence, CalculationEngines...)
The problem of messages types I dealt with is easly circumvented using Tuples. In general I dislike Tuples uglyness, but in this very situation, I feel like they really fits this place.
My problem is multiple picture analysis. Instead of having Blocks passing a "Workitem" object each other and mess with it, I rather use an separate "WorkItemSupplier" class instead. This class uses a ConcurrentDictionary of WorkItems and exposes methods to deals with workitems.
This way, my blocks in Dataflow only passes the ID of a workitem each other, so they can use the WorkItemSupplier as an external facility to store/retrieves, or change the state of any workitem.
By this way, code is running way smoothier, well separated and easier to read.
Not being an experienced programmer, I was wondering if you could help me to find the most efficient way to refactor a part of source code.
Indeed, I have taken over a project where in one class I have one (public) static method which is around 3000 lines long.
I would like to refactor it especially with regards to the fact that I will incorporate some multithreading in it.
Basically the code is a s follows:
> - Different kinds of parameter Initialisatoins need for the method
> - Monte-Carlo routine with random numbers generations and business logic
> - outut of results.
In my opinion the best way is to remove the staticity of the method and to build a "plain " class with a constructor and divide the Monte-Carlo routine in smaller functions.
However, I will have around 50 class memebers which seems not too appropriate.
However, that is the only "not to disgusting" implementation that I came with.
What would be your advice.
Many Thanks,
Your idea of creating new class from the function is probably best one.
Use the extract method feature to brake the function down into the 3 parts you just described. Then take each part and break it down even more by finding the logical independent parts of the code. But you can do more, define a monte carlo class that holds the independent monte carlo logic.
I'm writing the simple card game "War" for homework and now that the game works, I'm trying to make it more modular and organized. Below is a section of Main() containing the bulk of the program. I should mention, the course is being taught in C#, but it is not a C# course. Rather, we're learning basic logic and OOP concepts so I may not be taking advantage of some C# features.
bool sameCard = true;
while (sameCard)
{
sameCard = false;
card1.setVal(random.Next(1,14)); // set card value
val1 = determineFace(card1.getVal()); // assign 'face' cards accordingly
suit = suitArr[random.Next(0,4)]; // choose suit string from array
card1.setSuit(suit); // set card suit
card2.setVal(random.Next(1,14)); // rinse, repeat for card2...
val2 = determineFace(card2.getVal());
suit = suitArr[random.Next(0,4)];
card2.setSuit(suit);
// check if same card is drawn twice:
catchDuplicate(ref card1, ref card2, ref sameCard);
}
Console.WriteLine ("Player: {0} of {1}", val1, card1.getSuit());
Console.WriteLine ("Computer: {0} of {1}", val2, card2.getSuit());
// compare card values, display winner:
determineWinner(card1, card2);
So here are my questions:
Can I use loops in Main() and still consider it modular?
Is the card-drawing process written well/contained properly?
Is it considered bad practice to print messages in a method (i.e.: determineWinner())?
I've only been programming for two semesters and I'd like to form good habits at this stage. Any input/advice would be much appreciated.
Edit:
catchDuplicate() is now a boolean method and the call looks like this:
sameCard = catchDuplicate(card1, card2);
thanks to #Douglas.
Can I use loops in Main() and still consider it modular?
Yes, you can. However, more often than not, Main in OOP-programs contains only a handful of method-calls that initiate the core functionality, which is then stored in other classes.
Is the card-drawing process written well/contained properly?
Partially. If I understand your code correctly (you only show Main), you undertake some actions that, when done in the wrong order or with the wrong values, may not end up well. Think of it this way: if you sell your class library (not the whole product, but only your classes), what would be the clearest way to use your library for an uninitiated user?
I.e., consider a class Deck that contains a deck of cards. On creation it creates all cards and shuffles it. Give it a method Shuffle to shuffle the deck when the user of your class needs to shuffle and add methods like DrawCard for handling dealing cards.
Further: you have methods that are not contained within a class of their own yet have functionality that would be better of in a class. I.e., determineFace is better suited to be a method on class Card (assuming card2 is of type Card).
Is it considered bad practice to print messages in a method (i.e.: determineWinner())?
Yes and no. If you only want messages to be visible during testing, use Debug.WriteLine. In a production build, these will be no-ops. However, when you write messages in a production version, make sure that this is clear from the name of the method. I.e., WriteWinnerToConsole or something.
It's more common to not do this because: what format would you print the information? What text should come with it? How do you handle localization? However, when you write a program, obviously it must contain methods that write stuff to the screen (or form, or web page). These are usually contained in specific classes for that purpose. Here, that could be the class CardGameX for instance.
General thoughts
Think about the principle "one method/function should have only one task and one task only and it should not have side effects (like calculating square and printing, then printing is the side effect).".
The principle for classes is, very high-level: a class contains methods that logically belong together and operate on the same set of properties/fields. An example of the opposite: Shuffle should not be a method in class Card. However, it would belong logically in the class Deck.
If the main problem of your homework is create a modular application, you must encapsulate all logic in specialized classes.
Each class must do only one job.
Function that play with the card must be in a card class.
Function that draw cards, should be another class.
I think it is the goal of your homework, good luck!
Take all advices on "best practices" with a grain of salt. Always think for yourself.
That said:
Can I use loops in Main() and still consider it modular?
The two concepts are independent. If your Main() only does high-level logic (i.e. calls other methods) then it does not matter if it does so in a loop, after all the algorithm requires a loop. (you wouldn't add a loop unnecessarily, no?)
As a rule of thumb, if possible/practical, make your program self-documenting. Make it "readable" so, if a new person (or even you, a few months from now) looks at it they can understand it at any level.
Is the card-drawing process written well/contained properly?
No. First of all, a card should never be selected twice. For a more "modular" approach I would have something like this:
while ( Deck.NumCards >= 2 )
{
Card card1 = Deck.GetACard();
Card card2 = Deck.GetACard();
PrintSomeStuffAboutACard( GetWinner( card1, card2 ) );
}
Is it considered bad practice to print messages in a method (ie: determineWinner())?
Is the purpose of determineWinner to print a message? If the answer is "No" then it is not a matter of "bad practice", you function is plain wrong.
That said, there is such a thing as a "debug" build and a "release" build. To aid you in debugging the application and figuring out what works and what doesn't it is a good idea to add logging messages.
Make sure they are relevant and that they are not executed in the "release" build.
Q: Can I use loops in Main() and still consider it modular?
A: Yes, you can use loops, that doesn't really have an impact on modularity.
Q: Is the card-drawing process written well/contained properly?
A: If you want to be more modular, turn DrawCard into a function/method. Maybe just write DrawCards instead of DrawCard, but then there's an optimization-versus-modularity question there.
Q: Is it considered bad practice to print messages in a method (ie: determineWinner())?
A: I wouldn't say printing messages in a method is bad practice, it just depends on context. Ideally, the game itself doesn't handle anything but game logic. The program can have some kind of game object and it can read state from the game object. This way, you could technically change the game from being text-based to being graphical. I mean, that's ideal for modularity, but it may not be practical given a deadline. You always have to decide when you have to sacrifice a best practice because there isn't enough time. Sadly, this is all too often a common occurrence.
Separate game logic from the presentation of it. With a simple game like this, it's an unnecessary dependency.