C# refactoring considerations - c#

I have the following doubt.
For refactoring, I have read that is good create methods that has a very specific responsability, so if it is possible, it is a good idea to split a complex method in others small methods.
But imagine that I have this case:
I have to create a list of objects, and insdie this objects, I have to create another object. Something like that:
public void myComplexMethod(List<MyTypeA> paramObjectsA)
{
foreach(MyTypeA iteratorA in paramObjectsA)
{
//Create myObjectB of type B
//Create myObjectC of type C
myObjectB.MyPorpertyTpyeC = myObjectC;
}
}
I can split this method in two methods.
public void myMethodCreateB(List<MyTypeA> paramObjectsA)
{
foreach(MyTypeA iteratorA in paramObjectsA)
{
//Create myObjectB of type B
}
}
public void myMethodCreateB(List<MyTypeB> paramObjectsB)
{
foreach(MyTypeB iteratorB in paramObjectsB)
{
//Create myObjectC of type C
iteratorB.PropertyC = myObjectC;
}
}
In the second option, when I use two methods instead one, the unit tests are less complex, but the problem is that I use two foreach loops, so it is less efficient than use only one loop like in the first option.
So, what is the best practice, at least in general, to use a method a bit more complex to be more efficient or to use more methods?
Thanks so much.

I generally put readability at a higher priority than performance, until proven otherwise. I'm generalizing now a bit but in my experience, when people focus on performance too much at the code level, the result is less maintainable code, it distracts them from creating functionally correct code, it takes longer (=more money), and possibly results in even less performant code.
So don't worry about it and use the more readable approach. If your app is really too slow in the end, run it through a profiler and pinpoint (and prove) the one or two places where it requires optimization. I can guarantee you it won't be this code.
Making the correct choices at the architectural level early on is much more critical because you won't be able to easily make changes at that level once your app is built.

Usually I would keep using one for-loop in this case.
Seems you are just create and decorate the objects of MyTypeB.
I would prefer create a factory method in class MyTypeB:
static MyTypeB Create(MyTypeA a) { // if the creation of MyTypeB depends on A
//Create myObjectB of type B
//Create myObjectC of type C
myObjectB.MyPorpertyTpyeC = myObjectC;
return myObjectB;
}
then your complex method will become:
public void myComplexMethod(List<MyTypeA> paramObjectsA)
{
foreach(MyTypeA iteratorA in paramObjectsA)
{
MyTypeB myObjectB = MyTypeB.Create(iteratorA);
}
}

Related

Passing constructor delegate or object for unmanaged resources

In my (simplified) problem I have a method "Reading" that can use many different implementation of some IDisposableThing. I am passing delegates to the constructor right now so I can use the using statement.
Is this approach of passing a delegate of the constructor of my object appropriate?
My problem is that things like List<Func<IDisposable>> etc start looking bit scary (because delegates look like crap in c#) and passing in a object seems more usual and a clearer statement of intent.
Is there a better/different way of managing this situation without delegates?
public void Main()
{
Reading(() => new DisposableThingImplementation());
Reading(() => new AnotherDisposableThingImplementation());
}
public void Reading(Func<IDisposableThing> constructor)
{
using (IDisposableThing streamReader = constructor())
{
//do things
}
}
As I said in the comment, it's difficult to say what's best for your situation, so instead I'll just list your options so you can make an informed decision:
Continue doing what you're doing
Having to use around objects with an unpleasantly complicated-looking type is maybe not ideal visually, but in your situation it may well be perfectly appropriate
Use a custom delegate type
You can define a delegate like:
public delegate IDisposableThing DisposableThingConstructor();
Then anywhere you would write Func<IDisposableThing>, you can just write DisposableThingConstructor instead. For a commonly used delegate type, this may improve code readability, though this too is a matter of taste.
Move the using statements out of Reading
This really depends on whether it's sensible for the lifecycle management of these objects to be a responsibility of the Reading method or not. Given what we have of your code at the moment, we can't really judge this for you. An implementation with the lifecycle management moved out would look like:
public void Main()
{
using(var disposableThing = new DisposableThingImplementation())
Reading(disposableThing);
}
public void Reading(IDisposableThing disposableThing)
{
//do things
}
Use a factory pattern
In this option, you create a class which returns new IDisposableThing implementations. Lots of information can be found on the factory pattern which you may well already know, so I won't repeat it all here. This option may well be overkill for your purposes here, adding a lot of pointless complexity, but depending on how those DisposableThings are constructed, it may have additional benefits which make it worthwhile.
Use a generic argument
This option will only work if all of your IDisposableThing implementations have a parameterless constructor. I'm guessing that's not the case, but in case it is, it's a relatively straightforward approach:
public void Reading<T>() where T : IDisposableThing, new()
{
using(var disposableThing = new T())
{
//do things
}
}
Use an Inversion of Control container
This is another option which would certainly be overkill if used for this purpose alone. I include it mostly for completeness. Inversion of control containers like Ninject will give you easy ways to manage the lifecycles of objects passed into others.
I very much doubt this would be an appropriate solution in your case, especially since the disposable objects are not being used in another class's constructor. If you later run into a situation where you're trying to manage object lifecycle in a larger, complex object graph, this option might be worth revisiting.
Construct the objects outside of the using statement
This is specifically described as "not a best practice" in the MSDN documentation, but it is an option. You can do:
public void Main()
{
Reading(new DisposableThingImplementation());
}
public void Reading(IDisposableThing disposableThing)
{
using (disposableThing)
{
//do things
}
}
At the end of the using statement, the Dispose method will be called, but the object will not be garbage collected because it is still in scope. Trying to use the object after that would be very likely to cause problems because it is not fully initialized. So again, while this is an option, it's unlikely to be a good one.
Is this approach of passing a delegate of the constructor of my object appropriate? My problem is that things like List<Func<IDisposable>> etc start looking bit scary (because delegates look like crap in c#) and passing in a object seems more usual and a clearer statement of intent.
Yes, it's fine. However I understand your concern about passing a list of those things... Perhaps creating a custom delegate with the same signature as Func<IDisposable> and a more explicit name (e.g. SomethingFactory) would be clearer.
Is there a better/different way of managing this situation without delegates?
You could pass a factory or a list of factories to the method. I don't think it's really "better", though; it's mostly the same, since your factory would typically be represented as an interface with a single method, which is essentially the same as a delegate.

Method in method in method c#

Is it bad to have a lot of methods referring to each other, like the following example?
public void MainMethod()
{
GetProducts();
}
public void GetProducts()
{
var _products = new Products();
var productlist = _products.GetProductList;
GetExcelFile(productlist);
}
public void GetExcelFile(List<product> productlist)
{
var _getExcelFile = new GetExcelFile();
var excelfile = _getExcelFile.GetExcelFileFromProductList(productlist);
//create excel file and so on...
}
So I am creating a new method for every little action. It would be just as easy to call GetProducts from the MainMethod and do all actions in that method, but I think that isn't the way to create re-usable code.
The following tells me that there should be no more than 7 statements in a method:
Only 7 statements in a method
So the advantages of using methods with a minimal amount of code:
Code is re-usable
Every task can get his own method
The disadvantages of using methods with a minimal amount of code:
It's like spaghetti code
You get: refer to refer and so on
My question:
Should my methods be bigger, or should I keep creating small methods, that do little and refer to a lot of other methods?
The guideline is right. Methods should be small and you are doing the right thing be giving not only each operation its own method, but a well defined name. If those methods have a clear name, one responsibility and a clear intention (and don't forget to separate commands from queries), your code will not be spagetti. On top of that, try to order methods like a news article: most important methods on top of the file, methods with the most detail on the bottom. This way anyone else can start reading at the top and stop reading when they're bored (or have enough information).
I can advice you to get a copy of Robert Martin's Clean Code. There's no one in the industry who describes this more clearly than he does.
The guideline is generally a good one, not so much for reuse, but for readability of your code.
Your example, though, is not a good one. What you're doing is basically creating a long list of methods where each one stops when you feel it's too long and calls another one to perform the rest of the operations.
I would follow more this kind of approach, where reading the main method tells you the "story" that your code needs to tell by steps and the detail of each step is in smaller methods.
public void MainMethod()
{
var productlist=GetProducts();
string excelfile=GetExcelFile(productlist);
// do something in the excel file
}
public List<product> GetProducts()
{
var _products = new Products();
return _products.GetProductList;
}
public string GetExcelFile(List<product> productlist)
{
var _getExcelFile = new GetExcelFile();
var excelfile = _getExcelFile.GetExcelFileFromProductList(productlist);
return excelfile;
}
I don't really agree with '7 statements in a method'. A method can have dozens of statements, as long as the method performs a single function and the specific logic will only be used in one place, I don't really see the point of cutting it into parts just because some guyideline says so, it should be seperated based on what makes sense.
Re-use of code is good if it makes sense, but I don't think you should make everything everywhere re-usable, when you have no plans in the near future to actually re-use it. It increases the development time needlessly, it often makes the software more complex and harder to interpret then it needs to be, and (at least in my company) the majority of the code never gets re-used, and even if it is it still needs modifications in new products. Only the most generic parts of our codebase actually gets used in several applications without modifications for each product.
And I think that's fine.
this is mostly opinion based question, but i'll tell you one thing:
if you're not going to use a method from more then one place, it might be better not to create a method for that.
you can use regions for clarity and you might not want a method that is larger then a full page, but not every 2-3 commands should get a method.

Enhancing testability by decomposing batch tasks

I can't seem to find much information on this so I thought I'd bring it up here. One of the issues I often find myself running into is unit testing the creation of a single object while processing a list. For example, I'd have a method signature such as IEnumerable<Output> Process(IEnumerable<Input> inputs). When unit testing a single input I would create a list of one input and simply call First() on the results and ensure it is what I expect it to be. This would lead to something such as:
public class BatchCreator
{
public IEnumerable<Output> Create(IEnumerable<Input> inputs)
{
foreach (var input in inputs)
{
Console.WriteLine("Creating Output...");
yield return new Output();
}
}
}
My current thinking is that maybe one class should be responsible for the objects creation while another class be responsible for orchestrating my list of inputs. See example below.
public interface ICreator<in TInput, out TReturn>
{
TReturn Create(TInput input);
}
public class SingleCreator : ICreator<Input, Output>
{
public Output Create(Input input)
{
Console.WriteLine("Creating Output...");
return new Output();
}
}
public class CompositeCreator : ICreator<IEnumerable<Input>, IEnumerable<Output>>
{
private readonly ICreator<Input, Output> _singleCreator;
public CompositeCreator(ICreator<Input, Output> singleCreator)
{
_singleCreator = singleCreator;
}
public IEnumerable<Output> Create(IEnumerable<Input> inputs)
{
return inputs.Select(input => _singleCreator.Create(input));
}
}
With what's been posted above, I can easily test that I'm able to create one single instance of Output given an Input. Note that I do not need to call SingleCreator anywhere else in the code base other than from CompositeCreator. Creating ICreator would also give me the benefit of reusing it for other times I need to do similar tasks, which I currently do 2-3 other times in my current project
Anyone have any experience with this that could shed some light? Am I simply overthinking this? Suggestions are greatly appreciated.
Generally speaking, there's nothing inherently wrong with your reasoning. More or less that's how the issue can be solved.
However, your CompositeCreator isn't actually composite, since it uses precisely one "creation method".
It's difficult to say anything more, because we don't know your project internals, but if it integrates well into your use cases, then it's fine. What I'd try is stay with ICreator<Tin, Tout> only and make an extension method IEnumerable<Tout> CreateMany(this IEnumerable<Tin> c) to deal with collections. You can test both easily, independently (fake ICreator and check whether collection of inputs is processed). This way you get rid of ICreator<IEnumerable, ...>, which is usually good, because operating on collection as a whole and operating on individual items often don't go well together.
I'm not entirely sure why you need the IEnumerable input/output option, the composite creator, unless it is more than just a collection, as that's a problem solved by LINQ, which would look something like:
var singleCreator = new SingleCreator();
var outputs = InputEnumerable.Select(singleCreator.Create);
I think this is subjective, and depends on the complexity of the classes you are passing around - if it's not just an IEnumerable then it's worthwhile having some sort of multiple creator, which may or may not need to be a class.

Should I created class or create if?

I have a situation:
I nee to do something with a class.
What should be more efficiente, modify the method this way witf IFs or created methos for each action?
public value Value(int command)
{
if (command == 1)
{
DoSomething1();
}
if (command == 2)
{
DoSomething2();
}
else
{
return empty();
}
}
There are going to be like 50 o more of this commands.
Whats isbetter in terms of performance on execution and size of the exectuable?
At a high-level, it looks like you're trying to implement some kind of dynamic-dispatch system? Or are you just wanting to perform a specified operation without any polymorphism? It's hard to tell.
Anyway, based on the example you've given, switch block would be the most performant, as the JIT compiler converts it into an efficient hashtable lookup instead of a series of comparisons, so just do this:
enum Command { // observe how I use an enum instead "magic" integers
DoSomethingX = 1,
DoSomethingY = 2
}
public Value GetValue(Command command) {
switch(command) {
case Command.DoSomethingX: return DoSomethingX();
case Command.DoSomethingY: return DoSomethingY();
default: return GetEmpty();
}
}
I also note that the switch block also means you get more compact code.
This isn't a performance problem as much as it is a paradigm problem.
In C# a method should be an encapsulation of a task. What you have here is a metric boatload of tasks, each unrelated. That should not be in a single method. Imagine trying to maintain this method in the future. Imagine trying to debug this, wondering where you are in the method as each bit is called.
Your life will be much easier if you split this out, though the performance will probably make no difference.
Although separate methods will nearly certainly be better in terms of performance, it is highly unlikely that you should notice the difference. However, having separate methods should definitely improve readability a lot, which is a lot more important.

Create new instance or just set internal variables

I've written a helper class that takes a string in the constructor and provides a lot of Get properties to return various aspects of the string. Currently the only way to set the line is through the constructor and once it is set it cannot be changed. Since this class only has one internal variable (the string) I was wondering if I should keep it this way or should I allow the string to be set as well?
Some example code my help why I'm asking:
StreamReader stream = new StreamReader("ScannedFile.dat");
ScannerLine line = null;
int responses = 0;
while (!stream.EndOfStream)
{
line = new ScannerLine(stream.ReadLine());
if (line.IsValid && !line.IsKey && line.HasResponses)
responses++;
}
Above is a quick example of counting the number of valid responses in a given scanned file. Would it be more advantageous to code it like this instead?
StreamReader stream = new StreamReader("ScannedFile.dat");
ScannerLine line = new ScannerLine();
int responses = 0;
while (!stream.EndOfStream)
{
line.RawLine = stream.ReadLine();
if (line.IsValid && !line.IsKey && line.HasResponses)
responses++;
}
This code is used in the back end of a ASP.net web application and needs to be somewhat responsive. I am aware that this may be a case of premature optimization but I'm coding this for responsiveness on the client side and maintainability.
Thanks!
EDIT - I decided to include the constructor of the class as well (Yes, this is what it really is.) :
public class ScannerLine
{
private string line;
public ScannerLine(string line)
{
this.line = line;
}
/// <summary>Gets the date the exam was scanned.</summary>
public DateTime ScanDate
{
get
{
DateTime test = DateTime.MinValue;
DateTime.TryParseExact(line.Substring(12, 6).Trim(), "MMddyy", CultureInfo.InvariantCulture, DateTimeStyles.None, out test);
return test;
}
}
/// <summary>Gets a value indicating whether to use raw scoring.</summary>
public bool UseRaw { get { return (line.Substring(112, 1) == "R" ? true : false); } }
/// <summary>Gets the raw points per question.</summary>
public float RawPoints
{
get
{
float test = float.MinValue;
float.TryParse(line.Substring(113, 4).Insert(2, "."), out test);
return test;
}
}
}
**EDIT 2 - ** I included some sample properties of the class to help clarify. As you can see, the class takes a fixed string from a scanner and simply makes it easier to break apart the line into more useful chunks. The file is a line delimiated file from a Scantron machine and the only way to parse it is a bunch of string.Substring calls and conversions.
I would definitely stick with the immutable version if you really need the class at all. Immutability makes it easier to reason about your code - if you store a reference to a ScannerLine, it's useful to know that it's not going to change. The performance is almost certain to be insignificant - the IO involved in reading the line is likely to be more significant than creating a new object. If you're really concerned about performance, should should benchmark/profile the code before you decide to make a design decision based on those performance worries.
However, if your state is just a string, are you really providing much benefit over just storing the strings directly and having appropriate methods to analyse them later? Does ScannerLine analyse the string and cache that analysis, or is it really just a bunch of parsing methods?
You're first approach is more clear. Performance wise you can gain something but I don't think is worth.
I would go with the second option. It's more efficient, and they're both equally easy to understand IMO. Plus, you probably have no way of knowing how many times those statements in the while loop are going to be called. So who knows? It could be a .01% performance gain, or a 50% performance gain (not likely, but maybe)!
Immutable classes have a lot of advantages. It makes sense for a simple value class like this to be immutable. The object creation time for classes is small for modern VMs. The way you have it is just fine.
I'd actually ditch the "instance" nature of the class entirely, and use it as a static class, not an instance as you are right now. Every property is entirely independent from each other, EXCEPT for the string used. If these properties were related to each other, and/or there were other "hidden" variables that were set up every time that the string was assigned (pre-processing the properties for example), then there'd be reasons to do it one way or the other with re-assignment, but from what you're doing there, I'd change it to be 100% static methods of the class.
If you insist on having the class be an instance, then for pure performance reasons I'd allow re-assignment of the string, as then the CLR isn't creating and destroying instances of the same class continually (except for the string itself obviously).
At the end of the day, IMO this is something you can really do any way you want since there are no other class instance variables. There may be style reasons to do one or the other, but it'd be hard to be "wrong" when solving that problem. If there were other variables in the class that were set upon construction, then this'd be a whole different issue, but right now, code for what you see as the most clear.
I'd go with your first option. There's no reason for the class to be mutable in your example. Keep it simple unless you actually have a need to make it mutable. If you're really that concerned with performance, then run some performance analysis tests and see what the differences are.

Categories