Best performance for checking character - c#

Very simple question. Which would test faster? This:
var myString = "goodTimes";
if (myString.StartsWith("g"))
{
// do stuff
}
Or this:
var myString = "goodTimes";
if (myString[0] == 'g')
{
// do stuff
}

They do different things - they behave differently for zero length strings, in particular. Other than that, hypothetically the myString[0] should be marginally faster (it does less), but: whether this actually matters is hugely contextual. In most cases, it won't, and you'll have spent more time asking yourself the question than it will ever save. If you're in a scenario where it matters, you'll also know that you need to benchmark with actual realistic data to have a good answer. And only you can do that, with your own particular data.

They would both test at almost the exact same time, for all practical purposes.
The second one might be several clock cycles faster, but who cares.
If you were to write the most successful app ever, downloaded by millions of users, and ran on millions of devices every day, and if your app was executing the above code once a second on each installation, the total number of seconds you would save for all your users combined would never exceed the total amount of time we just spent discussing this.

Related

Why is using a pointer for a for loop more performant in this case?

I don't have a background in C/C++ or related lower-level languages and so I've never ran into pointers before. I'm a game dev working primarily in C# and I finally decided to move to an unsafe context this morning for some performance-critical sections of code (and please no "don't use unsafe" answers as I've read so many times while doing research, as it's already yielding me around 6 times the performance in certain areas, with no issues so far, plus I love the ability to do stuff like reverse arrays with no allocation). Anyhow, there's a certain situation where I expected no difference, or even a possible decrease in speed, and I'm saving a lot of ticks in reality (I'm talking about double the speed in some instances). This benefit seems to decrease with the number of iterations, which I don't fully understand.
This is the situation:
int x = 0;
for(int i = 0; i < 100; i++)
x++;
Takes, on average about 15 ticks.
EDIT: The following is unsafe code, though I assumed that was a given.
int x = 0, i = 0;
int* i_ptr;
for(i_ptr = &i; *i_ptr < 100; (*i_ptr)++)
x++;
Takes about 7 ticks, on average.
As I mentioned, I don't have a low-level background and I literally just started using pointers this morning, at least directly, so I'm probably missing quite a bit of info. So my first query is- why is the pointer more performant in this case? It isn't an isolated instance, and there are a lot of other variables of course, at that specific point in time in relation to the PC, but I'm getting these results very consistently across a lot of tests.
In my head, the operations are as such:
No pointer:
Get address of i
Get value at address
Pointer:
Get address of i_ptr
Get address of i from i_ptr
Get value at address
In my head, there must surely be more overhead, however ridiculously negligible, from using a pointer here. How is it that a pointer is consistently more performant than the direct variable in this case? These are all on the stack as well, of course, so it's not dependent on where they end up being stored, from what I can tell.
As touched on earlier, the caveat is that this bonus decreases with the number of iterations, and pretty fast. I took out the extremes from the following data to account for background interference.
At 1000 iterations, they are both identical at 30 to 34 ticks.
At 10000 iterations, the pointer is slower by about 20 ticks.
Jump up to 10000000 iterations, and the pointer is slower by about 10000 ticks or so.
My assumption is that the decrease comes from the extra step I covered earlier, given that there is an additional lookup, which brings me back to wonder why it's more performant with a pointer than without at low loop counts. At the very least, I'd assume they would be more or less identical (which they are in practice, I suppose, but a difference of 8 ticks from millions of repeated tests is pretty definitive to me) up until the very rough threshold I found somewhere between 100 and 1000 iterations.
Apologies if I'm nitpicking somewhat, or if this is a poor question, but I feel as though it will be beneficial to know exactly what is going on under the hood. And if nothing else, I think it's pretty interesting!
Some users suggested that the test results were most likely due to measurement inaccuracies, and it would seem as such, at least upto a point. When averaged across ten million continuous tests, the mean of both is typically equal, though in some cases the use of pointers averages out to an extra tick. Interestingly, when testing as a single case, the use of pointers has a consistently lower execution time than without. There are of course a lot of additional variables at play at the specific points in time at which a test is tried, which makes it somewhat of a pointless pursuit to track this down any further. But the result is that I've learned some more about pointers, which was my primary goal, and so I'm pleased with the test.

Why if I run method one time is done work almost the same time if I run few times in for loop c#

I make one method who doing some simple operations like +, -, *, /.
I need to run this method 1513 times.
Here I try to run this method only once. To see do is working good and how times is be needed for to finish with operations.
Stopwatch st = new Stopwatch();
st.Start();
DiagramValue dv = new DiagramValue();
double pixel = dv.CalculateYPixel(23.46, diction);
st.Stop();
When is stop the stopwatch is teling me the time is 0.06s.
When I run the same method 1513 times in for loop like that:
Stopwatch st = new Stopwatch();
st.Start();
for (int i = 0; i < 1513; i++)
{
DiagramValue dv = new DiagramValue();
double pixel = dv.CalculateYPixel(23.46, diction);
}
st.Stop();
Then the Stopwatch is tell me is working around 0.14s. Or 0.14s / 1513 times = 0.00009s for one time.
My question is why If I running some method only once is too slow and if I running around thousand times in for loop is almost the same time.
Writing benchmarks is hard.
First, Stopwatch isn't infinitely accurate. When you run the method just once, you're very much limited by the accuracy of the underlying stopwatch. On the other hand, running the method multiple times alleviates this - you can get arbitrary precision by using a big enough loop. Instead of 1 vs 1513, compare e.g. 1500 vs. 3000. You'll get around 100% time increase, as expected.
Second, there's usually some cost with the first call in particular (e.g. JIT compilation) or with the memory pressure at the time of the call. That's why you usually need to do "preheating" - run the method outside of the stopwatch first to isolate these, and measure (multiple invocations) later.
Third, in a garbage collected environment like .NET, the guy who ordered the beer isn't necessarily the guy who pays the bill. Most of the cost of memory allocation in .NET is in the collection, rather than the allocation itself (which is about as cheap as a stack allocation). The collection usually happens outside of the code that caused the allocations in the first place, pointing you in the entirely wrong direction when searching for performance issues. That's why most .NET memory trackers display garbage collection separately - it's important to take account of, but can easily mislead you as to the cause if you're not careful.
There's many more issues, but these should cover your particular scenario well enough.
Some possible reasons include:
Timing resolution. You get a more accurate figure when you find the mean over a large number of iterations.
Noise. The percentage of stuff that isn't what you actually want to record, will be different.
Jitting. .NET will create code the first time a method is used. As such the first time it is run in a programs lifetime, the longer it will take, by a large factor (try running it once and then measuring the second attempt).
Branch prediction. If you keep doing the same thing with the same data the CPU's branch predictor is going to get better at predicting which branches are takken.
GC stability. Not likely in this case, but possible. Often at the start of a set of operations that requires particular objects to be created and then released the program ends up having to get more memory from the OS. When it's a bit into that set of operations it's more likely to have reached a steady state where it can just get that memory by cleaning out objects it isn't using any more, which is faster.

C# Linq slower than PHP? Solving riddle #236A

I'm training with solving Olympic IT-riddles on one site.
I have provided two solutions:
- C#
http://ideone.com/exF1HJ
- PHP
http://ideone.com/WbaPHY
I was confused when online judgment showed , that PHP version was faster!!!
Why?
C#: 109 ms 3000 Kb
PHP: 45 ms 0 Kb
How could it be?
Given the programs given, the execution time of the important bit of the program - finding the unique characters - would definitely not take 109ms. It sounds like whatever "online judgement" is involved is measuring total execution time including process startup, JITting in the case of .NET, etc.
It's a bit like asking which car gets out of a garage faster, and thinking that represents the speed of the car.
Now it's entirely possible that PHP's array_unique function really is very fast, possibly faster than LINQ... but basically you can't get any useful information out of the benchmark results. You should be looking for benchmarks which execute for seconds rather than milliseconds, and which don't include startup/warm-up time, unless that's what you're particularly interested in.
Your C# version creates three arrays that you don't seem to need. You could replace it with:
string input = Console.ReadLine();
int charCount = input.Distinct().Count();
if(charCount % 2 == 0) ...
The following is probably quicker still:
int charCount = new HashSet<char>(input).Count;

Performance and Memory Consumption in C#

I have two question:
1) I need some expert view in terms of witting code which will be Performance and Memory Consumption wise sound enough.
2) Performance and Memory Consumption wise how good/bad is following piece of code and why ???
Need to increment the counter that could go maximum by 100 and writing code like this:
Some Sample Code is as follows:
for(int i=0;i=100;i++)
{
Some Code
}
for(long i=0;i=1000;i++)
{
Some Code
}
how good is to use Int16 or anything else instead of int, long if the requirement is same.
Need to increment the counter that could go maximum by 100 and writing code like this:
Options given:
for(int i=0;i=100;i++)
for(long i=0;i=1000;i++)
EDIT: As noted, neither of these would even actually compile, due to the middle expression being an assignment rather than an expression of type bool.
This demonstrates a hugely important point: get your code working before you make it fast. Your two loops don't do the same thing - one has an upper bound of 1000, the other has an upper bound of 100. If you have to choose between "fast" and "correct", you almost always want to pick "correct". (There are exceptions to this, of course - but that's usually in terms of absolute correctness of results across large amounts of data, not code correctness.)
Changing between the variable types here is unlikely to make any measurable difference. That's often the case with micro-optimizations. When it comes to performance, architecture is usually much more important than in-method optimizations - and it's also a lot harder to change later on. In general, you should:
Write the cleanest code you can, using types that represent your data most correctly and simply
Determine reasonable performance requirements
Measure your clean implementation
If it doesn't perform well enough, use profiling etc to work out how to improve it
DateTime dtStart = DateTime.Now;
for(int i=0;i=10000;i++)
{
Some Code
}
response.write ((DateTime.Now - dtStart).TotalMilliseconds.ToString());
same way for Long as well and you can know which one is better... ;)
When you are doing things that require a number representing iterations, or the quantity of something, you should always use int unless you have a good semantic reason to use a different type (ie data can never be negative, or it could be bigger than 2^31). Additionally, Worrying about this sort of nano-optimization concern will basically never matter when writing c# code.
That being said, if you are wondering about the differences between things like this (incrementing a 4 byte register versus incrementing 8 bytes), you can always cosult Mr. Agner's wonderful instruction tables.
On an Amd64 machine, incrementing long takes the same amount of time as incrementing int.**
On a 32 bit x86 machine, incrementing int will take less time.
** The same is true for almost all logic and math operations, as long as the value is not both memory bound and unaligned. In .NET a long will always be aligned, so the two will always be the same.

Is Linq Faster, Slower or the same?

Is this:
Box boxToFind = AllBoxes.FirstOrDefault(box => box.BoxNumber == boxToMatchTo.BagNumber);
Faster or slower than this:
Box boxToFind ;
foreach (Box box in AllBoxes)
{
if (box.BoxNumber == boxToMatchTo.BoxNumber)
{
boxToFind = box;
}
}
Both give me the result I am looking for (boxToFind). This is going to run on a mobile device that I need to be performance conscientious of.
It should be about the same, except that you need to call First (or, to match your code, Last), not Where.
Calling Where will give you a set of matching items (an IEnumerable<Box>); you only want one matching item.
In general, when using LINQ, you need to be aware of deferred execution. In your particular case, it's irrelevant, since you're getting a single item.
The difference is not important unless you've identified that this particular loop as a performance bottleneck through profiling.
If profiling does find it to be a problem, then you'll want to look into alternate storage. Store the data in a dictionary which provides faster lookup than looping through an array.
If micro-optimization is your thing, LINQ performs worse, this is just one article, there are a lot of other posts you can find.
Micro optimization will kill you.
First, finish the whole class, then, if you have performance problems, run a profiler and check for the hotspots of the application.
Make sure you're using the best algorithms you can, then turn to micro optimizations like this.
In case you already did :
Slow -> Fast
LINQ < foreach < for < unsafe for (The last option is not recommended).
Abstractions will make your code slower, 95% of the time.
The fastest is when you are using for loop. But the difference is so small that you are ignore it. It will only matter if you are building a real-time application but then for those applications maybe C# is not the best choice anyway!
If AllBoxes is an IQueryable, it can be faster than the loop, because the queryable could have an optimized implementation of the Where-operation (for example an indexed access).
LINQ is absolutely 100% slower
Depends on what you are trying to accomplish in your program, but for the most part this is most certainly what I would call LAZY PROGRAMMER CODE...
You are going to essentially "stall-out" if you are performing any complex queries, joins etc... total p.o.s for those types of functions/methods- just don't use it. If you do this the hard/long way you will be much happier in the long run...and performance will be a world apart.
NOTE:
I would definitely not recommend LINQ for any program built for speed/synchronization tasks/computation
(i.e. HFT trading &/or AT trading i-0-i for starters).
TESTED:
It took nearly 10 seconds to complete a join in "LINQ" vs. < 1 millisecond.
LINQ vs Loop – A performance test
LINQ: 00:00:04.1052060, avg. 00:00:00.0041052
Loop: 00:00:00.0790965, avg. 00:00:00.0000790
References:
http://ox.no/posts/linq-vs-loop-a-performance-test
http://www.schnieds.com/2009/03/linq-vs-foreach-vs-for-loop-performance.html

Categories