ToString method and return static string - c#

Class example:
public class SomeType
{
private int type;
// some code...
public override string ToString ()
{
if (type == 1) return "One";
if (type == 2) return "Two";
}
}
Now imagine application calls thousand times ToString() method in a second.
My question is: when I use inline created string in code like something = myClass.ToString() is in every call created a new string or compiler optimize it somehow? (because strings are immutable it could be returned only referense to a static string).
And if not, should I make static private string fields and return them in ToString method for performance reasons?
Ofcourse I will test it using Stopwatch, but I need an expert answer anyway.

You're using string literals - which means you're returning a reference to the same string each time. This is guaranteed by the language specification. From section 2.4.4.5 of the C# 5 specification:
When two or more string literals that are equivalent according to the string equality operator (§7.10.7) appear in the same program, these string literals refer to the same string instance.
So as a simpler example:
string x = "One";
string y = "One";
Console.WriteLine(object.ReferenceEquals(x, y)); // Prints True
In your code, the ToString() method will still be called - but it won't create a new string object each time. You might consider using a switch statement instead of all those if statements, by the way.
Note that even if it did create a new string each time, creating thousands of strings per second won't make a modern CPU break into a sweat. Both the allocator and garbage collector are pretty efficient, and modern computers can do an awful lot of work in a second.

Related

What does _= mean in C#? [duplicate]

While going through new C# 7.0 features, I stuck up with discard feature. It says:
Discards are local variables which you can assign but cannot read
from. i.e. they are “write-only” local variables.
and, then, an example follows:
if (bool.TryParse("TRUE", out bool _))
What is real use case when this will be beneficial? I mean what if I would have defined it in normal way, say:
if (bool.TryParse("TRUE", out bool isOK))
The discards are basically a way to intentionally ignore local variables which are irrelevant for the purposes of the code being produced. It's like when you call a method that returns a value but, since you are interested only in the underlying operations it performs, you don't assign its output to a local variable defined in the caller method, for example:
public static void Main(string[] args)
{
// I want to modify the records but I'm not interested
// in knowing how many of them have been modified.
ModifyRecords();
}
public static Int32 ModifyRecords()
{
Int32 affectedRecords = 0;
for (Int32 i = 0; i < s_Records.Count; ++i)
{
Record r = s_Records[i];
if (String.IsNullOrWhiteSpace(r.Name))
{
r.Name = "Default Name";
++affectedRecords;
}
}
return affectedRecords;
}
Actually, I would call it a cosmetic feature... in the sense that it's a design time feature (the computations concerning the discarded variables are performed anyway) that helps keeping the code clear, readable and easy to maintain.
I find the example shown in the link you provided kinda misleading. If I try to parse a String as a Boolean, chances are I want to use the parsed value somewhere in my code. Otherwise I would just try to see if the String corresponds to the text representation of a Boolean (a regular expression, for example... even a simple if statement could do the job if casing is properly handled). I'm far from saying that this never happens or that it's a bad practice, I'm just saying it's not the most common coding pattern you may need to produce.
The example provided in this article, on the opposite, really shows the full potential of this feature:
public static void Main()
{
var (_, _, _, pop1, _, pop2) = QueryCityDataForYears("New York City", 1960, 2010);
Console.WriteLine($"Population change, 1960 to 2010: {pop2 - pop1:N0}");
}
private static (string, double, int, int, int, int) QueryCityDataForYears(string name, int year1, int year2)
{
int population1 = 0, population2 = 0;
double area = 0;
if (name == "New York City")
{
area = 468.48;
if (year1 == 1960) {
population1 = 7781984;
}
if (year2 == 2010) {
population2 = 8175133;
}
return (name, area, year1, population1, year2, population2);
}
return ("", 0, 0, 0, 0, 0);
}
From what I can see reading the above code, it seems that the discards have a higher sinergy with other paradigms introduced in the most recent versions of C# like tuples deconstruction.
For Matlab programmers, discards are far from being a new concept because the programming language implements them since very, very, very long time (probably since the beginning, but I can't say for sure). The official documentation describes them as follows (link here):
Request all three possible outputs from the fileparts function:
helpFile = which('help');
[helpPath,name,ext] = fileparts('C:\Path\data.txt');
The current workspace now contains three variables from fileparts: helpPath, name, and ext. In this case, the variables are small. However, some functions return results that use much more memory. If you do not need those variables, they waste space on your system.
Ignore the first output using a tilde (~):
[~,name,ext] = fileparts(helpFile);
The only difference is that, in Matlab, inner computations for discarded outputs are normally skipped because output arguments are flexible and you can know how many and which one of them have been requested by the caller.
I have seen discards used mainly against methods which return Task<T> but you don't want to await the output.
So in the example below, we don't want to await the output of SomeOtherMethod() so we could do something like this:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
Example();
Except this will generate the following warning:
CS4014 Because this call is not awaited, execution of the
current method continues before the call is completed. Consider
applying the 'await' operator to the result of the call.
To mitigate this warning and essentially ensure the compiler that we know what we are doing, you can use a discard:
//myClass.cs
public async Task<bool> Example() => await SomeOtherMethod()
// example.cs
_ = Example();
No more warnings.
To add another use case to the above answers.
You can use a discard in conjunction with a null coalescing operator to do a nice one-line null check at the start of your functions:
_ = myParam ?? throw new MyException();
Many times I've done code along these lines:
TextBox.BackColor = int32.TryParse(TextBox.Text, out int32 _) ? Color.LightGreen : Color.Pink;
Note that this would be part of a larger collection of data, not a standalone thing. The idea is to provide immediate feedback on the validity of each field of the data they are entering.
I use light green and pink rather than the green and red one would expect--the latter colors are dark enough that the text becomes a bit hard to read and the meaning of the lighter versions is still totally obvious.
(In some cases I also have a Color.Yellow to flag something which is not valid but neither is it totally invalid. Say the parser will accept fractions and the field currently contains "2 1". That could be part of "2 1/2" so it's not garbage, but neither is it valid.)
Discard pattern can be used with a switch expression as well.
string result = shape switch
{
Rectangule r => $"Rectangule",
Circle c => $"Circle",
_ => "Unknown Shape"
};
For a list of patterns with discards refer to this article: Discards.
Consider this:
5 + 7;
This "statement" performs an evaluation but is not assigned to something. It will be immediately highlighted with the CS error code CS0201.
// Only assignment, call, increment, decrement, and new object expressions can be used as a statement
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/compiler-messages/cs0201?f1url=%3FappId%3Droslyn%26k%3Dk(CS0201)
A discard variable used here will not change the fact that it is an unused expression, rather it will appear to the compiler, to you, and others reviewing your code that it was intentionally unused.
_ = 5 + 7; //acceptable
It can also be used in lambda expressions when having unused parameters:
builder.Services.AddSingleton<ICommandDispatcher>(_ => dispatcher);

is there shorter way of calling non static method in c#?

I have a class StringFormatter which contains method RemoveCharFromString.
For a long time, I have been creating a new instance of a class and then use it like the following:
[...]
StringFormat sf = new StringFormat();
string exampleString = sf.RemoveCharFromString(inputString, '%');
[...]
Now I came to a point where I just have to use this method a single time in one class. I thought there might be a shorter way of accomplishing the above code such as:
[...]
string exampleString = new StringFormat.RemoveCharFromString(inputString, '%');
[...]
Is there something for that?
You can instantiate a class and call one of it's methods directly - your second code sample just needs a parenthesis after the constructor:
string exampleString = new StringFormatter().RemoveCharFromString(inputString, '%');
However - there are things to consider here, without knowing the insides of the method:
The method's name suggests it's basically removing a specific char from the string - If it removes all occurrences of said char, why not just use string.Replace()?
Since this method seems to be getting all the information it needs from it's arguments and does not rely on, nor changes the state of the StringFormatter instance, why not make it a static method?
Sounds to me like the StringFormatter class is a bunch of methods which works on the type string. One option, could therefore be to consider to use extensions methods on the string type instead
public static class StringFormatter
{
public static string RemoveCharFromString(this string value, char charToRemove)
{
//do your logic and then return a string
}
}
Then use it
var exampleString = inputString.RemoveCharFromString('%');

Avoidable boxing in string interpolation

Using string interpolation makes my string format looks much more clear, however I have to add .ToString() calls if my data is a value type.
class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
var person = new Person { Name = "Tom", Age = 10 };
var displayText = $"Name: {person.Name}, Age: {person.Age.ToString()}";
The .ToString() makes the format longer and uglier. I tried to get rid of it, but string.Format is a built-in static method and I can't inject it. Do you have any ideas about this? And since string interpolation is a syntax sugar of string.Format, why don't they add .ToString() calls when generating the code behind the syntax sugar? I think it's doable.
I do not see how you can avoid that boxing by compiler. Behavior of string.Format is not part of C# specification. You can not rely on that it will call Object.ToString(). In fact it does not:
using System;
public static class Test {
public struct ValueType : IFormattable {
public override string ToString() => "Object.ToString";
public string ToString(string format, IFormatProvider formatProvider) => "IFormattable.ToString";
}
public static void Main() {
ValueType vt = new ValueType();
Console.WriteLine($"{vt}");
Console.WriteLine($"{vt.ToString()}");
}
}
With the recent release of C# 10 and .NET 6 things changed. The compiler has been optimised to handle interpolated strings better. What I write here I took from a post by Stephen Toub.
In essence: earlier versions of .NET translated an interpolated string $"{major}.{minor}.{build}.{revision}" (with the variables all being integers) into something like this
var array = new object[4];
array[0] = major;
array[1] = minor;
array[2] = build;
array[3] = revision;
string.Format("{0}.{1}.{2}.{3}", array);
Since C#10 the compiler can use "interpolated string handlers". The above string can be translated into:
var handler = new DefaultInterpolatedStringHandler(literalLength: 3, formattedCount: 4);
handler.AppendFormatted(major);
handler.AppendLiteral(".");
handler.AppendFormatted(minor);
handler.AppendLiteral(".");
handler.AppendFormatted(build);
handler.AppendLiteral(".");
handler.AppendFormatted(revision);
return handler.ToStringAndClear();
According to the author that does not only remove the need for boxing, it introduces further enhancements. As a result the new approach achieves "40% throughput improvement and an almost 5x reduction in memory allocation" (Stephen Toub).
And this is only the tip of the iceberg. Besides interpolated string handlers further optimisations in the interpretation of interpolated strings were introduced to C#10.
So I stopped worrying too much about performance and focus on clarity and readability instead. Chances are that we spend a lot of time trying to outsmart the compiler at the cost of code clarity without achieving anything substantial. I'd say, unless we have serious performance issues, we can use interpolated strings without taking complicated detours.
Limitation. Situation may be different when we translate strings. Then our format string becomes a dynamic resource, right? Now the compiler is unable to interpret the string before runtime and we're back to string.Format again.

Inline use of function returning more than one value in C#

I am used to using functions that return a single value "in-line" like so:
Label1.Text = firstString + functionReturnSecondString(aGivenParameter);
Can this be done for a function that returns two values?
Hypothetical example:
label1.Text = multipleReturnFunction(parameter).firstValue
I have been looking into returning more than one value and it looks like the best options are using a tuple, struct, or an array list.
I made a working function that retuns a struct. However the way I got it to work I need to first call the function, then I can use the values. It doesn't seem possible to make it happen all on the same line without writing another function.
multipleReturnFunction(parameter);
Label1.Text = firstString + classOfStruct.secondString;
I haven't made a function that returns a tuple or array list yet, so I'm not sure. Is it possible to call those functions and reference the return values "inline"?
I appreciate your feedback.
I have a grotty hack for exactly this type of scenario - when you want to perform multiple operations on the return value without defining an extra variable to store it:
public static TResult Apply<TInput, TResult>(this TInput input, Func<TInput, TResult> transformation)
{
return transformation(input);
}
... and here's the reason it came about in the first place:
var collection = Enumerable.Range(1, 3);
// Average reimplemented with Aggregate.
double average = collection
.Aggregate(
new { Count = 0, Sum = 0 },
(acc, i) => new { Count = acc.Count + 1, Sum = acc.Sum + i })
.Apply(a => (double)a.Sum / (double)a.Count); // Note: we have access to both Sum and Count despite never having stored the result of the call to .Aggregate().
Console.WriteLine("Average: {0}", average);
Needless to say this is better suited for academic exercises than actual production code.
Alternatively, use the ref or they out keyword.
Example:
int a = 0, b = 0;
void DoSomething(ref int a, ref int b) {
a = 1;
b = 2;
}
Console.WriteLine(a); // Prints 1
Console.WriteLine(b); // Prints 2
It's not inline and I personally would consider a class or a struct before using the ref or the out keyword. Let's consider the theory: when you want to return multiple things, you have in fact an object that has multiple properties which you want to make available to the caller of your function.
Therefore it is much more correct to actually create an object (either by using a class or a struct) that represents what you want to make available and returning that.
The only time I use the ref or the out keyword is when using DLL imports because those functions often have pointers as their calling arguments and I personally don't see any benefit in using them in your typical normal application.
To do this inline, I think you would have to have another method that takes your struct and gives you the string you are looking for.
public string NewMethod(object yourStruct)
{
return string.Format("{0} {1}", yourStruct.value1, yourStruct.value2);
}
Then in the page, you do this:
Label1.Text = NewMethod(multipleReturnFunction(parameter));
C# doesn't have Inline functions, but it does support anonymous functions which can be closures.
With these techniques, you can say:
var firstString=default(String);
var secondString=default(String);
((Action<String>)(arg => {
firstString="abc"+arg;
secondString="xyz";
}))("wtf");
label1.Text=firstString+secondString;
Debug.Print("{0}", label1.Text);
((Action<String>)(arg => {
firstString="123"+arg;
secondString="456";
}))("???");
label1.Text=firstString+secondString;
Debug.Print("{0}", label1.Text);
or name the delegate and reuse it:
var firstString=default(String);
var secondString=default(String);
Action<String> m=
arg => {
firstString="abc"+arg;
secondString="xyz";
};
m("wtf");
label1.Text=firstString+secondString;
Debug.Print("{0}", label1.Text);
m("???");
label1.Text=firstString+secondString;
Debug.Print("{0}", label1.Text);
So, do you really need a method returns multiple values?
Each method can return only one value. Thats how methods defined in .NET
Methods are declared in a class or struct by specifying the access
level such as public or private, optional modifiers such as abstract
or sealed, the return value, the name of the method, and any method
parameters
If you need to return more than one value from method, then you have three options:
Return complex type which will hold all values. That cannot help you in this case, because you will need local variable to store value returned by method.
Use out parameters. Also not your case - you will need to declare parameters before method call.
Create another method, which does all work and returns single value.
Third option looks like
Label1.Text = AnotherMethod(parameters);
And implementation
public string AnotherMethod(parameters)
{
// use option 1 or 2 to get both values
// return combined string which uses both values and parameters
}
BTW One more option - do not return values at all - you can use method which sets several class fields.

c# performance: type comparison vs. string comparison

Which is faster? This:
bool isEqual = (MyObject1 is MyObject2)
Or this:
bool isEqual = ("blah" == "blah1")
It would be helpful to figure out which one is faster. Obviously, if you apply .ToUpper() to each side of the string comparison like programmers often do, that would require reallocating memory which affects performance. But how about if .ToUpper() is out of the equation like in the above sample?
I'm a little confused here.
As other answers have noted, you're comparing apples and oranges. ::rimshot::
If you want to determine if an object is of a certain type use the is operator.
If you want to compare strings use the == operator (or other appropriate comparison method if you need something fancy like case-insensitive comparisons).
How fast one operation is compared to the other (no pun intended) doesn't seem to really matter.
After closer reading, I think that you want to compare the speed of string comparisions with the speed of reference comparisons (the type of comparison used in the System.Object base type).
If that's the case, then the answer is that reference comparisons will never be slower than any other string comparison. Reference comparison in .NET is pretty much analogous to comparing pointers in C - about as fast as you can get.
However, how would you feel if a string variable s had the value "I'm a string", but the following comparison failed:
if (((object) s) == ((object) "I'm a string")) { ... }
If you simply compared references, that might happen depending on how the value of s was created. If it ended up not being interned, it would not have the same reference as the literal string, so the comparison would fail. So you might have a faster comparison that didn't always work. That seems to be a bad optimization.
According to the book Maximizing .NET Performance
the call
bool isEqual = String.Equals("test", "test");
is identical in performance to
bool isEqual = ("test" == "test");
The call
bool isEqual = "test".Equals("test");
is theoretically slower than the call to the static String.Equals method, but I think you'll need to compare several million strings in order to actually detect a speed difference.
My tip to you is this; don't worry about which string comparison method is slower or faster. In a normal application you'll never ever notice the difference. You should use the way which you think is most readable.
The first one is used to compare types not values.
If you want to compare strings with a non-sensitive case you can use:
string toto = "toto";
string tata = "tata";
bool isEqual = string.Compare(toto, tata, StringComparison.InvariantCultureIgnoreCase) == 0;
Console.WriteLine(isEqual);
How about you tell me? :)
Take the code from this Coding Horror post, and insert your code to test in place of his algorithm.
Comparing strings with a "==" operator compares the contents of the string vs. the string object reference. Comparing objects will call the "Equals" method of the object to determine whether they are equal or not. The default implementation of Equals is to do a reference comparison, returning True if both object references are the same physical object. This will likely be faster than the string comparison, but is dependent on the type of object being compared.
I'd assume that comparing the objects in your first example is going to be about as fast as it gets since its simply checking if both objects point to the same address in memory.
As it has been mentioned several times already, it is possible to compare addresses on strings as well, but this won't necessarily work if the two strings are allocated from different sources.
Lastly, its usually good form to try and compare objects based on type whenever possible. Its typically the most concrete method of identification. If your objects need to be represented by something other than their address in memory, its possible to use other attributes as identifiers.
If I understand the question and you really want to compare reference equality with the plain old "compare the contents": Build a testcase and call object.ReferenceEquals compared against a == b.
Note: You have to understand what the difference is and that you probably cannot use a reference comparison in most scenarios. If you are sure that this is what you want it might be a tiny bit faster. You have to try it yourself and evaluate if this is worth the trouble at all..
I don't feel like any of these answers address the actual question. Let's say the string in this example is the type's name and we're trying to see if it's faster to compare a type name or the type to determine what it is.
I put this together and to my surprise, it's about 10% faster to check the type name string than the type in every test I ran. I intentionally put the simplest strings and classes into play to see if it was possible to be faster, and turns out it is possible. Not sure about more complicated strings and type comparisons from heavily inherited classes. This is of course a micro-op and may possibly change at some point in the evolution of the language I suppose.
In my case, I was considering a value converter that switches based on this name, but it could also switch over the type since each type specifies a unique type name. The value converter would figure out the font awesome icon to show based on the type of item presented.
using System;
using System.Diagnostics;
using System.Linq;
namespace ConsoleApp1
{
public sealed class A
{
public const string TypeName = "A";
}
public sealed class B
{
public const string TypeName = "B";
}
public sealed class C
{
public const string TypeName = "C";
}
class Program
{
static void Main(string[] args)
{
var testlist = Enumerable.Repeat(0, 100).SelectMany(x => new object[] { new A(), new B(), new C() }).ToList();
int count = 0;
void checkTypeName()
{
foreach (var item in testlist)
{
switch (item)
{
case A.TypeName:
count++;
break;
case B.TypeName:
count++;
break;
case C.TypeName:
count++;
break;
default:
break;
}
}
}
void checkType()
{
foreach (var item in testlist)
{
switch (item)
{
case A _:
count++;
break;
case B _:
count++;
break;
case C _:
count++;
break;
default:
break;
}
}
}
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < 100000; i++)
{
checkTypeName();
}
sw.Stop();
Console.WriteLine(sw.Elapsed);
sw.Restart();
for (int i = 0; i < 100000; i++)
{
checkType();
}
sw.Stop();
Console.WriteLine(sw.Elapsed);
}
}
}

Categories