Why does `String.Trim()` not trim the object itself? - c#

Not often but sometimes I need to use String.Trim() to remove whitespaces of a string.
If it was a longer time since last trim coding I write:
string s = " text ";
s.Trim();
and be surprised why s is not changed. I need to write:
string s = " text ";
s = s.Trim();
Why are some string methods designed in this (not very intuitive) way? Is there something special with strings?

Strings are immutable. Any string operation generates a new string without changing the original string.
From MSDN:
Strings are immutable--the contents of a string object cannot be
changed after the object is created, although the syntax makes it
appear as if you can do this.

s.Trim() creates a new trimmed version of the original string and returns it instead of storing the new version in s. So, what you have to do is to store the trimmed instance in your variable:
s = s.Trim();
This pattern is followed in all the string methods and extension methods.
The fact that string is immutable doesn't have to do with the decision to use this pattern, but with the fact of how strings are kept in memory. This methods could have been designed to create the new modified string instance in memory and point the variable to the new instance.
It's also good to remember that if you need to make lots of modifications to a string, it's much better to use an StringBuilder, which behaves like a "mutable" string, and it's much more eficient doing this kind of operations.

As it is written in MSDN Library:
A String object is called immutable (read-only), because its value
cannot be modified after it has been created. Methods that appear to
modify a String object actually return a new String object that
contains the modification.
Because strings are immutable, string manipulation routines that
perform repeated additions or deletions to what appears to be a single
string can exact a significant performance penalty.
See this link.

In addition to all the good answers, I also feel that the reason being Threadsaftey.
Lets say
string s = " any text ";
s.Trim();
When you say this there is nothing stopping the other thread from modifying s. If the same string is modified, lets say the other thread remove 'a' from s, then what is the result of s.Trim()?
But when it returns the new string, though it is being modified by the other thread, the trim can make a local copy modify it and return modified string.

Related

Most efficient way of adding/removing a character to beginning of string?

I was doing a small 'scalable' C# MVC project, with quite a bit of read/write to a database.
From this, I would need to add/remove the first letter of the input string.
'Removing' the first character is quite easy (using a Substring method) - using something like:
String test = "HHello world";
test = test.Substring(1,test.Length-1);
'Adding' a character efficiently seems to be messy/awkward:
String test = "ello World";
test = "H" + test;
Seeing as this will be done for a lot of records, would this be be the most efficient way of doing these operations?
I am also testing if a string starts with the letter 'T' by using, and adding 'T' if it doesn't by:
String test = "Hello World";
if(test[0]!='T')
{
test = "T" + test;
}
and would like to know if this would be suitable for this
If you have several records and to each of the several records field you need to append a character at the beginning, you can use String.Insert with an index of 0 http://msdn.microsoft.com/it-it/library/system.string.insert(v=vs.110).aspx
string yourString = yourString.Insert( 0, "C" );
This will pretty much do the same of what you wrote in your original post, but since it seems you prefer to use a Method and not an operator...
If you have to append a character several times, to a single string, then you're better using a StringBuilder http://msdn.microsoft.com/it-it/library/system.text.stringbuilder(v=vs.110).aspx
Both are equally efficient I think since both require a new string to be initialized, since string is immutable.
When doing this on the same string multiple times, a StringBuilder might come in handy when adding. That will increase performance over adding.
You could also opt to move this operation to the database side if possible. That might increase performance too.
For removing I would use the remove command as this doesn't require to know the length of the string:
test = test.Remove(0, 1);
You could also treat the string as an array for the Add and use
test = test.Insert(0, "H");
If you are always removing and then adding a character you can treat the string as an array again and just replace the character.
test = (test.ToCharArray()[0] = 'H').ToString();
When doing lots of operations to the same string I would use a StringBuilder though, more expensive to create but faster operations on the string.

Replacing backslash in a string

I am having a few problems with trying to replace backslashes in a date string on C# .net.
So far I am using:
string.Replace(#"\","-")
but it hasnt done the replacement. Could anyone please help?
string.Replace does not modify the string itself but returns a new string, which most likely you are throwing away. Do this instead:
myString= myString.Replace(#"\","-");
On a side note, this kind of operation is usually seen in code that manually mucks around with formatted date strings. Most of the time there is a better way to do what you want (which is?) than things like this.
as all of them saying you need to take value back in the variable.
so it should be
val1= val1.Replace(#"\","-");
Or
val1= val1.Replace("\\","-");
but not only .. below one will not work
val1.Replace(#"\","-");
Use it this way.
oldstring = oldstring.Replace(#"\","-");
Look for String.Replace return type.
Its a function which returns a corrected string. If it would have simply changed old string then it would had a void return type.
You could also use:
myString = myString.Replace('\\', '-'));
but just letting you know, date slashes are usually forward ones /, and not backslashes \.
As suggested by others that String.Replace doesn't update the original string object but it returns a new string instead.
myString= myString.Replace(#"\","-");
It's worthwhile for you to understand that string is immutable in C# basically to make it thread-safe. More details about strings and why they are immutable please see links here and here

How to generate a unique string from a string collection?

I need a way to convert a strings collection into a unique string. This means that I need to have a different string if any of the strings inside the collection has changed.
I'm working on a big solution so I may wont be able to work with some better ideas. The required unique string will be used to compare the 2 collections, so different strings means different collections. I cannot compare the strings inside one by one because the order may change plus the solution is already built to return result based on 2 strings comparison. This is an add-on. The generated string will be passed as parameter for this comparison.
Thank you!
These both work by deciding to use the separator character of ":" and also using an escape character to make it clear when we mean something else by the separator character. We therefore just need to escape all our strings before concatenating them with our separator in between. This gives us unique strings for every collection. All we need to do if we want to make collections the same regardless or order is to sort our collection before we do anything. I should add that my sample uses LINQ and thus assumes the collection implements IEnumerable<string> and that you have a using declaration for System.LINQ
You can wrap that up in a function as follows
string GetUniqueString(IEnumerable<string> Collection, bool OrderMatters = true, string Escape = "/", string Separator = ":")
{
if(Escape == Separator)
throw new Exception("Escape character should never equal separator character because it fails in the case of empty strings");
if(!OrderMatters)
Collection = Collection.OrderBy(v=>v);//Sorting fixes ordering issues.
return Collection
.Select(v=>v.Replace(Escape, Escape + Escape).Replace(Separator,Escape + Separator))//Escape String
.Aggregate((a,b)=>a+Separator+b);
}
What about using a hash function?
Considering you constraints, use a delimited approach:
pick a delimiter and an escape method.
e.g. use ; and escape it bwithin strings y \;, also escape \ by \\
So this list of strings...
"A;bc"
"D\ef;"
...becomes "A\;bc;D\\ef\;"
It ain't pretty, but considering that it has to be a string, then the good old ways of csv and its brethren isn't all too bad.
By a "collection string" you mean "collection of strings"?
Here's a naive (but working) approach: sort the collection (to eliminate dependency on order), concat them, and take a hash of that (MD5 for instance).
Trivial to implement, but not very clever performance-wise.
Are you saying that you need to encode a string collection as a string. So for example the collection {"abc", "def"} may be encoded as "sDFSDFSDFSD" but {"a", "b"} might be encoded as "SDFeg". If so and you don't care about unique keys then you could use something like SHA or MD5.

Why string.Replace("X","Y") works only when assigned to new string?

I guess it has to do something with string being a reference type but I dont get why simply string.Replace("X","Y") does not work?
Why do I need to do string A = stringB.Replace("X","Y")? I thought it is just a method to be done on specified instance.
EDIT: Thank you so far. I extend my question: Why does b+="FFF" work but b.Replace does not?
Because strings are immutable. Any time you change a string .net creates creates a new string object. It's a property of the class.
Immutable objects
String Object
Why doesn't stringA.Replace("X","Y") work?
Why do I need to do stringB = stringA.Replace("X","Y"); ?
Because strings are immutable in .NET. You cannot change the value of an existing string object, you can only create new strings. string.Replace creates a new string which you can then assign to something if you wish to keep a reference to it. From the documentation:
Returns a new string in which all occurrences of a specified string in the current instance are replaced with another specified string.
Emphasis mine.
So if strings are immutable, why does b += "FFF"; work?
Good question.
First note that b += "FFF"; is equivalent to b = b + "FFF"; (except that b is only evaluated once).
The expression b + "FFF" creates a new string with the correct result without modifying the old string. The reference to the new string is then assigned to b replacing the reference to the old string. If there are no other references to the old string then it will become eligible for garbage collection.
Strings are immutable, which means that once they are created, they cannot be changed anymore. This has several reasons, as far as I know mainly for performance (how strings are represented in memory).
See also (among many):
http://en.wikipedia.org/wiki/Immutable_object
http://channel9.msdn.com/forums/TechOff/58729-Why-are-string-types-immutable-in-C/
As a direct consequence of that, each string operation creates a new string object. In particular, if you do things like
foreach (string msg in messages)
{
totalMessage = totalMessage + message;
totalMessage = totalMessage + "\n";
}
you actually create potentially dozens or hundreds of string objects. So, if you want to manipulate strings more sophisticatedly, follow GvS's hint and use the StringBuilder.
Strings are immutable. Any operation changing them has to create a new string.
A StringBuilder supports the inline Replace method.
Use the StringBuilder if you need to do a lot of string manipulation.
Why "b+="FFF"works but the b.replace is not
Because the += operator assigns the results back to the left hand operand, of course. It's just a short hand for b = b + "FFF";.
The simple fact is that you can't change any string in .Net. There are no instance methods for strings that alter the content of that string - you must always assign the results of an operation back to a string reference somewhere.
Yes its a method of System.String. But you can try
a = a.Replace("X","Y");
String.Replace is a shared function of string class that returns a new string. It is not an operator on the current object. b.Replace("a","b") would be similar to a line that only has c+1. So just like c=c+1 actually sets the value of c+1 to c, b=b.Replace("a","b") sets the new string returned to b.
As everyone above had said, strings are immutable.
This means that when you do your replace, you get a new string, rather than changing the existing string.
If you don't store this new string in a variable (such as in the variable that it was declared as) your new string won't be saved anywhere.
To answer your extended question, b+="FFF" is equivalent to b = b + "FFF", so basically you are creating a new string here also.
Just to be more explicit. string.Replace("X","Y") returns a new string...but since you are not assigning the new string to anything the new string is lost.

Declaring a looooong single line string in C#

Is there a decent way to declare a long single line string in C#, such that it isn't impossible to declare and/or view the string in an editor?
The options I'm aware of are:
1: Let it run. This is bad because because your string trails way off to the right of the screen, making a developer reading the message have to annoying scroll and read.
string s = "this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. this is my really long string. ";
2: #+newlines. This looks nice in code, but introduces newlines to the string. Furthermore, if you want it to look nice in code, not only do you get newlines, but you also get awkward spaces at the beginning of each line of the string.
string s = #"this is my really long string. this is my long string.
this line will be indented way too much in the UI.
This line looks silly in code. All of them suffer from newlines in the UI.";
3: "" + ... This works fine, but is super frustrating to type. If I need to add half a line's worth of text somewhere I have to update all kinds of +'s and move text all around.
string s = "this is my really long string. this is my long string. " +
"this will actually show up properly in the UI and looks " +
"pretty good in the editor, but is just a pain to type out " +
"and maintain";
4: string.format or string.concat. Basically the same as above, but without the plus signs. Has the same benefits and downsides.
Is there really no way to do this well?
There is a way. Put your very long string in resources. You can even put there long pieces of text because it's where the texts should be. Having them directly in code is a real bad practice.
If you really want this long string in the code, and you really don't want to type the end-quote-plus-begin-quote, then you can try something like this.
string longString = #"Some long string,
with multiple whitespace characters
(including newlines and carriage returns)
converted to a single space
by a regular expression replace.";
longString = Regex.Replace(longString, #"\s+", " ");
If using Visual Studio
Tools > Options > Text Editor > All Languages > Word Wrap
I'm sure any other text editor (including notepad) will be able to do this!
It depends on how the string is going to wind up being used. All the answers here are valid, but context is important. If long string "s" is going to be logged, it should be surrounded with a logging guard test, such as this Log4net example:
if (log.IsDebug) {
string s = "blah blah blah" +
// whatever concatenation you think looks the best can be used here,
// since it's guarded...
}
If the long string s is going to be displayed to a user, then Developer Art's answer is the best choice...those should be in resource file.
For other uses (generating SQL query strings, writing to files [but consider resources again for these], etc...), where you are concatenating more than just literals, consider StringBuilder as Wael Dalloul suggests, especially if your string might possibly wind up in a function that just may, at some date in the distant future, be called many many times in a time-critical application (All those invocations add up). I do this, for example, when building a SQL query where I have parameters that are variables.
Other than that, no, I don't know of anything that both looks pretty and is easy to type (though the word wrap suggestion is a nice idea, it may not translate well to diff tools, code print outs, or code review tools). Those are the breaks. (I personally use the plus-sign approach to make the line-wraps neat for our print outs and code reviews).
you can use StringBuilder like this:
StringBuilder str = new StringBuilder();
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
str.Append("this is my really long string. this is my long string. ");
string s = str.ToString();
You can also use: Text files, resource file, Database and registry.
Does it have to be defined in the source file? Otherwise, define it in a resource or config file.
Personally I would read a string that big from a file perhaps an XML document.
You could use StringBuilder
For really long strings, I'd store it in XML (or a resource). For occasions where it makes sense to have it in the code, I use the multiline string concatenation with the + operator. The only place I can think of where I do this, though, is in my unit tests for code that reads and parses XML where I'm actually trying to avoid using an XML file for testing. Since it's a unit test I almost always want to have the string right there to refer to as well. In those cases I might segregate them all into a #region directive so I can show/hide it as needed.
I either just let it run, or use string.format and write the string in one line (the let it run method) but put each of the arguments in new line, which makes it either easier to read, or at least give the reader some idea what he can expect in the long string without reading it in detail.
Use the Project / Properties / Settings from the top menu of Visual Studio. Make the scope = "Application".
In the Value box you can enter very long strings and as a bonus line feeds are preserved. Then your code can refer to that string like this:
string sql = Properties.Settings.Default.xxxxxxxxxxxxx;

Categories