Is string.ToCharArray() necessary for enumerating?

Is string.ToCharArray() necessary for enumerating? - c#

Why on Earth would someone convert a string to a char[] before enumerating the characters in it? The regular pattern for initializing a System.Security.SecureString found all around the net follows:
SecureString secureString = new SecureString();
foreach (char c in "fizzbuzz".ToCharArray())
{
secureString.AppendChar(c);
}
Calling ToCharArray() makes no sense for me. Could someone tell whether I am wrong here?

Since string implements IEnumerable<char>, it's not necessary in this context. The only time you need ToCharArray is when you actually need an array.
My guess is that most people who call ToCharArray don't know that string implements IEnumerable (even though as far as I know it always has).

Actually that is bad because it
enumerates on the string
creates an array with a copy of all chars
enumerates on that
A lot of unnecessary work...
That's because strings are immutable, and giving out a pointer to its internal array is not allowed, so ToCharArray() makes a copy instead of a cast.
That is:
you can enumerate a string as a char[] but:
you cannot:
var chars = (char[])"string";
If you would go enumerating and would want to break at some point, by using the toCharArray() would already have enumerated the whole string nad made a copy of all chars. If the string would be very big, that is quite expensive...

Related

c# Remove elements from List containing string [duplicate]

What would be the fastest way to check if a string contains any matches in a string array in C#? I can do it using a loop, but I think that would be too slow.

Using LINQ:
return array.Any(s => s.Equals(myString))
Granted, you might want to take culture and case into account, but that's the general idea.
Also, if equality is not what you meant by "matches", you can always you the function you need to use for "match".

I really couldn't tell you if this is absolutely the fastest way, but one of the ways I have commonly done this is:
This will check if the string contains any of the strings from the array:
string[] myStrings = { "a", "b", "c" };
string checkThis = "abc";
if (myStrings.Any(checkThis.Contains))
{
MessageBox.Show("checkThis contains a string from string array myStrings.");
}
To check if the string contains all the strings (elements) of the array, simply change myStrings.Any in the if statement to myStrings.All.
I don't know what kind of application this is, but I often need to use:
if (myStrings.Any(checkThis.ToLowerInvariant().Contains))
So if you are checking to see user input, it won't matter, whether the user enters the string in CAPITAL letters, this could easily be reversed using ToLowerInvariant().
Hope this helped!

That works fine for me:
string[] characters = new string[] { ".", ",", "'" };
bool contains = characters.Any(c => word.Contains(c));

You could combine the strings with regex or statements, and then "do it in one pass," but technically the regex would still performing a loop internally. Ultimately, looping is necessary.

If the "array" will never change (or change only infrequently), and you'll have many input strings that you're testing against it, then you could build a HashSet<string> from the array. HashSet<T>.Contains is an O(1) operation, as opposed to a loop which is O(N).
But it would take some (small) amount of time to build the HashSet. If the array will change frequently, then a loop is the only realistic way to do it.

String character array

Isn't a string already a character array in c#? Why is there a explicit ToCharacterArray function? I stumbled upon this when I was looking upon ways to reverse a string and saw a few answers converting the string to a character array first before proceeding with the loop to reverse the string. I am a beginner in coding.
Sorry if this seems stupid, but I didn't get the answer by googling.

Isn't a string already a character array in c# ?
The underlying implementation is, yes.
But you are not allowed to directly access that. String is using encapsulation to be an immutable object.
The actual array is private and hidden from view. You can use an indexer (property) to read characters but you cannot change them. The indexer is read only.
So yes, you do need ToCharacterArray() for reversing and similar actions. Note that you always end up with a different string, you cannot alter the original.

Isn't a string already a character array in c# ?
No, a string is a CLASS that encapsulates a "sequential collection of characters" (see Docs). Notice it doesn't explicitly say an "Array of Char". Now, it may be true that the string class currently uses a character array to accomplish this, but that doesn't mean it ~must~ use a character array to achieve that end. This is a fundamental concept of Object Oriented Programming that combines information hiding and the idea of a "black box" that does something. It doesn't matter how the black box (class) accomplishes its task under the hood, as long is it doesn't change the public interface presented to the end user. Perhaps, in the next version of .Net, some new-fangled magical structure that is not an array of characters will be used to implement the string class. The end user may not be aware that this change has even occurred because they can still use the string class in the same way, and if they so desire, could still output the characters to an array with ToCharArray()...even though internally the string is no longer an array of characters.

Yes String type is a character array but string array is not an character array you must have to convert each string in your array in char type so that you can easily reverse its indexes and then convert it into temporary string and then add that string to array to be reversed

Internally, the text is stored as a sequential read-only collection of Char objects.
See Programming Guide Docs

Console.WriteLine(StringHelper.ReverseString("framework"));
Console.WriteLine(StringHelper.ReverseString("samuel"));
Console.WriteLine(StringHelper.ReverseString("example string"));
OR
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}

Most efficient way of adding/removing a character to beginning of string?

I was doing a small 'scalable' C# MVC project, with quite a bit of read/write to a database.
From this, I would need to add/remove the first letter of the input string.
'Removing' the first character is quite easy (using a Substring method) - using something like:
String test = "HHello world";
test = test.Substring(1,test.Length-1);
'Adding' a character efficiently seems to be messy/awkward:
String test = "ello World";
test = "H" + test;
Seeing as this will be done for a lot of records, would this be be the most efficient way of doing these operations?
I am also testing if a string starts with the letter 'T' by using, and adding 'T' if it doesn't by:
String test = "Hello World";
if(test[0]!='T')
{
test = "T" + test;
}
and would like to know if this would be suitable for this

If you have several records and to each of the several records field you need to append a character at the beginning, you can use String.Insert with an index of 0 http://msdn.microsoft.com/it-it/library/system.string.insert(v=vs.110).aspx
string yourString = yourString.Insert( 0, "C" );
This will pretty much do the same of what you wrote in your original post, but since it seems you prefer to use a Method and not an operator...
If you have to append a character several times, to a single string, then you're better using a StringBuilder http://msdn.microsoft.com/it-it/library/system.text.stringbuilder(v=vs.110).aspx

Both are equally efficient I think since both require a new string to be initialized, since string is immutable.
When doing this on the same string multiple times, a StringBuilder might come in handy when adding. That will increase performance over adding.
You could also opt to move this operation to the database side if possible. That might increase performance too.

For removing I would use the remove command as this doesn't require to know the length of the string:
test = test.Remove(0, 1);
You could also treat the string as an array for the Add and use
test = test.Insert(0, "H");
If you are always removing and then adding a character you can treat the string as an array again and just replace the character.
test = (test.ToCharArray()[0] = 'H').ToString();
When doing lots of operations to the same string I would use a StringBuilder though, more expensive to create but faster operations on the string.

Why does `String.Trim()` not trim the object itself?

Not often but sometimes I need to use String.Trim() to remove whitespaces of a string.
If it was a longer time since last trim coding I write:
string s = " text ";
s.Trim();
and be surprised why s is not changed. I need to write:
string s = " text ";
s = s.Trim();
Why are some string methods designed in this (not very intuitive) way? Is there something special with strings?

Strings are immutable. Any string operation generates a new string without changing the original string.
From MSDN:
Strings are immutable--the contents of a string object cannot be
changed after the object is created, although the syntax makes it
appear as if you can do this.

s.Trim() creates a new trimmed version of the original string and returns it instead of storing the new version in s. So, what you have to do is to store the trimmed instance in your variable:
s = s.Trim();
This pattern is followed in all the string methods and extension methods.
The fact that string is immutable doesn't have to do with the decision to use this pattern, but with the fact of how strings are kept in memory. This methods could have been designed to create the new modified string instance in memory and point the variable to the new instance.
It's also good to remember that if you need to make lots of modifications to a string, it's much better to use an StringBuilder, which behaves like a "mutable" string, and it's much more eficient doing this kind of operations.

As it is written in MSDN Library:
A String object is called immutable (read-only), because its value
cannot be modified after it has been created. Methods that appear to
modify a String object actually return a new String object that
contains the modification.
Because strings are immutable, string manipulation routines that
perform repeated additions or deletions to what appears to be a single
string can exact a significant performance penalty.
See this link.

In addition to all the good answers, I also feel that the reason being Threadsaftey.
Lets say
string s = " any text ";
s.Trim();
When you say this there is nothing stopping the other thread from modifying s. If the same string is modified, lets say the other thread remove 'a' from s, then what is the result of s.Trim()?
But when it returns the new string, though it is being modified by the other thread, the trim can make a local copy modify it and return modified string.

StringBuilder: how to get the final String?

Someone told me that it's faster to concatenate strings with StringBuilder. I have changed my code but I do not see any Properties or Methods to get the final build string.
How can I get the string?

You can use .ToString() to get the String from the StringBuilder.

Once you have completed the processing using the StringBuilder, use the ToString method to return the final result.
From MSDN:
using System;
using System.Text;
public sealed class App
{
static void Main()
{
// Create a StringBuilder that expects to hold 50 characters.
// Initialize the StringBuilder with "ABC".
StringBuilder sb = new StringBuilder("ABC", 50);
// Append three characters (D, E, and F) to the end of the StringBuilder.
sb.Append(new char[] { 'D', 'E', 'F' });
// Append a format string to the end of the StringBuilder.
sb.AppendFormat("GHI{0}{1}", 'J', 'k');
// Display the number of characters in the StringBuilder and its string.
Console.WriteLine("{0} chars: {1}", sb.Length, sb.ToString());
// Insert a string at the beginning of the StringBuilder.
sb.Insert(0, "Alphabet: ");
// Replace all lowercase k's with uppercase K's.
sb.Replace('k', 'K');
// Display the number of characters in the StringBuilder and its string.
Console.WriteLine("{0} chars: {1}", sb.Length, sb.ToString());
}
}
// This code produces the following output.
//
// 11 chars: ABCDEFGHIJk
// 21 chars: Alphabet: ABCDEFGHIJK

When you say "it's faster to concatenate strings with a StringBuilder", this is only true if you are repeatedly (I repeat - repeatedly) concatenating to the same object.
If you're just concatenating 2 strings and doing something with the result immediately as a string, there's no point to using StringBuilder.
I just stumbled on Jon Skeet's nice write up of this:
https://jonskeet.uk/csharp/stringbuilder.html
If you are using StringBuilder, then to get the resulting string, it's just a matter of calling ToString() (unsurprisingly).

I would just like to throw out that is may not necessarily faster, it will definitely have a better memory footprint. This is because string are immutable in .NET and every time you change a string you have created a new one.

About it being faster/better memory:
I looked into this issue with Java, I assume .NET would be as smart about it.
The implementation for String is pretty impressive.
The String object tracks "length" and "shared" (independent of the length of the array that holds the string)
So something like
String a = "abc" + "def" + "ghi";
can be implemented (by the compiler/runtime) as:
- Extend the array holding "abc" by 6 additional spaces.
- Copy def in right after abc
- copy ghi in after def.
- give a pointer to the "abc" string to a
- leave abc's length at 3, set a's length to 9
- set the shared flag in both.
Since most strings are short-lived, this makes for some VERY efficient code in many cases.
The case where it's absolutely NOT efficient is when you are adding to a string within a loop, or when your code is like this:
a = "abc";
a = a + "def";
a += "ghi";
In this case, you are much better off using a StringBuilder construct.
My point is that you should be careful whenever you optimize, unless you are ABSOLUTELY sure that you know what you are doing, AND you are absolutely sure it's necessary, AND you test to ensure the optimized code makes a use case pass, just code it in the most readable way possible and don't try to out-think the compiler.
I wasted 3 days messing with strings, caching/reusing string-builders and testing speed before I looked at the string source code and figured out that the compiler was already doing it better than I possibly could for my use case. Then I had to explain how I didn't REALLY know what I was doing, I only thought I did...

It's not faster to concat - As smaclell pointed out, the issue is the immutable string forcing an extra allocation and recopying of existing data.
"a"+"b"+"c" is no faster to do with string builder, but repeated concats with an intermediate string gets faster and faster as the # of concat's gets larger like:
x = "a"; x+="b"; x+="c"; ...

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.