Hey, I have an array of strings and I want to replace a certain substring in each of those elements. Is there an easy way to do that besides iterating the array explicitly?
Thanks :-)
Ultimately, anything you do is going to do exactly that anyway. A simple for loop should be fine. There are pretty solutions involving lambdas, such as Array.ConvertAll / Enumerable.Select, but tbh it isn't necessary:
for(int i = 0 ; i < arr.Length ; i++) arr[i] = arr[i].Replace("foo","bar");
(the for loop has the most efficient handling for arrays; and foreach isn't an option due to mutating the iterator variable)
You could iterate the array implicitly
arrayOfStrings = arrayOfStrings.Select(s => s.Replace("abc", "xyz")).ToArray();
Related
In the C++ Standard Template Library (STL), it is possible for example to create a vector consisting of multiple copies of the same element, using this constructor:
std::vector<double> v(10, 2.0);
This would create a vector of 10 doubles, initially set to 2.0.
I want to do a similar thing in C#, more specifically creating an array of n doubles with all elements initialized to the same value x.
I have come up with the following one-liner, relying on generic collections and LINQ:
double[] v = new double[n].Select(item => x).ToArray();
However, if an outsider would read this code I don't think it would be immediately apparent what the code actually does. I am also concerned about the performance, I suppose it would be faster to initialize the array elements via a for loop (although I haven't checked). Does anybody know of a cleaner and/or more efficient way to perform this task?
What about this?
double[] v = Enumerable.Repeat(x, n).ToArray();
EDIT: I just did a small benchmark; to create 1000 arrays of 100000 elements each, using a loop is about 3 times faster that Enumerable.Repeat.
Repeat
00:00:18.6875488
Loop
00:00:06.1628806
So if performance is critical, you should prefer the loop.
var arr = Enumerable.Repeat(x, n).ToArray();
Personally, I'd just use a regular array loop, though:
var arr = new double[n];
for(int i = 0 ; i < arr.Length ; i++) arr[i] = x;
More characters, but the array is demonstrably the right size from the outset - no iterative growth List<T>-style and final copy back. Also; simply more direct - and the JIT can do a lot to optimise the for(int i = 0 ; i < arr.Length ; i++) pattern (for arrays).
double[] theSameValues = Enumerable.Repeat(2.0, 10).ToArray();
Later versions of .NET have introduced an Array.Fill method. See usage:
double[] v = new double[n];
Array.Fill(v, 2.0);
the for each (or better the classic for) is always much faster than using Linq.
You should use the Linq expression only if it makes the code more readable
In VB.NET
Imports System.Linq
Dim n As Integer = 10
Dim colorArray = New Color(n - 1) {}.[Select](Function(item) Color.White).ToArray()
What is the easiest way to build an array of integers starting at 0 and increasing until a given point?
Background:
I have a struct that holds an int[] representing indexes of other arrays.
I would like to signify I want to use all indexes by filling this array with ints starting at 0 and increasing until int numTotalIndexes; I am sure there is a better way to do this than using a for loop.
Someone here showed me this little Linq trick
int[] numContacts = new int[]{ 32, 48, 24, 12};
String[][] descriptions = numContacts.Select(c => new string[c]).ToArray();
to build a jagged 2D array without loops (well it does, but it hides them and makes my code pretty) and I think there might be a nice little trick to accomplish what I want above.
You can use Enumerable.Range:
int[] intArray = Enumerable.Range(0, numTotalIndexes).ToArray();
You:
I am sure there is a better way to do this than using a for loop
Note that LINQ also uses loops, you simply don't see them. It's also not the most efficient way since ToArray doesn't know how large the array must be. However, it is a readable and short way.
So here is the (possibly premature-)optimized, classic way to initialize the array:
int[] intArray = new int[numTotalIndexes];
for(int i=0; i < numTotalIndexes; i++)
intArray[i] = i;
I'm not sure i understand your question at all but if going by your first line what you want is
var MySequencialArray = Enumerable.From(0,howmanyyouwant).ToArray();
I'm trying to parse a large text string. I need to split the original string in blocks of 15 characters(and the next block might contain white spaces, so the trim function is used). I'm using two strings, the original and a temporary one. This temp string is used to store each 15 length block.
I wonder if I could fall into a performance issue because strings are immutable. This is the code:
string original = "THIS IS SUPPOSE TO BE A LONG STRING AN I NEED TO SPLIT IT IN BLOCKS OF 15 CHARACTERS.SO";
string temp = string.Empty;
while (original.Length != 0)
{
temp = original.Substring(0, 14).Trim();
original = original.Substring(14, (original.Length -14)).Trim();
}
I appreciate your feedback in order to find a best way to achieve this functionality.
You'll get slightly better performance like this (but whether the performance gain will be significant is another matter entirely):
for (var startIndex = 0; startIndex < original.Length; startIndex += 15)
{
temp = original.Substring(startIndex, Math.Min(original.Length - startIndex, 15)).Trim();
}
This performs better because you're not copying the last all-but-15-characters of the original string with each loop iteration.
EDIT
To advance the index to the next non-whitespace character, you can do something like this:
for (var startIndex = 0; startIndex < original.Length; )
{
if (char.IsWhiteSpace(string, startIndex)
{
startIndex++;
continue;
}
temp = original.Substring(startIndex, Math.Min(original.Length - startIndex, 15)).Trim();
startIndex += 15;
}
I think you are right about the immutable issue - recreating 'original' each time is probably not the fastest way.
How about passing 'original' into a StringReader class?
If your original string is longer than few thousand chars, you'll have noticable (>0.1s) processing time and a lot of GC pressure. First Substring call is fine and I don't think you can avoid it unless you go deep inside System.String and mess around with m_FirstChar. Second Substring can be avoided completely when going char-by-char and iterating over int.
In general, if you would run this on bigger data such code might be problematic, it of course depends on your needs.
In general, it might be a good idea to use StringBuilder class, which will allow you to operator on strings in "more mutable" way without performance hit, like remove from it's beggining without reallocating whole string.
In your example however I would consider throwing out lime that takes substring from original and substitute it with some code that would update some indexes pointing where you should get new substring from. Then while condition would be just checking if your index as at the end of the string and your temp method would take substring not from 0 to 14 but from i, where i would be this index.
However - don't optimize code if you don't have to, I'm assuming here that you need more performance and you want to sacrifice some time and/or write a bit less understandable code for more efficiency.
In the C++ Standard Template Library (STL), it is possible for example to create a vector consisting of multiple copies of the same element, using this constructor:
std::vector<double> v(10, 2.0);
This would create a vector of 10 doubles, initially set to 2.0.
I want to do a similar thing in C#, more specifically creating an array of n doubles with all elements initialized to the same value x.
I have come up with the following one-liner, relying on generic collections and LINQ:
double[] v = new double[n].Select(item => x).ToArray();
However, if an outsider would read this code I don't think it would be immediately apparent what the code actually does. I am also concerned about the performance, I suppose it would be faster to initialize the array elements via a for loop (although I haven't checked). Does anybody know of a cleaner and/or more efficient way to perform this task?
What about this?
double[] v = Enumerable.Repeat(x, n).ToArray();
EDIT: I just did a small benchmark; to create 1000 arrays of 100000 elements each, using a loop is about 3 times faster that Enumerable.Repeat.
Repeat
00:00:18.6875488
Loop
00:00:06.1628806
So if performance is critical, you should prefer the loop.
var arr = Enumerable.Repeat(x, n).ToArray();
Personally, I'd just use a regular array loop, though:
var arr = new double[n];
for(int i = 0 ; i < arr.Length ; i++) arr[i] = x;
More characters, but the array is demonstrably the right size from the outset - no iterative growth List<T>-style and final copy back. Also; simply more direct - and the JIT can do a lot to optimise the for(int i = 0 ; i < arr.Length ; i++) pattern (for arrays).
double[] theSameValues = Enumerable.Repeat(2.0, 10).ToArray();
Later versions of .NET have introduced an Array.Fill method. See usage:
double[] v = new double[n];
Array.Fill(v, 2.0);
the for each (or better the classic for) is always much faster than using Linq.
You should use the Linq expression only if it makes the code more readable
In VB.NET
Imports System.Linq
Dim n As Integer = 10
Dim colorArray = New Color(n - 1) {}.[Select](Function(item) Color.White).ToArray()
I'm aimed at speed, must be ultra fast.
string s = something;
for (int j = 0; j < s.Length; j++)
{
if (s[j] == 'ь')
if(s.Length>(j+1))
if(s[j+1] != 'о')
s[j] = 'ъ';
It gives me an error Error "Property or indexer 'string.this[int]' cannot be assigned to -- it is read only"
How do I do it the fastest way?
Fast way? Use a StringBuilder.
Fastest way? Always pass around a char* and a length instead of a string so you can modify the buffer in-place, but make sure you don't ever modify any string object.
There are at least two options:
Use a StringBuilder and keep track of the previous character.
You could just use a regular expression "ь(?!о)" or a simple string replacement of "ьо" depending on what your needs are (your question seems self-contradictory).
I tested the performance of a StringBuilder approach versus regular expressions and there is very little difference - at most a factor of 2:
Method Iterations per second
StringBuilder 153480.094
Regex (uncompiled) 90021.978
Regex (compiled) 136355.787
string.Replace 1427605.174
If performance is critical for you I would strongly recommend making some performance measurements before jumping to conclusions about what the fastest approach is.
Strings in .Net is read-only. You could use StringBuilder.