Formatting alphanumeric string - c#

I have a string with 16 alphanumeric characters, e.g. F4194E7CC775F003. I'd like to format it as F419-4E7C-C775-F003.
I tried using
string.Format("{0:####-####-####-####}","F4194E7CC775F003");
but this doesn't work since it's not a numeric value.
So I came up with the following:
public class DashFormatter : IFormatProvider, ICustomFormatter
{
public object GetFormat(Type formatType)
{
return this;
}
public string Format(string format, object arg, IFormatProvider formatProvider)
{
char[] chars = arg.ToString().ToCharArray();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < chars.Length; i++)
{
if (i > 0 && i % 4 == 0)
{
sb.Append('-');
}
sb.Append(chars[i]);
}
return sb.ToString();
}
}
and by using
string.Format(new DashFormatter(), "{0}", "F4194E7CC775F003");
I was able to solve the problem, however I was hoping there is a better/simpler way to do it? Perhaps some LINQ magic?
Thanks.

You can do it in one line without Linq:
StringBuilder splitMe = new StringBuilder("F4194E7CC775F003");
string joined = splitMe.Insert(12, "-").Insert(8, "-").Insert(4, "-").ToString();

You could do it with a regular expression, though I don't know what the performance of this would be compared to the other methods.
string formattedString = Regex.Replace(yourString, "(\\S{4})\\B", "$1-");
You could put this in an extension method for string too, if you want to do:
yourString.ToDashedFormat();

If you want it linq:
var formatted = string.Join("-", Enumerable.Range(0,4).Select(i=>s.Substring(i*4,4)).ToArray());
And if you want it efficient:
var sb = new StringBuilder(19);
sb.Append(s,0,4);
for(var i = 1; i < 4; i++ )
{
sb.Append('-');
sb.Append(s,i*4, 4);
}
return sb.ToString();
I did not benchmark this one, but i think it would be faster then StringBuilder.Insert because it does not move the rest of string many times, it just writes 4 chars.
Also it would not reallocate the underlying string, because it's preallocated to 19 chars at the beginning.

Based on Carra's answer I made this little utility method:
private static string ToDelimitedString(string input, int position, string delimiter)
{
StringBuilder sb = new StringBuilder(input);
int x = input.Length / position;
while (--x > 0)
{
sb = sb.Insert(x * position, delimiter);
}
return sb.ToString();
}
You can use it like this:
string result = ToDelimitedString("F4194E7CC775F003", 4, "-");
And a test case:
[Test]
public void ReturnsDelimitedString()
{
string input = "F4194E7CC775F003";
string actual = ToDelimitedString(input, 4, "-");
Assert.AreEqual("F419-4E7C-C775-F003", actual);
}

char[] chars = "F4194E7CC775F003".ToCharArray();
var str = string.Format("{0}-{1}-{2}-{3}"
, new string(chars.Take(4).ToArray())
, new string(chars.Skip(4).Take(4).ToArray())
, new string(chars.Skip(8).Take(4).ToArray())
, new string(chars.Skip(12).Take(4).ToArray())
);

Simplest solution I can think of is
var text = "F4194E7CC775F003";
var formattedText = string.Format(
"{0}-{1}-{2}-{3}",
text.Substring(0, 4),
text.Substring(4, 4),
text.Substring(8, 4),
text.Substring(12, 4));

Only 9 years later, a minor variation from Carra's answer. This yields about a 2.5x speed improvement based on my tests (change all "-" to '-'):
StringBuilder initial = new StringBuilder("F4194E7CC775F003");
return initial.Insert(12, '-').Insert(8, '-').Insert(4, '-').ToString();

Related

C# Concatenate strings or array of chars

I'm facing a problem while developing an application.
Basically,
I have a fixed string, let's say "IHaveADream"
I now want to user to insert another string, for my purpose of a fixed length, and then concatenate every character of the fixed string with every character of the string inserted by the user.
e.g.
The user inserts "ByeBye"
then the output would be:
"IBHyaevBeyAeDream".
How to accomplish this?
I have tried with String.Concat and String.Join, inside a for statement, with no luck.
One memory-efficient option is to use a string builder, since both the original string and the user input could potentially be rather large. As mentioned by Kris, you can initialize your StringBuilder capacity to the combined length of both strings.
void Main()
{
var start = "IHaveADream";
var input = "ByeBye";
var sb = new StringBuilder(start.Length + input.Length);
for (int i = 0; i < start.Length; i++)
{
sb.Append(start[i]);
if (input.Length >= i + 1)
sb.Append(input[i]);
}
sb.ToString().Dump();
}
This only safely accounts for the input string being shorter or equal in length to the starting string. If you had a longer input string, you'd want to take the longer length as the end point for your for loop iteration and check that each array index is not out of bounds.
void Main()
{
var start = "IHaveADream";
var input = "ByeByeByeByeBye";
var sb = new StringBuilder(start.Length + input.Length);
var length = start.Length >= input.Length ? start.Length : input.Length;
for (int i = 0; i < length; i++)
{
if (start.Length >= i + 1)
sb.Append(start[i]);
if (input.Length >= i + 1)
sb.Append(input[i]);
}
sb.ToString().Dump();
}
You can create an array of characters and then re-combine them in the order you want.
char[] chars1 = "IHaveADream".ToCharArray();
char[] chars2 = "ByeBye".ToCharArray();
// you can create a custom algorithm to handle
// different size strings.
char[] c = new char[17];
c[0] = chars1[0];
c[1] = chars2[0];
...
c[13] = chars1[10];
string s = new string(c);
var firstString = "Ihaveadream";
var secondString = "ByeBye";
var stringBuilder = new StringBuilder();
for (int i = 0; i< firstString.Length; i++) {
stringBuilder .Append(str[i]);
if (i < secondString.Length) {
stringBuilder .Append(secondStr[i]);
}
}
var result = stringBuilder.ToString();
If you don't care much about memory usage or perfomance you can just use:
public static string concatStrings(string value, string value2)
{
string result = "";
int i = 0;
for (i = 0; i < Math.Max(value.Length, value2.Length) ; i++)
{
if (i < value.Length) result += value[i].ToString();
if (i < value2.Length) result += value2[i].ToString();
}
return result;
}
Usage:
string conststr = "IHaveADream";
string input = "ByeBye";
var result = ConcatStrings(conststr, input);
Console.WriteLine(result);
Output: IBHyaevBeyAeDream
P.S.
Just checked perfomance of both methods (with strBuilder and simple cancatenation) and it appears to be that both of this methods take same time to execute (if you have just one operation). The main reason for it is that string builder take considerable time to initialize while with use of concatenation we don't need that.
But in case if you have to process something like 1500 strings then it's different story and string builder is more of an option.
For 100 000 method executions it showed 85 (str buld) vs 22 (concat) ms respectively.
My Code

Better way to clean a string?

I am using this method to clean a string:
public static string CleanString(string dirtyString)
{
string removeChars = " ?&^$##!()+-,:;<>’\'-_*";
string result = dirtyString;
foreach (char c in removeChars)
{
result = result.Replace(c.ToString(), string.Empty);
}
return result;
}
This method gives the correct result. However, there is a performance glitch in this method. Every time I pass the string, every character goes into the loop. If I have a large string then it will take too much time to return the object.
Is there a better way of doing the same thing? Maybe using LINQ or jQuery/JavaScript?
Any suggestions would be appreciated.
OK, consider the following test:
public class CleanString
{
//by MSDN http://msdn.microsoft.com/en-us/library/844skk0h(v=vs.71).aspx
public static string UseRegex(string strIn)
{
// Replace invalid characters with empty strings.
return Regex.Replace(strIn, #"[^\w\.#-]", "");
}
// by Paolo Tedesco
public static String UseStringBuilder(string strIn)
{
const string removeChars = " ?&^$##!()+-,:;<>’\'-_*";
// specify capacity of StringBuilder to avoid resizing
StringBuilder sb = new StringBuilder(strIn.Length);
foreach (char x in strIn.Where(c => !removeChars.Contains(c)))
{
sb.Append(x);
}
return sb.ToString();
}
// by Paolo Tedesco, but using a HashSet
public static String UseStringBuilderWithHashSet(string strIn)
{
var hashSet = new HashSet<char>(" ?&^$##!()+-,:;<>’\'-_*");
// specify capacity of StringBuilder to avoid resizing
StringBuilder sb = new StringBuilder(strIn.Length);
foreach (char x in strIn.Where(c => !hashSet.Contains(c)))
{
sb.Append(x);
}
return sb.ToString();
}
// by SteveDog
public static string UseStringBuilderWithHashSet2(string dirtyString)
{
HashSet<char> removeChars = new HashSet<char>(" ?&^$##!()+-,:;<>’\'-_*");
StringBuilder result = new StringBuilder(dirtyString.Length);
foreach (char c in dirtyString)
if (removeChars.Contains(c))
result.Append(c);
return result.ToString();
}
// original by patel.milanb
public static string UseReplace(string dirtyString)
{
string removeChars = " ?&^$##!()+-,:;<>’\'-_*";
string result = dirtyString;
foreach (char c in removeChars)
{
result = result.Replace(c.ToString(), string.Empty);
}
return result;
}
// by L.B
public static string UseWhere(string dirtyString)
{
return new String(dirtyString.Where(Char.IsLetterOrDigit).ToArray());
}
}
static class Program
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main()
{
var dirtyString = "sdfdf.dsf8908()=(=(sadfJJLef#ssyd€sdöf////fj()=/§(§&/(\"&sdfdf.dsf8908()=(=(sadfJJLef#ssyd€sdöf////fj()=/§(§&/(\"&sdfdf.dsf8908()=(=(sadfJJLef#ssyd€sdöf";
var sw = new Stopwatch();
var iterations = 50000;
sw.Start();
for (var i = 0; i < iterations; i++)
CleanString.<SomeMethod>(dirtyString);
sw.Stop();
Debug.WriteLine("CleanString.<SomeMethod>: " + sw.ElapsedMilliseconds.ToString());
sw.Reset();
....
<repeat>
....
}
}
Output
CleanString.UseReplace: 791
CleanString.UseStringBuilder: 2805
CleanString.UseStringBuilderWithHashSet: 521
CleanString.UseStringBuilderWithHashSet2: 331
CleanString.UseRegex: 1700
CleanString.UseWhere: 233
Conclusion
It probably does not matter which method you use.
The difference in time between the fastest (UseWhere: 233ms) and the slowest (UseStringBuilder: 2805ms) method is 2572ms when called 50000 (!) times in a row. If you don't run the method that often, the difference does not really matter.
But if performance is critical, use the UseWhere method (written by L.B). Note, however, that its behavior is slightly different.
If it's purely speed and efficiency you are after, I would recommend doing something like this:
public static string CleanString(string dirtyString)
{
HashSet<char> removeChars = new HashSet<char>(" ?&^$##!()+-,:;<>’\'-_*");
StringBuilder result = new StringBuilder(dirtyString.Length);
foreach (char c in dirtyString)
if (!removeChars.Contains(c)) // prevent dirty chars
result.Append(c);
return result.ToString();
}
RegEx is certainly an elegant solution, but it adds extra overhead. By specifying the starting length of the string builder, it will only need to allocate the memory once (and a second time for the ToString at the end). This will cut down on memory usage and increase the speed, especially on longer strings.
However, as L.B. said, if you are using this to properly encode text that is bound for HTML output, you should be using HttpUtility.HtmlEncode instead of doing it yourself.
use regex [?&^$##!()+-,:;<>’\'-_*] for replacing with empty string
I don't know if, performance-wise, using a Regex or LINQ would be an improvement.
Something that could be useful, would be to create the new string with a StringBuilder instead of using string.Replace each time:
using System.Linq;
using System.Text;
static class Program {
static void Main(string[] args) {
const string removeChars = " ?&^$##!()+-,:;<>’\'-_*";
string result = "x&y(z)";
// specify capacity of StringBuilder to avoid resizing
StringBuilder sb = new StringBuilder(result.Length);
foreach (char x in result.Where(c => !removeChars.Contains(c))) {
sb.Append(x);
}
result = sb.ToString();
}
}
This one is even faster!
use:
string dirty=#"tfgtf$#$%gttg%$% 664%$";
string clean = dirty.Clean();
public static string Clean(this String name)
{
var namearray = new Char[name.Length];
var newIndex = 0;
for (var index = 0; index < namearray.Length; index++)
{
var letter = (Int32)name[index];
if (!((letter > 96 && letter < 123) || (letter > 64 && letter < 91) || (letter > 47 && letter < 58)))
continue;
namearray[newIndex] = (Char)letter;
++newIndex;
}
return new String(namearray).TrimEnd();
}
Give this a try: http://msdn.microsoft.com/en-us/library/xwewhkd1.aspx
Perhaps it helps to first explain the 'why' and then the 'what'. The reason you're getting slow performance is because c# copies-and-replaces the strings for each replacement. From my experience using Regex in .NET isn't always better - although in most scenario's (I think including this one) it'll probably work just fine.
If I really need performance I usually don't leave it up to luck and just tell the compiler exactly what I want: that is: create a string with the upper bound number of characters and copy all the chars in there that you need. It's also possible to replace the hashset with a switch / case or array in which case you might end up with a jump table or array lookup - which is even faster.
The 'pragmatic' best, but fast solution is:
char[] data = new char[dirtyString.Length];
int ptr = 0;
HashSet<char> hs = new HashSet<char>() { /* all your excluded chars go here */ };
foreach (char c in dirtyString)
if (!hs.Contains(c))
data[ptr++] = c;
return new string(data, 0, ptr);
BTW: this solution is incorrect when you want to process high surrogate Unicode characters - but can easily be adapted to include these characters.
-Stefan.
I use this in my current project and it works fine. It takes a sentence, it removes all the non alphanumerical characters, it then returns the sentence with all the words in the first letter upper case and everything else in lower case. Maybe I should call it SentenceNormalizer. Naming is hard :)
internal static string StringSanitizer(string whateverString)
{
whateverString = whateverString.Trim().ToLower();
Regex cleaner = new Regex("(?:[^a-zA-Z0-9 ])", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);
var listOfWords = (cleaner.Replace(whateverString, string.Empty).Split(' ', StringSplitOptions.RemoveEmptyEntries)).ToList();
string cleanString = string.Empty;
foreach (string word in listOfWords)
{
cleanString += $"{word.First().ToString().ToUpper() + word.Substring(1)} ";
}
return cleanString;
}
I am not able to spend time on acid testing this but this line did not actually clean slashes as desired.
HashSet<char> removeChars = new HashSet<char>(" ?&^$##!()+-,:;<>’\'-_*");
I had to add slashes individually and escape the backslash
HashSet<char> removeChars = new HashSet<char>(" ?&^$##!()+-,:;<>’'-_*");
removeChars.Add('/');
removeChars.Add('\\');

Replace multiple characters in a C# string

Is there a better way to replace strings?
I am surprised that Replace does not take in a character array or string array. I guess that I could write my own extension but I was curious if there is a better built in way to do the following? Notice the last Replace is a string not a character.
myString.Replace(';', '\n').Replace(',', '\n').Replace('\r', '\n').Replace('\t', '\n').Replace(' ', '\n').Replace("\n\n", "\n");
You can use a replace regular expression.
s/[;,\t\r ]|[\n]{2}/\n/g
s/ at the beginning means a search
The characters between [ and ] are the characters to search for (in any order)
The second / delimits the search-for text and the replace text
In English, this reads:
"Search for ; or , or \t or \r or (space) or exactly two sequential \n and replace it with \n"
In C#, you could do the following: (after importing System.Text.RegularExpressions)
Regex pattern = new Regex("[;,\t\r ]|[\n]{2}");
pattern.Replace(myString, "\n");
If you are feeling particularly clever and don't want to use Regex:
char[] separators = new char[]{' ',';',',','\r','\t','\n'};
string s = "this;is,\ra\t\n\n\ntest";
string[] temp = s.Split(separators, StringSplitOptions.RemoveEmptyEntries);
s = String.Join("\n", temp);
You could wrap this in an extension method with little effort as well.
Edit: Or just wait 2 minutes and I'll end up writing it anyway :)
public static class ExtensionMethods
{
public static string Replace(this string s, char[] separators, string newVal)
{
string[] temp;
temp = s.Split(separators, StringSplitOptions.RemoveEmptyEntries);
return String.Join( newVal, temp );
}
}
And voila...
char[] separators = new char[]{' ',';',',','\r','\t','\n'};
string s = "this;is,\ra\t\n\n\ntest";
s = s.Replace(separators, "\n");
You could use Linq's Aggregate function:
string s = "the\nquick\tbrown\rdog,jumped;over the lazy fox.";
char[] chars = new char[] { ' ', ';', ',', '\r', '\t', '\n' };
string snew = chars.Aggregate(s, (c1, c2) => c1.Replace(c2, '\n'));
Here's the extension method:
public static string ReplaceAll(this string seed, char[] chars, char replacementCharacter)
{
return chars.Aggregate(seed, (str, cItem) => str.Replace(cItem, replacementCharacter));
}
Extension method usage example:
string snew = s.ReplaceAll(chars, '\n');
This is the shortest way:
myString = Regex.Replace(myString, #"[;,\t\r ]|[\n]{2}", "\n");
Strings are just immutable char arrays
You just need to make it mutable:
either by using StringBuilder
go in the unsafe world and play with pointers (dangerous though)
and try to iterate through the array of characters the least amount of times. Note the HashSet here, as it avoids to traverse the character sequence inside the loop. Should you need an even faster lookup, you can replace HashSet by an optimized lookup for char (based on an array[256]).
Example with StringBuilder
public static void MultiReplace(this StringBuilder builder,
char[] toReplace,
char replacement)
{
HashSet<char> set = new HashSet<char>(toReplace);
for (int i = 0; i < builder.Length; ++i)
{
var currentCharacter = builder[i];
if (set.Contains(currentCharacter))
{
builder[i] = replacement;
}
}
}
Edit - Optimized version (only valid for ASCII)
public static void MultiReplace(this StringBuilder builder,
char[] toReplace,
char replacement)
{
var set = new bool[256];
foreach (var charToReplace in toReplace)
{
set[charToReplace] = true;
}
for (int i = 0; i < builder.Length; ++i)
{
var currentCharacter = builder[i];
if (set[currentCharacter])
{
builder[i] = replacement;
}
}
}
Then you just use it like this:
var builder = new StringBuilder("my bad,url&slugs");
builder.MultiReplace(new []{' ', '&', ','}, '-');
var result = builder.ToString();
Ohhh, the performance horror!
The answer is a bit outdated, but still...
public static class StringUtils
{
#region Private members
[ThreadStatic]
private static StringBuilder m_ReplaceSB;
private static StringBuilder GetReplaceSB(int capacity)
{
var result = m_ReplaceSB;
if (null == result)
{
result = new StringBuilder(capacity);
m_ReplaceSB = result;
}
else
{
result.Clear();
result.EnsureCapacity(capacity);
}
return result;
}
public static string ReplaceAny(this string s, char replaceWith, params char[] chars)
{
if (null == chars)
return s;
if (null == s)
return null;
StringBuilder sb = null;
for (int i = 0, count = s.Length; i < count; i++)
{
var temp = s[i];
var replace = false;
for (int j = 0, cc = chars.Length; j < cc; j++)
if (temp == chars[j])
{
if (null == sb)
{
sb = GetReplaceSB(count);
if (i > 0)
sb.Append(s, 0, i);
}
replace = true;
break;
}
if (replace)
sb.Append(replaceWith);
else
if (null != sb)
sb.Append(temp);
}
return null == sb ? s : sb.ToString();
}
}
You may also simply write these string extension methods, and put them somewhere in your solution:
using System.Text;
public static class StringExtensions
{
public static string ReplaceAll(this string original, string toBeReplaced, string newValue)
{
if (string.IsNullOrEmpty(original) || string.IsNullOrEmpty(toBeReplaced)) return original;
if (newValue == null) newValue = string.Empty;
StringBuilder sb = new StringBuilder();
foreach (char ch in original)
{
if (toBeReplaced.IndexOf(ch) < 0) sb.Append(ch);
else sb.Append(newValue);
}
return sb.ToString();
}
public static string ReplaceAll(this string original, string[] toBeReplaced, string newValue)
{
if (string.IsNullOrEmpty(original) || toBeReplaced == null || toBeReplaced.Length <= 0) return original;
if (newValue == null) newValue = string.Empty;
foreach (string str in toBeReplaced)
if (!string.IsNullOrEmpty(str))
original = original.Replace(str, newValue);
return original;
}
}
Call them like this:
"ABCDE".ReplaceAll("ACE", "xy");
xyBxyDxy
And this:
"ABCDEF".ReplaceAll(new string[] { "AB", "DE", "EF" }, "xy");
xyCxyF
Use RegEx.Replace, something like this:
string input = "This is text with far too much " +
"whitespace.";
string pattern = "[;,]";
string replacement = "\n";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
Here's more info on this MSDN documentation for RegEx.Replace
Performance-Wise this probably might not be the best solution but it works.
var str = "filename:with&bad$separators.txt";
char[] charArray = new char[] { '#', '%', '&', '{', '}', '\\', '<', '>', '*', '?', '/', ' ', '$', '!', '\'', '"', ':', '#' };
foreach (var singleChar in charArray)
{
str = str.Replace(singleChar, '_');
}
string ToBeReplaceCharacters = #"~()##$%&+,'"<>|;\/*?";
string fileName = "filename;with<bad:separators?";
foreach (var RepChar in ToBeReplaceCharacters)
{
fileName = fileName.Replace(RepChar.ToString(), "");
}
A .NET Core version for replacing a defined set of string chars to a specific char. It leverages the recently introduced Span type and string.Create method.
The idea is to prepare a replacement array, so no actual comparison operations would be required for the each string char. Thus, the replacement process reminds the way a state machine works. In order to avoid initialization of all items of the replacement array, let's store oldChar ^ newChar (XOR'ed) values there, what gives the following benefits:
If a char is not changing: ch ^ ch = 0 - no need to initialize non-changing items
The final char can be found by XOR'ing: ch ^ repl[ch]:
ch ^ 0 = ch - not changed chars case
ch ^ (ch ^ newChar) = newChar - replaced char
So the only requirement would be to ensure that the replacement array is zero-ed when initialized. We'll be using ArrayPool<char> to avoid allocations each time the ReplaceAll method is called. And, in order to ensure that the arrays are zero-ed without expensive call to Array.Clear method, we'll be maintaining a pool dedicated for the ReplaceAll method. We'll be clearing the replacement array (exact items only) before returning it to the pool.
public static class StringExtensions
{
private static readonly ArrayPool<char> _replacementPool = ArrayPool<char>.Create();
public static string ReplaceAll(this string str, char newChar, params char[] oldChars)
{
// If nothing to do, return the original string.
if (string.IsNullOrEmpty(str) ||
oldChars is null ||
oldChars.Length == 0)
{
return str;
}
// If only one character needs to be replaced,
// use the more efficient `string.Replace`.
if (oldChars.Length == 1)
{
return str.Replace(oldChars[0], newChar);
}
// Get a replacement array from the pool.
var replacements = _replacementPool.Rent(char.MaxValue + 1);
try
{
// Intialize the replacement array in the way that
// all elements represent `oldChar ^ newChar`.
foreach (var oldCh in oldChars)
{
replacements[oldCh] = (char)(newChar ^ oldCh);
}
// Create a string with replaced characters.
return string.Create(str.Length, (str, replacements), (dst, args) =>
{
var repl = args.replacements;
foreach (var ch in args.str)
{
dst[0] = (char)(repl[ch] ^ ch);
dst = dst.Slice(1);
}
});
}
finally
{
// Clear the replacement array.
foreach (var oldCh in oldChars)
{
replacements[oldCh] = char.MinValue;
}
// Return the replacement array back to the pool.
_replacementPool.Return(replacements);
}
}
}
I know this question is super old, but I want to offer 2 options that are more efficient:
1st off, the extension method posted by Paul Walls is good but can be made more efficient by using the StringBuilder class, which is like the string data type but made especially for situations where you will be changing string values more than once. Here is a version I made of the extension method using StringBuilder:
public static string ReplaceChars(this string s, char[] separators, char newVal)
{
StringBuilder sb = new StringBuilder(s);
foreach (var c in separators) { sb.Replace(c, newVal); }
return sb.ToString();
}
I ran this operation 100,000 times and using StringBuilder took 73ms compared to 81ms using string. So the difference is typically negligible, unless you're running many operations or using a huge string.
Secondly, here is a 1 liner loop you can use:
foreach (char c in separators) { s = s.Replace(c, '\n'); }
I personally think this is the best option. It is highly efficient and doesn't require writing an extension method. In my testing this ran the 100k iterations in only 63ms, making it the most efficient.
Here is an example in context:
string s = "this;is,\ra\t\n\n\ntest";
char[] separators = new char[] { ' ', ';', ',', '\r', '\t', '\n' };
foreach (char c in separators) { s = s.Replace(c, '\n'); }
Credit to Paul Walls for the first 2 lines in this example.
I also fiddled around with that problem, and found that most of the solutions here are very slow. The fastest one was actually the LINQ + Aggregate method that dodgy_coder posted.
But I thought, well that might be also quite heavy in memory allocations depending upon how many old characters there are. So I came out with this:
The idea here is to have a cached replacement map of the old characters for the current thread, to safe allocations. And other than that just working with a character array of the input that later on is returned as string again. Whereas the character array is modified as less as possible.
[ThreadStatic]
private static bool[] replaceMap;
public static string Replace(this string input, char[] oldChars, char newChar)
{
if (input == null) throw new ArgumentNullException(nameof(input));
if (oldChars == null) throw new ArgumentNullException(nameof(oldChars));
if (oldChars.Length == 1) return input.Replace(oldChars[0], newChar);
if (oldChars.Length == 0) return input;
replaceMap = replaceMap ?? new bool[char.MaxValue + 1];
foreach (var oldChar in oldChars)
{
replaceMap[oldChar] = true;
}
try
{
var count = input.Length;
var output = input.ToCharArray();
for (var i = 0; i < count; i++)
{
if (replaceMap[input[i]])
{
output[i] = newChar;
}
}
return new string(output);
}
finally
{
foreach (var oldChar in oldChars)
{
replaceMap[oldChar] = false;
}
}
}
For me this is at most two allocations for the actual input string to work on. A StringBuilder turned out to be much slower for me for some reasons. And it is 2 times faster than the LINQ variant.
No "Replace" (Linq only):
string myString = ";,\r\t \n\n=1;;2,,3\r\r4\t\t5 6\n\n\n\n7=";
char NoRepeat = '\n';
string ByeBye = ";,\r\t ";
string myResult = myString.ToCharArray().Where(t => !"STOP-OUTSIDER".Contains(t))
.Select(t => "" + ( ByeBye.Contains(t) ? '\n' : t))
.Aggregate((all, next) => (
next == "" + NoRepeat && all.Substring(all.Length - 1) == "" + NoRepeat
? all : all + next ) );
Having built my own solution, and looking at the solution used here, I leveraged an answer that isn't using complex code and is generally efficient for most parameters.
Cover base cases where other methods are more appropriate. If there are no chars to replacement, return the original string. If there is only one, just use the Replace method.
Use a StringBuilder and initialize the capacity to the length of the original string. After all, the new string being built will have the same length of the original string if its just chars being replaced. This ensure only 1 memory allocation is used for the new string.
Assuming that the 'char' length could be small or large will impact performance. Large collections are better with hashsets, while smaller collections are not. This is a near-perfect use case for Hybrid Dictionaries. They switch to using a Hash based lookup once the collection gets too large. However, we don't care about the value of the dictionary, so I just set it to "true".
Have different methods for StringBuilder verse just a string will prevent unnecessary memory allocation. If its just a string, don't instantiate a StringBuilder unless the base cases were checked. If its already a StringBuilder, then perform the replacements and return the StringBuilder itself (as other StringBuilder methods like Append do).
I put the replacement char first, and the chars to check at the end. This way, I can leverage the params keyword for easily passing additional strings. However, you don't have to do this if you prefer the other order.
namespace Test.Extensions
{
public static class StringExtensions
{
public static string ReplaceAll(this string str, char replacementCharacter, params char[] chars)
{
if (chars.Length == 0)
return str;
if (chars.Length == 1)
return str.Replace(chars[0], replacementCharacter);
StringBuilder sb = new StringBuilder(str.Length);
var searcher = new HybridDictionary(chars.Length);
for (int i = 0; i < chars.Length; i++)
searcher[chars[i]] = true;
foreach (var c in str)
{
if (searcher.Contains(c))
sb.Append(replacementCharacter);
else
sb.Append(c);
}
return sb.ToString();
}
public static StringBuilder ReplaceAll(this StringBuilder sb, char replacementCharacter, params char[] chars)
{
if (chars.Length == 0)
return sb;
if (chars.Length == 1)
return sb.Replace(chars[0], replacementCharacter);
var searcher = new HybridDictionary(chars.Length);
for (int i = 0; i < chars.Length; i++)
searcher[chars[i]] = true;
for (int i = 0; i < sb.Length; i++)
{
var val = sb[i];
if (searcher.Contains(val))
sb[i] = replacementCharacter;
}
return sb;
}
}
}

In C#, perform "static" copy into a substring of a StringBuilder object

To build a sparsely populated fixed width record, I would like to copy a string field into a StringBuilder object, starting at a given position. A nice syntax for this would have been
StringBuilder sb = new StringBuilder(' ', 100);
string fieldValue = "12345";
int startPos = 16;
int endPos = startPos + fieldValue.Length - 1;
sb[startPos..endPos] = fieldValue; // no such syntax
I could obviously do this C style, one character at a time:
for (int ii = 0; ii++; ii < fieldValue.Length)
sb[startPos + ii] = fieldValue[ii];
But this seems way too cumbersome for c#, plus it uses a loop where the resulting machine code could more efficiently use a bulk copy, which can make a difference if the strings involved were long. Any ideas for a better way?
Your original algorithm can be supported in the following way
var builder = new StringBuilder(new string(' ', 100));
string toInsert = "HELLO WORLD";
int atIndex = 10;
builder.Remove(atIndex, toInsert.Length);
builder.Insert(atIndex, toInsert);
Debug.Assert(builder.Length == 100);
Debug.Assert(builder.ToString().IndexOf(toInsert) == 10);
You can write your own specialized string builder class that uses the efficient machinery of char[] and string underneath the hood, in particular String.CopyTo:
public class FixedStringBuilder
{
char[] buffer;
public FixedStringBuilder(int length)
{
buffer = new string(' ', length).ToCharArray();
}
public FixedStringBuilder Replace(int index, string value)
{
value.CopyTo(0, buffer, index, value.Length);
return this;
}
public override string ToString()
{
return new string(buffer);
}
}
class Program
{
static void Main(string[] args)
{
FixedStringBuilder sb = new FixedStringBuilder(100);
string fieldValue = "12345";
int startPos = 16;
sb.Replace(startPos, fieldValue);
string buffer = sb.ToString();
}
}
The closest solution to your goal is to convert the source string in a char array, then substitute the cells. Any char array can be converted back to a string.
why are you pre-allocating memory in the string builder (it is only support performance).
i would append the known prefix, then the actual value and then the postfix to the string.
something like:
StringBuilder sb = new StringBuilder();
sb.Append(prefix).Append(value).Append(postfix);

Multiplying strings in C# [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Can I "multiply" a string (in C#)?
In Python I can do this:
>>> i = 3
>>> 'hello' * i
'hellohellohello'
How can I multiply strings in C# similarly to in Python? I could easily do it in a for loop but that gets tedious and non-expressive.
Ultimately I'm writing out to console recursively with an indented level being incremented with each call.
parent
child
child
child
grandchild
And it'd be easiest to just do "\t" * indent.
There is an extension method for it in this post.
public static string Multiply(this string source, int multiplier)
{
StringBuilder sb = new StringBuilder(multiplier * source.Length);
for (int i = 0; i < multiplier; i++)
{
sb.Append(source);
}
return sb.ToString();
}
string s = "</li></ul>".Multiply(10);
If you just need a single character you can do:
new string('\t', i)
See this post for more info.
Here's how I do it...
string value = new string(' ',5).Replace(" ","Apple");
There's nothing built-in to the BCL to do this, but a bit of LINQ can accomplish the task easily enough:
var multiplied = string.Join("", Enumerable.Repeat("hello", 5).ToArray());
int indent = 5;
string s = new string('\t', indent);
One way of doing this is the following - but it's not that nice.
String.Join(String.Empty, Enumerable.Repeat("hello", 3).ToArray())
UPDATE
Ahhhh ... I remeber ... for chars ...
new String('x', 3)
how about with a linq aggregate...
var combined = Enumerable.Repeat("hello", 5).Aggregate("", (agg, current) => agg + current);
There is no such statement in C#; your best bet is probably your own MultiplyString() function.
Per mmyers:
public static string times(this string str, int count)
{
StringBuilder sb = new StringBuilder();
for(int i=0; i<count; i++)
{
sb.Append(str);
}
return sb.ToString();
}
As long as it's only one character that you want to repeat, there is a String constructor that you can use:
string indentation = new String('\t', indent);
I don't think that you can extend System.String with an operator overload, but you could make a string wrapper class to do it.
public class StringWrapper
{
public string Value { get; set; }
public StringWrapper()
{
this.Value = string.Empty;
}
public StringWrapper(string value)
{
this.Value = value;
}
public static StringWrapper operator *(StringWrapper wrapper,
int timesToRepeat)
{
StringBuilder builder = new StringBuilder();
for (int i = 0; i < timesToRepeat; i++)
{
builder.Append(wrapper.Value);
}
return new StringWrapper(builder.ToString());
}
}
Then call it like...
var helloTimesThree = new StringWrapper("hello") * 3;
And get the value from...
helloTimesThree.Value;
Of course, the sane thing to do would be to have your function track and pass in the current depth and dump tabs out in a for loop based off of that.
if u need string 3 times just do
string x = "hello";
string combined = x + x + x;

Categories