This question already has answers here:
Fastest way to trim a string and convert it to lower case
(6 answers)
Closed 6 years ago.
I am searching for a simple way to remove underscores from strings and replacing the next character with its upper case letter.
For example:
From: "data" to: "Data"
From: "data_first" to: "DataFirst"
From: "data_first_second" to: "DataFirstSecond"
Who needs more than one line of code?
var output = Regex.Replace(input, "(?:^|_)($|.)", m => m.Groups[1].Value.ToUpper());
This approach is known as a "finite-state machine" that iterates through the string - in that it has a finite set of states ("is the first letter of a word following an underscore" vs "character inside a word"). This represents the minimal instructions needed to perform the task. You can use a Regular Expression for the same effect, but it would generate at least the same number of instructions at runtime. Writing the code out manually guarantees a minimal runtime.
The advantage of this approach is sheer performance: there is no unnecessary allocation of intermediate strings being performed, and it iterates through the input string only once, giving a time complexity of O(n) and a space complexity of O(n). This cannot be improved upon.
public static String ConvertUnderscoreSeparatedStringToPascalCase(String input) {
Boolean isFirstLetter = true;
StringBuilder output = new StringBuilder( input.Length );
foreach(Char c in input) {
if( c == '_' ) {
isFirstLetter = true;
continue;
}
if( isFirstLetter ) {
output.Append( Char.ToUpper( c ) );
isFirstLetter = false;
}
else {
output.Append( c );
}
}
return output.ToString();
}
You can use String.Split and following LINQ query:
IEnumerable<string> newStrings = "data_first_second".Split('_')
.Select(t => new String(t.Select((c, index) => index == 0 ? Char.ToUpper(c) : c).ToArray()));
string result = String.Join("", newStrings);
All other answers valid... for a culture-aware way:
var textInfo = CultureInfo.CurrentCulture.TextInfo;
var modifiedString = textInfo.ToTitleCase(originalString).Replace("_","")
I've made a fiddle: https://dotnetfiddle.net/NAr5PP
I would do something like this:
string test = "data_first_second";
string[] testArray=test.Split('_');
StringBuilder modifiedString = new StringBuilder();
foreach (string t in testArray)
{
modifiedString.Append(t.First().ToString().ToUpper() + t.Substring(1));
}
test=modifiedString.toString();
Use LINQ and Split method like this:
var result = string.Join("",str.Split('_')
.Select(c => c.First().ToString()
.ToUpper() + String.Join("", c.Skip(1))));
Related
I'm building a string based on an IEnumerable, and doing something like this:
public string BuildString()
{
var enumerable = GetEnumerableFromSomewhere(); // actually an in parameter,
// but this way you don't have to care
// about the type :)
var interestingParts = enumerable.Select(v => v.TheInterestingStuff).ToArray();
stringBuilder.Append("This is it: ");
foreach(var part in interestingParts)
{
stringBuilder.AppendPart(part);
if (part != interestingParts.Last())
{
stringBuilder.Append(", ");
}
}
}
private static void AppendPart(this StringBuilder stringBuilder, InterestingPart part)
{
stringBuilder.Append("[");
stringBuilder.Append(part.Something");
stringBuilder.Append("]");
if (someCondition(part))
{
// this is in reality done in another extension method,
// similar to the else clause
stringBuilder.Append(" = #");
stringBuilder.Append(part.SomethingElse");
}
else
{
// this is also an extension method, similar to this one
// it casts the part to an IEnumerable, and iterates over
// it in much the same way as the outer method.
stringBuilder.AppendInFilter(part);
}
}
I'm not entirely happy with this idiom, but I'm struggling to formulate something more succinct.
This is, of course, part of a larger string building operation (where there are several blocks similar to this one, as well as other stuff in between) - otherwise I'd probably drop the StringBuilder and use string.Join(", ", ...) directly.
My closest attempt at simplifying the above, though, is constructs like this for each iterator:
stringBuilder.Append(string.Join(", ", propertyNames.Select(prop => "[" + prop + "]")));
but here I'm still concatenating strings with +, which makes it feel like the StringBuilder doesn't really contribute much.
How could I simplify this code, while keeping it efficient?
You can replace this:
string.Join(", ", propertyNames.Select(prop => "[" + prop + "]"))
With c# 6 string interpolation:
string.Join(", ", propertyNames.Select(prop => $"[{prop}]"))
In both cases the difference is semantic only and it doesn't really matter. String concatenation like in your case in the select isn't a problem. The compiler still creates only 1 new string for it (and not 4, one for each segment and a 4th for the joint string).
Putting it all together:
var result = string.Join(", ", enumerable.Select(v => $"[{v.TheInterestingStuff}]"));
Because body of foreach is more complex that to fit in a String Interpolation scope you can just remove the last N characters of the string once calculated, as KooKiz suggested.
string separator = ", ";
foreach(var part in interestingParts)
{
stringBuilder.Append("[");
stringBuilder.Append(part);
stringBuilder.Append("]");
if (someCondition(part))
{
// Append more stuff
}
else
{
// Append other thingd
}
stringBuilder.Append(separator);
}
stringBuilder.Length = stringBuilder.Lenth - separator;
In any case I think that for better encapsulation the content of the loop's scope should sit in a separate function that will receive a part and the separator and will return the output string. It can also be an extension method for StringBuilder as suggested by user734028
Use Aggregate extension method with StringBuilder.
Will be more efficiently then concatenate strings if your collection are big
var builder = new StringBuilder();
list.Aggregate(builder, (sb, person) =>
{
sb.Append(",");
sb.Append("[");
sb.Append(person.Name);
sb.Append("]");
return sb;
});
builder.Remove(0, 1); // Remove first comma
As pure foreach is always more efficient then LINQ then just change logic for delimeter comma
var builder = new StringBuilder();
foreach(var part in enumerable.Select(v => v.TheInterestingStuff))
{
builder.Append(", ");
builder.Append("[");
builder.Append(part);
builder.Append("]");
}
builder.Remove(0, 2); //Remove first comma and space
Aggregate solution:
var answer = interestingParts.Select(v => "[" + v + "]").Aggregate((a, b) => a + ", " + b);
Serialization solution:
var temp = JsonConvert.SerializeObject(interestingParts.Select(x => new[] { x }));
var answer = temp.Substring(1, temp.Length - 2).Replace(",", ", ");
the code:
public string BuildString()
{
var enumerable = GetEnumerableFromSomewhere();
var interestingParts = enumerable.Select(v => v.TheInterestingStuff).ToArray();
stringBuilder.Append("This is it: ");
foreach(var part in interestingParts)
{
stringBuilder.AppendPart(part)
}
if (stringBuilder.Length>0)
stringBuilder.Length--;
}
private static void AppendPart(this StringBuilder stringBuilder, InterestingPart part)
{
if (someCondition(part))
{
stringBuilder.Append(string.Format("[{0}] = #{0}", part.Something));
}
else
{
stringBuilder.Append(string.Format("[{0}]", part.Something));
stringBuilder.AppendInFilter(part); //
}
}
much better now IMO.
Now a little discussion on making it very fast. We can use Parallel.For. But you would think (if you would think) the Appends are all happening to a single shareable resource, aka the StringBuilder, and then you would have to lock it to Append to it, not so efficient! Well, if we can say that each iteration of the for loop in the outer function creates one single string artifact, then we can have a single array of string, allocated to the count of interestingParts before the Parallel for starts, and each index of the Parallel for would store its string to its respective index.
Something like:
string[] iteration_buckets = new string[interestingParts.Length];
System.Threading.Tasks.Parallel.For(0, interestingParts.Length,
(index) =>
{
iteration_buckets[index] = AppendPart(interestingParts[index]);
});
your function AppendPart will have to be adjusted to make it a non-extension to take just a string and return a string.
After the loop ends you can do a string.Join to get a string, which is what you may be doing with the stringBuilder.ToString() too.
This question already has answers here:
How would you count occurrences of a string (actually a char) within a string?
(34 answers)
Closed 9 years ago.
I am trying to get the number of occurrences of a certain character such as & in the following string.
string test = "key1=value1&key2=value2&key3=value3";
How do I determine that there are 2 ampersands (&) in the above test string variable?
You could do this:
int count = test.Split('&').Length - 1;
Or with LINQ:
test.Count(x => x == '&');
Because LINQ can do everything...:
string test = "key1=value1&key2=value2&key3=value3";
var count = test.Where(x => x == '&').Count();
Or if you like, you can use the Count overload that takes a predicate :
var count = test.Count(x => x == '&');
The most straight forward, and most efficient, would be to simply loop through the characters in the string:
int cnt = 0;
foreach (char c in test) {
if (c == '&') cnt++;
}
You can use Linq extensions to make a simpler, and almost as efficient version. There is a bit more overhead, but it's still surprisingly close to the loop in performance:
int cnt = test.Count(c => c == '&');
Then there is the old Replace trick, however that is better suited for languages where looping is awkward (SQL) or slow (VBScript):
int cnt = test.Length - test.Replace("&", "").Length;
Why use regex for that. String implements IEnumerable<char>, so you can just use LINQ.
test.Count(c => c == '&')
Your string example looks like the query string part of a GET. If so, note that HttpContext has some help for you
int numberOfArgs = HttpContext.Current.QueryString.Count;
For more of what you can do with QueryString, see NameValueCollection
Here is the most inefficient way to get the count in all answers. But you'll get a Dictionary that contains key-value pairs as a bonus.
string test = "key1=value1&key2=value2&key3=value3";
var keyValues = Regex.Matches(test, #"([\w\d]+)=([\w\d]+)[&$]*")
.Cast<Match>()
.ToDictionary(m => m.Groups[1].Value, m => m.Groups[2].Value);
var count = keyValues.Count - 1;
If the title isn't clear enough, here's a procedural way of approaching the problem:
[TestMethod]
public void Foo()
{
var start = "9954-4740-4491-4414";
var sb = new StringBuilder();
var j = 0;
for (var i = 0 ; i < start.Length; i++)
{
if ( start[i] != '-')
{
if (j == 2)
{
sb.AppendFormat(":{0}", start[i]);
j = 1;
}
else
{
sb.Append(start[i]);
j++;
}
}
}
var end = sb.ToString();
Assert.AreEqual(end, "99:54:47:40:44:91:44:14");
}
If you're using C# 4 all you need is this:
string result = string.Join(":", Regex.Matches(start, #"\d{2}").Cast<Match>());
For C# 3 you need to provide a string[] to Join:
string[] digitPairs = Regex.Matches(start, #"\d{2}")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
string result = string.Join(":", digitPairs);
I agree with "why bother with regular expressions?"
string.Join(":", str.Split('-').Select(s => s.Insert(2, ":"));
Regex.Replace version, although I like Mark's answer better:
string res = Regex.Replace(start,
#"(\d{2})(\d{2})-(\d{2})(\d{2})-(\d{2})(\d{2})-(\d{2})(\d{2})",
#"$1:$2:$3:$4:$5:$6:$7:$8");
After a while of experimenting, I've found a way to do it by using a single regular expression that works with input of unlimited length:
Regex.Replace(start, #"(?'group'\d\d)-|(?'group'\d\d)(?!$)", #"$1:")
When using named groups (the (?'name') stuff) with same name, captures are stored in the same group. That way, it is possible to replace distinct matches with same value.
It also makes use of negative lookahead (the (?!) stuff).
You don't need them: strip the '-' characters and then insert a colon between each pair of numbers. Unless I've misunderstood the desired output format.
Which is the best way to skip characters before underscore _ in a string using c#?
eg:case 1 String contain _
string s="abc_defg"
I want to get defg to another string
case 2
some times string do not contain _ . at that time i need to get the all string
eg.
s="defg"
In both case i want get "defg" . Filtering only applied if there is an underscore in the string. How can i do that
string s="abc_defg";
int ix = s.IndexOf('_');
s = ix != -1 ? s.Substring(ix + 1) : s;
using the ternary operator here is quite useless, better to write:
s = s.Substring(ix + 1);
directly, using the fact that Substring probably is optimized for the case index == 0
Is this what you want?
BUT someone has suggested using LINQ cannons, so
var temp = s.SkipWhile(p => p != '_').Skip(1);
s = temp.Any() ? new string(temp.ToArray()) : s;
In .NET 4.0 there is a new string.Concat method.
s = temp.Any() ? string.Concat(temp) : s;
(note that in general the LINQ way is slower and more complex to read)
I'll add the ultrakill: the Regular Expressions!!! There is a school of thought that anything can be done with Regular Expressions OR jQuery! :-)
var rx = new Regex("(?:[^_]*_)?(.*)", RegexOptions.Singleline);
var res = rx.Match(s).Groups[1].Value;
I won't even try to explain this beast to anyone, so don't ask. It's useless. (both the Regex and to ask :-) )
I do not know how to define the best way but this works as long as there is only one underscore.
string s1 = "abc_defg";
string s2 = "defg";
Console.WriteLine( s1.Split('_').Last() );
Console.WriteLine( s2.Split('_').Last() );
If you need more flexibility you can take a look at regular expressions.
Regex.Replace(s1, "^.*_", "")
simple function perhaps:
public string TakeStringAfterUnderscoreOrWholeString(string input)
{
var underscorePos = input.IndexOf("_");
return underscorePos == -1 ? input : input.Substring(underscorePos+1);
}
Live test: http://rextester.com/rundotnet?code=AVQO61388
You can leverage LINQ SkipWhile():
textValue.SkipWhile(c => c != separator)
.Skip(1)
Final solution for both cases:
string textValue = "abc_defg";
char separator = '_';
string result = textValue.IndexOf(separator) >= 0
? new String(textValue.SkipWhile(c => c != separator)
.Skip(1)
.ToArray())
: textValue;
EDIT:
I'm right with xanatos comment, but anyway sometimes it is pretty interesting to find out different solutions :)
I would suggest simply use a string[] arr and split function for this purpose.
string[] arr=s.split('_');
string out=arr[arr.Length-1];
I have a very simple question, and I shouldn't be hung up on this, but I am. Haha!
I have a string that I receive in the following format(s):
123
123456-D53
123455-4D
234234-4
123415
The desired output, post formatting, is:
123-455-444
123-455-55
123-455-5
or
123-455
The format is ultimately dependent upon the total number of characters in the original string..
I have several ideas of how to do this, but I keep thing there's a better way than string.Replace and concatenate...
Thanks for the suggestions..
Ian
Tanascius is right but I cant comment or upvote due to my lack of rep but if you want additional info on the string.format Ive found this helpful.
http://blog.stevex.net/string-formatting-in-csharp/
I assume this does not merely rely upon the inputs always being numeric? If so, I'm thinking of something like this
private string ApplyCustomFormat(string input)
{
StringBuilder builder = new StringBuilder(input.Replace("-", ""));
int index = 3;
while (index < builder.Length)
{
builder.Insert(index, "-");
index += 4;
}
return builder.ToString();
}
Here's a method that uses a combination of regular expressions and LINQ to extract groups of three letters at a time and then joins them together again. Note: it assumes that the input has already been validated. The validation can also be done with a regular expression.
string s = "123456-D53";
string[] groups = Regex.Matches(s, #"\w{1,3}")
.Cast<Match>()
.Select(match => match.Value)
.ToArray();
string result = string.Join("-", groups);
Result:
123-456-D53
EDIT: See history for old versions.
You could use char.IsDigit() for finding digits, only.
var output = new StringBuilder();
var digitCount = 0;
foreach( var c in input )
{
if( char.IsDigit( c ) )
{
output.Append( c );
digitCount++;
if( digitCount % 3 == 0 )
{
output.Append( "-" );
}
}
}
// Remove possible last -
return output.ToString().TrimEnd('-');
This code should fill from left to right (now I got it, first read, then code) ...
Sorry, I still can't test this right now.
Not the fastest, but easy on the eyes (ed: to read):
string Normalize(string value)
{
if (String.IsNullOrEmpty(value)) return value;
int appended = 0;
var builder = new StringBuilder(value.Length + value.Length/3);
for (int ii = 0; ii < value.Length; ++ii)
{
if (Char.IsLetterOrDigit(value[ii]))
{
builder.Append(value[ii]);
if ((++appended % 3) == 0) builder.Append('-');
}
}
return builder.ToString().TrimEnd('-');
}
Uses a guess to pre-allocate the StringBuilder's length. This will accept any Alphanumeric input with any amount of junk being added by the user, including excess whitespace.