sub string according to some sign - c#

Q:
I want to get sub strings according to some sign like - .
EX:
if i have string like this :
saturday-sa-0-
and i wanna to get:
saturday
sa
0
I search and find the following method:
string substring = name.Split('-')[i];
my code block sample:
foreach (string name in q)
{
for (int i = 0; i < 3; i++)
{
string substring = name.Split('-')[i];
}
}
but i read the comments about the performance drawbacks when i have a long string ..
my question is: Is there any way to substring according to specific sign and not affect badly on the performance code?

Splitting a string is O(N) , no more and no less, which is the actual complexity of String.Split So even if you write your own procedure, it cannot be REALLY faster. Perhaps it can be slightly faster. In any case, first make sure that the performance of String.Split is indeed insatisfactory for you.
And yes, if you split is over and over in a LOOP, then it will be a performance issue. You must first split it and then iterate over the array - see other answers

First, you should execute the Split operation only once. I.e., instead of
some loop {
...
string substring = name.Split('-')[i];
...
}
use
string[] substrings = name.Split('-');
some loop {
...
string substring = substrings[i];
...
}
Second, don't worry about the performance of Strint.Split too much unless
you have a real, measurable performance problem and
you know that String.Split is the culprit.
For example, if you have some database operation that takes 1 second, it does not really matter if the subsequent Split operation takes 0.001 or 0.002 seconds.
EDIT: Regarding the code in your comment: You can refactor
foreach (string name in q) {
for (int i = 0; i < 3; i++) {
string substring = name.Split('-')[i];
// do something with substring
}
}
to
foreach (string name in q) {
string[] substrings = name.Split('-');
for (int i = 0; i < 3; i++) {
string substring = substrings[i];
// do something with substring
}
}

The issues with split are if it is over used. If you need to split a string on a character then split the string. Regex is the next most used way but comes with its own set or performance gotchas. If you really need to keep you foot print small scanning the string and doing your processing in place is you best option, however this is fraught with peril as well since .net string are immutable and you may well run it the same issues you run into with split. So I guess the long and the short of it is use split and if that doesn't meet your need reevaluate.

I'd say it depends on your data. However, you should not repeatedly do ...
string substring = name.Split('-')[i];
... since this will split your string into parts every time you need to access just one of the parts. Instead, cache the split result like this ...
string[] parts = name.Split('-');
... and then use ...
string substring = parts[i];
... to access the respective parts.

Related

Split string with plus sign as a delimiter

I have an issue with a string containing the plus sign (+).
I want to split that string (or if there is some other way to solve my problem)
string ColumnPlusLevel = "+-J10+-J10+-J10+-J10+-J10";
string strpluslevel = "";
strpluslevel = ColumnPlusLevel;
string[] strpluslevel_lines = Regex.Split(strpluslevel, "+");
foreach (string line in strpluslevel_lines)
{
MessageBox.Show(line);
strpluslevel_summa = strpluslevel_summa + line;
}
MessageBox.Show(strpluslevel_summa, "summa sumarum");
The MessageBox is for my testing purpose.
Now... The ColumnPlusLevel string can have very varied entry but it is always a repeated pattern starting with the plus sign.
i.e. "+MJ+MJ+MJ" or "+PPL14.1+PPL14.1+PPL14.1" as examples.
(It comes form Another software and I cant edit the output from that software)
How can I find out what that pattern is that is being repeated?
That in this exampels is the +-J10 or +MJ or +PPL14.1
In my case above I have tested it by using only a MessageBox to show the result but I want the repeated pattering stored in a string later on.
Maybe im doing it wrong by using Split, maybe there is another solution.
Maybe I use Split in the wrong way.
Hope you understand my problem and the result I want.
Thanks for any advice.
/Tomas
How can I find out what that pattern is that is being repeated?
Maybe i didn't understand the requirement fully, but isn't it easy as:
string[] tokens = ColumnPlusLevel.Split(new[]{'+'}, StringSplitOptions.RemoveEmptyEntries);
string first = tokens[0];
bool repeatingPattern = tokens.Skip(1).All(s => s == first);
If repeatingPattern is true you know that the pattern itself is first.
Can you maybe explain how the logic works
The line which contains tokens.Skip(1) is a LINQ query, so you need to add using System.Linq at the top of your code file. Since tokens is a string[] which implements IEnumerable<string> you can use any LINQ (extension-)method. Enumerable.Skip(1) will skip the first because i have already stored that in a variable and i want to know if all others are same. Therefore i use All which returns false as soon as one item doesn't match the condition(so one string is different to the first). If all are same you know that there is a repeating pattern which is already stored in the variable first.
You should use String.Split function :
string pattern = ColumnPlusLevel.Split("+")[0];
...but it is always a repeated pattern starting with the plus sign.
Why do you even need String.Split() here if the pattern always only repeats itself?
string input = #"+MJ+MJ+MJ";
int indexOfSecondPlus = input.IndexOf('+', 1);
string pattern = input.Remove(indexOfSecondPlus, input.Length - indexOfSecondPlus);
//pattern is now "+MJ"
No need of string split, no need to use LinQ
String has a method called Split which let's you split/divide the string based on a given character/character-set:
string givenString = "+-J10+-J10+-J10+-J10+-J10"'
string SplittedString = givenString.Split("+")[0] ///Here + is the character based on which the string would be splitted and 0 is the index number
string result = SplittedString.Replace("-","") //The mothod REPLACE replaces the given string with a targeted string,i added this so that you can get the numbers only from the string

Inverse Count Issue C#

Not Quite what the title suggests, what i need is a way to count a string backwards like
string i = "3027"
i[0] = label1.Text
Result = 7 not 3 is there a way?
not sure if you need my code or not its not really important.
You can reverse the string using a number of approaches including
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
http://www.dotnetperls.com/reverse-string
then access the portion of the reversed string that you are interested in.
Note that you cannot assign to i[0] as shown in your example code because strings are immutable in C# (why). If you want to construct a string a bit at a time, it is often most efficient to use StringBuilder.

Most efficient way to parse a delimited string in C#

This has been asked a few different ways but I am debating on "my way" vs "your way" with another developer. Language is C#.
I want to parse a pipe delimited string where the first 2 characters of each chunk is my tag.
The rules. Not my rules but rules I have been given and must follow.
I can't change the format of the string.
This function will be called possibly many times so efficiency is key.
I need to keep is simple.
The input string and tag I am looking for may/will change during runtime.
Example input string: AOVALUE1|ABVALUE2|ACVALUE3|ADVALUE4
Example tag I may need value for: AB
I split string into an array based on delimiter and loop through the array each time the function is called. I then looked at the first 2 characters and return the value minus the first 2 characters.
The "other guys" way is to take the string and use a combination of IndexOf and SubString to find the starting point and ending point of the field I am looking for. Then using SubString again to pullout the value minus the first 2 characters. So he would say IndexOf("|AB") the find then next pipe in the string. This would be the start and end. Then SubString that out.
Now I should think that IndexOf and SubString would parse the string each time at a char by char level so this would be less efficient than using large chunks and reading the string minus the first 2 characters. Or is there another way the is better then what both of us has proposed?
The other guy's approach is going to be more efficient in time given that input string needs to be reevaluated each time. If the input string is long, it is also won't require the extra memory that splitting the string would.
If I'm trying to code a really tight loop I prefer to directly use array/string operators rather than LINQ to avoid that additional overhead:
string inputString = "AOVALUE1|ABVALUE2|ACVALUE3|ADVALUE4";
static string FindString(string tag)
{
int startIndex;
if (inputString.StartsWith(tag))
{
startIndex = tag.Length;
}
else
{
startIndex = inputString.IndexOf(string.Format("|{0}", tag));
if (startIndex == -1)
return string.Empty;
startIndex += tag.Length + 1;
}
int endIndex = inputString.IndexOf('|', startIndex);
if (endIndex == -1)
endIndex = inputString.Length;
return inputString.Substring(startIndex, endIndex - startIndex);
}
I've done a lot of parsing in C# and I would probably take the approach suggested by the "other guys" just because it would be a bit lighter on resources used and likely to be a little faster as well.
That said, as long as the data isn't too big, there's nothing wrong with the first approach and it will be much easier to program.
Something like this may work ok
string myString = "AOVALUE1|ABVALUE2|ACVALUE3|ADVALUE4";
string selector = "AB";
var results = myString.Split('|').Where(x => x.StartsWith(selector)).Select(x => x.Replace(selector, ""));
Returns: list of the matches, in this case just one "VALUE2"
If you are just looking for the first or only match this will work.
string result = myString.Split('|').Where(x => x.StartsWith(selector)).Select(x => x.Replace(selector, "")).FirstOrDefault();
SubString does not parse the string.
IndexOf does parse the string.
My preference would be the Split method, primarily code coding efficiency:
string[] inputArr = input.Split("|".ToCharArray()).Select(s => s.Substring(3)).ToArray();
is pretty concise. How many LoC does the substring/indexof method take?

C# Regex.Match to decimal

I have a string "-4.00 %" which I need to convert to a decimal so that I can declare it as a variable and use it later. The string itself is found in string[] rows. My code is as follows:
foreach (string[] row in rows)
{
string row1 = row[0].ToString();
Match rownum = Regex.Match(row1.ToString(), #"\-?\d+\.+?\d+[^%]");
string act = Convert.ToString(rownum); //wouldn't convert match to decimal
decimal actual = Convert.ToDecimal(act);
textBox1.Text = (actual.ToString());
}
This results in "Input string was not in a correct format." Any ideas?
Thanks.
I see two things happening here that could contribute.
You are treating the Regex Match as though you expect it to be a string, but what a Match retrieves is a MatchGroup.
Rather than converting rownum to a string, you need to lookat rownum.Groups[0].
Secondly, you have no parenthesised match to capture. #"(\-?\d+\.+?\d+)%" will create a capture group from the whole lot. This may not matter, I don't know how C# behaves in this circumstance exactly, but if you start stretching your regexes you will want to use bracketed capture groups so you might as well start as you want to go on.
Here's a modified version of your code that changes the regex to use a capturing group and explicitly look for a %. As a consequence, this also simplifies the parsing to decimal (no longer need an intermediary string):
EDIT : check rownum.Success as per executor's suggestion in comments
string[] rows = new [] {"abc -4.01%", "def 6.45%", "monkey" };
foreach (string row in rows)
{
//regex captures number but not %
Match rownum = Regex.Match(row.ToString(), #"(\-?\d+\.+?\d+)%");
//check for match
if(!rownum.Success) continue;
//get value of first (and only) capture
string capture = rownum.Groups[1].Value;
//convert to decimal
decimal actual = decimal.Parse(capture);
//TODO: do something with actual
}
If you're going to use the Match class to handle this, then you have to access the Match.Groups property to get the collection of matches. This class assumes that more than one occurrence appears. If you can guarantee that you'll always get 1 and only 1 you could get it with:
string act = rownum.Groups[0];
Otherwise you'll need to parse through it as in the MSDN documentation.

Find if string contains at least 2 characters similar to another? C#

I need a method to check if a string contains one or more similar characters to another. I dont want to find all strings containing the letter "D".
For example, if I have a string "Christopher" and want to see if "Chris" is contained in "Christopher", I want that to return. However, if I want to see if "Candy" is in the string "Christopher", I wont want it to return just because it has a "C" in common.
I have tried the .Contains() method but cant give that rules for 2 or more similar characters and I have thought about using regular expressions but that might be a bit over kill. The similar letters must be next to eachother.
Thank you :)
This looks for each 2-character-gram of s1 and looks for it in s2.
string s1 = "Chrx";
string s2 = "Christopher";
IsMatchOn2Characters(s1, s2);
static bool IsMatchOn2Characters(string a, string b)
{
string s1 = a.ToLowerInvariant();
string s2 = b.ToLowerInvariant();
for (int i = 0; i < s1.Length - 1; i++)
{
if (s2.IndexOf(s1.Substring(i,2)) >= 0)
return true; // match
}
return false; // no match
}
This looks a lot like a longest common substring problem. This can be solved easily using DP in O(m*n).
If you are not worried about performance and don't really want to implement this, you can also go with the brute force solution of searching every substring of s1 into s2.

Categories