Split string with plus sign as a delimiter - c#

I have an issue with a string containing the plus sign (+).
I want to split that string (or if there is some other way to solve my problem)
string ColumnPlusLevel = "+-J10+-J10+-J10+-J10+-J10";
string strpluslevel = "";
strpluslevel = ColumnPlusLevel;
string[] strpluslevel_lines = Regex.Split(strpluslevel, "+");
foreach (string line in strpluslevel_lines)
{
MessageBox.Show(line);
strpluslevel_summa = strpluslevel_summa + line;
}
MessageBox.Show(strpluslevel_summa, "summa sumarum");
The MessageBox is for my testing purpose.
Now... The ColumnPlusLevel string can have very varied entry but it is always a repeated pattern starting with the plus sign.
i.e. "+MJ+MJ+MJ" or "+PPL14.1+PPL14.1+PPL14.1" as examples.
(It comes form Another software and I cant edit the output from that software)
How can I find out what that pattern is that is being repeated?
That in this exampels is the +-J10 or +MJ or +PPL14.1
In my case above I have tested it by using only a MessageBox to show the result but I want the repeated pattering stored in a string later on.
Maybe im doing it wrong by using Split, maybe there is another solution.
Maybe I use Split in the wrong way.
Hope you understand my problem and the result I want.
Thanks for any advice.
/Tomas

How can I find out what that pattern is that is being repeated?
Maybe i didn't understand the requirement fully, but isn't it easy as:
string[] tokens = ColumnPlusLevel.Split(new[]{'+'}, StringSplitOptions.RemoveEmptyEntries);
string first = tokens[0];
bool repeatingPattern = tokens.Skip(1).All(s => s == first);
If repeatingPattern is true you know that the pattern itself is first.
Can you maybe explain how the logic works
The line which contains tokens.Skip(1) is a LINQ query, so you need to add using System.Linq at the top of your code file. Since tokens is a string[] which implements IEnumerable<string> you can use any LINQ (extension-)method. Enumerable.Skip(1) will skip the first because i have already stored that in a variable and i want to know if all others are same. Therefore i use All which returns false as soon as one item doesn't match the condition(so one string is different to the first). If all are same you know that there is a repeating pattern which is already stored in the variable first.

You should use String.Split function :
string pattern = ColumnPlusLevel.Split("+")[0];

...but it is always a repeated pattern starting with the plus sign.
Why do you even need String.Split() here if the pattern always only repeats itself?
string input = #"+MJ+MJ+MJ";
int indexOfSecondPlus = input.IndexOf('+', 1);
string pattern = input.Remove(indexOfSecondPlus, input.Length - indexOfSecondPlus);
//pattern is now "+MJ"
No need of string split, no need to use LinQ

String has a method called Split which let's you split/divide the string based on a given character/character-set:
string givenString = "+-J10+-J10+-J10+-J10+-J10"'
string SplittedString = givenString.Split("+")[0] ///Here + is the character based on which the string would be splitted and 0 is the index number
string result = SplittedString.Replace("-","") //The mothod REPLACE replaces the given string with a targeted string,i added this so that you can get the numbers only from the string

Related

How to remove word from the string in a generic way

I have a string which is basically a url something like APIPAth/resources/customers/SSNNumber/authorizations/contracts.
The SSNNumber can be of any value. Its the actual SSN number which I want to remove from the string and the string should look like APIPAth/resources/customers/authorizations/contracts.
I can't find a proper solution in which without hardcoding the word and removing the string
I tried using Find and Replace but I think the function would require the particular word
Looking at the URL you provided, it appears you only want to get rid of digits. You could accomplish that with this line:
var output = Regex.Replace(input, #"[\d]", string.Empty);
There are many ways to skin this cat depending on what stays static.
One way would be to split the url by separators and join them back
var url = #"APIPAth/resources/customers/SSNNumber/authorizations/contracts";
var items = new List<string>(url.Split('/'));
items.RemoveAt(3);
url = string.Join("/", items);
Another way would be to use Regex
var url = #"APIPAth/resources/customers/SSNNumber/authorizations/contracts";
url = Regex.Replace(url, #"/customers/[^/]+/authorizations/", "/customers/authorizations/")
If you elaborate on what you expect in a generic solution, i.e. what part stays static, then I can help you out better
If SSN in number then it has to have form of this 000 00 0000 means 9 digits consequents.
Took a string parse it by / and you get an array of elements lets say parsed
for(int i=0; i<parsed.length; i++){
if(parsed[i].length === 9){
...keep this i...
}
}
remove this parsed[i] from whole parsed and concat with /

Is there a way to get the length of a return value in the same line of code?

I'm not sure if I worded that right but heres what I'm looking for.
I would like to do something like this:
string lastWord = words.Split(':')[splitResult.Length -1];
Is there any way to make that happen or must I store the array first?
using Linq, LastOrDefault extention.
string lastword = words.Split(':').LastOrDefault();
If I would use Split, wouldnt I be splitting it twice?
It Depends.
if you do below, yes you are splitting twice.
string lastWord = words.Split(':')[words.Split(':').Length -1];
and if you use temporary variable for splits then you need Split only once.
var splits =words.Split(':');
string lastWord = splits[splits.Length -1];

how to remove last part of string in c#

I was trying to remove last part of a string but failed.Here string named D:\software\VS2012\newtext.txt and i want to trim last section of string so here newtext.txt . I should get D:\software\VS2012 but how to do it in c#.When i tried it is removing all the string that has '\'. Here is what i did in c#
string str = #"D:\softwares\VS2012\newtext.txt";
str= str.Remove(str.IndexOf('\\'));
Console.WriteLine(str);
There is a premade function for this in the framework
string str = #"D:\softwares\VS2012\newtext.txt";
string path = System.IO.Path.GetDirectoryName(str);
(Reference)
Note that your original code does not work because you are removing from the first backslash, not the last. Substitute this line to make your code work
str = str.Remove(str.LastIndexOf('\\'));
Try using System.IO.Path.GetDirectoryName(string):
string dirname= System.IO.Path.GetDirectoryName(#"D:\softwares\VS2012\newtext.txt");
For removing a known portion of a string you can simply use the Replace.
In your case:
str = str.Replace("\\newtext.txt", ""); //this will give you the same result of the System.IO.Path.GetDirectoryName already suggested by gmiley, but it's more in a string context as per your question
Though if you want to remove the last part of a string by the last encounterd known character then the suggested "LastIndexOff('\')" method already suggested along with the Remove.
If you want to use a delimiter method, so depending on the delimiter character but not on the string format (in your case path format) the LastIndexOff(char) is the best option.
Although you could also split the string into an array and then rejoin the array after removing the last element:
var delmimter = '\\';
var strAy = str.Split(char);
str = String.Join('\\', strAy.SkipLast(1).ToArray());
With this method you don't need to rely on the existence of the delimiter char in the string and the result is always without the delimiter char at the end.
Besides, you can easily create an extension with the delimiter as a parameter.
We should check the existance of the char also
string str = #"D:\softwares\VS2012\newtext.txt";
int rstr = str.LastIndexOf('\\');
if (rstr>0) str= str.Remove(rstr);
Console.WriteLine(str);

Most efficient way to parse a delimited string in C#

This has been asked a few different ways but I am debating on "my way" vs "your way" with another developer. Language is C#.
I want to parse a pipe delimited string where the first 2 characters of each chunk is my tag.
The rules. Not my rules but rules I have been given and must follow.
I can't change the format of the string.
This function will be called possibly many times so efficiency is key.
I need to keep is simple.
The input string and tag I am looking for may/will change during runtime.
Example input string: AOVALUE1|ABVALUE2|ACVALUE3|ADVALUE4
Example tag I may need value for: AB
I split string into an array based on delimiter and loop through the array each time the function is called. I then looked at the first 2 characters and return the value minus the first 2 characters.
The "other guys" way is to take the string and use a combination of IndexOf and SubString to find the starting point and ending point of the field I am looking for. Then using SubString again to pullout the value minus the first 2 characters. So he would say IndexOf("|AB") the find then next pipe in the string. This would be the start and end. Then SubString that out.
Now I should think that IndexOf and SubString would parse the string each time at a char by char level so this would be less efficient than using large chunks and reading the string minus the first 2 characters. Or is there another way the is better then what both of us has proposed?
The other guy's approach is going to be more efficient in time given that input string needs to be reevaluated each time. If the input string is long, it is also won't require the extra memory that splitting the string would.
If I'm trying to code a really tight loop I prefer to directly use array/string operators rather than LINQ to avoid that additional overhead:
string inputString = "AOVALUE1|ABVALUE2|ACVALUE3|ADVALUE4";
static string FindString(string tag)
{
int startIndex;
if (inputString.StartsWith(tag))
{
startIndex = tag.Length;
}
else
{
startIndex = inputString.IndexOf(string.Format("|{0}", tag));
if (startIndex == -1)
return string.Empty;
startIndex += tag.Length + 1;
}
int endIndex = inputString.IndexOf('|', startIndex);
if (endIndex == -1)
endIndex = inputString.Length;
return inputString.Substring(startIndex, endIndex - startIndex);
}
I've done a lot of parsing in C# and I would probably take the approach suggested by the "other guys" just because it would be a bit lighter on resources used and likely to be a little faster as well.
That said, as long as the data isn't too big, there's nothing wrong with the first approach and it will be much easier to program.
Something like this may work ok
string myString = "AOVALUE1|ABVALUE2|ACVALUE3|ADVALUE4";
string selector = "AB";
var results = myString.Split('|').Where(x => x.StartsWith(selector)).Select(x => x.Replace(selector, ""));
Returns: list of the matches, in this case just one "VALUE2"
If you are just looking for the first or only match this will work.
string result = myString.Split('|').Where(x => x.StartsWith(selector)).Select(x => x.Replace(selector, "")).FirstOrDefault();
SubString does not parse the string.
IndexOf does parse the string.
My preference would be the Split method, primarily code coding efficiency:
string[] inputArr = input.Split("|".ToCharArray()).Select(s => s.Substring(3)).ToArray();
is pretty concise. How many LoC does the substring/indexof method take?

C# Regex.Match to decimal

I have a string "-4.00 %" which I need to convert to a decimal so that I can declare it as a variable and use it later. The string itself is found in string[] rows. My code is as follows:
foreach (string[] row in rows)
{
string row1 = row[0].ToString();
Match rownum = Regex.Match(row1.ToString(), #"\-?\d+\.+?\d+[^%]");
string act = Convert.ToString(rownum); //wouldn't convert match to decimal
decimal actual = Convert.ToDecimal(act);
textBox1.Text = (actual.ToString());
}
This results in "Input string was not in a correct format." Any ideas?
Thanks.
I see two things happening here that could contribute.
You are treating the Regex Match as though you expect it to be a string, but what a Match retrieves is a MatchGroup.
Rather than converting rownum to a string, you need to lookat rownum.Groups[0].
Secondly, you have no parenthesised match to capture. #"(\-?\d+\.+?\d+)%" will create a capture group from the whole lot. This may not matter, I don't know how C# behaves in this circumstance exactly, but if you start stretching your regexes you will want to use bracketed capture groups so you might as well start as you want to go on.
Here's a modified version of your code that changes the regex to use a capturing group and explicitly look for a %. As a consequence, this also simplifies the parsing to decimal (no longer need an intermediary string):
EDIT : check rownum.Success as per executor's suggestion in comments
string[] rows = new [] {"abc -4.01%", "def 6.45%", "monkey" };
foreach (string row in rows)
{
//regex captures number but not %
Match rownum = Regex.Match(row.ToString(), #"(\-?\d+\.+?\d+)%");
//check for match
if(!rownum.Success) continue;
//get value of first (and only) capture
string capture = rownum.Groups[1].Value;
//convert to decimal
decimal actual = decimal.Parse(capture);
//TODO: do something with actual
}
If you're going to use the Match class to handle this, then you have to access the Match.Groups property to get the collection of matches. This class assumes that more than one occurrence appears. If you can guarantee that you'll always get 1 and only 1 you could get it with:
string act = rownum.Groups[0];
Otherwise you'll need to parse through it as in the MSDN documentation.

Categories