C# Get string between two characters in a string

C# Get string between two characters in a string - c#

I have a string like below:
{{"textA","textB","textC"}}
And currently, I'm using below code to split them:
string stringIWantToSplit = "{{\"textA\",\"textB\",\"textC\"}}";
string[] result = stringIWantToSplit.Split(',');
And I can get the below result:
{{"textA"
"textB"
"textC"}}
After that, I can manually trim out the '{' and '}' to get the final result, but here is the problem:
If the string is like below:
`{{"textA","textB,textD","textC"}}`
Then the result will be different from Expected result
Expected result:
"textA"
"textB,textD"
"textC"
Actual result:
{{"textA"
"textB
textD"
"textC"}}
How can I get the string between two double quotes?
Updated:
Just now when I checked the data, I found that some of them contains decimals i.e.
{{"textA","textB","",0,9.384,"textC"}}
Currently, I'm trying to use Jenish Rabadiya's approach, and the regex I'm using is
(["'])(?:(?=(\\?))\2.)*?\1
but with this regex, the numbers aren't selected, how to modify it so that the numbers / decimal can be selected?

Try using regex like following.
Regex regex = new Regex(#"([""'])(?:(?=(\\?))\2.)*?\1");
foreach (var match in regex.Matches("{{\"textA\",\"textB\",\"textC\"}}"))
{
Console.WriteLine(match);
}
Here is working dotnet fiddle => Link

Assuming your string will always look like your examples, you can use a simple regular expression to get your strings out:
string s = "{{\"textA\",\"textB,textD\",\"textC\"}}";
foreach (Match m in Regex.Matches(s, "\\\".*?\\\""))
{
//do stuff
}

I think this will help you,
List<string> specialChars = new List<string>() {",", "{{","}}" };
string stringIWantToSplit = "{{\"textA\",\"textB,textD\",\"textC\"}}";
string[] result = stringIWantToSplit.Split(new char[] {'"'}, StringSplitOptions.RemoveEmptyEntries)
.Where(text => !specialChars.Contains(text)).ToArray();

Using this regex makes simple:
text = Regex.Replace(text, #"^[\s,]+|[\s,]+$", "");

I finally modified the regex to this:
(["'])(?:(?=(\\?))\2.)*?\1|(\d*\.?\d*)[^"' {},]
And this finally works:
Sample:
https://dotnetfiddle.net/vg4jUh

Related

Using regex to remove everything that is not in between '<#'something'#>' and replace it with commas

I have a string, for example
<#String1#> + <#String2#> , <#String3#> --<#String4#>
And I want to use regex/string manipulation to get the following result:
<#String1#>,<#String2#>,<#String3#>,<#String4#>
I don't really have any experience doing this, any tips?

There are multiple ways to do something like this, and it depends on exactly what you need. However, if you want to use a single regex operation to do it, and you only want to fix stuff that comes between the bracketed strings, then you could do this:
string input = "<#String1#> + <#String2#> , <#String3#> --<#String4#>";
string pattern = "(?<=>)[^<>]+(?=<)";
string replacement = ",";
string result = Regex.Replace(input, pattern, replacement);
The pattern uses [^<>]+ to match any non-pointy-bracket characters, but it combines it with a look-behind statement ((?<=>)) and a look-ahead statement (?=<) to make sure that it only matches text that occurs between a closing and another opening set of brackets.
If you need to remove text that comes before the first < or after the last >, or if you find the look-around statements confusing, you may want to consider simply matching the text that comes between the brackets and then loop through all the matches and build a new string yourself, rather than using the RegEx.Replace method. For instance:
string input = "sdfg<#String1#> + <#String2#> , <#String3#> --<#String4#>ag";
string pattern = #"<[^<>]+>";
List<String> values = new List<string>();
foreach (Match m in Regex.Matches(input, pattern))
values.Add(m.Value);
string result = String.Join(",", values);
Or, the same thing using LINQ:
string input = "sdfg<#String1#> + <#String2#> , <#String3#> --<#String4#>ag";
string pattern = #"<[^<>]+>";
string result = String.Join(",", Regex.Matches(input, pattern).Cast<Match>().Select(x => x.Value));

If you're just after string manipulation and don't necessarily need a regex, you could simply use the string.Replace method.
yourString = yourString.Replace("#> + <#", "#>,<#");

Get only Whole Words from a .Contains() statement

I've used .Contains() to find if a sentence contains a specific word however I found something weird:
I wanted to find if the word "hi" was present in a sentence which are as follows:
The child wanted to play in the mud
Hi there
Hector had a hip problem
if(sentence.contains("hi"))
{
//
}
I only want the SECOND sentence to be filtered however all 3 gets filtered since CHILD has a 'hi' in it and hip has a 'hi' in it. How do I use the .Contains() such that only whole words get picked out?

Try using Regex:
if (Regex.Match(sentence, #"\bhi\b", RegexOptions.IgnoreCase).Success)
{
//
};
This works just fine for me on your input text.

Here's a Regex solution:
Regex has a Word Boundary Anchor using \b
Also, if the search string might come from user input, you might consider escaping the string using Regex.Escape
This example should filter a list of strings the way you want.
string findme = "hi";
string pattern = #"\b" + Regex.Escape(findme) + #"\b";
Regex re = new Regex(pattern,RegexOptions.IgnoreCase);
List<string> data = new List<string> {
"The child wanted to play in the mud",
"Hi there",
"Hector had a hip problem"
};
var filtered = data.Where(d => re.IsMatch(d));
DotNetFiddle Example

You could split your sentence into words - you could split at each space and then trim any punctuation. Then check if any of these words are 'hi':
var punctuation = source.Where(Char.IsPunctuation).Distinct().ToArray();
var words = sentence.Split().Select(x => x.Trim(punctuation));
var containsHi = words.Contains("hi", StringComparer.OrdinalIgnoreCase);
See a working demo here: https://dotnetfiddle.net/AomXWx

You could write your own extension method for string like:
static class StringExtension
{
public static bool ContainsWord(this string s, string word)
{
string[] ar = s.Split(' ');
foreach (string str in ar)
{
if (str.ToLower() == word.ToLower())
return true;
}
return false;
}
}

Splitting string on multi-character delimeter

string Idstr="ID03I010102010210AEMPD4677EID03I020102020208L8159734ID03I030102030210IPS1406974PT03T010109981815938030202PT03T0201109899488666030201PT03T0301109818159381030203PT03T040112919818159381030201";
string[] stringSeparators = new string[] { "ID03I0" };
string[] result;
result = IdStr.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);
This is the result:
result[0]=10102010210AEMPD4677E
result[1]=20102020208L8159734
result[3]=30102030210IPS1406974PT03T010109981815938030202PT03T0201109899488666030201PT03T0301109818159381030203PT03T040112919818159381030201
Desired result:
result[0]=ID03I010102010210AEMPD4677E
result[1]=ID03I020102020208L8159734
result[3]=ID03I030102030210IPS1406974PT03T010109981815938030202PT03T0201109899488666030201PT03T0301109818159381030203PT03T040112919818159381030201
As you can see I want to include delimiter ID03I0 to the elements.
NOTE: I know I can include it by hardcoding it. But that's not the way I want to do it.

result = IdStr.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries)
.Select(x => stringSeparators[0] + x).ToArray();
This adds the seperator to the beginning at every element within your array.
EDIT: Unfortunately with this approach you are limited to use just one single delimiter. So if you want to add more you´d use Regex instead.

Following Regex pattern should work.
string input = "ID03I010102010210AEMPD4677EID03I020102020208L8159734ID03I030102030210IPS1406974PT03T010109981815938030202PT03T0201109899488666030201PT03T0301109818159381030203PT03T040112919818159381030201";
string delimiter = "ID03I0";//Modify it as you need
string pattern = string.Format("(?<=.)(?={0})", delimiter);
string[] result = Regex.Split(input, pattern);
Online Demo
Adapted from this answer.

Regular expression to split string and number

I have a string of the form:
codename123
Is there a regular expression that can be used with Regex.Split() to split the alphabetic part and the numeric part into a two-element string array?

I know you asked for the Split method, but as an alternative you could use named capturing groups:
var numAlpha = new Regex("(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)");
var match = numAlpha.Match("codename123");
var alpha = match.Groups["Alpha"].Value;
var num = match.Groups["Numeric"].Value;

splitArray = Regex.Split("codename123", #"(?<=\p{L})(?=\p{N})");
will split between a Unicode letter and a Unicode digit.

Regex is a little heavy handed for this, if your string is always of that form. You could use
"codename123".IndexOfAny(new char[] {'1','2','3','4','5','6','7','8','9','0'})
and two calls to Substring.

A little verbose, but
Regex.Split( "codename123", #"(?<=[a-zA-Z])(?=\d)" );
Can you be more specific about your requirements? Maybe a few other input examples.

IMO, it would be a lot easier to find matches, like:
Regex.Matches("codename123", #"[a-zA-Z]+|\d+")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
rather than to use Regex.Split.

Well, is a one-line only: Regex.Split("codename123", "^([a-z]+)");

Another simpler way is
string originalstring = "codename123";
string alphabets = string.empty;
string numbers = string.empty;
foreach (char item in mainstring)
{
if (Char.IsLetter(item))
alphabets += item;
if (Char.IsNumber(item))
numbers += item;
}

this code is written in java/logic should be same elsewhere
public String splitStringAndNumber(String string) {
String pattern = "(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(string);
if (m.find()) {
return (m.group(1) + " " + m.group(2));
}
return "";
}

How can I split a string using regex to return a list of values?

How can I take the string foo[]=1&foo[]=5&foo[]=2 and return a collection with the values 1,5,2 in that order. I am looking for an answer using regex in C#. Thanks

In C# you can use capturing groups
private void RegexTest()
{
String input = "foo[]=1&foo[]=5&foo[]=2";
String pattern = #"foo\[\]=(\d+)";
Regex regex = new Regex(pattern);
foreach (Match match in regex.Matches(input))
{
Console.Out.WriteLine(match.Groups[1]);
}
}

I don't know C#, but...
In java:
String[] nums = String.split(yourString, "&?foo[]");
The second argument in the String.split() method is a regex telling the method where to split the String.

I'd use this particular pattern:
string re = #"foo\[\]=(?<value>\d+)";
So something like (not tested):
Regex reValues = new Regex(re,RegexOptions.Compiled);
List<integer> values = new List<integer>();
foreach (Match m in reValues.Matches(...putInputStringHere...)
{
values.Add((int) m.Groups("value").Value);
}

Use the Regex.Split() method with an appropriate regex. This will split on parts of the string that match the regular expression and return the results as a string[].
Assuming you want all the values in your querystring without checking if they're numeric, (and without just matching on names like foo[]) you could use this: "&?[^&=]+="
string[] values = Regex.Split(“foo[]=1&foo[]=5&foo[]=2”, "&?[^&=]+=");
Incidentally, if you're playing with regular expressions the site http://gskinner.com/RegExr/ is fantastic (I'm just a fan).

Assuming you're dealing with numbers this pattern should match:
/=(\d+)&?/

This should do:
using System.Text.RegularExpressions;
Regex.Replace(s, !#"^[0-9]*$”, "");
Where s is your String where you want the numbers to be extracted.

Just make sure to escape the ampersand like so:
/=(\d+)\&/

Here's an alternative solution using the built-in string.Split function:
string x = "foo[]=1&foo[]=5&foo[]=2";
string[] separator = new string[2] { "foo[]=", "&" };
string[] vals = x.Split(separator, StringSplitOptions.RemoveEmptyEntries);

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Get string between two characters in a string - c#

Try using regex like following. Regex regex = new Regex(#"([""'])(?:(?=(\\?))\2.)*?\1"); foreach (var match in regex.Matches("{{\"textA\",\"textB\",\"textC\"}}")) { Console.WriteLine(match); } Here is working dotnet fiddle => Link

Assuming your string will always look like your examples, you can use a simple regular expression to get your strings out: string s = "{{\"textA\",\"textB,textD\",\"textC\"}}"; foreach (Match m in Regex.Matches(s, "\\\".*?\\\"")) { //do stuff }

Using this regex makes simple: text = Regex.Replace(text, #"^[\s,]+|[\s,]+$", "");

I finally modified the regex to this: (["'])(?:(?=(\\?))\2.)?\1|(\d\.?\d*)[^"' {},] And this finally works: Sample: https://dotnetfiddle.net/vg4jUh

Related

Using regex to remove everything that is not in between '<#'something'#>' and replace it with commas

Get only Whole Words from a .Contains() statement

Splitting string on multi-character delimeter

Regular expression to split string and number

How can I split a string using regex to return a list of values?

Categories

Resources

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# Get string between two characters in a string - c#

Try using regex like following. Regex regex = new Regex(#"([""'])(?:(?=(\\?))\2.)*?\1"); foreach (var match in regex.Matches("{{\"textA\",\"textB\",\"textC\"}}")) { Console.WriteLine(match); } Here is working dotnet fiddle => Link

Assuming your string will always look like your examples, you can use a simple regular expression to get your strings out: string s = "{{\"textA\",\"textB,textD\",\"textC\"}}"; foreach (Match m in Regex.Matches(s, "\\\".*?\\\"")) { //do stuff }

Using this regex makes simple: text = Regex.Replace(text, #"^[\s,]+|[\s,]+$", "");

I finally modified the regex to this: (["'])(?:(?=(\\?))\2.)*?\1|(\d*\.?\d*)[^"' {},] And this finally works: Sample: https://dotnetfiddle.net/vg4jUh

Related

Using regex to remove everything that is not in between '<#'something'#>' and replace it with commas

Get only Whole Words from a .Contains() statement

Splitting string on multi-character delimeter

Regular expression to split string and number

How can I split a string using regex to return a list of values?

Categories

Resources

I finally modified the regex to this: (["'])(?:(?=(\\?))\2.)?\1|(\d\.?\d*)[^"' {},] And this finally works: Sample: https://dotnetfiddle.net/vg4jUh