Remove First and Last Specific Char - c#

I dont know is that duplicate or not,but i not found same(maybe i not found because it got hard title)
So,i have a this string:
string a = "(Hello(World),World(Hello))";
And i need to remove a first Bracket,and last Bracket.
And get that output:
Hello(World),World(Hello)
I not need to remove first char and last.
I need to remove first specific char(bracket) and last specific char(close bracket).
That says,if string is be:
string a = "gyfw(Hello(World),World(Hello))";
Output is be:
gyfw Hello(World),World(Hello)

To remove first specific char:
a = a.Remove(a.IndexOf("("), 1);
To remove last specific char:
a = a.Remove(a.LastIndexOf(")"), 1);

In a balanced way, it can be done with this regex
Find #"\(((?>[^()]+|\((?<Depth>)|\)(?<-Depth>))*(?(Depth)(?!)))\)"
Replace #"$1"
If it is required to have inner parens change the * to a +.
If it is required that it should only match once and span the string, add ^ and $ to beginning / end respectively to the regex.
Here is the regex explained
\( # Match ( a open parenth
( # (1 start), Capture the core, to be written back
(?> # Then either match (possessively):
[^()]+ # Any character except parenths
| # or
\( # Open ( increase the paren counter
(?<Depth> )
| # or
\) # Close ) decrease the paren counter
(?<-Depth> )
)* # Repeat as needed.
(?(Depth) # Assert that the paren counter is at zero.
(?!)
)
) # (1 end)
\) # Match ) a closing parenth

Use the String.Substring() method to remove specific character.
So, if your string is stored in a variable myval:
myval = myval.Substring(1, myval.Length - 1);

Related

Regex to get square brackets containing numbers only but are not within square brackets themselves

Sample String
"[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
The regex should match
[1448472995] [1448472995]
and should not match [000112] since there is outer square bracket.
Currently I have this regex that is matching [000112] as well
const string unixTimeStampPattern = #"\[([0-9]+)]";
This is a good way to do it using balanced text.
( \[ \d+ \] ) # (1)
| # or,
\[ # Opening bracket
(?> # Then either match (possessively):
[^\[\]]+ # non - brackets
| # or
\[ # [ increase the bracket counter
(?<Depth> )
| # or
\] # ] decrease the bracket counter
(?<-Depth> )
)* # Repeat as needed.
(?(Depth) # Assert that the bracket counter is at zero
(?!)
)
\] # Closing bracket
C# sample
string sTestSample = "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
Regex RxBracket = new Regex(#"(\[\d+\])|\[(?>[^\[\]]+|\[(?<Depth>)|\](?<-Depth>))*(?(Depth)(?!))\]");
Match bracketMatch = RxBracket.Match(sTestSample);
while (bracketMatch.Success)
{
if (bracketMatch.Groups[1].Success)
Console.WriteLine("{0}", bracketMatch);
bracketMatch = bracketMatch.NextMatch();
}
Output
[1448472995]
[1448472995]
You need to use balancing groups to handle this - it looks a bit daunting but isn't all that complicated:
Regex regexObj = new Regex(
#"\[ # Match opening bracket.
\d+ # Match a number.
\] # Match closing bracket.
(?= # Assert that the following can be matched ahead:
(?> # The following group (made atomic to avoid backtracking):
[^\[\]]+ # One or more characters except brackets
| # or
\[ (?<Depth>) # an opening bracket (increase bracket counter)
| # or
\] (?<-Depth>) # a closing bracket (decrease bracket counter, can't go below 0).
)* # Repeat ad libitum.
(?(Depth)(?!)) # Assert that the bracket counter is now zero.
[^\[\]]* # Match any remaining non-bracket characters
\z # until the end of the string.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);
Are you just trying to capture the unix time stamp? Then you can try a simpler one where you specify the minimum number of characters matched in a group.
\[([0-9]{10})\]
Here I limit it to 10 characters since I doubt the time stamp will hit 11 characters anytime soon... To protect against that:
\[([0-9]{10,11})\]
Of course this could lead to false positives if you have a 10-length number in an enclosing bracket.
This will match your expression as expected: http://regexr.com/3csg3 it uses lookahead.

How can i match inner expression on nested expression with regular expressions?

I got this code on c#
This works:
string code = "dqwdSTART12sdaSTART12312ENDsdfSTARTasdsaENDasdaENDqwe";
string pattern = "START[^(START)(END)]*END";
But not this:
string code = "dqwdstart12sdastart12312endsdfstartasdsaendasdaendqwe";
string pattern = "start[^(start)(end)]*end";
How can i do the match ?
( preferably c # )
this pattern [^(start)(end)] does not mean what you think, it does not mean non of the words but non of the characters enclosed between [ and ]
the only reason why it worked is because you had numbers between start and end, if you add a letter like s it won't work.
use this pattern instead
START((?:(?!START|END).)*)END
with gi options
Demo
START # "START"
( # Capturing Group (1)
(?: # Non Capturing Group
(?! # Negative Look-Ahead
START # "START"
| # OR
END # "END"
) # End of Negative Look-Ahead
. # Any character except line break
) # End of Non Capturing Group
* # (zero or more)(greedy)
) # End of Capturing Group (1)
END # "END"
(?<=start)(?:(?!start|end).)*(?=end)
You can try this as well if you dont want to capture start and end and just the content between.See demo,
http://regex101.com/r/yP3iB0/23

RegEx to match a number in the second line

I need a regex to match a number in the second line. Similar input is like this:
^C1.1
xC20
SS3
M 4
Decimal pattern (-?\d+(\.\d+)?) matches all numbers and second number can be get in a loop on the code behind but I need a regular expression to get directly the number in the second line.
/^[^\r\n]*\r?\n\D*?(-?\d+(\.\d+)?)/
This operates by capturing a single line at the beginning of the input:
^ Beginning of the string
[^\r\n]* Anything that isn't a line terminator
\r?\n A newline, optionally preceded by a carriage return
Then all the non digit characters, then your numbers.
Since you've now repeatedly changed your needs, try this on for size:
/(?<=\n\D*)-?\d+(\.\d+)?/
I was able to capture it with this regex.
.*\n\D*(\d*).*\n
Check out group 1 of anything that this matches:
^.*?\r\n.*?(\d+)
If that doesn't work, try this:
^.*?\r\n.*?(\d+)
Both are with multiline NOT set...
I would probably use the captured group in /^.*?\r?\n.*?(-?\d+(?:\.\d+)?)/ where…
^ # beginning of string
.*? # anything...
\r?\n # followed by a new line
.*? # anything...
( # followed by...
-? # an optional negative sign (minus)
\d+ # a number
(?: # -this part not captured explicitly-
\.\d+ # a dot and a number
)? # -and is optional-
)
If it is a flavor that supports lookbehind then there are other alternatives.

C# Regular Expression excluding a string

I got a collection of string and all i want for regex is to collect all started with http..
href="http://www.test.com/cat/1-one_piece_episodes/"href="http://www.test.com/cat/2-movies_english_subbed/"href="http://www.test.com/cat/3-english_dubbed/"href="http://www.exclude.com"
this is my regular expression pattern..
href="(.*?)[^#]"
and return this
href="http://www.test.com/cat/1-one_piece_episodes/"
href="http://www.test.com/cat/2-movies_english_subbed/"
href="http://www.xxxx.com/cat/3-english_dubbed/"
href="http://www.exclude.com"
what is the pattern for excluding the last match.. or excluding matches that has the exclude domain inside like href="http://www.exclude.com"
EDIT:
for multiple exclusion
href="((?:(?!"|\bexclude\b|\bxxxx\b).)*)[^#]"
#ridgerunner and me would change the regex to:
href="((?:(?!\bexclude\b)[^"])*)[^#]"
It matches all href attributes as long as they don't end in # and don't contain the word exclude.
Explanation:
href=" # Match href="
( # Capture...
(?: # the following group:
(?! # Look ahead to check that the next part of the string isn't...
\b # the entire word
exclude # exclude
\b # (\b are word boundary anchors)
) # End of lookahead
[^"] # If successful, match any character except for a quote
)* # Repeat as often as possible
) # End of capturing group 1
[^#]" # Match a non-# character and the closing quote.
To allow multiple "forbidden words":
href="((?:(?!\b(?:exclude|this|too)\b)[^"])*)[^#]"
Your input doesn't look like a valid string (unless you escape the quotes in them) but you can do it without regex too:
string input = "href=\"http://www.test.com/cat/1-one_piece_episodes/\"href=\"http://www.test.com/cat/2-movies_english_subbed/\"href=\"http://www.test.com/cat/3-english_dubbed/\"href=\"http://www.exclude.com\"";
List<string> matches = new List<string>();
foreach(var match in input.split(new string[]{"href"})) {
if(!match.Contains("exclude.com"))
matches.Add("href" + match);
}
Will this do the job?
href="(?!http://[^/"]+exclude.com)(.*?)[^#]"

Regular expression to find separator dots in formula

The C# expression library I am using will not directly support my table/field parameter syntax:
The following are table/field parameter names that are not directly supported:
TableName1.FieldName1
[TableName1].[FieldName1]
[Table Name 1].[Field Name 1]
It accepts alphanumeric parameters without spaces, or most characters enclosed within square brackets. I would like to use C# regular expressions to replace the dot separators and neighboring brackets to a different delimiter, so the results would be as follows:
[TableName1|FieldName1]
[TableName1|FieldName1]
[Table Name 1|Field Name 1]
I also need to skip any string literals within single quotes, like:
'TableName1.FieldName1'
And, of course, ignore any numeric literals like:
12345.6789
EDIT: Thank you for your feedback on improving my question. Hopefully it is clearer now.
I've written a completely new answer, now that the problem is clarified:
You can do this in a single regex. It is quite bulletproof, I think, but as you can see, it's not exactly self-explanatory, which is why I've commented it liberally. Hope it makes sense.
You're lucky that .NET allows re-use of named capturing groups, otherwise you would have had to do this in several steps.
resultString = Regex.Replace(subjectString,
#"(?: # Either match...
(?<before> # (and capture into backref <before>)
(?=\w*\p{L}) # (as long as it contains at least one letter):
\w+ # one or more alphanumeric characters,
) # (End of capturing group <before>).
\. # then a literal dot,
(?<after> # (now capture again, into backref <after>)
(?=\w*\p{L}) # (as long as it contains at least one letter):
\w+ # one or more alphanumeric characters.
) # (End of capturing group <after>) and end of match.
| # Or:
\[ # Match a literal [
(?<before> # (now capture into backref <before>)
[^\]]+ # one or more characters except ]
) # (End of capturing group <before>).
\]\.\[ # Match literal ].[
(?<after> # (capture into backref <after>)
[^\]]+ # one or more characters except ]
) # (End of capturing group <after>).
\] # Match a literal ]
) # End of alternation. The match is now finished, but
(?= # only if the rest of the line matches either...
[^']*$ # only non-quote characters
| # or
[^']*'[^']*' # contains an even number of quote characters
[^']* # plus any number of non-quote characters
$ # until the end of the line.
) # End of the lookahead assertion.",
"[${before}|${after}]", RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

Categories