How to take only first line from the multiline text

How to take only first line from the multiline text - c#

How can I get only the first line of multiline text using regular expressions?
string test = #"just take this first line
even there is
some more
lines here";
Match m = Regex.Match(test, "^", RegexOptions.Multiline);
if (m.Success)
Console.Write(m.Groups[0].Value);

If you just need the first line, you can do it without using a regex like this
var firstline = test.Substring(0, test.IndexOf(Environment.NewLine));
As much as I like regexs, you don't really need them for everything, so unless this is part of some larger regex exercise, I would go for the simpler solution in this case.

string test = #"just take this first line
even there is
some more
lines here";
Match m = Regex.Match(test, "^(.*)", RegexOptions.Multiline);
if (m.Success)
Console.Write(m.Groups[0].Value);
. is often touted to match any character, while this isn't totally true. . matches any character only if you use the RegexOptions.Singleline option. Without this option, it matches any character except for '\n' (end of line).
That said, a better option is likely to be:
string test = #"just take this first line
even there is
some more
lines here";
string firstLine = test.Split(new string[] {Environment.NewLine}, StringSplitOptions.None)[0];
And better yet, is Brian Rasmussen's version:
string firstline = test.Substring(0, test.IndexOf(Environment.NewLine));

Try this one:
Match m = Regex.Match(test, #".*\n", RegexOptions.Multiline);

This kind of line replaces rest of text after linefeed with empty string.
test = Regex.Replace(test, "(\n.*)$", "", RegexOptions.Singleline);
This will work also properly if string does not have linefeed - then no replacement will be done.

Related

Regular expression to find all words which starts with white space and ends with white space

I need to find words in a string with starting and ending white space. I am finding issues while searching white spaces. However, I could achieve the below. Starts and ends with ##. Any help with whitespaces will be great.
string input = "##12## ##13##";
foreach (Match match in Regex.Matches(input, #"##\b\S+?\b##"))
{
messagebox.show(match.Groups[1].Value);
}

From MSDN doc:
// Define a regular expression for repeated words.
Regex rx = new Regex(#"\b(?<word>\w+)\s+(\k<word>)\b",
RegexOptions.Compiled | RegexOptions.IgnoreCase);

\s+(?=</)
is that expression you're after. It means one or more white-space characters followed by

In my opinion it is betetr to use string.Split() instead of Regex:
var wordsArray = s.Split(new []{' '},StringSplitOptions.RemoveEmptyEntries);
it is better to avoid regex if you can achieve the same result easyer with standard string methods.

i cant exactly get what is in your mind but i hope this code can help you:
string[] ha = input.Split(new[] { '#' }, StringSplitOptions.RemoveEmptyEntries);

How to detect dot (.) at the start of a line in C#?

I want to remove a dot (.) if appears at the start of the line, so for example:
hi,
<new line> .
<new line> How are you.
How can I remove this line?

Remove a dot at the start of a line:
resultString = Regex.Replace(subjectString, #"^\.", "", RegexOptions.Multiline);
Remove an entire line if it starts with a dot:
resultString = Regex.Replace(subjectString, #"^\..*\r\n", "", RegexOptions.Multiline);
Remove an entire line if it contains only a dot:
resultString = Regex.Replace(subjectString, #"^\.\r\n", "", RegexOptions.Multiline);
Remove an entire line if it starts with a dot and possibly contains trailing whitespace:
resultString = Regex.Replace(subjectString, #"^\.[^\r\n\S]*\r\n", "", RegexOptions.Multiline);

string result = "hi,\n.\nHow are you.".Replace("\n.\n", "\n");

You could use built-in string-comparisons - there are plenty avialable.
You could use RegEx.
But it all comes to one point --> what have you tried? and please read the docs.
Short possible answer:
// allLines is your List/Array/Enumerable of all lines that need checking
foreach(string line in allLines){
if(!line.Trim().StartsWith("."){
// Do whatever you like with the found string.
line = line.Remove(".",1);
}
}

What does "detect" mean in your case? If it's just to return true/false if a dot occurs, you can just search for "\n." - a line break followed by a dot. You don't even need regex for it:
bool weHaveDot = myString[0] == "." || (myString.IndexOf("\n.") > -1);

What about:
MyLine.Substring(MyLine.IndexOf("<new line> ") + 11).StartsWith(".")

I'm afraid the answer depends on what the question means.
My guess is that you have several lines, and you want to remove those lines that consist of only a dot. Right?
Then the solution would be:
Split the string into a string array, each of which contains one line
Remove all array elements that consist of just a dot
Join the string array back together.
And you don't need a regex for that. (Unless you want to get some experience in regexes; in that case, sorry.)

Regex - Find from both sides only if it has spaces

I need some help on Regex. I need to find a word that is surrounded by whatever element, for example - *. But I need to match it only if it has spaces or nothing on the ether sides. For example if it is at start of the text I can't really have space there, same for end.
Here is what I came up to
string myString = "You will find *me*, and *me* also!";
string findString = #"(\*(.*?)\*)";
string foundText;
MatchCollection matchCollection = Regex.Matches(myString, findString);
foreach (Match match in matchCollection)
{
foundText = match.Value.Replace("*", "");
myString = myString.Replace(match.Value, "->" + foundText + "<-");
match.NextMatch();
}
Console.WriteLine(myString);
You will find ->me<-, and ->me<- also!
Works correct, the problem is when I add * in the middle of text, I don't want it to match then.
Example: You will find *m*e*, and *me* also!
Output: You will find ->m<-e->, and <-me* also!
How can I fix that?

Try the following pattern:
string findString = #"(?<=\s|^)\*(.*?)\*(?=\s|$)";
(?<=\s|^)X will match any X only if preceded by a space-char (\s), or the start-of-input, and
X(?=\s|$) matches any X if followed by a space-char (\s), or the end-of-input.
Note that it will not match *me* in foo *me*, bar since the second * has a , after it! If you want to match that too, you need to include the comma like this:
string findString = #"(?<=[\s,]|^)\*(.*?)\*(?=[\s,]|$)";
You'll need to expand the set [\s,] as you see fit, of course. You might want to add !, ? and . at the very least: [\s,!?.] (and no, . and ? do not need to be escaped inside a character-set!).
EDIT
A small demo:
string Txt = "foo *m*e*, bar";
string Pattern = #"(?<=[\s,]|^)\*(.*?)\*(?=[\s,]|$)";
Console.WriteLine(Regex.Replace(Txt, Pattern, ">$1<"));
which would print:
>m*e<

You can add "beginning of line or space" and "space or end of line" around your match:
(^|\s)\*(.*?)\*(\s|$)
You'll now need to refer to the middle capture group for the match string.

problem in regular expression

I am having a regular expression
Regex r = new Regex(#"(\s*)([A|B|C|E|G|H|J|K|L|M|N|P|R|S|T|V|Y|X]\d(?!.*[DFIOQU])(?:[A-Z](\s?)\d[A-Z]\d))(\s*)",RegexOptions.IgnoreCase);
and having a string
string test="LJHLJHL HJGJKDGKJ JGJK C1C 1C1 LKJLKJ";
I have to fetch C1C 1C1.This running fine.
But if a modify test string as
string test="LJHLJHL HJGJKDGKJ JGJK C1C 1C1 ON";
then it is unable to find the pattern i.e C1C 1C1.
any idea why this expression is failing?

You have a negative look ahead:
(?!.*[DFIOQU])
That matches the "O" in "ON" and since it is a negative look ahead, the whole pattern fails. And, as an aside, I think you want to replace this:
[A|B|C|E|G|H|J|K|L|M|N|P|R|S|T|V|Y|X]
With this:
[A-CEGHJ-NPR-TVYX]
A pipe (|) is a literal character inside a character class, not an alternation, and you can use ranges to help hilight the characters that you're leaving out.
A single regex might not be the best way to parse that string. Or perhaps you just need a looser regex.

You are searching for a not a following DFIOQU with your negative look ahead (?!.*[DFIOQU])
In your second string there is a O at the end in ON, so it must be failing to match.
If you remove the .* in your negative look ahead it will only check the directly following character and not the complete string to the end (Is it this what you want?).
\s*([ABCEGHJKLMNPRSTVYX]\d(?![DFIOQU])(?:[A-Z]\s?\d[A-Z]\d))\s*
then it works, see it here on Regexr. It is now checking if there is not one of the characters in the class directly after the digit, I don't know if this is intended.
Btw. I removed the | from your first character class, its not needed and also some brackets around your whitespaces, also not needed.

As I understood you need to find the C1C 1C1 text in your string
I've used this regex for do this
string strRegex = #"^.*(?<c1c>C1C)\s*(?<c1c2>1C1).*$";
after that you can extract text from named groups
string strRegex = #"^.*(?<c1c>C1C)\s*(?<c1c2>1C1).*$";
RegexOptions myRegexOptions = RegexOptions.Multiline;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = #"LJHLJHL HJGJKDGKJ JGJK C1C 1C1 LKJLKJ";
string secondStr = "LJHLJHL HJGJKDGKJ JGJK C1C 1C1 ON";
Match match = myRegex.Match(strTargetString);
string c1c = match.Groups["c1c"].Value;
string c1c2 = match.Groups["c1c2"].Value;
Console.WriteLine(c1c + " " +c1c2);

Why doesn't $ always match to an end of line

Below is a simple code snippet that demonstrates the seemingly buggy behavior of end of line matching ("$") in .Net regular expressions. Am I missing something obvious?
string input = "Hello\nWorld\n";
string regex = #"^Hello\n^World\n"; //Match
//regex = #"^Hello\nWorld\n"; //Match
//regex = #"^Hello$"; //Match
//regex = #"^Hello$World$"; //No match!!!
//regex = #"^Hello$^World$"; //No match!!!
Match m = Regex.Match(input, regex, RegexOptions.Multiline | RegexOptions.CultureInvariant);
Console.WriteLine(m.Success);

$ does not consume the newline character(s). #"^Hello$\s+^World$" should match.

The $ doesn't match a newline. It matches the end of the string in which the pattern is applied (unless multiline mode is enabled). There isn't much sense in having two ends in a string.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to take only first line from the multiline text - c#

How can I get only the first line of multiline text using regular expressions? string test = #"just take this first line even there is some more lines here"; Match m = Regex.Match(test, "^", RegexOptions.Multiline); if (m.Success) Console.Write(m.Groups[0].Value);

Try this one: Match m = Regex.Match(test, #".*\n", RegexOptions.Multiline);

This kind of line replaces rest of text after linefeed with empty string. test = Regex.Replace(test, "(\n.*)$", "", RegexOptions.Singleline); This will work also properly if string does not have linefeed - then no replacement will be done.

Related

Regular expression to find all words which starts with white space and ends with white space

How to detect dot (.) at the start of a line in C#?

Regex - Find from both sides only if it has spaces

problem in regular expression

Why doesn't $ always match to an end of line

Categories

Resources