C# Get line of multiline String starting with specific word - c#

I have a multiline string, say
abcde my first line
fghij my second line
klmno my third line
All of this is one String, but what I want to do now is to get the content (substring) of this string which is starting with a specific word, for example "fghij". So if I do a method and pass "fghij" to it, it should return me "fghij my second line" in that case.
The following I tried, but it does not work, unfortunately m.Success is always false:
String getLineBySubstring(String myInput, String mySubstring)
{
Match m = Regex.Match(myInput, "^(" + mySubstring + "*)", RegexOptions.Multiline);
Console.WriteLine("getLineBySubstring operation: " + m.Success);
if (m.Success == true)
{
return m.Groups[0].Value;
}
else
{
return "NaN";
}
}

The * operator is currently quantifying the last letter in mySubstring. You need to precede the operator with . to eat up the rest of the characters on the given line. No need for grouping either.
Match m = Regex.Match(myInput, "^" + mySubstring + ".*", RegexOptions.Multiline);
if (m.Success) {
// return m.Value
}
Ideone Demo

You are almost there, just change the * char to [^\r\n]+
Match m = Regex.Match(myInput, "^(" + mySubstring + "[^\n\r]+)", RegexOptions.Multiline);
[^\r\n]+ will match any character, but \r and \n, which are used to mark a new line.

Try to add line ending $ to your regex. Also * concatenated to mySubstring specifies repeat of the last symbol in mySubstring, you should have .* to catch all the possible ones.
Regex.Match(myInput, "^(" + mySubstring + ".*)$", RegexOptions.Multiline);

If you need to check that string starts with some substring, then you should avoid Regex. Just split whole string to lines and check each line with StartsWith.
String getLineBySubstring(String myInput, String mySubstring)
{
string[] lines = myInput.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
foreach (var line in lines)
if (line.StartsWith(mySubstring))
return line;
return "NaN";
}

Related

How to find and get string after a string known values in a text file c#

I want to find and get a string after a string known values in a text file with c#
My text file:
function PreloadFiles takes nothing returns nothing
call Preload( "=== Save ===" )
call Preload( "Player: Michael" )
call Preload( "-load1 UvjkiJyjLlPN1o7FCAwQ0en80t769u5uBKAL1t0u0Cajk86WNmp83F" )
call Preload( "-load2 IMdOIPKGSDFXStx4Zd4LAvAaBmHW19rxsvSNF6kaObSFyBzGq8skYGuq0T1eW" )
call Preload( "-load3 Bd6MoyqnfDydBbwqGApWii3mabJpwNvjcwrKLI0r6UU2wadrMV1h7WQ8D6" )
call Preload( "-load4 D5kI18Flk5bJ4Oi7vQw33b5LHDXHGgJNYsiC6VNJDAHe1" )
call Preload( "KEY PASS: 3568" )
endfunction
i want to get string after string "-load1" ,"-load2" ,"-load3" ,"-load4" ,"KEY PASS: " and fill them on 5 Textbox
like that
UvjkiJyjLlPN1o7FCAwQ0en80t769u5uBKAL1t0u0Cajk86WNmp83F
IMdOIPKGSDFXStx4Zd4LAvAaBmHW19rxsvSNF6kaObSFyBzGq8skYGuq0T1eW
Bd6MoyqnfDydBbwqGApWii3mabJpwNvjcwrKLI0r6UU2wadrMV1h7WQ8D6
D5kI18Flk5bJ4Oi7vQw33b5LHDXHGgJNYsiC6VNJDAHe1
3568
Please help me
Thanks you!
you can use
string Substring (int startIndex);
like:
string in1 = "-load1 UvjkiJyjLlPN1o7FCAwQ0en80t769u5uBKAL1t0u0Cajk86WNmp83F";
string out = in1.substring(7);
it returns:
"UvjkiJyjLlPN1o7FCAwQ0en80t769u5uBKAL1t0u0Cajk86WNmp83F"
It is possible to do with Regex class (from System.Text.RegularExpressions namespace).
Patterns examples:
for -loadN ... string: " [A-Za-z0-9]*\" ". It means Regex should look for substring which starts with whitespace " " contains some amount of chars (A-z) (of any case) or digits (0-9) and ends with double quote \" and whitespace " ". Such as yours UvjkiJyjLlP..." .
for KEY PASS: ... string: #"KEY PASS: (\d{4})". This means Regex should find a substring which contains "KEYPASS: " text and some string of 4 digits and with whitespace " " between them.
But aware, it's very unsafe, because Regex patterns is very sensitive.
For example,
"-loaddd1 AbCdEfG..." (extra chars)
"-load1 AbCdEfG..." (multiple whitespaces)
"KEY PASS: 12345" (pattern in example below looks strictly only for 4 digits, not 5 or more or less)
"-LOAD1 AbCdEfG..." (uppercased)
etc.
This ones will be ignored (last, btw, could be solved by passing RegexOptions.IgnoreCase into Regex.Match(line, pattern, RegexOptions.IgnoreCase)). Others could be solved too, but you should know that this cases are possible.
For a provided in question example this code works fine:
string loadPattern = " [A-Za-z0-9]*\" ";
string keyPassPattern = #"KEY PASS: (\d{4})";
List<string> capturedValues = new List<string>();
foreach (string line in File.ReadAllLines("Preload.txt"))
{
string s;
if (Regex.IsMatch(line, loadPattern) && line.Contains("-load"))
{
// Getting captured substring and trimming from trailing whitespace and quote
s = Regex.Match(line, loadPattern, RegexOptions.IgnoreCase).Value.Trim('\"', ' ');
capturedValues.Add(s);
}
else if (Regex.IsMatch(line, keyPassPattern))
{
// Just replacing "KEY PASS: " to empty string
s = Regex.Match(line, keyPassPattern).Value.Replace("KEY PASS: ", "");
capturedValues.Add(s);
}
}
Result:
string s1 = "-load1 UvjkiJyjLlPN1o7FCAwQ0en80t769u5uBKAL1t0u0Cajk86WNmp83F";
String filter = s1.ToString();
String[] filterRemove = filter.Split(' ');
String Value1= filterRemove[1];
In this way, you will get
"UvjkiJyjLlPN1o7FCAwQ0en80t769u5uBKAL1t0u0Cajk86WNmp83F" in value1
in the same way you can do for all the string and combine them.

C# Filter a word with an undefined number of spaces between charachers

For exampe:
I can create a wordt with multiple spaces for example:
string example = "**example**";
List<string>outputs = new List<string>();
string example_output = "";
foreach(char c in example)
{
example_putput += c + " ";
}
And then i can loop it to remve all spaces and add them to the outputs list,
The problem there is. I need it to work in scenario's where there are double spaces and more.
For example.
string text = "This is a piece of text for this **example**.";
I basicly want to detect AND remove 'example'
But, i want to do that even when it says e xample, e x ample or example.
And in my scenaria, since its a spamfilter, i cant just replace the spaces in the whole sentence like below, because i'd need to .Replace( the word with the exact same spaces as the user types it in).
.Replace(" ", "");
How would i achieve this?
TLDR:
I want to filter out a word with multiple spaces combinations without altering any other parts of the line.
So example, e xample, e x ample, e x a m ple
becomes a filter word
I wouldn't mind a method which could generates a word with all spaces as plan b.
You can use this regex to achieve that:
(e[\s]*x[\s]*a[\s]*m[\s]*p[\s]*l[\s]*e)
Link
Dotnet Fiddle
You could use a regex for that: e\s*x\s*a\s*m\s*p\s*l\s*e
\s means any whitespace character and the * means 0-n count of that whitespace.
Small snippet:
const string myInput = "e x ample";
var regex = new Regex("e\s*x\s*a\s*m\s*p\s*l\s*e");
var match = regex.Match(myInput);
if (match.Success)
{
// We have a match! Bad word
}
Here the link for the regex: https://regex101.com/r/VFjzTg/1
I see that the problem is to ignore the spaces in the matchstring, but not touch them anywhere else in the string.
You could create a regular expression out of your matchword, allowing arbitrary whitespace between each character.
// prepare regex. Need to do this only once for many applications.
string findword = "example";
// TODO: would need to escape special chars like * ( ) \ . + ? here.
string[] tmp = new string[findword.Length];
for(int i=0;i<tmp.Length;i++)tmp[i]=findword.Substring(i,1);
System.Text.RegularExpressions.Regex r = new System.Text.RegularExpressions.Regex(string.Join("\\s*",tmp));
// on each text to filter, do this:
string inp = "A text with the exa mple word in it.";
string outp;
outp = r.Replace(inp,"");
System.Console.WriteLine(outp);
Left out the escaping of regex-special-chars for brevity.
You can try regular expressions:
using System.Text.RegularExpressions;
....
// Having a word to find
string toFind = "Example";
// we build the regular expression
Regex regex = new Regex(
#"\b" + string.Join(#"\s*", toFind.Select(c => Regex.Escape(c.ToString()))) + #"\b",
RegexOptions.IgnoreCase);
// Then we apply regex built for the required text:
string text = "This is a piece of text for this **example**. And more (e X amp le)";
string result = regex.Replace(text, "");
Console.Write(result);
Outcome:
This is a piece of text for this ****. And more ()
Edit: if you want to ignore diacritics, you should modify regular expression:
string toFind = "Example";
Regex regex = new Regex(#"\b" + string.Join(#"\s*",
toFind.Select(c => Regex.Escape(c.ToString()) + #"\p{Lm}*")),
RegexOptions.IgnoreCase);
and Normalize text before matching:
string text = "This is a piece of text for this **examplé**. And more (e X amp le)";
string result = regex.Replace(text.Normalize(NormalizationForm.FormD), "");

Replacing only single "\n" in string occurrence

I have a string in C# which can have multiple \n characters. For e.g. :
string tp = "Hello\nWorld \n\n\n !!";
If there is a single occurrence of \n I want to replace it with something, but if more than one \n appear together in the same place I want to leave them alone. So for the tp string above I want to replace the \n between Hello and World, because there is only one at that place, and leave the the three \n nearer the end of the string alone, because they appear in a group.
If I try to use the Replace() method in C# it replaces all of them. How can I resolve this issue?
You can try using regular expressions: let's change \n into "*" whenever \n is single:
using System.Text.RegularExpressions;
...
string tp = "Hello\nWorld \n\n\n !!";
// "Hello*World \n\n\n !!";
string result = Regex.Replace(tp, "\n+", match =>
match.Value.Length > 1
? match.Value // multiple (>1) \n in row: leave intact
: "*"); // single ocurrence: change into "*"
A solution using loops:
char[] c = "\t"+ tp + "\t".ToCharArray();
for(int i = 1; i < c.Length - 1; i++)
if(c[i] == '\n' && c[i-1] != '\n' && c[i+1] != '\n')
c[i] = 'x';
tp = new string(c, 1, c.Length-2);
Use regular expressions and combine negative lookbehind and lookahead:
var test = "foo\nbar...foo\n\nbar\n\n\nfoo\r\nbar";
var replaced = System.Text.RegularExpressions.Regex.Replace(test, "(?<!\n)\n(?!\n)", "_");
// only first and last \n have been replaced
While searching through the input the regex "stops" at any "\n" it finds and verifies if no "\n" is one character behind the current position or ahead.
Thus only single "\n" will be replaced.

Find hashtags in string

I am working on a Xamarin.Forms PCL project in C# and would like to detect all the hashtags.
I tried splitting at spaces and checking if the word begins with an # but the problem is if the post contains two spaces like "Hello #World Test" it would lose that the double space
string body = "Example string with a #hashtag in it";
string newbody = "";
foreach (var word in body.Split(' '))
{
if (word.StartsWith("#"))
newbody += "[" + word + "]";
newbody += word;
}
Goal output:
Example string with a [#hashtag] in it
I also only want it to have A-Z a-z 0-9 and _ stopping at any other character
Test #H3ll0_W0rld$%Test => Test [#H3ll0_W0rld]$%Test
Other Stack questions try to detect the string and extract it, I would like it work with it and put it back in the string without losing anything that methods such as splitting by certain characters would lose.
You can use Regex with #\w+ and $&
Explanation
# matches the character # literally (case sensitive)
\w+ matches any word character (equal to [a-zA-Z0-9_])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
$& Includes a copy of the entire match in the replacement string.
Example
var input = "asdads sdfdsf #burgers, #rabbits dsfsdfds #sdf #dfgdfg";
var regex = new Regex(#"#\w+");
var matches = regex.Matches(input);
foreach (var match in matches)
{
Console.WriteLine(match);
}
or
var result = regex.Replace(input, "[$&]" );
Console.WriteLine(result);
Ouput
#burgers
#rabbits
#sdf
#dfgdfg
asdads sdfdsf [#burgers], [#rabbits] dsfsdfds [#sdf] [#dfgdfg]
Updated Demo here
Another Example
Use a regular expression: \#\w*
string pattern = "\#\w*";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(input);

Regex help for this expression with digits wanted

I am splitting given text wherever eachDELETEDDELETED occours, however some of my files contain text like:
each2,DELETED6,DELETED
eachDELETED2,DELETED
each5,DELETED15,DELETED
each5,DELETED5,DELETED2
I want to do a regex replace and turn these expressions into eachDELETEDDELETED.
I have tried using the follow code:
Regex ra = new Regex(#"eachDELETED\d, DELETED");
MatchCollection mcMatches = ra.Matches(extracted);
foreach (Match m in mcMatches)
{
if (m.Success)
{
// MessageBox.Show(m.Value.ToString());
richTextBox5.Text += "JJJJ------>" +m.Value + "\n";
}
}
But I'm not getting any matches.
The regex each\d*,*DELETED\d*,DELETED\d* matches all the sample data:
each2,DELETED6,DELETED
eachDELETED2,DELETED
each5,DELETED15,DELETED
each5,DELETED5,DELETED2
If the lack of the comma in the second line is a typo, use each\d*,DELETED\d*,DELETED\d*
Basically, \d matches a digit and * means zero or more times.

Categories