Regex, ignore non letter characters at before capital letter - c#

I need to make a Regex string which matches server address taken from a file. The address always start witha capital letter. The lines in the file are in the form:
# note: first entry will be initial default
London, lonxx:33333
New York, NyC:222222
~CloudLondon, Clon:55555
I want to make a regex which takes each line starting from the upper case letter so in the case of CloudLondon it should match only "CloudLondon, Clon:55555" without the "~" .
I have the regex for the rest:
^[A-Z](?<Location>[\w\s]+)\s*,\s*(?<Server>\w+):(?<Port>\d+)$
but how can I ignore the characters at the beginning of the line until the first Capital letter?
Thanks to anybody who is going to answer.

You can remove the anchor ^ and move the character class into the group Location.
\b(?<Location>[A-Z][\w\s]+)\s*,\s*(?<Server>\w+):(?<Port>\d+)$
See a regex demo for the group values.

Related

Trying to space words using Regex

I have a regex that is able to space words correctly, however, if something has a capitalized shortcode, it will not work.
what I'm trying to do is turn something like "TSTApplicationType" into TST Application Type".
Currently, I'm using Regex.Replace(value, "([a-z])_?([A-Z])", "$1 $2") to add the spaces to the words, however this just turns it into "TSTApplication Type".
You may use either of the two:
// Details on Approach 1
Regex.Replace(text, #"\p{Lu}{2,}(?=\p{Lu})|(?>\p{Lu}\p{Ll}*)(?!$)", "$& ")
// Details on Approach 2
Regex.Replace(text, #"(?<=\p{Lu})(?=\p{Lu}\p{Ll})|(?<=\p{Ll})(?=\p{Lu})", " ")
See regex demo #1 and regex demo #2
Details on Approach 1
\p{Lu}{2,}(?=\p{Lu})|(?>\p{Lu}\p{Ll}*)(?!$) matches
\p{Lu}{2,}(?=\p{Lu}) - 2 or more uppercase letters followed with an uppercase letter
| - or
(?>\p{Lu}\p{Ll}*)(?!$) - an uppercase letter and then 0 or more lowercase letters not at the end of string.
The replacement is the whole match (referenced with $&) and a space.
Details on Approach 2
This is a common approach that is basically inserting a space in between an uppercase letter and an uppercase letter followed with a lowercase letter ((?<=\p{Lu})(?=\p{Lu}\p{Ll})) or (|) between a lowercase letter and an uppercase letter (see (?<=\p{Ll})(?=\p{Lu})).
If you don't mind using Humanizer they also have this as well when you try to do .Humanize() on a string. This however doesn't preserve casing, but would be another option if you actually had wanted to change the casing.
"TSTApplicationType".Humanize(LetterCasing.Title); // TST Application Type

Regex pattern for search first letters of the first and last name

I have a problem with regex pattern. Every day I get names and surnames. Example:
Darkholme Van Tadashi
Herrington Billy Aniki
Johny
Walker Sam Cooler
etc..
The fact is that they are specific and do not consist of just one last name and first name.
From this list, I need to select one person (whose last name and first name I know). To do this, I found pattern:
"Darkholme|\b[vt]"
As I said, I know the person's data in advance (before the list arrives). But I only know his last name. The second and third names (Van Tadashi) are unknown to me, I only know the first letters of these names ("V" and "T"). I ran into this problem: when regex analyzes incoming data (I use regex.ismatch), it returns true if the input string is "Van Dungeonmaster". How do I create a pattern that will only return true if the surname=Darkholme, first letters of the second and third names match (=V and T)?
Perhaps I'm not making myself clear.. But in the end, it should turn out that I passed only the last name and the first letters of the first name and patronymic to pattern, and regex gave a match for input string.
If there is a comma present and the names can start with either V or T where the third name can be optional, you could use an optional group matching any non whitespace char except a comma.
\bDarkholme\s+[VT][^\s,]+(?:\s+[VT][^\s,]+)?
\b Word bounary, to prevent Darkholme being part of a larger word
Darkholme Match literally
\s+[VT] Match 1+ whitespace chars followed by either V or T
[^\s,]+ Match 1+ times any char except a whitespace char or comma
(?: Non capture group
\s+[VT] Match 1+ whitespace chars followed by either V or T
[^\s,]+ Match 1+ times any char except a whitespace char or comma
)? Close the group to make the 3rd part optional
.NET regex demo
If you know that the name starts with V for the second and T for the third:
\bDarkholme\s+V[^\s,]+(?:\s+T[^\s,]+)?
.NET regex demo
If the name can also be a Single V or T, the quantifier could be an asterix for [^\s,]*
Your pattern as is means "match any string that contains Darkholme or any string where any word starts with a v or a t" which isn't quite what you want
Perhaps
Darkholme\s+V\S*\s+T
Would suit you better. It means "darkholme followed by at least one white space then V, followed by any number of non whitespace characters then any number of whitespace followed by T

Regex : How can I match a path which starts with '$' and ends in a blank space?

I am trying to match a path from a description with regex so that it only selects the path which starts with '$' has '/' and Alphanumerics. It ends in a blank or \n.
It might however might contain a blank space within the path, which should be matched.
Can anyone suggest me one?
My working Regex is : \$[\/\w\s]+
This is not being able to finish the match.
Trial run :
String =
"Create a folder At Below Location
Path:-$/LoremIpsum/Main/Source/Dolores
Central/Libraries/Umbridge
Folder Name:-Umbridge"
Output:
$/LoremIpsum/Main/Source/Dolores
Central/Libraries/Umbridge
Folder Name
Required:
$/LoremIpsum/Main/Source/Dolores Central/Libraries/Umbridge
Try this brother:
\$[\w\W]+\/+\S+
You can see the Demo of working regex here.
i hope it helped.
You might match a repeating pattern that matches a forward slash followed by a whitespace character or a word character.
For the last part you could match a forward slash, zero or more times a word or a whitespace character followed by a single character and an optional whitespace character so it can end in a blank space or a newline.
\$(?:/[\w\s]+)*/[\w ]*\w\s?
Explanation
\$ Match $
(?:/ Non capturing group
[\w\s]+ Match one or more times a whitespace character or a word character
)* Close non capturing group and repeat zero or more times
/[\w ]* Match a word character or a whitespace zero or more times
\w\s? Match a word character and and optional whitespace character

Regex to match trimmed string consisting of words separated by only 1 space char

I am looking for a regex to validate input in C#. The regex has to match an arbitrary number of words which are separated with only 1 space character in between. The matched string cannot start or end with whitespace characters (this is where my problem is).
Example: some sample input 123
What I've tried: /^(\S+[ ]{0,1})+$/gm this pattern almost does what is required but it also matches 1 trailing space.
Any ideas? Thanks.
I tried this one and it seems to work:
Regex regex = new Regex(#"^\S+([ ]{1}\S+)*$");
It checks if your string starts with a word followed by zero or more entities of a single white space followed by a word. So trailing white spaces are not allowed.

Regex misunderstanding

I'm trying to use regex to check for letters only so I used the below method. The problem is that if I have a number before or after the letter, the number is ignored and nothing happens and that's not what I'm trying to do. I'm trying to check for letters ONLY so if I have anything other then letters an error message pops up. If I have letters only it works fine, and If I have numbers only it also works fine, the problem is that if I have a letter and a number it does't work correctly, other than that everything works fine.
Regex _regex = new Regex("[A-Z]");
Match Instruction_match = _regex.Match(Instruction_Seperator[1]);
if (!Instruction_match.Success) // "A," or "B," or "C,"...etc.
{
Messagebox.show("Error, Please letters only");
}
note that Instruction_Seperator[1] is taken from the user through a textbox, where the user MUST only input letters nothing before the letters nor after the letters. do u have any idea why the messagebox doesn't popup when I input letters and numbers.
Looking forward for your replies :)
can I have a specific size where if the user exceeds pops up an error, for example if the user is allowed only to input 3 Latin letters and nothing else, is there a length constrain in regex :)
That pattern will match any string that contains a capital Latin letter; if it happens to contain any other characters they will be ignored. If you want pattern that will match if the string contains only capital Latin letters, you'll want to use start (^) and end ($) anchors, as well as a one-or-more quantifier (+) after your character class, like this:
^[A-Z]+$
In the end your code should look like this:
Regex _regex = new Regex("^[A-Z]+$");
Match Instruction_match = _regex.Match(Instruction_Seperator[1]);
if (!Instruction_match.Success) // "A," or "B," or "C,"...etc.
{
Messagebox.show("Error, Please letters only");
}
Given the update to your question and some other comments you've made, here are some more patterns you might need to use instead:
^[A-Z]{3}$ - This pattern will match exactly three capital Latin characters
^[A-Z]{1,3}$ - This pattern will match one, two, or three capital Latin characters
^[A-Z]([A-Z]{2})?$ - This pattern will match one or three capital Latin characters
Change your pattern to:
Regex _regex = new Regex("^[A-Z]+$");
The regex you have used [A-Z] matches only a single capital letter. Use [A-Z]+ for any length of continuous capital lettered substring of the input. Use ^[A-Z]+$ which implies that substring is anchored at both start and end position of the input string.
I am assuming that you would only like to match one letter, so the only matched string is "D" in the follwoing example, if you want any number of words use ^[A-Z]+$
var patterns = new string[] { "12ABC", "D", "A","AB","ABC","A2B3","A1BC", "A123", "123ABC12" };
var regex = new Regex(#"^[A-Z]{1,3}$");
foreach (var pattern in patterns)
{
var isMatch = regex.Match(pattern);
if (isMatch.Success)
Console.WriteLine("Found Matching string {0}", pattern);
}
Please look at the modified code, the change for your question is to add {1,3} to the regex pattern, which means up to 3 occurrences of Latin words.

Categories