match names with unicode chars - c#

can somebody help me to match following type of strings "BEREŽALINS", "GŽIBOVSKIS" in C# and js , I've tried
\A\w+\z (?>\P{M}\p{M}*)+ ^[-a-zA-Z\p{L}']{2,50}$
, and so on ... but nothing works .
Thanks

Just wrote a little console app to do it:
private static void Main(string[] args) {
var list = new List<string> {
"BEREŽALINS",
"GŽIBOVSKIS",
"TEST"
};
var pat = new Regex(#"[^\u0000-\u007F]");
foreach (var name in list) {
Console.WriteLine(string.Concat(name, " = ", pat.IsMatch(name) ? "Match" : "Not a Match"));
}
Console.ReadLine();
}
Works with the two examples you gave me, but not sure about all scenarios :)

Can you give an example of what is should not match?
Reading your question it's like you want to match just string (on seperates line maybe). If thats the case just use
^.*$
In C# this becomes
foundMatch = Regex.IsMatch(SubjectString, "^.*$", RegexOptions.Multiline);
And in javascript this is
if (/^.*$/m.test(subject)) {
// Successful match
} else {
// Match attempt failed
}

Related

Regular expressions - Equal number of characters in left and right

So I have this regular expression
[a+][a-z-[a]]{1}[a+]
which will match string "aadaa"
but it will also match string "aaaaaaaadaa"
Is there any way to force it to match only those strings in which left side a's and right side a's occurrence count should be same?
so that it will match only "aadaa" and not this "aaaaaaaadaa"
Edit
With the help of Peter's answer I could make it working, this is the working version for my requirement
(a+)[a-z-[a]]{1}\1
You can use a back reference, as follows:
console.log(check("ada"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));
function check(str) {
var re = /^(.*).\1$/;
return re.test(str);
}
Or to only match a's and d's:
console.log(check("aca"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));
function check(str) {
var re = /^(a*)d\1$/;
return re.test(str);
}
Or to only match a's that surround not-an-a:
console.log(check("aca"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));
function check(str) {
var re = /^(a*)[b-z]\1$/;
return re.test(str);
}
I realize all the above is javascript, which was easy for quick demoing within the context of SO.
I made a working DotNetFiddle with the following C# code that is similar to all the above:
public static Regex re = new Regex(#"^(a+)[b-z]\1$");
public static void Main()
{
check("aca");
check("ada");
check("aadaa");
check("aaddaa");
check("aadcaa");
check("aaaaaaaadaa");
check("aadaaaaaaaa");
}
public static void check(string str)
{
Console.WriteLine(str + " -> " + re.IsMatch(str));
}
You can also use the following regex for the same although I would prefer the one suggested by #PeterB
console.log(check("aca"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));
function check(str) {
var re = /^(\w+)[A-Za-z]\1$/;
return re.test(str);
}
The code is similar to the one in Peter B's answer, but the regex is the one changed by me.

Need to split something dynamically from a string

I have a string which is somewhat like this:
string data = "I have a {apple} and a {orange}";
I need to extract the content inside {}, let's say for 10 times
I tried this
string[] split = data.Split(new char[] { '{', '}' }, StringSplitOptions.RemoveEmptyEntries);
The problem is my data is going to be dynamic and I wouldn't know at what instance the {<>} would be present, it can also be something like this
Give {Pen} {Pencil}
I guess the above method wouldn't work, so I would really like to know a dynamic way to do this. Any input would be really helpful.
Thanks and Regards
Try this:
string data = "I have a {apple} and a {orange}";
Regex rx = new Regex("{(.*?)}");
foreach (Match item in rx.Matches(data))
{
Console.WriteLine(item.Groups[1].Value);
}
You need to use Regex to get all values you need.
If the string between {} does not contain nested {} you can use a regex to perform this task:
string data = "I have a {apple} and a {orange}";
Regex reg = new Regex(#"\{(?<Name>[A-z0-9]*)\}");
var matches = reg.Matches(data);
foreach (var m in matches.OfType<Match>())
{
Console.WriteLine($"Found {m.Groups["Name"].Value} at {m.Index}");
}
To replace the strings between {} you can use Regex.Replace:
reg.Replace(data, m => m.Groups["Name"].Value + "_")
// Will produce "I have a apple_ and a orange_"
To get the rest of the string, you can use Regex.Split:
Regex reg2 = new Regex(#"\{[A-z0-9]*\}");
var result = reg2.Split(data);
// will contain "I have a ", " and a ", "", you might want to remove ""
As I understand, you want to split that string into parts like this:
I have a
{apple}
and a
{orange}
And then you want to go over those parts and do something with them, and that something is different depending on whether part is enclosed in {} or not. If so - you need Regex.Split:
string data = "I have a {apple} and a {orange}";
var parts = Regex.Split(data, #"({.*?})");
foreach (var part in parts) {
if (part.StartsWith("{") && part.EndsWith("}")) {
var trimmed = part.TrimStart('{').TrimEnd('}');
// "apple" and "orange" go here
// do something with {} part
}
else {
// "I have a " and " and a " go here
// do something with other part
}
}

How to read between a specified character in a string?

I was trying to create a list from a user input with something like this:
Create newlist: word1, word2, word3, etc...,
but how do I get those words one by one only by using commas as references going through them (in order) and placing them into an Array etc? Example:
string Input = Console.ReadLine();
if (Input.Contains("Create new list:"))
{
foreach (char character in Input)
{
if (character == ',')//when it reach a comma
{
//code goes here, where I got stuck...
}
}
}
Edit: I didn`t know the existence of "Split" my mistake... but at least it would great if you could explain me to to use it for the problem above?
You can use this:
String words = "word1, word2, word3";
List:
List<string> wordsList= words.Split(',').ToList<string>();
Array:
string[] namesArray = words.Split(',');
#patrick Artner beat me to it, but you can just split the input with the comma as the argument, or whatever you want the argument to be.
This is the example, and you will learn from the documentation.
using System;
public class Example {
public static void Main() {
String value = "This is a short string.";
Char delimiter = 's';
String[] substrings = value.Split(delimiter);
foreach (var substring in substrings)
Console.WriteLine(substring);
}
}
The example displays the following output:
Thi
i
a
hort
tring.

How to split at every second quotation mark

I have a string that looks like this
2,"E2002084700801601390870F"
3,"E2002084700801601390870F"
1,"E2002084700801601390870F"
4,"E2002084700801601390870F"
3,"E2002084700801601390870F"
This is one whole string, you can imagine it being on one row.
And I want to split this in the way they stand right now like this
2,"E2002084700801601390870F"
I cannot change the way it is formatted. So my best bet is to split at every second quotation mark. But I haven't found any good ways to do this. I've tried this https://stackoverflow.com/a/17892392/2914876 But I only get an error about invalid arguements.
Another issue is that this project is running .NET 2.0 so most LINQ functions aren't available.
Thank you.
Try this
var regEx = new Regex(#"\d+\,"".*?""");
var lines = regex.Matches(txt).OfType<Match>().Select(m => m.Value).ToArray();
Use foreach instead of LINQ Select on .Net 2
Regex regEx = new Regex(#"\d+\,"".*?""");
foreach(Match m in regex.Matches(txt))
{
var curLine = m.Value;
}
I see three possibilities, none of them are particularly exciting.
As #dvnrrs suggests, if there's no comma where you have line-breaks, you should be in great shape. Replace ," with something novel. Replace the remaining "s with what you need. Replace the "something novel" with ," to restore them. This is probably the most solid--it solves the problem without much room for bugs.
Iterate through the string looking for the index of the next " from the previous index, and maintain a state machine to decide whether to manipulate it or not.
Split the string on "s and rejoin them in whatever way works the best for your application.
I realize regular expressions will handle this but here's a pure 2.0 way to handle as well. It's much more readable and maintainable in my humble opinion.
using System;
using System.Collections.Generic;
namespace ConsoleApplication1
{
internal class Program
{
private static void Main(string[] args)
{
const string data = #"2,""E2002084700801601390870F""3,""E2002084700801601390870F""1,""E2002084700801601390870F""4,""E2002084700801601390870F""3,""E2002084700801601390870F""";
var parsedData = ParseData(data);
foreach (var parsedDatum in parsedData)
{
Console.WriteLine(parsedDatum);
}
Console.ReadLine();
}
private static IEnumerable<string> ParseData(string data)
{
var results = new List<string>();
var split = data.Split(new [] {'"'}, StringSplitOptions.RemoveEmptyEntries);
if (split.Length % 2 != 0)
{
throw new Exception("Data Formatting Error");
}
for (var index = 0; index < split.Length / 2; index += 2)
{
results.Add(string.Format(#"""{0}""{1}""", split[index], split[index + 1]));
}
return results;
}
}
}

Using regex to match multiple times using capturing group

I am attempting to use regex matching to get a list of optional params out of an mvc route and dynamically inject values into the holders where variables have been used. See code below. Unfortunatly the sample doesn't find both values but repeats the first. Can anyone offer any help?
using System;
using System.Text.RegularExpressions;
namespace regexTest
{
class Program
{
static void Main(string[] args)
{
var inputstr = "http://localhost:12345/Controller/Action/{route:value1}/{route:value2}";
var routeRegex = new Regex(#"(?<RouteVals>{route:[\w]+})");
var routeMatches = routeRegex.Match(inputstr);
for (var i = 0; i < routeMatches.Groups.Count; i++)
{
Console.WriteLine(routeMatches.Groups[i].Value);
}
Console.ReadLine();
}
}
}
This outputs
{route:value1}
{route:value1}
where I was hopeing to get
{route:value1}
{route:value2}
I know nothing about C# but it may help if you put the quantifier after the closing parenthese, no?
Update: That post may help you.
Just make a global match :
var inputstr = "http://localhost:12345/Controller/Action/{route:value1}/{route:value2}";
StringCollection resultList = new StringCollection();
Regex regexObj = new Regex(#"\{route:\w+\}");
Match matchResult = regexObj.Match(inputstr);
while (matchResult.Success) {
resultList.Add(matchResult.Value);
matchResult = matchResult.NextMatch();
}
Your results will be stored in resultList.
foreach (Match match in routeMatches){
for(var i=1;i<match.Groups.Count;++i)
Console.WriteLine(match.Groups[i].Value);
}

Categories