I am using a string list in c#, which contains a list of subjects.
E.g art, science, music.
I then have the user input "I would like to study science and art."
I would like to store the results into a variable, but I get lots of duplicates like "science, sciencemusic" (that's not a typo).
I think it's from the looping of the for each statement. Could there be an easier way to do this or is there something wrong in my code? I can't figure it out.
Here's my code:
string input = "I would like to study science and art.";
string result = "";
foreach (string sub in SubjectsClass.SubjectsList)
{
Regex rx = new Regex(sub, RegexOptions.IgnoreCase);
MatchCollection matches = rx.Matches(input);
foreach (Match match in matches)
{
result += match.Value;
}
}
The subjects class function "SubjectsList" is read from a CSV file with only words in it of random subjects:
CSV File:
Computing
English
Maths
Art
Science
Engineering
private list<string> subjects = new list<string>();
//Read data from csv file to list...
public list<string>SubjectsList
{
get { return subjects; }
{
Currently the output I get is this:
"input": "art science",
"Subject": "artscienceartscienceartscience"
If I change:
result += match.Value;
to
result += match.Value + " ";
I get lots of spaces.
edit: I should mention that this code runs on a WPF c# button press and then shows the result.
Using your code, and with the following test data:
List<string> subjects = new List<string>{"Science", "Art", "Maths"};
string input = "I would like to study science and art.";
I don't get duplicates.
To avoid blank matches, perform a check on the value being empty
foreach (Match match in matches)
{
if (!string.IsNullOrEmpty(match.Value))
{
result += match.Value + " ";
}
}
Related
I have to find whether the String Contains one of the Exact word which are present in the List.
Eg:
List<string> KeyWords = new List<string>(){"Test","Re Test","ACK"};
String s1 = "Please give the Test"
String s2 = "Please give Re Test"
String s3 = "Acknowledge my work"
Now,
When I use: Keywords.Where(x=>x.Contains(s1)) It Gives me a Match which is correct. But for s3 it should not.
Any workaround for this.
Use split function on the basis of space and match the words.
i hope that will worked.
How about using regular expressions?
public static class Program
{
public static void Main(string[] args)
{
var keywords = new List<string>() { "Test", "Re Test", "ACK" };
var targets = new[] {
"Please give the Test",
"Please give Re Test",
"Acknowledge my work"
};
foreach (var target in targets)
{
Console.WriteLine($"{target}: {AnyMatches(target, keywords)}");
}
Console.ReadKey();
}
private static bool AnyMatches(string target, IEnumerable<string> keywords)
{
foreach (var keyword in keywords)
{
var regex = new Regex($"\\b{Regex.Escape(keyword)}\\b", RegexOptions.IgnoreCase);
if (regex.IsMatch(target))
return true;
}
return false;
}
}
Creating the regular expression always on-the-fly is maybe not the best option in production, so you should think of creating a list of Regex based on your keywords instead of storing only the keywords in a dumb string list.
Bit different solution.
void Main()
{
var KeyWords = new List<string>(){ "Test","Re Test","ACK" };
var array = new string[] {
"Please give the Test",
"Please give Re Test",
"Acknowledge my work"
};
foreach(var c in array)
{
Contains(c,KeyWords); // Your result.
}
}
private bool Contains(string sentence, List<string> keywords) {
var result = keywords.Select(keyWord=>{
var parts3 = Regex.Split(sentence, keyWord, RegexOptions.IgnoreCase).Where(x=>!string.IsNullOrWhiteSpace(x)).First().Split((char[])null); // Split by the keywords and get the rest of the words splitted by empty space
var splitted = sentence.Split((char[])null); // split the original string.
return parts3.Where(t=>!string.IsNullOrWhiteSpace(t)).All(x=>splitted.Any(t=>t.Trim().Equals(x.Trim(),StringComparison.InvariantCultureIgnoreCase)));
}); // Check if all remaining words from parts3 are inside the existing splitted string, thus verifying if full words.
return result.All(x=>x);// if everything matches then it was a match on full word.
}
The Idea is to split by the word you are looking for e.g Split by ACK and then see if the remaining words are matched by words splitted inside the original string, if the remaining match that means there was a word match and thus a true. If it is a part split meaning a sub string was taken out, then words wont match and thus result will be false.
Your usage of Contains is backwards:
var foundKW = KeyWords.Where(kw => s1.Contains(kw)).ToList();
how about the using of regex
using \bthe\b, \b represents a word boundary delimiter.
List<string> KeyWords = new List<string>(){"Test","Re Test","ACK"};
String s1 = "Please give the Test"
String s2 = "Please give Re Test"
String s3 = "Acknowledge my work"
bool result = false ;
foreach(string str in KeyWords)
{
result = Regex.IsMatch(s1 , #"\b"+str +"\b");
if(result)
break;
}
I'm using Regex to match characters from a file, but I want to match 2 different strings from that file but they appear more than once, that's why I am using a loop. I can match with a single string but not with 2 strings.
Regex celcius = new Regex(#"""temp"":\d*\.?\d{1,3}");
foreach (Match match in celcius.Matches(htmlcode))
{
Regex date = new Regex(#"\d{4}-\d{2}-\d{2}");
foreach (Match match1 in date.Matches(htmlcode))
{
string date1 = Convert.ToString(match1.Value);
string temperature = Convert.ToString(match.Value);
Console.Write(temperature + "\t" + date1);
}
}
htmlcode:
{"temp":287.05,"temp_min":286.932,"temp_max":287.05,"pressure":1019.04,"sea_level":1019.04,"grnd_level":1001.11,"humidity":89,"temp_kf":0.12},"weather":[{"id":804,"main":"Clouds","description":"overcast
clouds","icon":"04n"}],"clouds":{"all":100},"wind":{"speed":0.71,"deg":205.913},"sys":{"pod":"n"},"dt_txt":"2019-09-22
21:00:00"},{"dt":1569196800,"main":{"temp":286.22,"temp_min":286.14,"temp_max":286.22,"pressure":1019.27,"sea_level":1019.27,"grnd_level":1001.49,"humidity":90,"temp_kf":0.08},"weather":[{"id":804,"main":"Clouds","description":"overcast
clouds","icon":"04n"}],"clouds":{"all":99},"wind":{"speed":0.19,"deg":31.065},"sys":{"pod":"n"},"dt_txt":"2019-09-23
00:00:00"},{"dt":1569207600,"main":{"temp":286.04,"temp_min":286,"temp_max":286.04,"pressure":1019.38,"sea_level":1019.38,"grnd_level":1001.03,"humidity":89,"temp_kf":0.04},"weather":
You can use a single Regex pattern with two capturing groups for temperature and date. The pattern can look something like this:
("temp":\d*\.?\d{1,3}).*?(\d{4}-\d{2}-\d{2})
Regex demo.
C# example:
string htmlcode = // ...
var matches = Regex.Matches(htmlcode, #"(""temp"":\d*\.?\d{1,3}).*?(\d{4}-\d{2}-\d{2})");
foreach (Match m in matches)
{
Console.WriteLine(m.Groups[1].Value + "\t" + m.Groups[2].Value);
}
Output:
"temp":287.05 2019-09-22
"temp":286.22 2019-09-23
Try it online.
I don't think you have HTML. I think you have a collection of something called JSON (JavaScript Object Notification) which is a way to pass data efficiently.
So, this is one of your "HTML" objects.
{
"temp":287.05,
"temp_min":286.932,
"temp_max":287.05,
"pressure":1019.04,
"sea_level":1019.04,
"grnd_level":1001.11,
"humidity":89,
"temp_kf":0.12},
"weather":[{
"id":804,
"main":"Clouds",
"description":"overcast clouds",
"icon":"04n"
}],
"clouds":{
"all":100
},
"wind":{
"speed":0.71,"deg":205.913
},
"sys":{
"pod":"n"
},
"dt_txt":"2019-09-22 21:00:00"
}
So, I would recommend converting the line using the C# web helpers and parsing the objects directly.
//include this library
using System.Web.Helpers;
//parse your htmlcode using this loop
foreach(var line in htmlcode)
{
dynamic data = JSON.decode(line);
string temperature = (string)data["temp"];
string date = Convert.ToDateTime(data["dt_txt"]).ToString("yyyy-MM-dd");
Console.WriteLine($"temperature: {temperature} date: {date}"");
}
I have a string which is somewhat like this:
string data = "I have a {apple} and a {orange}";
I need to extract the content inside {}, let's say for 10 times
I tried this
string[] split = data.Split(new char[] { '{', '}' }, StringSplitOptions.RemoveEmptyEntries);
The problem is my data is going to be dynamic and I wouldn't know at what instance the {<>} would be present, it can also be something like this
Give {Pen} {Pencil}
I guess the above method wouldn't work, so I would really like to know a dynamic way to do this. Any input would be really helpful.
Thanks and Regards
Try this:
string data = "I have a {apple} and a {orange}";
Regex rx = new Regex("{(.*?)}");
foreach (Match item in rx.Matches(data))
{
Console.WriteLine(item.Groups[1].Value);
}
You need to use Regex to get all values you need.
If the string between {} does not contain nested {} you can use a regex to perform this task:
string data = "I have a {apple} and a {orange}";
Regex reg = new Regex(#"\{(?<Name>[A-z0-9]*)\}");
var matches = reg.Matches(data);
foreach (var m in matches.OfType<Match>())
{
Console.WriteLine($"Found {m.Groups["Name"].Value} at {m.Index}");
}
To replace the strings between {} you can use Regex.Replace:
reg.Replace(data, m => m.Groups["Name"].Value + "_")
// Will produce "I have a apple_ and a orange_"
To get the rest of the string, you can use Regex.Split:
Regex reg2 = new Regex(#"\{[A-z0-9]*\}");
var result = reg2.Split(data);
// will contain "I have a ", " and a ", "", you might want to remove ""
As I understand, you want to split that string into parts like this:
I have a
{apple}
and a
{orange}
And then you want to go over those parts and do something with them, and that something is different depending on whether part is enclosed in {} or not. If so - you need Regex.Split:
string data = "I have a {apple} and a {orange}";
var parts = Regex.Split(data, #"({.*?})");
foreach (var part in parts) {
if (part.StartsWith("{") && part.EndsWith("}")) {
var trimmed = part.TrimStart('{').TrimEnd('}');
// "apple" and "orange" go here
// do something with {} part
}
else {
// "I have a " and " and a " go here
// do something with other part
}
}
I have this piece of code which repeats about 250,000 on a loop searching through the records. There are 28 different Regex (this being one of them). Is there an easier way other than writing to a file, reading it into a string and using each towards the end of my code?
if (CSV_Radio_Button.Checked)
{
string pattern = #"(?<=\*\*Course : )(.*?)(?=\*\*Going)";
Regex myRegex = new Regex(pattern, RegexOptions.IgnoreCase);
Match m = myRegex.Match(text);
while (m.Success)
{
string CourseToString = m.ToString();
System.IO.File.WriteAllText(CourseFile, UppercaseWords(CourseToString));
m = m.NextMatch();
}
}
string Course = File.ReadLines(CourseFile).ElementAtOrDefault(0);
I haven't tested this, but this is how I've done something similar. A list would work but you'd be iterating over the list when you are done to build the string. You could also use one of the various Streams.
StringBuilder CourseBuilder = new StringBuilder();
while (m.Success)
{
CourseBuilder.AppendLine(m.ToString());
m = m.NextMatch();
}
}
string Course = CourseBuilder.ToString();
If you intentionally overwriting file here:
System.IO.File.WriteAllText(CourseFile, UppercaseWords(CourseToString));
you can replace that line with string defined before your block like this:
string CSV_regex_result;
if (CSV_Radio_Button.Checked)
{
...
while(m.Success) {
CSV_regex_result = UppercaseWords(m.ToString());
m = m.NextMatch();
}
}
Now you can access last matched regex in CSV_regex_result.
If there is mistake in code and you want all regex it depends, if you want it separated or not.
If you want single string David Green answer is way to go. But be careful about string size limit.
If you want separated results:
Replace in my example string CSV_regex_result; with List<string> CSV_regex_result = new List<string>(); and in loop replace CSV_regex_result = UppercaseWords(m.ToString()); with CSV_regex_result.Add(UppercaseWords(m.ToString()));.
If you want results accessible separated by regex name. You can:
Dictionary<string, List<string>> result = new Dictionary<string, List<string>>();
...
List<string> Course_result = new List<string>();
...
//in loop
Course_result.Add(UppercaseWords(m.ToString()));
...
//after loop
if (!result.ContainsKey("Course")) result.Add("Course",Course_result);
else result["Course"]=Course_result;
Of course if you want merged results of regex you can create Dictionary<string,string> and add results generated with StringBuilder.
In case will run out of memory (depends on your memory size and data amount) it can be good to stick with your current approach (save parts to files).
So I am coding a converter program that convers a old version of code to the new version you just put the old text in a text box and it converts Txt to Xml and im trying to get each items beetween two characters and below is the string im trying to split. I have put just the name of the param in the " " to protect my users credentials. So i want to get every part of code beetween the ","
["Id","Username","Cash","Password"],["Id","Username","Cash","Password"]
And then add each string to a list so it would be like
Item 1
["Id","Username","Cash","Password"]
Item 2
["Id","Username","Cash","Password"]
I would split it using "," but then it would mess up because there is a "," beetween the params of the string so i tried using "],"
string input = textBox1.Text;
string[] parts1 = input.Split(new string[] { "]," }, StringSplitOptions.None);
foreach (string str in parts1)
{
//Params is a list...
Params.Add(str);
}
MessageBox.Show(string.Join("\n\n", Params));
But it sort of take the ] of the end of each one. And it messes up in other ways
This looks like a great opportunity for Regular Expressions.
My approach would be to get the row parts first, then get the column parts. I'm sure there are about 30 ways to do this, but this is my (simplistic) approach.
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var rowPattern = new Regex(#"(?<row>\[[^]]+\])", RegexOptions.Multiline | RegexOptions.ExplicitCapture);
var columnPattern = new Regex(#"(?<column>\"".+?\"")", RegexOptions.Multiline | RegexOptions.ExplicitCapture);
var data = "[\"Id\",\"Username\",\"Cash\",\"Password\"],[\"Id\",\"Username\",\"Cash\",\"Password\"]";
var rows = rowPattern.Matches(data);
var rowCounter = 0;
foreach (var row in rows)
{
Console.WriteLine("Row #{0}", ++rowCounter);
var columns = columnPattern.Matches(row.ToString());
foreach (var column in columns)
Console.WriteLine("\t{0}", column);
}
Console.ReadLine();
}
}
}
Hope this helps!!
You can use Regex.Split() together with positive lookbehind and lookahead to do this:
var parts = Regex.Split(input, "(?<=]),(?=\\[)");
Basically this says “split on , with ] right before it and [ right after it”.
Assuming that the character '|' does not occur in your original data, you can try:
input.Replace("],[", "]|[").Split(new char[]{'|'});
If the pipe character does occur, use another (non-occurring) character.