Comparing and combining a List of similar strings - c#

I have a list of strings that I have to run through multiple times to try to reduce duplicates.
List <string> EventsList = BuildList.Distinct().ToList();
This removes exact copies but occasionally there will be a duplicate event message that contains different variations on the exact same event.
For instance:
Error code [123]: Failure in the [X] directory.
Error code [123]: Failure in the [Y] directory.
The intent being that I can compare these strings again and come up with the output:
Error code [123]: Failure in the [X, Y] directory.
Since the varying input is always in brackets I created the
string pattern = #"\[([^\]]+)";
RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.Compiled;
Regex ConsolidatorRegex = new Regex(pattern, options);
BuildList = EventsList;
foreach (string singleEvent in BuildList)
{
ConsolidatorRegex.Replace(singleEvent, "");
}
Thinking that I could then compare the strings and remove the duplicates again.
But now I'm stuck. I want to preserve the original order of the chronological events as much as possible but I can't figure the best way to go about this. Running BuildList.Distinct().ToList(); again doesn't help me capture the (often multiple) removed Matches so I can add them back in.
I thought I could run a loop that does a String.Equals method and put all the hits into a dictionary then compare the dictionary to the EventsList, but I couldn't get the index of the loop in order to create dictionary key.
Is there a better way to go about this that I'm missing?

You can build your won comparer as mentioned in the docs.
From the docs:
To compare a custom data type, you need to implement this interface
and provide your own GetHashCode and Equals methods for the type.
See the docs for that one.

You can use the LINQ GroupBy function to group you strings.
var eventListGrouping = BuildList.GroupBy(eventString => ConsolidatorRegex.Replace(eventString, ""));
Then you can iterate over the groups:
foreach(var variation in eventListGrouping)
{
// Use variation.Key to find your 'template string'
// Iterate over variation to find all the string you want to combine
// You can reuse you regex to extract the values you want to combine
// Pay attention to adhere to the correct regex match count.
}
For more info on the IGrouping interface, see MSDN

var memo = new Dictionary<int, List<string>>();
var event_list = new List<string>
{
"Error code [123]: Failure in the [X] directory.",
"Error code [123]: Failure in the [Y] directory.",
"Error code [456]: Failure in the [Y] service.",
};
var pattern = new Regex(#"(code\s\[(?'code'\d+)\]).*\[(?'message'.*)\]");
foreach(var item in event_list)
{
var match = pattern.Match(item);
var code = Int32.Parse(match.Groups["code"].Value);
var msg = match.Groups["message"].Value;
var messages = default(List<string>);
if(!memo.TryGetValue(code, out messages))
memo.Add(code, messages = new List<string>());
messages.Add(msg);
}
var directory_errors = from x in memo where x.Key == 123 select x;
foreach(var error in directory_errors)
Console.WriteLine(string.Format("Error code [{0}]: Failure in the [{1}] directory", error.Key, string.Join(",", from err in error.Value select "'" + err + "'")));
The idea is that we have we use a dictionary of type Dictionary<int, List<string>> where the key is the error code (assumed to be an int) and the value is a List<string>.
For each event, we use a regex to extract the code and the message, we then check the dictionary to see if there is already a list of messages associated with that code, if so we simply add to the list, but if not, then we create the list and add it to the dictionary (using the error code as the key), and then we add to the list.
Rextester Demo.

Related

If string in list occurs in string, then add to list

had a look around and found many similar questions but none matching mine exactly.
public bool checkInvalid()
{
invalidMessage = filterWords.Any(s => appmessage.Contains(s));
return invalidMessage;
}
If a string is found that matches a string in the list the boolean invalidMessage is set to true.
After this though I would like to be able to add each string found to a list. is there a way I can do this using .Contains() or can someone recommend me another way to go about this?
Many thanks.
Well, from your description, I thought here is what you want:
// Set of filtered words
string[] filterWords = {"AAA", "BBB", "EEE"};
// The app message
string appMessage = "AAA CCC BBB DDD";
// The list contains filtered words from the app message
List<string> result = new List<string>();
// Normally, here is what you do
// 1. With each word in the filtered words set
foreach (string word in filterWords)
{
// Check if it exists in the app message
if (appMessage.Contains(word))
{
// If it does, add to the list
result.Add(word);
}
}
But as you said, you want to use LINQ, so instead of doing a loop, you can do it like this:
// If you want to use LINQ, here is the way
result.AddRange(filterWords.Where(word => appMessage.Contains(word)));
If what you want is to gets the words in filterWords that are contained in appmessage you can use Where:
var words = filterWords.Where(s => appmessage.Contains(s)).ToList();

c# - Reading a complex file into a comboBox

So I tried some research, but I just don't know how to google this..
For example, I got a .db (works same as .txt for me) file, written like this:
DIRT: 3;
STONE: 6;
so far, i got a code that can put items in a comboBox like this:
DIRT,
STONE,
will put DIRT and STONE in the comboBox. This is the code I'm using for that:
string[] lineOfContents = System.IO.File.ReadAllLines(dbfile);
foreach (var line in lineOfContents)
{
string[] tokens = line.Split(',');
comboBox1.Items.Add(tokens[0]);
}
How do I expand this so it put e.g. DIRT and STONE in the combobox, and keep the rest (3) in variables (ints, like int varDIRT = 3)?
If you want, it doesn't have to be txt or db files.. i heard xml are config files too.
Try doing something like this:
cmb.DataSource = File.ReadAllLines("filePath").Select(d => new
{
Name = d.Split(',').First(),
Value = Convert.ToInt32(d.Split(',').Last().Replace(";",""))
}).ToList();
cmb.DisplayMember = "Name";
cmb.ValueMember= "Value";
remember it will require to use using System.Linq;
if your want ot reference the selected value of the combobox you can use
cmb.SelectedValue;
cmb.SelectedText;
I think you've really got two questions, so I'll try to answer them separately.
The first question is "How can I parse a file that looks like this...
DIRT: 3;
STONE: 6;
into names and integers?" You could remove all the whitespace and semicolons from each line, and then split on colon. A cleaner way, in my opinion, would be to use a regular expression:
// load your file
var fileLines = new[]
{
"DIRT: 3;",
"STONE: 6;"
};
// This regular expression will match anything that
// begins with some letters, then has a colon followed
// by optional whitespace ending in a number and a semicolon.
var regex = new Regex(#"(\w+):\s*([0-9])+;", RegexOptions.Compiled);
foreach (var line in fileLines)
{
// Puts the tokens into an array.
// The zeroth token will be the entire matching string.
// Further tokens will be the contents of the parentheses in the expression.
var tokens = regex.Match(line).Groups;
// This is the name from the line, i.e "DIRT" or "STONE"
var name = tokens[1].Value;
// This is the numerical value from the same line.
var value = int.Parse(tokens[2].Value);
}
If you're not familiar with regular expressions, I encourage you to check them out; they make it very easy to format strings and pull out values. http://regexone.com/
The second question, "how do I store the value alongside the name?", I'm not sure I fully understand. If what you want to do is back each item with the numerical value specified in the file, the dub stylee's advice is good for you. You'll need to place the name as the display member and value as the value member. However, since your data is not in a table, you'll have to put the data somewhere accessible so that the Properties you want to use can be named. I recommend a dictionary:
// This is your ComboBox.
var comboBox = new ComboBox();
// load your file
var fileLines = new[]
{
"DIRT: 3;",
"STONE: 6;"
};
// This regular expression will match anything that
// begins with some letters, then has a colon followed
// by optional whitespace ending in a number and a semicolon.
var regex = new Regex(#"(\w+):\s*([0-9])+;", RegexOptions.Compiled);
// This does the same as the foreach loop did, but it puts the results into a dictionary.
var dictionary = fileLines.Select(line => regex.Match(line).Groups)
.ToDictionary(tokens => tokens[1].Value, tokens => int.Parse(tokens[2].Value));
// When you enumerate a dictionary, you get the entries as KeyValuePair objects.
foreach (var kvp in dictionary) comboBox.Items.Add(kvp);
// DisplayMember and ValueMember need to be set to
// the names of usable properties on the item type.
// KeyValue pair has "Key" and "Value" properties.
comboBox.DisplayMember = "Key";
comboBox.ValueMember = "Value";
In this version, I have used Linq to construct the dictionary. If you don't like the Linq syntax, you can use a loop instead:
var dictionary = new Dictionary<string, int>();
foreach (var line in fileLines)
{
var tokens = regex.Match(line).Groups;
dictionary.Add(tokens[1].Value, int.Parse(tokens[2].Value));
}
You could also use FileHelpers library. First define your data record.
[DelimitedRecord(":")]
public class Record
{
public string Name;
[FieldTrim(TrimMode.Right,';')]
public int Value;
}
Then you read in your data like so:
FileHelperEngine engine = new FileHelperEngine(typeof(Record));
//Read from file
Record[] res = engine.ReadFile("FileIn.txt") as Record[];
// write to file
engine.WriteFile("FileOut.txt", res);

Linq query, select everything from one lists property that starts with a string in another list

Hello I'm new to linq and lambda
I have two lists
fl.LocalOpenFiles ...
List<string> f....
there is a property (string) for example taking index 0
fl.LocalOpenFiles[0].Path
i wanted to select all from the first list fl.LocalOpenFiles where fl.LocalOpenFiles.Path starts with a string from the List<string> f
I finally got this...
List<LocalOpenFile> lof = new List<LocalOpenFile>();
lof = fl.LocalOpenFiles.Join(
folders,
first => first.Path,
second => second,
(first, second) => first)
.ToList();
But its just selecting folders that meet the requirement first.Path == second and i couldnt find a way to get the data that i want which is something meeting this "braindump" requirement:
f[<any>] == fl.LocalOpenFiles[<any>].Path.Substring(0, f[<any>].Length)
Another Example...
List<string> f = new List<string>{ "abc", "def" };
List<LocalOpenFile> lof = new List<LocalOpenFile>{
new LocalOpenFile("abc"),
new LocalOpenFile("abcc"),
new LocalOpenFile("abdd"),
new LocalOpenFile("defxsldf"),)}
// Result should be
// abc
// abcc
// defxsldf
I hope i explained it in a understandable way :)
Thank you for your help
Do you mean something like this :
List<LocalOpenFile> result =
lof.Where(file => f.Any(prefix => file.Path.StartsWith(prefix)))
.ToList();
You can use a regular where instead of a join, which will give you more straight forward control over the selection criteria;
var result =
from file in lof
from prefix in f
where file.Path.StartsWith(prefix)
select file.Path; // ...or just file if you want the LocalOpenFile objects
Note that a file matching multiple prefixes may show up more than once. If that is a problem, you can just add a call to Distinct to eliminate duplicates.
EDIT:
If you - as it seems in this case - only want to know the matching path and not the prefix it matches (ie you only want data from one collection as in this case), I'd go for #har07's Any solution instead.

get list of files in directory that are 3 extensions and only numbers

I am using this code:
var list = Directory.GetFiles(AppDomain.CurrentDomain.BaseDirectory, _globalSetting.CompanyCode + "trn*.???", SearchOption.TopDirectoryOnly).ToList();
foreach (var listitem in list)
{
listBox_Files.Items.Add(Path.GetFileName(listitem));
}
but it's giving me more than I need. I'd like it to only give me files with 3 extensions, and if I could, only those with numbers in them. I tried the ??? above but it's giving me this:
WEBTRN25.000
WEBTRN25.001
WEBTRN25.000_copy
WEBTRN34.ABC
I also tried ### but that gave me no results.
This is what I would like it to give back:
WEBTRN25.000
WEBTRN25.001
Any suggestions?
You could combine a Regex expression with a Linq clause Where
Regex r = new Regex(#"^\.\d\d\d$");
var list = Directory.EnumerateFiles(AppDomain.CurrentDomain.BaseDirectory,
_globalSetting.CompanyCode + "trn*.*",
SearchOption.TopDirectoryOnly)
.Where(x => r.IsMatch(Path.GetExtension(x)));
Notice that I have replaced your call to GetFiles with EnumerateFiles. This method allows to start the enumeration of the collection before the whole directory list has been read. So, EnumerateFiles (if you have many files in the directory) could be more efficient.
you could use a regex to make sure the file name ends in [dot then three numbers]
.*\\.[0-9]{3}$
or something like that
Try using LINQ:
var list = Directory.GetFiles("<DIRECTORY>").Where(a=> Regex.IsMatch(a, #"\d\d\d")).ToList().Foreach((b)=> Path.GetFileName(b));. You wouldnt need the foreach loop

GetCookie extract information to a String

I'm trying to get the numbers information from a cookie I get by Set-Cookie I need &om=-&lv=1341532178340&xrs= the numbers here
This is what I came up with:
string key = "";
ArrayList list = new ArrayList();
foreach (Cookie cookieValue in agent.LastResponse.Cookies)
{
list.Add(cookieValue);
}
String[] myArr = (String[])list.ToArray(typeof(string));
foreach (string i in myArr)
{
// Here we call Regex.Match.
Match match = Regex.Match(i, #"&lv=(.*)&xrs=",
RegexOptions.IgnoreCase);
// Here we check the Match instance.
if (match.Success)
{
// Finally, we get the Group value and display it.
key = match.Groups[1].Value;
}
}
agent.GetURL("http://site.com/" + key + ".php");
The issue I'm having is I cannot change ArrayList to String (the error is: "At least one element in the source array could not be cast down to the destination array type."), I thought you guys can help me maybe you can come up with a way to fix it or a better code to do that?
Thanks a lot!
With first loop, you are building an ArrayList that contains Cookie instances. It's not possible to simply convert from Cookie to string as you are attempting to do just before the second loop.
A simple way to get values of all cookies is to use LINQ:
IEnumerable<string> cookieValues = agent.LastResponse.Cookies.Select(x => x.Value);
If you are still using .NET Framework 2.0, you will need to use a loop:
List<string> cookieValues = new List<string>();
foreach (Cookie cookie in agent.LastResponse.Cookies)
{
cookieValues.Add(cookie.Value);
}
Then, you can iterate over this collection just like you previously were. However, are you aware that if multiple cookies match your regex, the last one that matches will be stored to the key? Don't know how exactly you want this to work when there are multiple cookies that match, but if you simply want the first one, you can again employ LINQ to make your code simpler and do almost everything you need in a single query:
var cookies = agent.LastResponse.Cookies;
string key = cookies.Cast<Cookie>()
.Select(x => Regex.Match(x.Value, #"&lv=(.*)&xrs=", RegexOptions.IgnoreCase))
.Where(x => x.Success)
.Select(x => x.Groups[1].Value)
.FirstOrDefault();
If there was no match, the key will be null, otherwise, it will contain the first match.
The Cast<Cookie>() bit is necessary for type inference to kick in - I believe that agent.LastResponse.Cookies returns an instance of CookieCollection which does not implement IEnumerable<Cookie>.

Categories