Check if dictionary contains substring? - c#

Say I have two dictionaries with the following key value pairs:
1, "Hello"
2, "Example"
And another dictionary as follows:
1, "HelloWorld"
2, "Example2"
I want to find out if these dictionaries contain the substring "hello" within them. dictionary.ContainsValue("Hello") will work for the first example but not the second one. How can I check for existence of a substring in all values in a dictionary?

Just use Any to to check for the first Value that contains "Hello"
dictionary.Any(kvp=>kvp.Value.Contains("Hello"))

Dictionary doesn't allow to search for substrings. To find it, you need to enumerate all values and check each for substring, as suggested by juharr. However, this method is highly inefficient. Use it only if you don't care about search performance at all.
If you need good performance, use suffix array algorithm. https://en.wikipedia.org/wiki/Suffix_array

dictionary.Values.Any(v => v.Contains("Hello"));
Dictionary isn't an IEnumerable itself so it won't have LINQ extensions apply to it.

Related

Splitting large string in c# and adding values in it to a List

I have a string as shown below
string names = "<?startname; Max?><?startname; Alex?><?startname; Rudy?>";
is there any way I can split this string and add Max , Alex and Rudy into a separate list ?
Sure, split on two strings (all that consistently comes before, and all that consistently comes after) and specify that you want Split to remove the empties:
var r = names.Split(new[]{ "<?startname; ", "?>" }, StringSplitOptions.RemoveEmptyEntries);
If you take out the RemoveEmptyEntries it will give you a more clear idea of how the splitting is working, but in essence without it you'd get your names interspersed with array entries that are empty strings because split found a delimiter (the <?...) immediately following another (the ?>) with an empty string between the delimiters
You can read the volumes of info about this form of split here - that's a direct link to netcore3.1, you can change your version in the table of contents - this variant of Split has been available since framework2.0
You did also say "add to a separate list" - didn't see any code for that so I guess you will either be happy to proceed with r here being "a separate list" (an array actually, but probably adequately equivalent and easy to convert with LINQ's ToList() if not) or if you have another list of names (that really is a List<string>) then you can thatList.AddRange(r) it
Another Idea is to use Regex
The following regex should work :
(?<=; )(.*?)(?=\s*\?>)

c# Remove elements from List containing string [duplicate]

What would be the fastest way to check if a string contains any matches in a string array in C#? I can do it using a loop, but I think that would be too slow.
Using LINQ:
return array.Any(s => s.Equals(myString))
Granted, you might want to take culture and case into account, but that's the general idea.
Also, if equality is not what you meant by "matches", you can always you the function you need to use for "match".
I really couldn't tell you if this is absolutely the fastest way, but one of the ways I have commonly done this is:
This will check if the string contains any of the strings from the array:
string[] myStrings = { "a", "b", "c" };
string checkThis = "abc";
if (myStrings.Any(checkThis.Contains))
{
MessageBox.Show("checkThis contains a string from string array myStrings.");
}
To check if the string contains all the strings (elements) of the array, simply change myStrings.Any in the if statement to myStrings.All.
I don't know what kind of application this is, but I often need to use:
if (myStrings.Any(checkThis.ToLowerInvariant().Contains))
So if you are checking to see user input, it won't matter, whether the user enters the string in CAPITAL letters, this could easily be reversed using ToLowerInvariant().
Hope this helped!
That works fine for me:
string[] characters = new string[] { ".", ",", "'" };
bool contains = characters.Any(c => word.Contains(c));
You could combine the strings with regex or statements, and then "do it in one pass," but technically the regex would still performing a loop internally. Ultimately, looping is necessary.
If the "array" will never change (or change only infrequently), and you'll have many input strings that you're testing against it, then you could build a HashSet<string> from the array. HashSet<T>.Contains is an O(1) operation, as opposed to a loop which is O(N).
But it would take some (small) amount of time to build the HashSet. If the array will change frequently, then a loop is the only realistic way to do it.

Search list of objects

I get a list of security groups a user belongs to as array of objects. I use DirectoryEntry to get active directory properties and one of the properties is "memberOf" (de.properties["memberOf"].value). The return values is an "array of objects". Each element of this array of objects look something like:
"CN=SITE_MAINTENANCE,OU=CMS,OU=SD,OU=ESM,OU=Engineering Systems,DC=usa,DC=abc,DC=domain,DC=com"
I can loop through the elements, cast each element as "string" and search this way. I just thought there might be an easier way that does not require looping.
I need to be able to find the one(s) with OU=CMS in it.
Thanks.
Loop through the array and then use indexOf or Regexp search for the string "OU=CMS". If it exists in the string, then you've "found the one(s) with OU=CMS in it."
You can do anything like throwing the items into a new list or whatever you want.
list.Where(a=>a.ToString().Contains("OU=CMS")).ToList();
You can use like follows
string listString="CN=SITE_MAINTENANCE,OU=CMS,OU=SD,OU=ESM,"+
"OU=Engineering Systems,DC=usa,DC=abc,DC=domain,DC=com"
Using linq:
listString.Split(',').Contains("OU=CMS")
W/o linq:
Array.IndexOf(listString.Split(','), "OU=CMS") >= 0
you can search required value by foreach loop

Default String Sort Order

Is the default sort order an implementation detail? or how is it that the Default Comparer is selected?
It reminds me of the advice. "Don't store HashCodes in the database"
Is the following Code guaranteed to sort the string in the same order?
string[] randomStrings = { "Hello", "There", "World", "The", "Secrete", "To", "Life", };
randomStrings.ToList().Sort();
Strings are always sorted in alphabetical order.
The default (string.CompareTo()) uses the Unicode comparison rules of the current culture:
public int CompareTo(String strB) {
if (strB==null) {
return 1;
}
return CultureInfo.CurrentCulture.CompareInfo.Compare(this, strB, 0);
}
This overload of List<T>.Sort uses the default comparer for strings, which is implemented like this:
This method performs a word (case-sensitive and culture-sensitive)
comparison using the current culture. For more information about word,
string, and ordinal sorts, see System.Globalization.CompareOptions.
If you do not specify a comparer then sort will use the default comparer which sorts alphabetically. So to answer your question yes that code will always return the strings in the same order.
There is an overload to the sort method that allows you to specify your own comparer if you wish to sort the data in a different order.
I wanted to post a reply related to cultural things while you sort your strings but Jon already added it:-(. Yes, I think you may take into acc the issue too because it is selected by default the alphabetical order, the random strings if existed as foreign besides English (i.e Spanish) will be placed after English after all, they appear in the same first letters though. That means you need a globalization namespace to deal with it.
By the way, Timothy, it is secret not secrete :D
Your code creates a new list copying from the array, sorts that list, and then discards it. It does not change the array at all.
try:
Array.Sort(randomStrings);

Easiest way of working with a comma separated list

I'm about to build a solution to where I receive a comma separated list every night. It's a list with around 14000 rows, and I need to go through the list and select some of the values in the list.
The document I receive is built up with around 50 semicolon separated values for every "case". How the document is structured:
"";"2010-10-17";"";"";"";Period-Last24h";"Problem is that the customer cant find....";
and so on, with 43 more semicolon statements. And every "case" ends with the value "Total 515";
What I need to do is go through all these "cases" and withdraw some of the values in the "cases". The "cases" is always built up in the same order and I know that it's always the 3, 15 and 45'th semicolon value that I need to withdraw.
How can I do this in the easiest way?
I think you should decompose this problem into smaller problems. Here are the steps I'd take:
Each semi-colon separated record represents a single object. C# is an object-oriented language. Stop thinking in terms of .csv records and start thinking in terms of objects. Break up the input into semi-colon delimited records.
Given a single comma-separated record, the values represent the properties of your object. Give them meaningful names.
Parse a comma-separated record into an object. When you're done, you'll have a collection of objects that you can deal with.
Use C#'s collections and LINQ to filter your list based on those cases that you need to withdraw. When you're done, you'll have a collection of objects with the desired cases removed.
Don't worry about the "easiest" way. You need one way that works. Whatever you do, get something working and worry about optimizing it to make it easiest, fastest, smallest, etc. later on.
Assuming the "rows" are lines and that you read line by line, your main tool should be string.Split:
foreach (string line in ... )
{
string [] parts = line.split (';');
string part3 = parts[2];
string part15 = parts[14];
// etc
}
Note that this is a simple approach that will fail if the content of any column can contain ';'
You could use String.Split twice.
The first time using "Total 515"; as the split string using this overload. This will give you an array of cases.
The second time using ";" as the split character using this overload on each of the cases. This will give you a data array for each case. As the data is consistent you can extract the 3rd, 15th and 45th elements of this array.
I'd search for an existing csv library. The escaping rules are probably not that easily mapped to regex.
If writing a library myself I'd first parse each line into a list/an array of strings. And then in a second step(probably outside of the csv library itself) convert the stringlist to a strongly typed object.
A simple but slow approach would be reading single characters from the input (StringReader class, for example). Write a ReadItem method that reads a quote, continues to read until the next quote, and then looks for the next character. If it is a newline of semicolon, one item has been read. If it is another quote, add a single quote to the item being read. Otherwise, throw an exception. Then use this method to split up the input data into a series of items, each line stored e.g. in a string[number of items in a row], lines stored in a List<>. Then you can use this class to read the CSV data inside another class that decodes the data read into objects that you can get your data out of.

Categories