Dictionary look up where we want the keys contained in a string - c#

I have a dictionary containing keys, e.g.
"Car"
"Card Payment"
I have a string description, e.g. "Card payment to tesco" and I want to find the item in the dictionary that corresponds to the string.
I have tried this:
var category = dictionary.SingleOrDefault(p => description.ToLowerInvariant().Contains(p.Key)).Value;
This currently results in both "Car" and "Card Payment" being returned from the dictionary and my code blows up as I have SingleOrDefault.
How can I achieve what I want? I thought about prefixing and suffixing the keys in spaces, but I'd have to do the same to the descriptions - I think this would work but it is a bit dirty. Are there any better ways? I have no objections of changing the Dictionary to some other type as long as performance is not impacted too much.
Required Result for above example: only get "Card Payment"

You can try to use linq OrderByDescending and Take after your where condition. to find the most match word value.
var category = dictionary
.Where(p => description.ToLowerInvariant().Contains(p.Key.ToLowerInvariant()))
.OrderByDescending(x => x.Key.Length)
.Take(1);
c# online
I would use List<string> to contain your keys, because there isn't any reason need to use a key and value collection.
List<string> keys = new List<string>();
keys.Add("Car");
keys.Add("Card Payment");
string description = "Card payment to tesco";
var category = keys
.Where(p => description.ToLowerInvariant().Contains(p.ToLowerInvariant()))
.OrderByDescending(x => x.Length)
.Take(1)
.FirstOrDefault();
NOTE
OrderBy key values length desc can make sure which key is the most match word value.

Here I'm using List<string> keys and System.Text.RegularExpressions find desired key.Try it.
string description = "Card payment to tesco";
List<string> keys = new List<string> {
{"Car" }, {"Card Payment" }
};
string desc = description.ToLowerInvariant( );
string pattern = #"([{0}]+) (\S+)";
var resp = keys.FirstOrDefault( a => {
var regx = new Regex( string.Format( pattern, a.ToLowerInvariant( ) ) );
return regx.Match( desc ).Success;
} );
Check here .NET Fiddle

You are abusing dictionaries. You will get no performance gain from dictionaries by scanning the keys. Even worse, a simple list would be faster in this case. Dictionaries approach a constant time access (O(1)) if you look up a value by the key.
if (dictionary.TryGetValue(key, out var value)) { ...
To be able to use this advantage you will need a more subtle approach. The main difficulty is that sometimes keys might consist of more than a single word. Therefore I would suggest a two level approach where at the first level you store single word keys and at the second level you store the composed keys and values.
Example: Key value pairs to be stored:
["car"]: categoryA
["card payment"]: categoryB
["payment"]: categoryC
We build a dictionary as
var dictionary = new Dictionary<string, List<KeyValuePair<string, TValue>>> {
["car"] = new List<KeyValuePair<string, TValue>> {
new KeyValuePair("car", categoryA)
},
["card"] = new List<KeyValuePair<string, TValue>> {
new KeyValuePair("card payment", categoryB)
},
["payment"] = new List<KeyValuePair<string, TValue>> {
new KeyValuePair("card payment", categoryB),
new KeyValuePair("payment", categoryC)
}
};
Of course, in reality, we would do this using an algorithm. But the point here is to show the structure. As you can see, the third entry for the main key "payment" contains two entries: One for "card payment" and one for "payment".
The algorithm for adding values goes like this:
Split the key the be entered into single words.
For each word, create a dictionary entry using this word as main key and store a key value pair in a list as dictionary value. This second key is the original key possibly consisting of several words.
As you can imagine, step 2 requires you to test whether an entry with the same main key is already there. If yes, then add the new entry to the existing list. Otherwise create a new list with a single entry and insert it into the dictionary.
Retrieve an entry like this:
Split the key the be entered into single words.
For each word, retrieve the existing dictionary entries using a true and therefore fast dictionary lookup(!) into a List<List<KeyValuePair<string, TValue>>>.
Flatten this list of lists using SelectMany into a single List<KeyValuePair<string, TValue>>
Sort them by key length in descending order and test whether the description contains the key. The first entry found is the result.
You can also combine steps 2 and 3 and directly add the list entries of the single dictionary entries into a main list.

Related

Searching dictionary for matching key from a textbox C#

I'm reading a csv file that contains abbreviations and the full version for example LOL,Laughing out Loud. I've created a dictionary where the abbreviation is the key and full version is the value.
'''
private void btnFilter_Click(object sender, RoutedEventArgs e)
{
var keys = new List<string>();
var values = new List<string>();
using (var rd = new StreamReader("textwords.csv"))
{
while (!rd.EndOfStream)
{
var splits = rd.ReadLine().Split(',');
keys.Add(splits[0]);
values.Add(splits[1]);
}
}
var dictionary = keys.Zip(values, (k, v) => new { Key = k, Value = v }).ToDictionary(x => x.Key, x => x.Value);
foreach (var k in dictionary)
{
//
}
aMessage.MessageContent = txtContent.Text;
}
'''
I'm using a button that will check if the textbox txtContent contains any of the abbreviations in the text and change this to the full version. So if the textbox contained the following. "Going to be AFK" after the button click it would change this to "Going to be away from keyboard".
I want to write a foreach loop that would check for any abbreviations, elongate them and then save this to a string variable MessageContent.
Would anyone be able to show me the best way to go about this as I'm not sure how to do this with the input from a textbox?
Thanks
You can just use LINQ to read and create a dictionary object:
var dictionary = File.ReadAllLines(#"textwords.csv")
.Select(x => x.Split(",",StringSplitOptions.RemoveEmptyEntries))
.ToDictionary(key => key.FirstOrDefault().Trim(),
value => value.Skip(1).FirstOrDefault().Trim());
If the abbrevs are correct, i.e. you don't need fuzzy logic, you can use a variety of .NET objects and search the keys rather quickly.
if(dict.ContainsKey(myKey))
{
}
I did that freehand, so it might be dict.Keys.Contains() or similar. The point is you can search the dictionary directly rather than loop.
If you need to do a more fuzzy search, you can utilize LINQ to write queries to iterate over collections of objects extremely fast (and few lines).
As for yours, you would iterate over dictionary.Keys to seach keys, but I still don't see why you need the extra code.

Is there a way in linq wherin i can insert a row(from dictionary) in datatable using the list of column names c#?

I have a List<Dictionary<string,string>> something like this:
[0] key1 val,key2 val,key3 val
[1] key1 val,key2 val,key3 val
[2] key1 val,key2 val,key3 val
And i have a list of column names in the same order as columns in the datatable.
I want to filter only those keys which are there inside the list from the dictionary and also insert it in the proper order.
I'm able to filter the required keys to be inserted but then how do i insert it in the proper order in linq.
var colList = new List<string>() { "key3", "key1"};
dict.ForEach(p => jsonDataTable.Rows.Add(p.Where(q=>colList.Contains(q.key)).Select(r => r.Value).ToArray()));
I cannot do like this because number of columns will vary and also the method must work when we pass any list of column names:
foreach(var item in dict)
jsonDatatable.Rows.Add(item[colList[0]], item[colList[1]]);
Please suggest some ways.
LINQ will never ever change the input sources. You can only extract data from it.
Divide problems in subproblems
The only way to change the input sources is by using the extracted data to update your sources. Make sure that before you update the source you have materialized your query (= ToList() etc)
You can divide your problem into subproblems:
Convert the table into a sequence of columns in the correct order
convert the sequence of columns into a sequence of column names (still in the correct order)
use the column names and the dictionary to fetch the requested data.
By separating your problem into these steps, you prepare your solution for reusability. If in future you change your table to a DataGridView, or a table in an entity framework database, or a CSV file, or maybe even JSON, you can reuse the latter steps. If in future you need to use the column names for something else, you can still use the earlier steps.
To be able to use the code in a LINQ-like way, my advice would be to create extension method. If you are unfamiliar with extension methods, read Extension Methods Demystified
You will be more familiar with the layout of your table (System.Data.DataTable? Windows.Forms.DataGridView? DataGrid in Windows.Controls?) and your columns, so you'll have to create the first ones yourself. In the example I use MyTable and MyColumn; replace them with your own Table and Column classes.
public static IEnumerable<MyColumn> ToColumns(this MyTable)
{
// TODO: return the columns of the table
}
public static IEnumerable<string> ToColumnNames(this IEnumerable<MyColumn> columns)
{
return columns.Select(column => ...);
}
If the column name is just a property of the column, I wouldn't bother creating the second procedure. However, the nice thing is that it hides where you get the name from. So to be future-changes-proof, maybe create the method anyway.
You said these columns were sorted. If you want to be able to use ThenBy(...) consider returning an IOrderedEnumerable<MyColumn>. If you won't sort the sorted result, I wouldn't bother.
Usage:
MyTable table = ...
IEnumerable<string> columnNames = table.ToColumns().ToColumnNames();
or:
IEnumerable<string> columnNames = table.ToColumns()
.Select(column => column.Name);
The third subproblem is the interesting one.
Join and GroupJoin
In LINQ whenever you have two tables and you want to use a property of the elements in one table to match them with the properties of another table, consider to use (Group-)Join.
If you only want items of the first table that match exactly one item of the other table, use Join: "Get Customer with his Address", "Get Product with its Supplier". "Book with its Author"
On the other hand, if you expect that one item of the first table matches zero or more items from the other table, use GroupJoin: "Schools, each with their Students", "Customers, each with their Orders", "Authors, each with their Books"
Some people still think in database terms. They tend to use some kind of Left Outer Join to fetch "Schools with their Students". The disadvantage of this is that if a School has 2000 Students, then the same data of the School is transferred 2000 times, once for every Student. GroupJoin will transfer the data of the School only once, and the data of every Student only once.
Back to your question
In your problem: every column name is the key of exactly one item in the Dictionary.
What do you want to do with column names without keys? If you want to discard them, use Join. If you still want to use the column names that have nothing in the Dictionary, use GroupJoin.
IEnumerable<string> columNames = ...
var result = columnNames.Join(myDictionary,
columName => columName, // from every columName take the columnName,
dictionaryItem => dictionaryItem.Key, // from every dictionary keyValuePair take the key
// parameter resultSelector: from every columnName and its matching dictionary keyValuePair
// make one new object:
(columnName, keyValuePair) => new
{
// Select the properties that you want:
Name = columnName,
// take the whole dictionary value:
Value = keyValuePair.Value,
// or select only the properties that you plan to use:
Address = new
{
Street = keyValuePair.Street,
City = keyValuePair.City,
PostCode = keyValuePair.Value.PostCode
...
},
});
If you use this more often: consider to create an extension method for this.
Note: the order of the result of a Join is not specified, so you'll have to Sort after the Order
Usage:
Table myTable = ...
var result = myTable.ToColumns()
.Select(column => column.Name)
.Join(...)
.Sort(joinResult => joinResult.Name)
.ToList();
Instead of filtering on the List<Dictionary<string, string>>, filter on the colList so that you will get in the same order and only if the colList is available in the List<Dictionary<string, string>>
This is as per my understanding, please comment if you need the result in any other way.
var dictAllValues = dict.SelectMany(x => x.Select(y => y.Value)).ToList();
// Now you can filter the colList using the above values
var filteredList = colList.Where(x => dictAllValues.Contains(x));
// or you can directly add to final list as below
jsonDataTable.Rows.AddRange(colList.Where(x => dictAllValues.Contains(x)).ToList());

How to get a values list from a dictionary where the key matches in a list in c#

I have a dictionary and I want to retrieve all the values list from the dictionary based on a condition on the key, i.e. I want to retrieve only the values for which the respective key matches in alist.
Example: dictionary is as follows
IDictionary<string, string> maskingValues = new Dictionary<string, string>();
maskingValues.Add("cat", "Me#ena");
maskingValues.Add("dog", "N&avya");
maskingValues.Add("llama", "vivek!a");
maskingValues.Add("iguana", "sh^ams");
and I have list of strings as
List<string> keysString = new List<string>();
keysString.Add("cat");
keysString.Add("fox");
Now my requirement is to get the values list from the dictionary where the key matches from the keysString list.
The output should be
Me#ena
till now what I have done is
var all_parameters = maskingValues .Keys.ToList();
var parameters_to_mask = all_parameters.Intersect(keysString);
var values_list = parameters_to_mask.Where(k => data_dictionary.ContainsKey(k)).Select(k => data_dictionary[k]);
so values_list will contain the output Me#ena, I retrieved all the keys from the dictionary and then compared the keys list with the keysString list and then retrieved the values list from the output of the two list's intersect. But can I do it in more optimized way so that the code and the performance is good.
Thanks in advance. Please help me out.
This should work:
var q = maskingValues.Where(x => keysString.Contains(x.Key)).Select(x => x.Value);
foreach (var item in q)
{
Console.WriteLine(item);
}
There are a lot of solutions. This one for example:
var newlist = maskingValues.Where(x => keysString.Contains(x.Key)).Select(y=>y.Value).ToList();
I came up with a quick bit of code to do this fully using linq.
keysString.Where(x => maskingValues.Keys.Contains(x)).ToList().ForEach(x => Console.WriteLine(maskingValues[x]));
Not sure I got the spec right but this is faster than linq:
var matches = new List<string>();
foreach (var key in keysString)
{
if (maskingValues.TryGetValue(key, out string value))
{
matches.Add(value);
}
}
If your dictionary is large, you can improve performance by taking advantage of the fact that accessing an element by key in a dictionary is O(
var result = keysString
.Select(k =>
{ string value;
maskingValues.TryGetValue(k, out value);
return value;
})
.Where(v => v != null);
... etc ...
Note that using TryGetValue is more efficient than calling Contains then the dictionary indexer.
This Should Work:
Below Solution is used when you know your Key name and want to retrive value of key
string objectValue;
maskingValues.TryGetValue("cat", out objectValue);
If you want to Retrive All values from Dictionary than used single line of code: maskingValues.Values

string grouping algorithm c#

I need something like a grouping algorithm for strings in C#.
I've tried for days and before I go mad, I should maybe ask someone :)
(no adjazenctmatrix^^)
what do I have is data in an Dictonary
something like this:
key|value
"bla","AAA;BBB;CCC" // ';' is split sign
"whatever","BBB;DDD;EEE;FF"
"hmm", "ZZZ,YYY,XXX"
"foo", "CCC,JJJ,VVV"
....
value1 and value2 contains "BBB" so group it to new string : (in a new dictionary,key whatever...counter?)
"AAA;BBB;CCC;EEE;FF" (or without distinct to "AAA;BBB;CCC;BBB;DDD;EEE;FF")
value3 is his own group
value4 contains "CCC" so group it to the others
"AAA;BBB;CCC;EEE;FF;JJJ;VVV" (or without distinct to "AAA;BBB;CCC;BBB;DDD;EEE;FF;CCC;JJJ;VVV")
I need that string for SQL update
update item set group = bar
where group in ('','',... )
I do it with split and join, this part works :-P
thanks
So first organize the data. Have a map keys["bla"] = some_set("AAA", "BBB", "CCC"); and so on. Then build a reverse map that should look like reverse["BBB"] = ["bla", "whatever"] both maps should be about the same size as the original data.
Next you can do a DFS over the implicit graph (pseudocode):
merge = some_set()
DFS(string key) {
if (key in merge) return; // Been here already.
merge.insert(key);
for (string edge : keys[key])
for (string other_key : reverse[edge])
DFS(other_key)
}
So you can now call DSF("bla"). When it returns it should contain "bla", "whatever", ..." and any other keys that might be in the group and you can concatenate their strings from keys to get the result you wanted.
You can call DFS for every key to get all the group each key belongs (complexity O(N^2*set_op)). Or, better, keep track of what keys you already processed to avoid working on them again (complexity O(N*set_op)).
If you use hash based sets/maps your set_op is O(average string length). If you use tree based structures then set_op is O(logN). This shouldn't matter unless you have very long strings or lots of keys.

is there a way to get the Order number of a Dictionary Item?

Dictionary<string,string> items = new Dictionary<string,string>();
IEnumerable<string> textvalues = from c in......//using linQ to query
string s = textvalues[items["book"]];
In this case the textvalues array will accept a integer value to return the string value.How can i get the item number , say "items" has 5 itemnames and "book" is at first position then i must get 0.So textvalues[items["book"]] would be translated as textvalues[item[0]]
Ok i was trying to use OpenXML 2.0 to read Excel.The point here is there is no way i could specify a field name and get the value.
So i was trying to iterate the first row of a worksheet, add the values to a dictionary Dictionary fieldItems so that when i say fieldItems["Status"] it would retrieve me the cell value based on the column number , in my case the column header name.ok here's the code for it.
Dictionary<string, int> headers = new Dictionary<string, int>();
IEnumerable<string> ColumnHeaders = from cell in (from row in worksheet.Descendants<Row>()
where row.RowIndex == 1
select row).First().Descendants<Cell>()
where cell.CellValue != null
select
(cell.DataType != null
&& cell.DataType.HasValue
&& cell.DataType == CellValues.SharedString
? sharedString.ChildElements[
int.Parse(cell.CellValue.InnerText)].InnerText
: cell.CellValue.InnerText);
int i=0;
Parallel.ForEach(ColumnHeaders, x => { headers.Add(x,i++); });
order.Number = textValues[headers["Number"]];
(Darn, I didn't misread it after all, and there's no edit history in the first five minutes.)
Your question is somewhat confusingly presented... but it seems like you're basically trying to find the "position" of an item within a dictionary. There's no such concept in Dictionary<TKey, TValue>. You should regard it as a set of mappings from keys to values.
Obviously when you iterate over the entries in that set of mappings they will come out in some order - but there's no guarantee that it will bear any relation to the order in which the entries were added. If you use SortedDictionary<,> or SortedList<,> (both of which are really still dictionaries), you can get the entries in sorted key order... but it's not clear whether that would be good enough for you.
It's also not clear what you're really trying to achieve - you call textvalues an array in the text, but declare it as an IEnumerable<string> - and it looks like you're then trying to use an indexer with a string parameter...
EDIT: Okay, now the question has been edited, you've got a Dictionary<string, int> rather than a Dictionary<string, string>... so the whole thing makes more sense. Now it's easy:
order.Number = textValues.ElementAt(headers["Number"]);

Categories