Dictionary change of order - c#

I have a dictionary.
Dictionary<int, string> inboxMessages = new Dictionary<int, string>();
This dictionary contains messages with their own unique ID (the newer the message, the higher the ID). I put the messages in a picker (xamarin) but it shows the oldest messages first. How can I change this?
The Picker:
inboxPicker = new Picker
{
WidthRequest = 320,
};
foreach (string inboxMessage in inboxMessages.Values)
{
inboxPicker.Items.Add(inboxMessage);
}
How i get my messages:
private async Task getMessages()
{
await Task.Run(async () => {
MailModel[] mails = await api.GetMails(App.userInfo.user_id);
foreach (MailModel mail in mails)
{
inboxMessages.Add(mail.message_id,mail.sender_user_id +" "+ mail.subject +" "+ mail.time_send);
}
});
}

The Values property of a dictionary is not ordered. Quote from the documentation:
The order of the values in the Dictionary<TKey, TValue>.ValueCollection is unspecified [...]
If you want to retrieve the values in some specific order, you need to sort it yourself. For example:
var sorted = inboxMessages.OrderByDescending(kv => kv.Key).Select(kv => kv.Value);
foreach (string inboxMessage in sorted)
{
inboxPicker.Items.Add(inboxMessage);
}
This retrieves the KeyValuePairs from the dictionary, sorts them descending on their int key and then returns an enumeration of the values.

You should sort the dictionary entries while you still have access to their keys:
foreach (string inboxMessage in inboxMessages
.OrderByDescending(m => m.Key)
.Select(m => m.Value)
{
inboxPicker.Items.Add(inboxMessage);
}

Related

Efficient way to create a new list based of the differences in values in 2 dictionaries?

I currently have 2 strings that are formatted as an XML that are later converted into dictionaries for comparison.
So, I have a 2 Dictionary<string, object>, dict1 and dict2, that I need to compare. I need to:
Add the key to a list of strings if the values of these two dictionaries do not match
Add the key of dict2 to the list if dict1 does not contain this key
Currently, I have a simple foreach loop
foreach (string propName in dict2.Keys)
{
string oldDictValue;
string newDicValue = dict1[propName].ToString();
if (dict1.ContainsKey(propName))
{
oldDictValue = dict2[propName].ToString();
if (oldDictValue != newDicValue)
list.Add(propName);
}
else
{
list.Add(propName);
}
}
I would like to a faster solution to this problem if possible?
I don't claim that this is any faster, but it should be on par and it's less code:
List<string> list =
dict2
.Keys
.Where(k => !(dict1.ContainsKey(k) && dict1[k].Equals(dict2[k])))
.ToList();
I did do some testing with this:
List<string> list =
dict2
.Keys
.AsParallel()
.Where(k => !(dict1.ContainsKey(k) && dict1[k].Equals(dict2[k])))
.ToList();
That produced a significantly faster run.
Here's how I produced my test data:
var dict1 = Enumerable.Range(0, 10000000).Select(x => Random.Shared.Next(2000000)).Distinct().ToDictionary(x => x.ToString(), x => (object)Random.Shared.Next(20));
var dict2 = Enumerable.Range(0, 10000000).Select(x => Random.Shared.Next(2000000)).Distinct().ToDictionary(x => x.ToString(), x => (object)Random.Shared.Next(20));
You could make it faster by avoiding to get separately the dict1[propName] and the dict2[propName]. You could get the value along with the key, either by enumerating directly the KeyValuePairs stored in the dictionary, or by calling the TryGetValue method:
foreach (var (key, value2) in dict2)
{
if (!dict1.TryGetValue(key, out var value1)
|| value1.ToString() != value2.ToString())
{
list.Add(key);
}
}

Build Dictionary with LINQ

Let's say we have a variable 'data' which is a list of Id's and Child Id's:
var data = new List<Data>
{
new()
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new()
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new()
{
Id = 2,
ChildIds = new List<int> {678, 789}
},
};
I would like to have a dictionary with ChildId's and the related Id's. If the ChildId is already in the dictionary, it should overwrite with the new Id.
Currently I have this code:
var dict = new Dictionary<int, int>();
foreach (var dataItem in data)
{
foreach (var child in dataItem.ChildIds)
{
dict[child] = dataItem.Id;
}
}
This works fine, but I don't like the fact that I am using two loops. I prefer to use Linq ToDictionary to build up the dictionary in a Functional way.
What is the best way to build up the dictionary by using Linq?
Why? I prefer functional code over mutating a state. Besides that, I was just curious how to build up the dictionary by using Linq ;-)
In this case your foreach appproach is both, readable and efficient. So even if i'm a fan of LINQ i would use that. The loop has the bonus that you can debug it easily or add logging if necessary(for example invalid id's).
However, if you want to use LINQ i would probably use SelectMany and ToLookup. The former is used to flatten child collections like this ChildIds and the latter is used to create a collection which is very similar to your dictionary. But one difference is that it allows duplicate keys, you get multiple values in that case:
ILookup<int, int> idLookup = data
.SelectMany(d => d.ChildIds.Select(c => (Id:d.Id, ChildId:c)))
.ToLookup(x => x.ChildId, x => x.Id);
Now you have already everything you needed since it can be used like a dictionary with same lookup performance. If you wanted to create that dictionary anyway, you can use:
Dictionary<int, int> dict = idLookup.ToDictionary(x => x.Key, x => x.First());
If you want to override duplicates with the new Id, as mentioned, simply use Last().
.NET-Fiddle: https://dotnetfiddle.net/mUBZPi
The SelectMany linq operator actually has a few less known overloads. One of these has a result collector which is a perfect use case for your scenario.
Following is an example code snippet to turn that into a dictionary. Note that I had to use the Distinct, since you had 2 id's with value 1 which had some duplicated child id's which would pose problems for a dictionary.
void Main()
{
// Get the data
var list = GetData();
// Turn it into a dictionary
var dict = list
.SelectMany(d => d.ChildIds, (data, childId) => new {data.Id, childId})
.Distinct()
.ToDictionary(x => x.childId, x => x.Id);
// show the content of the dictionary
dict.Keys
.ToList()
.ForEach(k => Console.WriteLine($"{k} {dict[k]}"));
}
public List<Data> GetData()
{
return
new List<Data>
{
new Data
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new Data
{
Id = 1,
ChildIds = new List<int> {123, 234, 345}
},
new Data
{
Id = 2,
ChildIds = new List<int> {678, 789}
},
};
}
public class Data
{
public int Id { get; set; }
public List<int> ChildIds { get; set; }
}
The approach is to create pairs of each combination of Id and ChildId, and build a dictionary of these:
var list = new List<(int Id, int[] ChildIds)>()
{
(1, new []{10, 11}),
(2, new []{11, 12})
};
var result = list
.SelectMany(pair => pair.ChildIds.Select(childId => (childId, pair.Id)))
.ToDictionary(p => p.childId, p => p.Id);
ToDictionary will throw if there are duplicate keys, to avoid this you can look at this answer and create your own ToDictionary:
public static Dictionary<K, V> ToDictionaryOverWriting<TSource, K, V>(
this IEnumerable<TSource> source,
Func<TSource, K> keySelector,
Func<TSource, V> valueSelector)
{
Dictionary<K, V> output = new Dictionary<K, V>();
foreach (TSource item in source)
{
output[keySelector(item)] = valueSelector(item);
}
return output;
}
With LINQ you can achieve the result like this:
Dictionary<int, int> dict = (from item in data
from childId in item.ChildIds
select new { item.Id, childId}
).Distinct()
.ToDictionary(kv => kv.childId, kv => kv.Id);
Update:
Fully compatible version with foreach loop would use group by with Last(), instead of Distict():
Dictionary<int, int> dict2 = (from item in data
from childId in item.ChildIds
group new { item.Id, childId } by childId into g
select g.Last()
).ToDictionary(kv => kv.childId, kv => kv.Id);
As some already pointed out, depending on order of input elements does not feel "functional". LINQ expression becomes more convoluted then original foreach loop.
There is an overload of SelectMany which not only flattens the collection but also allows you to have any form of result.
var all = data.SelectMany(
data => data.ChildIds, //collectionSelector
(data, ChildId) => new { data.Id, ChildId } //resultSelector
);
Now if you want to transform all into a Dictionary, you have to remove the duplicate ChildIds first. You can use GroupBy as in below, and then pick the last item from each group (as you stated in your question you want to overwrite Ids as you go). The key of your dictionary should also be unique=ChildId:
var dict = all.GroupBy(x => x.ChildId)
.Select(x => x.Last())
.ToDictionary(x => x.ChildId, x => x.Id);
Or you can write a new class with IEquatable<> implemented and use it as the return type of resultSelector (instead of new { data.Id, ChildId }). Then write all.Reverse().Distinct().ToDictionary(x => x.ChildId); so it would detect duplicates based on your own implementation of Equals method. Reverse, because you said you want the last occurrence of the duplicates.

How can I combine data from rows in a list using LINQ?

I have a list that contains two properties, Sequence and Term.
termData <int,string>
For each Sequence there can be multiple Terms.
Is there a way that I can combine the terms for each Sequence number such that it creates another list looking something like:
1438690 "weather; the elements; fair weather
var _result = termData.GroupBy(x => x.Sequence)
.Select(x => new
{
seq = x.Key,
term = x.Select(y => y.Term).ToList()
});
var list = new List<termData>();
list.Add(new termData() { Sequence = 1438690, Terms = "weather" });
list.Add(new termData() { Sequence = 1438690, Terms = "the elements" });
list.Add(new termData() { Sequence = 9672410, Terms = "dogs" });
list.Add(new termData() { Sequence = 9672410, Terms = "cats" });
var result = list
.GroupBy(t => t.Sequence, t => t.Terms)
.Select(g => g.Key + ";" + String.Join(";", g));
foreach (var item in result)
{
Console.WriteLine(item);
}
Output:
1438690;weather;the elements
9672410;dogs;cats
Whenever you have a series of items with a single key referencing multiple items, you can use a Lookup object:
var lookup = list.ToLookup( item => item.Sequence, item => item.Terms);
This code tells c# to create a lookup, which is just like a dictionary where item.Sequence is the key and item.Terms is the value. The value itself is a list which can be enumerated:
foreach (var item in lookup)
{
Console.WriteLine("Sequence {0} has these terms: {1}", item.Key, string.Join(",", item));
}
Output:
Sequence 1438690 has these terms: weather,the elements
Sequence 9672410 has these terms: dogs,cats
See my working example on DotNetFiddle

How to aggregate millions of rows using EF Core

I'm trying to aggregate approximately two million rows based on user.
One user has several Transactions, each Transaction has a Platform and a TransactionType.I aggregate Platform and TransactionType columns as json and save as a single row.
But my code is slow.
How can I improve the performance?
public static void AggregateTransactions()
{
using (var db = new ApplicationDbContext())
{
db.ChangeTracker.AutoDetectChangesEnabled = false;
//Get a list of users who have transactions
var users = db.Transactions
.Select(x => x.User)
.Distinct();
foreach (var user in users.ToList())
{
//Get all transactions for a particular user
var _transactions = db.Transactions
.Include(x => x.Platform)
.Include(x => x.TransactionType)
.Where(x => x.User == user)
.ToList();
//Aggregate Platforms from all transactions for user
Dictionary<string, int> platforms = new Dictionary<string, int>();
foreach (var item in _transactions.Select(x => x.Platform).GroupBy(x => x.Name).ToList())
{
platforms.Add(item.Key, item.Count());
};
//Aggregate TransactionTypes from all transactions for user
Dictionary<string, int> transactionTypes = new Dictionary<string, int>();
foreach (var item in _transactions.Select(x => x.TransactionType).GroupBy(x => x.Name).ToList())
{
transactionTypes.Add(item.Key, item.Count());
};
db.Add<TransactionByDay>(new TransactionByDay
{
User = user,
Platforms = platforms, //The dictionary list is represented as json in table
TransactionTypes = transactionTypes //The dictionary list is represented as json in table
});
db.SaveChanges();
}
}
}
Update
So a basic view of the data would look like the following:
Tansactions Data:
Id: b11c6b67-6c74-4bbe-f712-08d609af20cf,
UserId: 1,
PlatformId: 3,
TransactionypeId: 1
Id: 4782803f-2f6b-4d99-f717-08d609af20cf,
UserId: 1,
PlatformId: 3,
TransactionypeId: 4
Aggregate data as TransactionPerDay:
Id: 9df41ef2-2fc8-441b-4a2f-08d609e21559,
UserId: 1,
Platforms: {"p3":2},
TransactionsTypes: {"t1":1,"t4":1}
So in this case, two transactions are aggregated into one. You can see that the platforms and transaction types will be aggregated as json.
You probably should not be calling db.saveChanges() within the loop. Putting it outside the loop to persist the changes once, may help.
But having said this, when dealing with large volumes of data and performance is key, I've found that ADO.NET is probably a better choice. This does not mean you have to stop using Entity Framework, but perhaps for this method you could use ADO.NET. If you go down this path you could either:
Create a stored procedure to return the data you need to work on, populate a datatable, manipulate the data and the persist everything in bulk using sqlBulkCopy.
Use a stored procedure to completely perform this operation. This avoids the need to shuttle the data to your application and the entire processing can happen within the database itself.
Linq To EF is not built for speed (LinqToSQL is easier and faster IMHO, or you could run direct SQL commands with Linq EF\SQL). Anyway, I don't know how this would speed wise:
using (var db = new MyContext(connectionstring))
{
var tbd = (from t in db.Transactions
group t by t.User
into g
let platforms = g.GroupBy(tt => tt.Platform.Name)
let trantypes = g.GroupBy(tt => tt.TransactionType.Name)
select new {
User = g.Key,
Platforms = platforms,
TransactionTypes = trantypes
}).ToList()
.Select(u => new TransactionByDay {
User=u.User,
Platforms=u.Platforms.ToDictionary(tt => tt.Key, tt => tt.Count()),
TransactionTypes = u.TransactionTypes.ToDictionary(tt => tt.Key, tt => tt.Count())
});
//...
}
The idea is to try to do less queries and includes by getting as much data as needed first. So there is no need to include with every transaction the Platform and TransactionType, where you can just query them once in a Dictionary and look the data up. Further more we could do our processing in Parallel, then save all the data at once.
public static void AggregateTransactions()
{
using (var db = new ApplicationDbContext())
{
db.ChangeTracker.AutoDetectChangesEnabled = false;
//Get a list of users who have transactions
var transactionsByUser = db.Transactions
.GroupBy(x => x.User) //Not sure if EF Core supports this kind of grouping
.ToList();
var platforms = db.Platforms.ToDictionary(ks => ks.PlatformId);
var Transactiontypes = db.TransactionTypes.ToDictionary(ks => ks.TransactionTypeId);
var bag = new ConccurentBag<TransactionByDay>();
Parallel.ForEach(transactionsByUser, transaction =>
{
//Aggregate Platforms from all transactions for user
Dictionary<string, int> platforms = new Dictionary<string, int>(); //This can be converted to a ConccurentDictionary
//This can be converted to Parallel.ForEach
foreach (var item in _transactions.Select(x => platforms[x.PlatformId]).GroupBy(x => x.Name).ToList())
{
platforms.Add(item.Key, item.Count());
};
//Aggregate TransactionTypes from all transactions for user
Dictionary<string, int> transactionTypes = new Dictionary<string, int>(); //This can be converted to a ConccurentDictionary
//This can be converted to Parallel.ForEach
foreach (var item in _transactions.Select(x => Transactiontypes[c.TransactionTypeId]).GroupBy(x => x.Name).ToList())
{
transactionTypes.Add(item.Key, item.Count());
};
bag.Add(new TransactionByDay
{
User = transaction.Key,
Platforms = platforms, //The dictionary list is represented as json in table
TransactionTypes = transactionTypes //The dictionary list is represented as json in table
});
});
//Before calling this we may need to check the status of the Parallel ForEach, or just convert it back to regular foreach loop if you see no benefit.
db.AddRange(bag);
db.SaveChanges();
}
}
Variation #2
public static void AggregateTransactions()
{
using (var db = new ApplicationDbContext())
{
db.ChangeTracker.AutoDetectChangesEnabled = false;
//Get a list of users who have transactions
var users = db.Transactions
.Select(x => x.User)
.Distinct().ToList();
var platforms = db.Platforms.ToDictionary(ks => ks.PlatformId);
var Transactiontypes = db.TransactionTypes.ToDictionary(ks => ks.TransactionTypeId);
var bag = new ConccurentBag<TransactionByDay>();
Parallel.ForEach(users, user =>
{
var _transactions = db.Transactions
.Where(x => x.User == user)
.ToList();
//Aggregate Platforms from all transactions for user
Dictionary<string, int> userPlatforms = new Dictionary<string, int>();
Dictionary<string, int> userTransactions = new Dictionary<string, int>();
foreach(var transaction in _transactions)
{
if(platforms.TryGetValue(transaction.PlatformId, out var platform))
{
if(userPlatforms.TryGetValue(platform.Name, out var tmp))
{
userPlatforms[platform.Name] = tmp + 1;
}
else
{
userPlatforms.Add(platform.Name, 1);
}
}
if(Transactiontypes.TryGetValue(transaction.TransactionTypeId, out var type))
{
if(userTransactions.TryGetValue(type.Name, out var tmp))
{
userTransactions[type.Name] = tmp + 1;
}
else
{
userTransactions.Add(type.Name, 1);
}
}
}
bag.Add(new TransactionByDay
{
User = user,
Platforms = userPlatforms, //The dictionary list is represented as json in table
TransactionTypes = userTransactions //The dictionary list is represented as json in table
});
});
db.AddRange(bag);
db.SaveChanges();
}
}

Get Key values from a file by using linq query

There are multiple files in a folder that the code should read one by one. I have to extract some key value from the file to perform some business logic.
the file look like this,
x-sender:
x-receiver:
Received:
X-AuditID:
Received:
Received:
From:
To:
Subject:
Thread-Topic:
Thread-Index:
Date:
Message-ID:
Accept-Language:
Content-Language:
X-MS-Has-Attach:
There are multiple keys that can increase and decrease as per file. The order of the key could also be changed. Every key has some value.
Code:
private void BtnStart_Click(object sender, EventArgs e)
{
// searches current directory
foreach (string file in Directory.EnumerateFiles(NetWorkPath, "*.eml"))
{
var dic = File.ReadAllLines(file)
.Select(l => l.Split(new[] { ':' }))
.ToDictionary(s => s[0].Trim(), s => s[1].Trim());
string myUser = dic["From"];
}
}
I was trying to read the file and convert that into dictionary , So that i can access by using Keys. But it is giving me an error "An item with the same key has already been added.".
Any help??
Instead of ToDictionary, You can use ToLookup
......same code....
.Where(s => s.Length>1)
.ToLookup(s => s[0].Trim(), s => s[1].Trim());
Then you can check as
string myUser = dic["From"].FirstOrDefault();
That's because Receieved is in there multiple times and Dictionary doesn't allow duplicate entries for it's key value.
You could use a Tuple<string, string>, that would allow duplicates.
If you don't want to return it though, you could just use an anonymous type:
foreach (string file in Directory.EnumerateFiles(NetWorkPath, "*.eml"))
{
var items = myList
.Select(l => l.Split(new [] {':' }, StringSplitOptions.RemoveEmptyEntries))
.Where(l => l != null && l.Count() == 2)
.Select(l => new
{
Key = l[0],
Value = l[1],
})
.ToList();
string myUser = items.First(i => i.Key == "From").Value;
}
You have 2 elements with a same name - Received:
It is means that you have already added in dictionary the same key twice,
for content of your file it is Received:

Categories