Working with data held in a ConcurrentQueue<T> - c#

I have a background worker that streams data and saves it to a ConcurrentQueue<T> which is what I need since it is a thread safe First In First Out collection, but I also need to do tasks like perform simple calculations or to pull data from this collection and I'm not sure what I need to use at this point. Here is some example pseudo code:
public class ExampleData
{
public DateTime Date { get; set; }
public decimal Value { get; set; }
}
public ConcurrentQueue<ExampleData> QueueCol { get; set; } = new();
public void AddToQueue(DateTime date, decimal value)
{
QueueCol.Enqueue(new ExampleData() { Date = date, Value = value });
}
public void DisplayPastData()
{
var count = QueueCol.Count();
var prev1Data = count >= 2 ? QueueCol.ElementAt(count - 2) : null;
var prev2Data = count >= 3 ? QueueCol.ElementAt(count - 3) : null;
var prev3Data = count >= 4 ? QueueCol.ElementAt(count - 4) : null;
if (prev1Data != null)
{
Console.WriteLine($"Date: {prev1Data.Date} Value: {prev1Data.Value}");
}
if (prev2Data != null)
{
Console.WriteLine($"Date: {prev2Data.Date} Value: {prev2Data.Value}");
}
if (prev3Data != null)
{
Console.WriteLine($"Date: {prev3Data.Date} Value: {prev3Data.Value}");
}
}
This is a very rough example but even with displaying data most of it looks correct and then I will get dates completely out of left field like a date from the previous day in between dates from the current day and so because of ordering issues like that I know the data isn't correct so my question is how do I convert the concurrent queue to a new collection that will allow me to keep the order and to work with the data without giving incorrect results?

The usage pattern you describe in your question makes a ConcurrentQueue<T> not a suitable collection for your scenario. As far as I can understand the requirements are:
The producer(s) should be able to enqueue items in the collection without being blocked for any amount of time.
The consumer(s) should be able to perform calculations on a snapshot of the collection, without creating an expensive copy of the collection, and without interfering in any way with the producer(s).
The collection that seems more suitable for your scenario out of the box, is the ImmutableList<T>. This collection can be updated with lock-free Interlocked operations, and it is essentially a snapshot by itself (because it is immutable). Here is how you could use it in a multithreading scenario, with thread-safety and without blocking any thread:
private ImmutableList<ExampleData> _data = ImmutableList<ExampleData>.Empty;
public ImmutableList<ExampleData> Data => Volatile.Read(ref _data);
public void AddToQueue(DateTime date, decimal value)
{
var newData = new ExampleData() { Date = date, Value = value };
ImmutableInterlocked.Update(ref _data, (x, y) => x.Add(y), newData);
}
public void DisplayPastData()
{
ImmutableList<ExampleData> snapshot = Volatile.Read(ref _data);
int count = snapshot.Count;
var prev1Data = count >= 2 ? snapshot[count - 2] : null;
var prev2Data = count >= 3 ? snapshot[count - 3] : null;
var prev3Data = count >= 4 ? snapshot[count - 4] : null;
if (prev1Data != null)
{
Console.WriteLine($"Date: {prev1Data.Date} Value: {prev1Data.Value}");
}
if (prev2Data != null)
{
Console.WriteLine($"Date: {prev2Data.Date} Value: {prev2Data.Value}");
}
if (prev3Data != null)
{
Console.WriteLine($"Date: {prev3Data.Date} Value: {prev3Data.Value}");
}
}
The immutable collections are not without disadvantages. They are a lot slower in comparison with the normal collections, they require significantly more memory, and they create significantly more garbage every time they are updated.
An optimal solution to your specific scenario could be a combination of a ConcurrentQueue<ExampleData> (recent data) and a List<ExampleData> (historic data). The producer(s) would enqueue items in the ConcurrentQueue<T>, and the single consumer would dequeue all the items from the ConcurrentQueue<T> and then add them in the List<T>. Then it would use the List<T> to do the calculations.

Related

Sort List Of Objects By Properties C# (Custom Sort Order)

I currently have a list of objects that I am trying sort for a custom made grid view. I am hoping that I can achieve it without creating several customized algorithms. Currently I have a method called on page load that sorts the list by customer name, then status. I have a customized status order (new, in progress, has issues, completed, archived) and no matter which sort is used (customer, dates, so on) it should sort the status in the correct order. For example:
I have two customers with two orders each, the first customer is Betty White, the second is Mickey Mouse. Currently, Betty has a new order, and a completed order and Mickey has an order in progress and another on that has issues. So the display order should be:
Betty, New :: Betty, Completed
Mickey, In Progress :: Mickey, Has Issues
I am currently using Packages.OrderBy(o => o.Customer).ThenBy(o => o.Status). This works effectively to get the customers sorted, however this doesn't eliminate the custom sorting of the status property.
What would be the most efficient and standards acceptable method to achieve this result?
case PackageSortType.Customer:
Packages = Packages.OrderBy(o => o.Customer).ThenBy(o=>o.Status).ToList<Package>();
break;
I previously created a method that sorted by status only, however it is my belief that throwing the OrderBy into that algorithm would just jumble the status back up in the end.
private void SortByStatus() {
// Default sort order is: New, In Progress, Has Issues, Completed, Archived
List<Package> tempPackages = new List<Package>();
string[] statusNames = new string[5] { "new", "inProgress", "hasIssue", "completed", "archived" };
string currentStatus = string.Empty;
for (int x = 0; x < 5; x++) {
currentStatus = statusNames[x];
for (int y = 0; y < Packages.Count; y++) {
if (tempPackages.Contains(Packages[y])) continue;
else {
if (Packages[y].Status == currentStatus)
tempPackages.Add(Packages[y]);
}
}
}
Packages.Clear();
Packages = tempPackages;
}
Also, I'm not sure if it is relevant or not; however, the Packages list is stored in Session.
EDIT
Thanks to Alex Paven I have resolved the issue of custom sorting my status. I ended up creating a new class for the status and making it derive from IComparable, then created a CompareTo method that forced the proper sorting of the status.
For those who are curious about the solution I came up with (it still needs to be cleaned up), it's located below:
public class PackageStatus : IComparable<PackageStatus> {
public string Value { get; set; }
int id = 0;
static string[] statusNames = new string[5] { "new", "inProgress", "hasIssue", "completed", "archived" };
public int CompareTo(PackageStatus b) {
if (b != null) {
if (this == b) {
return 0;
}
for (int i = 0; i < 5; i++) {
if (this.Value == statusNames[i]) { id = i; }
if (b.Value == statusNames[i]) { b.id = i; }
}
}
return Comparer<int>.Default.Compare(id, b.id);
}
}
Use:
Packages.OrderBy(o => o.Customer).ThenBy(o => o.Status).ToList<Package>();
I'm not sure what exactly you're asking; why can't you use the Linq expressions in your first code sample? There's OrderByDescending in addition to OrderBy, so you can mix and match the sort order as you desire.

Is there a concurrent sorted dictionary or something similar?

For a project we've been working on we used a concurrent dictionary, which was fine until a new specification came up which required the dictionary to be sorted (it should remain in the order it was added, kind of like a FIFO).
This is currently what we do, we take an x amount (5 in this case) of items out of the dictionary:
private Dictionary<PriorityOfMessage, ConcurrentDictionary<Guid, PriorityMessage>> mQueuedMessages = new Dictionary<PriorityOfMessage, ConcurrentDictionary<Guid, PriorityMessage>>();
var messages = new List<KeyValuePair<Guid, PriorityMessage>>();
messages.AddRange(mQueuedMessages[priority].Take(5));
then we do some things with it and eventually if everything succeeds we removed them.
mQueuedMessages[priority].TryRemove(messageOfPriority.Key);
However if things fail we won't remove them and try later. So unfortunatly there is no concurrent sorted dictionary, but are there ways to ensure the messages stay in the order they are added?
It is very important we can take multiple objects from the list/dictionary without removing them (or we need to be able to add them to the front later).
How often will you take per second?
.
it could be a thousand times a second
1000 lock operations per second are absolutely nothing. This will consume almost no time at all.
my colleague has already tried using locks and lists and he deemed it too slow
In all likelihood this means that the locked region was too big. My guess is it went something like that:
lock (...) {
var item = TakeFromQueue();
Process(item);
DeleteFromQueue(item);
}
This does not work because Process is too slow. It must be:
lock (...)
var item = TakeFromQueue();
Process(item);
lock (...)
DeleteFromQueue(item);
You will not have any perf problems with that at all.
You can now pick any data structure that you like. You are no longer bound by the capabilities of the built-in concurrent data structures. Besides picking a data structure that you like you also can perform any operation on it that you like such as taking multiple items atomically.
I have not fully understood your needs but it sounds like SortedList might go in the right direction.
You could also go for another solution (haven't tested it performance-wise):
public class ConcurrentIndexableQueue<T> {
private long tailIndex;
private long headIndex;
private readonly ConcurrentDictionary<long, T> dictionary;
public ConcurrentIndexableQueue() {
tailIndex = -1;
headIndex = 0;
dictionary = new ConcurrentDictionary<long, T>();
}
public long Count { get { return tailIndex - headIndex + 1; } }
public bool IsEmpty { get { return Count == 0; } }
public void Enqueue(T item) {
var enqueuePosition = Interlocked.Increment(ref tailIndex);
dictionary.AddOrUpdate(enqueuePosition, k => item, (k, v) => item);
}
public T Peek(long index) {
T item;
return dictionary.TryGetValue(index, out item) ?
item :
default(T);
}
public long TryDequeue(out T item) {
if (headIndex > tailIndex) {
item = default(T);
return -1;
}
var dequeuePosition = Interlocked.Increment(ref headIndex) - 1;
dictionary.TryRemove(dequeuePosition, out item);
return dequeuePosition;
}
public List<T> GetSnapshot() {
List<T> snapshot = new List<T>();
long snapshotTail = tailIndex;
long snapshotHead = headIndex;
for (long i = snapshotHead; i < snapshotTail; i++) {
T item;
if (TryDequeue(out item) >= 0) {
snapshot.Add(item);
}
}
return snapshot;
}
}

Element not getting added in List

I have a class called Estimate and it has the following field and property:
private IList<RouteInformation> _routeMatrix;
public virtual IList<RouteInformation> RouteMatrix
{
get
{
if (_routeMatrix != null && _routeMatrix.Count > 0)
{
var routeMatrix = _routeMatrix.ToList();
routeMatrix =
routeMatrix.OrderBy(tm => tm.Level.LevelType).ThenBy(tm => tm.Level.LevelValue).ToList();
return routeMatrix;
}
else return _routeMatrix;
}
set { _routeMatrix = value; }
}
So, in the getter method, I am just sorting the _routeMatrix by Level Type and then by Level Value and returning the sorted list.
In one of my programs, I have the following code:
public void SaveApprovers(string[] approvers)
{
int i = 1;
foreach (var approver in approvers)
{
var role = Repository.Get<Role>(long.Parse(approver));
var level = new Models.Level
{
LevelType = LevelType.Approver,
LevelValue = (LevelValue)i,
Role = role
};
Repository.Save(level);
var routeInformation = new Models.RouteInformation
{
Level = level,
RouteObjectType = RouteObjectType.Estimate,
RouteObjectId = _estimate.Id
};
Repository.Save(routeInformation);
_estimate.RouteMatrix.Add(routeInformation); // <--- The problem is here
Repository.Save(_estimate);
i++;
}
}
The problem is that, if there are multiple approvers (i.e: the length of the approvers array is greater than 1, only the first routeInformation is added in the RouteMatrix. I don't know what happen to the rest of them, but the Add method doesn't give any error.
Earlier, RouteMatrix was a public field. This problem started occuring after I made it private and encapsulated it in a public property.
Your get member returns a different list, you add to that temporary list.
get
{
if (_routeMatrix != null && _routeMatrix.Count > 0)
{
var routeMatrix = _routeMatrix.ToList(); // ToList creates a _copy_ of the list
...
return routeMatrix;
}
else return _routeMatrix;
}
.....
_estimate.RouteMatrix.Add(routeInformation); // add to the result of ToList()
I think the moral here is not to make getters too complicated. The sorting is wasted effort anyway when you just want to Add().
Also, bad things will happen when _routeMatrix == null. That may not happen but then the if (_routeMatrix != null && ...) part is misleading noise.
When you are applying ToList() then completely new list is created, which is not related to original _routeMatrix list. Well, they share same elements, but when you add or remove elements from one of lists, it does not affect second list.
From MSDN:
You can append this method to your query in order to obtain a cached
copy of the query results.
So, you have cached copy of your _routeMatrix which you are successfully modifying.
To solve this issue you can return IEnumerable instead of IList (to disable collection modifications outside of estimation class), and create AddRouteInformation method to estimation class which will add route information to _routeMatrix. Use that method to add new items:
_estimate.AddRouteInformation(routeInformation);
Repository.Save(_estimate);
The problem is that you're not actually modifying _routeMatrix, you're modifying a copy of it. Don't issue the ToList on _routeMatrix, just sort it. Change the get to this:
get
{
if (_routeMatrix != null && _routeMatrix.Count > 0)
{
_routeMatrix =
_routeMatrix.OrderBy(tm => tm.Level.LevelType).ThenBy(tm => tm.Level.LevelValue).ToList();
return _routeMatrix;
}
else return _routeMatrix;
}

Fastest way to execute LINQ against a BindingList?

I'm writing a WinForms app that contains a simple object like this:
public class MyObject : INotifyPropertyChanged // for two-way data binding
{
public event PropertyChangedEventHandler PropertyChanged;
private void RaisePropertyChanged([CallerMemberName] string caller = "")
{
if (PropertyChanged != null)
{
PropertyChanged(this, new PropertyChangedEventArgs(caller));
}
}
private int _IndexValue;
public int IndexValue
{
get { return Value; }
set
{
if (value != Value)
{
Value = value;
RaisePropertyChanged();
}
}
}
private string _StringValue;
public string StringValue
{
get { return _StringValue; }
set
{
if (value != _StringValue)
{
_StringValue = value;
_Modified = true;
RaisePropertyChanged();
}
}
}
private bool _Modified;
public bool Modified
{
get { return _Modified; }
set
{
if (value != _Modified)
{
_Modified = value;
RaisePropertyChanged();
}
}
}
public MyObject(int indexValue)
{
IndexValue = indexValue;
StringValue = string.Empty;
Modified = false;
}
}
I have a BindingList that will contain a fixed number (100,000) of my objects as well as a BindingSource. Both of those are defined like this:
BindingList<MyObject> myListOfObjects = new BindingList<MyObject>();
BindingSource bindingSourceForObjects = new BindingSource();
bindingSourceForObjects .DataSource = myListOfObjects;
Finally, I have my DataGridView control. It has single column ("STRINGVALUECOLUMN") which displays the StringValue property for my objects and it is bound to the BindingSource that I just mentioned:
dataGridViewMyObjects.DataSource = bindingSourceForObjects;
When my application starts, I add 100,000 objects to myListOfObjects. Since I only have one column in my DGV and the property that it displays is initialized to string.Empty, I basically have a DGV that contains 100,000 "blank" rows. At this point, my user can begin editing the rows to enter strings. They don't have to edit them in any order so they might put one string in the first row, the next string in row 17, the next string in row 24581, etc. Sometimes, my users will want to import strings from text file. Since I have a fixed number of objects (100,000) and there may or may not be some existing strings already entered, I have a few checks to perform during the import process before I add a new string. In the code below, I've removed those checks but they don't seem to impact the performance of my application. However, if I import tens of thousands of strings using the code below, it's very slow (like 4 or 5 minutes to import 50k lines). I have narrowed it down to something in this block of code:
// this code is inside the loop that reads each line from a file...
// does this string already exist?
int count = myListOfObjects.Count(i => i.StringValue == stringFromFile);
if (count > 0)
{
Debug.WriteLine("String already exists!"); // don't insert strings that already exist
}
else
{
// find the first object in myListOfObjects that has a .StringValue property == string.Empty and then update it with the string read from the file
MyObject myObject = myListOfObjects.FirstOrDefault(i => i.StringValue == string.Empty);
myObject.StringValue = stringFromFile;
}
It's my understanding that I need two-way binding so I can update the underlying data and have it reflect in the DGV control but I've also read that INotifyPropertyChanged can be slow sometimes. Has anyone ever run into this problem before? If so, how did you solve it?
-- UPDATE --
Just for testing purposes, I replaced:
// does this string already exist?
int count = myListOfObjects.Count(i => i.StringValue == stringFromFile);
if (count > 0)
{
Debug.WriteLine("String already exists!"); // don't insert strings that already exist
}
else
{
// find the first object in myListOfObjects that has a .StringValue property == string.Empty and then update it with the string read from the file
MyObject myObject = myListOfObjects.FirstOrDefault(i => i.StringValue == string.Empty);
myObject.StringValue = stringFromFile;
}
with a for loop containing:
myListOfObjects[counter].StringValue = "some random string";
This is extremely fast even with 100,000 objects. However, I've now lost the ability to 1) check to see if the string that I read from the file is already assigned to an object in the list before I assign it and 2) find the first available object in the list whose StringValue property == string.Empty and then update that value accordingly. So it seems that:
int count = myListOfObjects.Count(i => i.StringValue == stringFromFile);
and
MyObject myObject = myListOfObjects.FirstOrDefault(i => i.StringValue == string.Empty);
...are the source of my performance problems. Is there a faster, more efficient way to perform these two operations against my BindingList?
The thing about Linq is that its really just standard loops, optimized of course, but still regular old loops, back in the back code.
One thing that may speed your code up is this:
myListOfObjects.Any(i => i.StringValue.Equals(stringFromFile));
this returns a simple boolean, Does X exist. It early exits so it wont scan the entire collection if it doesn't have to. .Count() requires not only scanning the whole thing but also keeping a running count.
Another thing to point out, since you are using FirstOrDefault, that indicates that the result could be null. Make sure you have a null-check on myobject before trying to use it.
Finally, as suggested by Mr Saunders, check the event stack and make sure there isn't more code running than you think there is. This is a danger in operations like this. You might need to borrow some code from the initialization engine and use this.SuspendLayout() and this.ResumeLayout()
The problem may be that when you update the underlying data, events fire to cause the grid to update. Lots of data changing == lots of updates.
It's been a long time since I've done much with Windows Forms, but check out the SuspendLayout method.

Sortable linked list of objects

For a school lab I have to make a linked list of messages and then sort those messages by priority, with "High" priority being pulled out first, then medium, then low. I've been trying to figure this out for days and I can't wrap my mind around the sorting. I've been trying to get it to sort without adding anything other than a head and a size field in my ListofMessages class but all I seem to do is add garbage code. I wanted to figure this out myself but right now I'm stumped.
Here's what I have so far. Hopefully you can make sense of it:
class ListOfMessages
{
private int m_nSize;
public Message m_cListStart;
//public Message m_cNextItem;
public Message m_cLastItem;
public ListOfMessages()
{
m_nSize = 0;
m_cListStart = null;
//m_cNextItem = null;
}
public int Count
{
get { return m_nSize; }
}
public string Display(Message cMessage)
{
return "Message: " + cMessage.m_strMessage + "\nPriority: " + cMessage.m_strPriority;
}
//list additions
public int Add(Message newMessage)
{
Message nextMessage = new Message();
//inserts objects at the end of the list
if (m_nSize == 0)
{
m_cListStart = newMessage;
//m_cLastItem = newMessage;
}
else
{
Message CurrentMessage = m_cListStart;
if (newMessage.m_strPriority == "High")
{
if (m_cListStart.m_strPriority != "High")
{
//Make the object at the start of the list point to itself
CurrentMessage.m_cNext = m_cListStart;
//Replace the object at the start of the list with the new message
m_cListStart = newMessage;
}
else
{
Message LastMessage = null;
for (int iii = 0; iii < m_nSize; iii++)//(newmessage.m_strpriority == iii.m_strpriority)
//&& (iii.m_cnext == null);)
{
if (m_cListStart.m_strPriority != "High")
{
nextMessage = newMessage;
nextMessage.m_cNext =
CurrentMessage = nextMessage;
//LastMessage.m_cNext = CurrentMessage;
}
LastMessage = CurrentMessage;
if (m_cListStart.m_cNext != null)
m_cListStart = m_cListStart.m_cNext;
}
}
//iii = iii.m_cnext;
}
// for (int iii = m_cListStart; ; iii = iii.m_cNext)//(newMessage.m_strPriority == iii.m_strPriority)
// //&& (iii.m_cNext == null);)
//{
// //Message lastMessage = iii;
// if (iii.m_strPriority != iii.m_strPriority)
// {
// //iii.m_cNext = newMessage;
// newMessage.m_cNext = iii.m_cNext;
// iii.m_cNext = newMessage;
// }
//m_cLastItem.m_cNext = newMessage;
}
//m_cLastItem = newMessage;
return m_nSize++;
}
public Message Pop()
{
//Message Current = m_cListStart;
//if the object at the start of the list has another object after it, make that object the start of the list
//and decrease the size by 1 after popping an object off if there is more than 1 object after the start of the list
if (m_cListStart.m_cNext != null)
{
m_cListStart = m_cListStart.m_cNext;
}
if (m_nSize > 0)
m_nSize--;
else
m_cListStart = null;
return m_cListStart;
//if (m_cListStart.m_cNext != null)
// m_cListStart = m_cListStart.m_cNext;
//if (m_nSize > 1)
// m_nSize--;
//return m_cListStart;
}
My pop function to retrieve the messages might need some refining but most of the work right now lies in the Add function. I'm really just stumbling through the dark there.
Does anyone know of a simple way of doing what I'm asking?
Why dont you write a custom linked list as follows:
class Node<T> : IComparable<T>
{
public int Priority {set;get;}
public T Data {set;get;}
public Node<T> Next {set;get;}
public Node<T> Previous {set;get;}
// you need to implement IComparable here for sorting.
}
This is your node definitions. Now We need to implement a LinkedList Class.
Your linked list class can be doubly linked list, since you dont have any specs on it. and it would be easier with doubly linked list.
Here is the definition:
class LinkedList<T> : IEnumerable<T> where T: IComparable
{
public Node<T> Head {set;get;}
public Node<T> Tail {set;get;}
// set of constructors
//.....
public void Insert(Node<T> node)
{
// you can do recursive or iterative impl. very easy.
}
// other public methods such as remove, insertAfter, insert before, insert last etc.
public void Sort()
{
// easiest solution is to use insertion sort based on priority.
}
}
If you can get away by creating extra memory, ie: another linked list. insertion sort would be fine. For this purpose you need to implement insert after functionality as well.
I have a LinkedList implementation, you can check it out. You just need to implement sorting based on priority, you can use bubble sort, insertion sort, or merge sort.
Also, you might want to look at Heap which you can use to implement a priority Queue, it serves the purpose. I have a Heap Data Structure Implementation, you can check it out.
The easiest solution would be to have three singly-linked lists, one for each priority.
When you add, you add to the end of the correct list. When you remove, you first try to remove from the highest priority list. If that is empty, try the middle one. If even that is empty, use the lowest list.
If you have constant number of priorities, the time complexities are O(1) in both cases.

Categories