I've been working with improving a library that is built using EF 5. I would like to use the Parallel library to speed performance up. I understand that a DbContext is not thread safe and that for every transaction a new DbContext would need to be created.
The problem I believe I am having is how to get a new context generated every time my IQueryable is iterated over. Here is a truncated method of my implementation:
public virtual void ProcessAttachments(IQueryable<File> results)
{
var uniqueOrderKeys = results.Select(r => r.ForeignKey).Distinct();
//process each order
Parallel.ForEach(uniqueOrderKeys, key =>
{
var key1 = key;
var resultsForKey = results.Where(result => result.ForeignKey == key1);
//process File objects for the order
Parallel.ForEach(resultsForKey, result =>
{
string orderNum;
using (var da = new DataAccess()) //DataAccess creates the DbContext and is implementing IDisposable
{
orderNum = da.GetOrderNumberByOrderKey(key);
}
});
});
}
Is there a way to specify a new DbContext to be used as my IQueryable results are looped through and retrieved?
I just put this together and I think it may help you:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using WhereverYourObjectContextLives;
/// <summary>
/// Provides an iterator pattern over a collection such that the results may be processed in parallel.
/// </summary>
public abstract class ParallelSkipTakeIterator <T>
{
private int currentIndex = 0;
private int batchSize;
private Expression<Func<T, int>> orderBy;
private ParallelQuery<T> currentBatch;
/// <summary>
/// Build the iterator, specifying an Order By function, and optionally a <code>batchSize</code>.
/// </summary>
/// <param name="orderBy">Function which selects the id to sort by</param>
/// <param name="batchSize">number of rows to return at once - defaults to 1000</param>
/// <remarks>
/// <code>batchSize</code> balances overhead with cost of parallelizing and instantiating
/// new database contexts. This should be scaled based on observed performance.
/// </remarks>
public ParallelSkipTakeIterator(Expression<Func<T, int>> orderBy, int batchSize = 1000)
{
this.batchSize = batchSize;
this.orderBy = orderBy;
}
/// <summary>
/// Accesses the materialized result of the most recent iteration (execution of the query).
/// </summary>
public ParallelQuery<T> CurrentBatch
{
get
{
if (this.currentBatch == null)
{
throw new InvalidOperationException("Must call HasNext at least once before accessing the CurrentBatch.");
}
return this.currentBatch;
}
}
/// <summary>
/// Does the current iterator have another batch of data to process?
/// </summary>
/// <returns>true if more data can be accessed via <code>CurrentBatch</code></returns>
/// <remarks>
/// Creates a new database context, issues a query, and places a materialized collection in <code>CurrentBatch</code>.
/// Context is disposed once the query is issued.
/// Materialized collection is specified by <code>BuildIQueryable</code>. Use of any associated navigation properties
/// must be accounted for by using the appropriate <code>.Include</code> operator where the query is
/// built in <code>BuildIQueryable</code>.
/// </remarks>
public bool HasNext()
{
using (YourObjectContextHere db = new YourObjectContextHere())
{
this.currentBatch = this.BuildIQueryable(db)
.OrderBy(this.orderBy)
.Skip(this.currentIndex)
.Take(this.batchSize)
.ToList()
.AsParallel();
this.currentIndex += this.batchSize;
return currentBatch.Count() > 0;
}
}
/// <summary>
/// Given a Database Context, builds a query which can be executed in batches.
/// </summary>
/// <param name="db">context on which to build and execute the query</param>
/// <returns>a query which will be executed and materialized</returns>
/// <remarks>Context will be disposed as soon a HasNext has been executed.</remarks>
protected abstract IQueryable<T> BuildIQueryable(YourObjectContextHere db);
}
You may then subclass this thing and implement BuildIQueryable as follows:
class MyObjectIterator: ParallelSkipTakeIterator<MyObject>
{
private List<int> instanceIds;
public PropertyRecordMatchFileIterator(List<int> someExtraInfoNeededByQuery)
: base(f => f.InstanceId)
{
this.instanceIds = someExtraInfoNeededByQuery;
}
protected override IQueryable<MyObject> BuildIQueryable(YourObjectContextHere db)
{
IQueryable<MyObject> myObjects= db.SomeCollection.Select(x => this.instanceIds.Contains(x).Include("SomethingImportant");
return myObjects;
}
}
Finally, you can loop over the data sets as so:
MyObjectIterator myIterator = new MyObjectIterator(someExtraInfoNeededByQuery);
while (myIterator.HasNext())
{
ParallelQuery<MyObject> query = myIterator.CurrentBatch;
query.ForAll(item =>
doSomethingCool(item));
}
Related
I need a data structure like Queue. Only the first element can exit and new elements should be added to end of Queue. And also I need to access to last element.
The System.Collections.Queue has all functionality I need except the last one.
I wondering is there any built in data structure like this?
The C# LinkedList class is exactly what you need.
See https://stackoverflow.com/a/1798358/2394945
Based on comments and other answers the best way is create my own Queue, but implementing a Queue with best performance is a duplicate(because it already exists in c#).
So I simply create a MyQueue class and use a System.Collections.Generic.Queue<T> as its inner data structure. here is the code:
/// <summary>
/// A generic first in first out data structure with access to last element.
/// </summary>
/// <typeparam name="T">The specific data type for this queue.</typeparam>
public class MyQueue<T>
{
#region Varibles
private Queue<T> _inner;
private T _last;
#endregion
#region Properties
/// <summary>
/// Number of elements of this Queue.
/// </summary>
public int Count
{
get
{
return _inner.Count;
}
}
/// <summary>
/// The inner Queue in this structure.
/// </summary>
public Queue<T> Inner
{
get
{
return _inner;
}
}
#endregion
public MyQueue()
{
_inner = new Queue<T>();
}
#region Methods
/// <summary>
/// Adds an object to the end of the queue.
/// </summary>
/// <param name="item">Specific item for add.</param>
public void Enqueue(T item)
{
_inner.Enqueue(item);
_last = item;
}
/// <summary>
/// Return and removes the first item in the queue.
/// </summary>
/// <returns>The first item in queue.</returns>
/// <exception cref="InvalidOperationException">InvalidOperationException</exception>
public T Dequeue()
{
if (_inner.Count > 0)
return _inner.Dequeue();
else
throw new InvalidOperationException("The Queue is empty.");
}
/// <summary>
/// Returns the first item in the queue without removing it.
/// </summary>
/// <returns>The first item in the queue.</returns>
public T Peek()
{
return _inner.Peek();
}
/// <summary>
/// Returns the last item in the queue without removing it.
/// </summary>
/// <returns>The last item in the queue.</returns>
public T Last()
{
return _last;
}
/// <summary>
/// Clears all items in the queue.
/// </summary>
public void Clear()
{
_inner.Clear();
}
#endregion
}
I need to add cache functionality and found a new shiny class called MemoryCache. However, I find MemoryCache a little bit crippled as it is (I'm in need of regions functionality). Among other things I need to add something like ClearAll(region). Authors made a great effort to keep this class without regions support, code like:
if (regionName != null)
{
throw new NotSupportedException(R.RegionName_not_supported);
}
flies in almost every method.
I don't see an easy way to override this behaviour. The only way to add region support that I can think of is to add a new class as a wrapper of MemoryCache rather then as a class that inherits from MemoryCache. Then in this new class create a Dictionary and let each method "buffer" region calls. Sounds nasty and wrong, but eventually...
Do you know of better ways to add regions to MemoryCache?
I know it is a long time since you asked this question, so this is not really an answer to you, but rather an addition for future readers.
I was also surprised to find that the standard implementation of MemoryCache does NOT support regions. It would have been so easy to provide right away. I therefore decided to wrap the MemoryCache in my own simple class to provide the functionality I often need.
I enclose my code it here to save time for others having the same need!
/// <summary>
/// =================================================================================================================
/// This is a static encapsulation of the Framework provided MemoryCache to make it easier to use.
/// - Keys can be of any type, not just strings.
/// - A typed Get method is provided for the common case where type of retrieved item actually is known.
/// - Exists method is provided.
/// - Except for the Set method with custom policy, some specific Set methods are also provided for convenience.
/// - One SetAbsolute method with remove callback is provided as an example.
/// The Set method can also be used for custom remove/update monitoring.
/// - Domain (or "region") functionality missing in default MemoryCache is provided.
/// This is very useful when adding items with identical keys but belonging to different domains.
/// Example: "Customer" with Id=1, and "Product" with Id=1
/// =================================================================================================================
/// </summary>
public static class MyCache
{
private const string KeySeparator = "_";
private const string DefaultDomain = "DefaultDomain";
private static MemoryCache Cache
{
get { return MemoryCache.Default; }
}
// -----------------------------------------------------------------------------------------------------------------------------
// The default instance of the MemoryCache is used.
// Memory usage can be configured in standard config file.
// -----------------------------------------------------------------------------------------------------------------------------
// cacheMemoryLimitMegabytes: The amount of maximum memory size to be used. Specified in megabytes.
// The default is zero, which indicates that the MemoryCache instance manages its own memory
// based on the amount of memory that is installed on the computer.
// physicalMemoryPercentage: The percentage of physical memory that the cache can use. It is specified as an integer value from 1 to 100.
// The default is zero, which indicates that the MemoryCache instance manages its own memory
// based on the amount of memory that is installed on the computer.
// pollingInterval: The time interval after which the cache implementation compares the current memory load with the
// absolute and percentage-based memory limits that are set for the cache instance.
// The default is two minutes.
// -----------------------------------------------------------------------------------------------------------------------------
// <configuration>
// <system.runtime.caching>
// <memoryCache>
// <namedCaches>
// <add name="default" cacheMemoryLimitMegabytes="0" physicalMemoryPercentage="0" pollingInterval="00:02:00" />
// </namedCaches>
// </memoryCache>
// </system.runtime.caching>
// </configuration>
// -----------------------------------------------------------------------------------------------------------------------------
/// <summary>
/// Store an object and let it stay in cache until manually removed.
/// </summary>
public static void SetPermanent(string key, object data, string domain = null)
{
CacheItemPolicy policy = new CacheItemPolicy { };
Set(key, data, policy, domain);
}
/// <summary>
/// Store an object and let it stay in cache x minutes from write.
/// </summary>
public static void SetAbsolute(string key, object data, double minutes, string domain = null)
{
CacheItemPolicy policy = new CacheItemPolicy { AbsoluteExpiration = DateTime.Now + TimeSpan.FromMinutes(minutes) };
Set(key, data, policy, domain);
}
/// <summary>
/// Store an object and let it stay in cache x minutes from write.
/// callback is a method to be triggered when item is removed
/// </summary>
public static void SetAbsolute(string key, object data, double minutes, CacheEntryRemovedCallback callback, string domain = null)
{
CacheItemPolicy policy = new CacheItemPolicy { AbsoluteExpiration = DateTime.Now + TimeSpan.FromMinutes(minutes), RemovedCallback = callback };
Set(key, data, policy, domain);
}
/// <summary>
/// Store an object and let it stay in cache x minutes from last write or read.
/// </summary>
public static void SetSliding(object key, object data, double minutes, string domain = null)
{
CacheItemPolicy policy = new CacheItemPolicy { SlidingExpiration = TimeSpan.FromMinutes(minutes) };
Set(key, data, policy, domain);
}
/// <summary>
/// Store an item and let it stay in cache according to specified policy.
/// </summary>
/// <param name="key">Key within specified domain</param>
/// <param name="data">Object to store</param>
/// <param name="policy">CacheItemPolicy</param>
/// <param name="domain">NULL will fallback to default domain</param>
public static void Set(object key, object data, CacheItemPolicy policy, string domain = null)
{
Cache.Add(CombinedKey(key, domain), data, policy);
}
/// <summary>
/// Get typed item from cache.
/// </summary>
/// <param name="key">Key within specified domain</param>
/// <param name="domain">NULL will fallback to default domain</param>
public static T Get<T>(object key, string domain = null)
{
return (T)Get(key, domain);
}
/// <summary>
/// Get item from cache.
/// </summary>
/// <param name="key">Key within specified domain</param>
/// <param name="domain">NULL will fallback to default domain</param>
public static object Get(object key, string domain = null)
{
return Cache.Get(CombinedKey(key, domain));
}
/// <summary>
/// Check if item exists in cache.
/// </summary>
/// <param name="key">Key within specified domain</param>
/// <param name="domain">NULL will fallback to default domain</param>
public static bool Exists(object key, string domain = null)
{
return Cache[CombinedKey(key, domain)] != null;
}
/// <summary>
/// Remove item from cache.
/// </summary>
/// <param name="key">Key within specified domain</param>
/// <param name="domain">NULL will fallback to default domain</param>
public static void Remove(object key, string domain = null)
{
Cache.Remove(CombinedKey(key, domain));
}
#region Support Methods
/// <summary>
/// Parse domain from combinedKey.
/// This method is exposed publicly because it can be useful in callback methods.
/// The key property of the callback argument will in our case be the combinedKey.
/// To be interpreted, it needs to be split into domain and key with these parse methods.
/// </summary>
public static string ParseDomain(string combinedKey)
{
return combinedKey.Substring(0, combinedKey.IndexOf(KeySeparator));
}
/// <summary>
/// Parse key from combinedKey.
/// This method is exposed publicly because it can be useful in callback methods.
/// The key property of the callback argument will in our case be the combinedKey.
/// To be interpreted, it needs to be split into domain and key with these parse methods.
/// </summary>
public static string ParseKey(string combinedKey)
{
return combinedKey.Substring(combinedKey.IndexOf(KeySeparator) + KeySeparator.Length);
}
/// <summary>
/// Create a combined key from given values.
/// The combined key is used when storing and retrieving from the inner MemoryCache instance.
/// Example: Product_76
/// </summary>
/// <param name="key">Key within specified domain</param>
/// <param name="domain">NULL will fallback to default domain</param>
private static string CombinedKey(object key, string domain)
{
return string.Format("{0}{1}{2}", string.IsNullOrEmpty(domain) ? DefaultDomain : domain, KeySeparator, key);
}
#endregion
}
You can create more than one just one MemoryCache instance, one for each partition of your data.
http://msdn.microsoft.com/en-us/library/system.runtime.caching.memorycache.aspx :
you can create multiple instances of the MemoryCache class for use in the same application and in the same AppDomain instance
I just recently came across this problem. I know this is an old question but maybe this might be useful for some folks. Here is my iteration of the solution by Thomas F. Abraham
namespace CLRTest
{
using System;
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Globalization;
using System.Linq;
using System.Runtime.Caching;
class Program
{
static void Main(string[] args)
{
CacheTester.TestCache();
}
}
public class SignaledChangeEventArgs : EventArgs
{
public string Name { get; private set; }
public SignaledChangeEventArgs(string name = null) { this.Name = name; }
}
/// <summary>
/// Cache change monitor that allows an app to fire a change notification
/// to all associated cache items.
/// </summary>
public class SignaledChangeMonitor : ChangeMonitor
{
// Shared across all SignaledChangeMonitors in the AppDomain
private static ConcurrentDictionary<string, EventHandler<SignaledChangeEventArgs>> ListenerLookup =
new ConcurrentDictionary<string, EventHandler<SignaledChangeEventArgs>>();
private string _name;
private string _key;
private string _uniqueId = Guid.NewGuid().ToString("N", CultureInfo.InvariantCulture);
public override string UniqueId
{
get { return _uniqueId; }
}
public SignaledChangeMonitor(string key, string name)
{
_key = key;
_name = name;
// Register instance with the shared event
ListenerLookup[_uniqueId] = OnSignalRaised;
base.InitializationComplete();
}
public static void Signal(string name = null)
{
// Raise shared event to notify all subscribers
foreach (var subscriber in ListenerLookup.ToList())
{
subscriber.Value?.Invoke(null, new SignaledChangeEventArgs(name));
}
}
protected override void Dispose(bool disposing)
{
// Set delegate to null so it can't be accidentally called in Signal() while being disposed
ListenerLookup[_uniqueId] = null;
EventHandler<SignaledChangeEventArgs> outValue = null;
ListenerLookup.TryRemove(_uniqueId, out outValue);
}
private void OnSignalRaised(object sender, SignaledChangeEventArgs e)
{
if (string.IsNullOrWhiteSpace(e.Name) || string.Compare(e.Name, _name, true) == 0)
{
// Cache objects are obligated to remove entry upon change notification.
base.OnChanged(null);
}
}
}
public static class CacheTester
{
private static Stopwatch _timer = new Stopwatch();
public static void TestCache()
{
MemoryCache cache = MemoryCache.Default;
int size = (int)1e6;
Start();
for (int idx = 0; idx < size; idx++)
{
cache.Add(idx.ToString(), "Value" + idx.ToString(), GetPolicy(idx, cache));
}
long prevCnt = cache.GetCount();
Stop($"Added {prevCnt} items");
Start();
SignaledChangeMonitor.Signal("NamedData");
Stop($"Removed {prevCnt - cache.GetCount()} entries");
prevCnt = cache.GetCount();
Start();
SignaledChangeMonitor.Signal();
Stop($"Removed {prevCnt - cache.GetCount()} entries");
}
private static CacheItemPolicy GetPolicy(int idx, MemoryCache cache)
{
string name = (idx % 10 == 0) ? "NamedData" : null;
CacheItemPolicy cip = new CacheItemPolicy();
cip.AbsoluteExpiration = System.DateTimeOffset.UtcNow.AddHours(1);
var monitor = new SignaledChangeMonitor(idx.ToString(), name);
cip.ChangeMonitors.Add(monitor);
return cip;
}
private static void Start()
{
_timer.Start();
}
private static void Stop(string msg = null)
{
_timer.Stop();
Console.WriteLine($"{msg} | {_timer.Elapsed.TotalSeconds} sec");
_timer.Reset();
}
}
}
His solution involved using an event to keep track of ChangeMonitors. But the dispose method was working slow when the number of entries were more than 10k. My guess is that this code SignaledChangeMonitor.Signaled -= OnSignalRaised removes a delegate from invocation list by doing a linear search. So when you remove a lot of entries it becomes slow. I decided to use ConcurrentDictionary instead of an event. In hope that dispose becomes faster. I ran some basic performance tests and here are the results:
Added 10000 items | 0.027697 sec
Removed 1000 entries | 0.0040669 sec
Removed 9000 entries | 0.0105687 sec
Added 100000 items | 0.5065736 sec
Removed 10000 entries | 0.0338991 sec
Removed 90000 entries | 0.1418357 sec
Added 1000000 items | 6.5994546 sec
Removed 100000 entries | 0.4176233 sec
Removed 900000 entries | 1.2514225 sec
I am not sure if my code does not have some critical flaws. I would like to know if that is the case.
Another approach is to implement a wrapper around MemoryCache that implements regions by composing the key and region name e.g.
public interface ICache
{
...
object Get(string key, string regionName = null);
...
}
public class MyCache : ICache
{
private readonly MemoryCache cache
public MyCache(MemoryCache cache)
{
this.cache = cache.
}
...
public object Get(string key, string regionName = null)
{
var regionKey = RegionKey(key, regionName);
return cache.Get(regionKey);
}
private string RegionKey(string key, string regionName)
{
// NB Implements region as a suffix, for prefix, swap order in the format
return string.IsNullOrEmpty(regionName) ? key : string.Format("{0}{1}{2}", key, "::", regionName);
}
...
}
It's not perfect but it works for most use cases.
I've implemented this and it's available as a NuGet package: Meerkat.Caching
I am experiencing some weird behavior that disappears/reappears based on whether this dictionary is a new instance of the object, or the old instance of the object. Let me provide all the code first.
/// <summary>
/// Removes a control from monitoring/Session/Database based on ID.
/// </summary>
public static void Remove<T>(ICormantControl<T> control)
{
_logger.InfoFormat("Removing {0}", control.ID);
SerializableDictionary<string, T> states = new SerializableDictionary<string,T>(GetStates<SerializableDictionary<string, T>>());
((IDictionary)states).Remove(control.ID);
SetStates(states);
}
/// <summary>
/// Retrieves information on an object. If the object is cached to Session then the
/// cached object is retrieved. Else, it is retrieved from the database.
/// </summary>
/// <typeparam name="T"> The type of object expected to get back.</typeparam>
/// <returns> Collection of data for the specific object type.</returns>
public static T GetStates<T>() where T : new()
{
T states = new T();
string stateName = GetStateNameFromType(typeof(T));
if (!Equals(SessionRepository.Instance.GetSession(stateName), null))
{
states = (T)SessionRepository.Instance.GetSession(stateName);
}
else
{
XmlSerializer serializer = new XmlSerializer(states.GetType());
string data = DatabaseRepository.Instance.GetWebLayoutData(stateName);
if (!string.IsNullOrEmpty(data))
{
byte[] dataAsArray = Convert.FromBase64String(data);
MemoryStream stream = new MemoryStream(dataAsArray);
states = (T)serializer.Deserialize(stream);
}
SessionRepository.Instance.SetSession(stateName, states);
}
return states;
}
public static void SetStates<T>(T states) where T : new()
{
string stateName = GetStateNameFromType(typeof(T));
SessionRepository.Instance.SetSession(stateName, states);
if (shouldWriteToDatabase) DatabaseRepository.Instance.SaveToDatabase(stateName, states);
}
/// <summary>
/// Recreates the page state recursively by creating a control and looking for its known children.
/// </summary>
/// <param name="pane"> Pane having children added to it.</param>
private void RegeneratePaneChildren(CormantRadPane pane)
{
_logger.InfoFormat("Initializing paneToResize children for paneToResize {0}", pane.ID);
foreach (var splitterState in StateManager.GetStates<SerializableDictionary<string, RadSplitterSetting>>())
{
RadSplitterSetting splitterSetting = splitterState.Value;
if (!splitterSetting.ParentID.Contains(pane.ID)) continue;
CormantRadSplitter splitter = new CormantRadSplitter(splitterSetting);
pane.UpdatePanel.ContentTemplateContainer.Controls.AddAt(0, splitter); //Visibility will fight with splitter if you don't re-add like this.
RegenerateSplitterChildren(splitter);
}
}
/// <summary>
/// Recreates the page state recursively by creating a control and looking for its known children.
/// </summary>
/// <param name="splitter"> Splitter having children added to it. </param>
public void RegenerateSplitterChildren(RadSplitter splitter)
{
_logger.InfoFormat("Initializing splitter children for splitter {0}", splitter.ID);
foreach (var paneState in StateManager.GetStates<SerializableDictionary<string, RadPaneSetting>>()
.Where(paneState => paneState.Value.ParentID.Contains(splitter.ID)))
{
RadPaneSetting paneSetting = paneState.Value;
CormantRadPane pane = new CormantRadPane(paneSetting);
StyledUpdatePanel updatePanel = pane.CreateUpdatePanel(paneSetting.UpdatePanelID);
pane.Controls.Add(updatePanel);
splitter.Controls.Add(pane);
RegeneratePaneChildren(pane);
InsertSplitBar(splitter);
}
}
The key line to look at in all of this is: SerializableDictionary<string, T> states = new SerializableDictionary<string,T>(GetStates<SerializableDictionary<string, T>>());
If this line of code is modified such that it does not create a new instance of states (instead using the object saved in Session) my code gets 'desynched' and I experience odd behavior with my Regeneration methods. An object that is supposed to have 'ObjectA' as a parent instead has 'ObjectB' as a parent.
There's a lot of collection-modification going on... I'm removing a control from states and re-saving it...but I can't see where I do anything explicitly incorrect in this code. Yet, I still feel that I should be able to express the above line of code without creating a new instance of the object.
If anyone sees an obvious blunder I'd love to hear it. Thanks.
How do I work with queue at c#?
I want one thread that will enqueue data to the queue & another thread will dequeue data from to the queue. Those threads should run simultaneously.
Is it possible?
If you need thread safety use ConcurrentQueue<T>.
If you use System.Collections.Queue thread-safety is guaranteed in this way:
var queue = new Queue();
Queue.Synchronized(queue).Enqueue(new WorkItem());
Queue.Synchronized(queue).Enqueue(new WorkItem());
Queue.Synchronized(queue).Clear();
if you wanna use System.Collections.Generic.Queue<T> then create your own wrapper class. I did this allready with System.Collections.Generic.Stack<T>:
using System;
using System.Collections.Generic;
[Serializable]
public class SomeStack
{
private readonly object stackLock = new object();
private readonly Stack<WorkItem> stack;
public ContextStack()
{
this.stack = new Stack<WorkItem>();
}
public IContext Push(WorkItem context)
{
lock (this.stackLock)
{
this.stack.Push(context);
}
return context;
}
public WorkItem Pop()
{
lock (this.stackLock)
{
return this.stack.Pop();
}
}
}
One possible implementation is to use a ring buffer with separate read and write pointers. On each read/write operation you copy the opposite pointer (must be thread safe) into your local context and then perform batched reads or writes.
on each read or write you update the pointer and pulse an event.
If the read or write thread gets to where it has no more work to do you wait on the other threads event before rereading the appropriate pointer.
You can implement a thread-safe queue using atomic operations. I once wrote the following class for a multi-player game. It allows multiple threads to safely write to the queue, and a single other thread to safely read from the queue:
/// <summary>
/// The WaitFreeQueue class implements the Queue abstract data type through a linked list. The WaitFreeQueue
/// allows thread-safe addition and removal of elements using atomic operations. Multiple threads can add
/// elements simultaneously, and another thread can remove elements from the queue at the same time. Only one
/// thread can remove elements from the queue at any given time.
/// </summary>
/// <typeparam name="T">The type parameter</typeparam>
public class WaitFreeQueue<T>
{
// Private fields
// ==============
#region Private fields
private Node<T> _tail; // The tail of the queue.
private Node<T> _head; // The head of the queue.
#endregion
// Public methods
// ==============
#region Public methods
/// <summary>
/// Removes the first item from the queue. This method returns a value to indicate if an item was
/// available, and passes the item back through an argument.
/// This method is not thread-safe in itself (only one thread can safely access this method at any
/// given time) but it is safe to call this method while other threads are enqueueing items.
///
/// If no item was available at the time of calling this method, the returned value is initialised
/// to the default value that matches this instance's type parameter. For reference types, this is
/// a Null reference.
/// </summary>
/// <param name="value">The value.</param>
/// <returns>A boolean value indicating if an element was available (true) or not.</returns>
public bool Dequeue(ref T value)
{
bool succeeded = false;
value = default(T);
// If there is an element on the queue then we get it.
if (null != _head)
{
// Set the head to the next element in the list, and retrieve the old head.
Node<T> head = System.Threading.Interlocked.Exchange<Node<T>>(ref _head, _head.Next);
// Sever the element we just pulled off the queue.
head.Next = null;
// We have succeeded.
value = head.Value;
succeeded = true;
}
return succeeded;
}
/// <summary>
/// Adds another item to the end of the queue. This operation is thread-safe, and multiple threads
/// can enqueue items while a single other thread dequeues items.
/// </summary>
/// <param name="value">The value to add.</param>
public void Enqueue(T value)
{
// We create a new node for the specified value, and point it to itself.
Node<T> newNode = new Node<T>(value);
// In one atomic operation, set the tail of the list to the new node, and remember the old tail.
Node<T> previousTail = System.Threading.Interlocked.Exchange<Node<T>>(ref _tail, newNode);
// Link the previous tail to the new tail.
if (null != previousTail)
previousTail.Next = newNode;
// If this is the first node in the list, we save it as the head of the queue.
System.Threading.Interlocked.CompareExchange<Node<T>>(ref _head, newNode, null);
} // Enqueue()
#endregion
// Public constructor
// ==================
#region Public constructor
/// <summary>
/// Constructs a new WaitFreeQueue instance.
/// </summary>
public WaitFreeQueue() { }
/// <summary>
/// Constructs a new WaitFreeQueue instance based on the specified list of items.
/// The items will be enqueued. The list can be a Null reference.
/// </summary>
/// <param name="items">The items</param>
public WaitFreeQueue(IEnumerable<T> items)
{
if(null!=items)
foreach(T item in items)
this.Enqueue(item);
}
#endregion
// Private types
// =============
#region Private types
/// <summary>
/// The Node class represents a single node in the linked list of a WaitFreeQueue.
/// It contains the queued-up value and a reference to the next node in the list.
/// </summary>
/// <typeparam name="U">The type parameter.</typeparam>
private class Node<U>
{
// Public fields
// =============
#region Public fields
public Node<U> Next;
public U Value;
#endregion
// Public constructors
// ===================
#region Public constructors
/// <summary>
/// Constructs a new node with the specified value.
/// </summary>
/// <param name="value">The value</param>
public Node(U value)
{
this.Value = value;
}
#endregion
} // Node generic class
#endregion
} // WaitFreeQueue class
If the restriction of having only a single thread de-queueing while multiple threads can en-queue is OK with you then you could use that. It was great for the game because it meant no thread synchronisation was required.
Example simple usage would be
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
ExampleQueue eq = new ExampleQueue();
eq.Run();
// Wait...
System.Threading.Thread.Sleep(100000);
}
}
class ExampleQueue
{
private Queue<int> _myQueue = new Queue<int>();
public void Run()
{
ThreadPool.QueueUserWorkItem(new WaitCallback(PushToQueue), null);
ThreadPool.QueueUserWorkItem(new WaitCallback(PopFromQueue), null);
}
private void PushToQueue(object Dummy)
{
for (int i = 0; i <= 1000; i++)
{
lock (_myQueue)
{
_myQueue.Enqueue(i);
}
}
System.Console.WriteLine("END PushToQueue");
}
private void PopFromQueue(object Dummy)
{
int dataElementFromQueue = -1;
while (dataElementFromQueue < 1000)
{
lock (_myQueue)
{
if (_myQueue.Count > 0)
{
dataElementFromQueue = _myQueue.Dequeue();
// Do something with dataElementFromQueue...
System.Console.WriteLine("Dequeued " + dataElementFromQueue);
}
}
}
System.Console.WriteLine("END PopFromQueue");
}
}
}
You might want to use a blocking queue, in which the thread that is popping from the queue will wait until some data is available.
See: Creating a blocking Queue<T> in .NET?
Does anyone have any advice on which method is better when caching data in a C# ASP.net application?
I am currently using a combination of two approaches, with some data (List, dictionaries, the usual domain-specific information) being put directly into the cache and boxed when needed, and some data being kept inside a globaldata class, and retrieved through that class (i.e. the GlobalData class is cached, and it's properties are the actual data).
Is either approach preferable?
I get the feeling that caching each item separately would be more sensible from a concurrency point of view, however it creates a lot more work in the long run with more functions that purely deal with getting data out of a cache location in a Utility class.
Suggestions would be appreciated.
Generally the cache's performance is so much better than the underlying source (e.g. a DB) that the performance of the cache is not a problem. The main goal is rather to get as high cache-hit ratio as possible (unless you are developing at really large scale because then it pays off to optimize the cache as well).
To achieve this I usually try to make it as straight forward as possible for the developer to use cache (so that we don't miss any chances of cache-hits just because the developer is too lazy to use the cache). In some projects we've use a modified version of a CacheHandler available in Microsoft's Enterprise Library.
With CacheHandler (which uses Policy Injection) you can easily make a method "cacheable" by just adding an attribute to it. For instance this:
[CacheHandler(0, 30, 0)]
public Object GetData(Object input)
{
}
would make all calls to that method cached for 30 minutes. All invocations gets a unique cache-key based on the input data and method name so if you call the method twice with different input it doesn't get cached but if you call it >1 times within the timout interval with the same input then the method only gets executed once.
Our modified version looks like this:
using System;
using System.Diagnostics;
using System.IO;
using System.Reflection;
using System.Runtime.Remoting.Contexts;
using System.Text;
using System.Web;
using System.Web.Caching;
using System.Web.UI;
using Microsoft.Practices.EnterpriseLibrary.Common.Configuration;
using Microsoft.Practices.Unity.InterceptionExtension;
namespace Middleware.Cache
{
/// <summary>
/// An <see cref="ICallHandler"/> that implements caching of the return values of
/// methods. This handler stores the return value in the ASP.NET cache or the Items object of the current request.
/// </summary>
[ConfigurationElementType(typeof (CacheHandler)), Synchronization]
public class CacheHandler : ICallHandler
{
/// <summary>
/// The default expiration time for the cached entries: 5 minutes
/// </summary>
public static readonly TimeSpan DefaultExpirationTime = new TimeSpan(0, 5, 0);
private readonly object cachedData;
private readonly DefaultCacheKeyGenerator keyGenerator;
private readonly bool storeOnlyForThisRequest = true;
private TimeSpan expirationTime;
private GetNextHandlerDelegate getNext;
private IMethodInvocation input;
public CacheHandler(TimeSpan expirationTime, bool storeOnlyForThisRequest)
{
keyGenerator = new DefaultCacheKeyGenerator();
this.expirationTime = expirationTime;
this.storeOnlyForThisRequest = storeOnlyForThisRequest;
}
/// <summary>
/// This constructor is used when we wrap cached data in a CacheHandler so that
/// we can reload the object after it has been removed from the cache.
/// </summary>
/// <param name="expirationTime"></param>
/// <param name="storeOnlyForThisRequest"></param>
/// <param name="input"></param>
/// <param name="getNext"></param>
/// <param name="cachedData"></param>
public CacheHandler(TimeSpan expirationTime, bool storeOnlyForThisRequest,
IMethodInvocation input, GetNextHandlerDelegate getNext,
object cachedData)
: this(expirationTime, storeOnlyForThisRequest)
{
this.input = input;
this.getNext = getNext;
this.cachedData = cachedData;
}
/// <summary>
/// Gets or sets the expiration time for cache data.
/// </summary>
/// <value>The expiration time.</value>
public TimeSpan ExpirationTime
{
get { return expirationTime; }
set { expirationTime = value; }
}
#region ICallHandler Members
/// <summary>
/// Implements the caching behavior of this handler.
/// </summary>
/// <param name="input"><see cref="IMethodInvocation"/> object describing the current call.</param>
/// <param name="getNext">delegate used to get the next handler in the current pipeline.</param>
/// <returns>Return value from target method, or cached result if previous inputs have been seen.</returns>
public IMethodReturn Invoke(IMethodInvocation input, GetNextHandlerDelegate getNext)
{
lock (input.MethodBase)
{
this.input = input;
this.getNext = getNext;
return loadUsingCache();
}
}
public int Order
{
get { return 0; }
set { }
}
#endregion
private IMethodReturn loadUsingCache()
{
//We need to synchronize calls to the CacheHandler on method level
//to prevent duplicate calls to methods that could be cached.
lock (input.MethodBase)
{
if (TargetMethodReturnsVoid(input) || HttpContext.Current == null)
{
return getNext()(input, getNext);
}
var inputs = new object[input.Inputs.Count];
for (int i = 0; i < inputs.Length; ++i)
{
inputs[i] = input.Inputs[i];
}
string cacheKey = keyGenerator.CreateCacheKey(input.MethodBase, inputs);
object cachedResult = getCachedResult(cacheKey);
if (cachedResult == null)
{
var stopWatch = Stopwatch.StartNew();
var realReturn = getNext()(input, getNext);
stopWatch.Stop();
if (realReturn.Exception == null && realReturn.ReturnValue != null)
{
AddToCache(cacheKey, realReturn.ReturnValue);
}
return realReturn;
}
var cachedReturn = input.CreateMethodReturn(cachedResult, input.Arguments);
return cachedReturn;
}
}
private object getCachedResult(string cacheKey)
{
//When the method uses input that is not serializable
//we cannot create a cache key and can therefore not
//cache the data.
if (cacheKey == null)
{
return null;
}
object cachedValue = !storeOnlyForThisRequest ? HttpRuntime.Cache.Get(cacheKey) : HttpContext.Current.Items[cacheKey];
var cachedValueCast = cachedValue as CacheHandler;
if (cachedValueCast != null)
{
//This is an object that is reloaded when it is being removed.
//It is therefore wrapped in a CacheHandler-object and we must
//unwrap it before returning it.
return cachedValueCast.cachedData;
}
return cachedValue;
}
private static bool TargetMethodReturnsVoid(IMethodInvocation input)
{
var targetMethod = input.MethodBase as MethodInfo;
return targetMethod != null && targetMethod.ReturnType == typeof (void);
}
private void AddToCache(string key, object valueToCache)
{
if (key == null)
{
//When the method uses input that is not serializable
//we cannot create a cache key and can therefore not
//cache the data.
return;
}
if (!storeOnlyForThisRequest)
{
HttpRuntime.Cache.Insert(
key,
valueToCache,
null,
System.Web.Caching.Cache.NoAbsoluteExpiration,
expirationTime,
CacheItemPriority.Normal, null);
}
else
{
HttpContext.Current.Items[key] = valueToCache;
}
}
}
/// <summary>
/// This interface describes classes that can be used to generate cache key strings
/// for the <see cref="CacheHandler"/>.
/// </summary>
public interface ICacheKeyGenerator
{
/// <summary>
/// Creates a cache key for the given method and set of input arguments.
/// </summary>
/// <param name="method">Method being called.</param>
/// <param name="inputs">Input arguments.</param>
/// <returns>A (hopefully) unique string to be used as a cache key.</returns>
string CreateCacheKey(MethodBase method, object[] inputs);
}
/// <summary>
/// The default <see cref="ICacheKeyGenerator"/> used by the <see cref="CacheHandler"/>.
/// </summary>
public class DefaultCacheKeyGenerator : ICacheKeyGenerator
{
private readonly LosFormatter serializer = new LosFormatter(false, "");
#region ICacheKeyGenerator Members
/// <summary>
/// Create a cache key for the given method and set of input arguments.
/// </summary>
/// <param name="method">Method being called.</param>
/// <param name="inputs">Input arguments.</param>
/// <returns>A (hopefully) unique string to be used as a cache key.</returns>
public string CreateCacheKey(MethodBase method, params object[] inputs)
{
try
{
var sb = new StringBuilder();
if (method.DeclaringType != null)
{
sb.Append(method.DeclaringType.FullName);
}
sb.Append(':');
sb.Append(method.Name);
TextWriter writer = new StringWriter(sb);
if (inputs != null)
{
foreach (var input in inputs)
{
sb.Append(':');
if (input != null)
{
//Diffrerent instances of DateTime which represents the same value
//sometimes serialize differently due to some internal variables which are different.
//We therefore serialize it using Ticks instead. instead.
var inputDateTime = input as DateTime?;
if (inputDateTime.HasValue)
{
sb.Append(inputDateTime.Value.Ticks);
}
else
{
//Serialize the input and write it to the key StringBuilder.
serializer.Serialize(writer, input);
}
}
}
}
return sb.ToString();
}
catch
{
//Something went wrong when generating the key (probably an input-value was not serializble.
//Return a null key.
return null;
}
}
#endregion
}
}
Microsoft deserves most credit for this code. We've only added stuff like caching at request level instead of across requests (more useful than you might think) and fixed some bugs (e.g. equal DateTime-objects serializing to different values).
Under what conditions do you need to invalidate your cache? Objects should be stored so that when they are invalidated repopulating the cache only requires re-caching the items that were invalidated.
For example if you have cached say a Customer object that contains the delivery details for an order along with the shopping basket. Invalidating the shopping basket because they added or removed an item would also require repopulating the delivery details unnecessarily.
(NOTE: This is an obteuse example and I'm not advocating this just trying to demonstrate the principle and my imagination is a bit off today).
Ed, I assume those lists and dictionaries contain almost static data with low chances of expiration. Then there's data that gets frequent hits but also changes more frequently, so you're caching it using the HttpRuntime cache.
Now, you should think of all that data and all of the dependencies between diferent types. If you logically find that the HttpRuntime cached data depends somehow on your GlobalData items, you should move that into the cache and set up the appropriate dependencies in there so you'll benefit of the "cascading expiration".
Even if you do use your custom caching mechanism, you'd still have to provide all the synchronization, so you won't save on that by avoiding the other.
If you need (preordered) lists of items with a really low frequency change, you can still do that by using the HttpRuntime cache. So you could just cache a dictionary and either use it to list your items or to index and access by your custom key.
How about the best (worst?) of both worlds?
Have the globaldata class manage all the cache access internally. The rest of your code can then just use globaldata, meaning that it doesn't need to be cache-aware at all.
You could change the cache implementation as/when you like just by updating globaldata, and the rest of your code won't know or care what's going on inside.
There's much more than that to consider when architecting your caching strategy. Think of your cache store as if it were your in-memory db. So carefully handle dependencies and expiration policy for each and every type stored in there. It really doesn't matter what you use for caching (system.web, other commercial solution, rolling your own...).
I'd try to centralize it though and also use some sort of a plugable architecture. Make your data consumers access it through a common API (an abstract cache that exposes it) and plug your caching layer at runtime (let's say asp.net cache).
You should really take a top down approach when caching data to avoid any kind of data integrity problems (proper dependecies like I said) and then take care of providing synchronization.