I store all of my profiles into a profileCache, which eats up a ton of memory within the Large Object Heap. Therefore, I have implemented a method to help delete unused cache. The problem is the method doesn't seem to be clearing the cache correctly and is throwing a stack overflow error. Here is the two methods I have implemented.
private static void OnScavengeProfileCache(object data)
{
// loop until the runtime is shutting down
while(HostingEnvironment.ShutdownReason == ApplicationShutdownReason.None)
{
// NOTE: Right now we only do the scavenge when traffic is temporarily low,
// to try to amortize the overhead of scavenging the cache into a low utilization period.
// We also only scavenge if the process memory usage is very high.
if (s_timerNoRequests.ElapsedMilliseconds >= 10000)
{
// We dont want to scavenge under lock to avoid slowing down requests,
// so we get the list of keys under lock and then incrementally scan them
IEnumerable<string> profileKeys = null;
lock (s_profileCache)
{
profileKeys = s_profileCache.Keys.ToList();
}
ScavengeProfileCacheIncremental(profileKeys.GetEnumerator());
}
// wait for a bit
Thread.Sleep(60 * 1000);
}
}
My method is constantly scanning traffic, and when traffic is low, it collects all of my profiles and stores them into an IEnumerable called profileKeys. I then invoke this method to delete unused keys -
private static void ScavengeProfileCacheIncremental(IEnumerator<string> profileKeys)
{
if (s_thisProcess.PrivateMemorySize64 >= (200 * 1024 * 1024) ) // 3Gb at least
{
int numProcessed = 0;
while(profileKeys.MoveNext())
{
var key = profileKeys.Current;
Profile profile = null;
if (s_profileCache.TryGetValue(key, out profile))
{
// safely check/remove under lock, its fast but makes sure we dont blow away someone currently being addded
lock (s_profileCache)
{
if (DateTime.UtcNow.Subtract(profile.CreateTime).TotalMinutes > 5)
{
// can clear it out
s_profileCache.Remove(key);
}
}
}
if (++numProcessed >= 5)
{
// stop this scan and check memory again
break;
}
}
// Check again to see if we freed up memory, if not continue scanning the profiles?
ScavengeProfileCacheIncremental(profileKeys);
}
}
The method is not clearing up memory and is throwing a stack overflow error with this trace:
192. ProfileHelper.ScavengeProfileCacheIncremental(
193. ProfileHelper.ScavengeProfileCacheIncremental(
194. ProfileHelper.ScavengeProfileCacheIncremental(
195. ProfileHelper.ScavengeProfileCacheIncremental(
196. ProfileHelper.OnScavengeProfileCache(...)
197. ExecutionContext.RunInternal(...)
198. ExecutionContext.Run(...)
199. IThreadPoolWorkItem.ExecuteWorkItem(...)
200. ThreadPoolWorkQueue.Dispatch(...)
EDIT:
So would this be a possible solution to remove unused profile keys and clear LOH...
private static void ScavengeProfileCacheIncremental(IEnumerator<string> profileKeys)
{
if (s_thisProcess.PrivateMemorySize64 >= (200 * 1024 * 1024) ) // 3Gb at least
{
int numProcessed = 0;
while(profileKeys.MoveNext())
{
var key = profileKeys.Current;
Profile profile = null;
if (s_profileCache.TryGetValue(key, out profile))
{
// safely check/remove under lock, its fast but makes sure we dont blow away someone currently being addded
lock (s_profileCache)
{
if (DateTime.UtcNow.Subtract(profile.CreateTime).TotalMinutes > 5)
{
// can clear it out
s_profileCache.Remove(key);
}
}
}
if (++numProcessed >= 5)
{
// stop this scan and check memory again
break;
}
}
}
GC.Collect;
}
I believe your code is suffering of a problem known as Infinite Recursion.
You are calling method ScavengeProfileCacheIncremental, which in turn calls itself internally. At some point, you call into it enough times that you run out of stack, causing an overflow.
Either your condition is not being met before you run out of stack, or your condition is never met at all. Debugging should show you why.
You can read more on the subject here.
There is no exit from SaveProfileCacheIncremental.
It does its stuff and then calls itself. It then does its stuff and calls itself. It then does its stuff and calls itself. It then does its stuff and calls itself. It then does its stuff and calls itself.
After a while it uses all the stack space and the process crashes.
Related
I'm doing performance testing comparing various algorithms and want to eliminate the scatter in the results due to garbage collection perhaps being done during the critical phase of the test. I'm turning off the garbage collection using GC.TryStartNoGCRegion(long) (see https://learn.microsoft.com/en-us/dotnet/api/system.gc.trystartnogcregion?view=net-6.0) before the critical test phase and reactivating it immediately afterwards.
My code looks like this:
long allocatedBefore;
int collectionsBefore;
long allocatedAfter;
int collectionsAfter;
bool noGCSucceeded;
try
{
// Just in case end "no GC region"
if (GCSettings.LatencyMode == GCLatencyMode.NoGCRegion)
{
GC.EndNoGCRegion();
}
// Exception is thrown sometimes in this line
noGCSucceeded = GC.TryStartNoGCRegion(solveAllocation);
allocatedBefore = GC.GetAllocatedBytesForCurrentThread();
collectionsBefore = getTotalGCCollectionCount();
stopwatch.Restart();
doMyTest();
stopwatch.Stop();
allocatedAfter = GC.GetAllocatedBytesForCurrentThread();
collectionsAfter = getTotalGCCollectionCount();
}
finally
{
// Reactivate garbage collection
if (GCSettings.LatencyMode == GCLatencyMode.NoGCRegion)
{
GC.EndNoGCRegion();
}
}
//...
private int getTotalGCCollectionCount()
{
int collections = 0;
for (int i = 0; i < GC.MaxGeneration; i++)
{
collections += GC.CollectionCount(i);
}
return collections;
}
The following exception is thrown from time to time (about once in 1500 tests):
System.InvalidOperationException
The NoGCRegion mode was already in progress
bei System.GC.StartNoGCRegionWorker(Int64 totalSize, Boolean hasLohSize, Int64 lohSize, Boolean disallowFullBlockingGC)
bei MyMethod.cs:Zeile 409.
bei MyCaller.cs:Zeile 155.
The test might start a second thread that creates some objects it needs in a pool.
As far as I can see, the finally should always turn the GC back on, and the (theoretically unnecessary) check at the beginning should also do it in any case, but nevertheless, there is an error that NoGCRegion was already active.
The question C# TryStartNoGCRegion 'The NoGCRegion mode was already in progress' exception when the GC is in LowLatency mode got the same error message, but there was clear code path to have activated NoGCRegion more than once there. I can't see how that could happen here.
The test itself is not accessing GC operations except for GC.SuppressFinalize in some Dispose() methods.
The test itself does not run in parallel with any other test; my Main() method loops over a set of input files and calls the test method for each one.
The test method uses an external c++ library which would be unmanaged memory in the .NET context.
What could be causing the exception, and why doesn't the call to GC.EndNoGCRegion(); prevent the problem?
I have the following code in C#:
(_StoreQueue is a ConcurrentQueue)
var S = _StoreQueue.FirstOrDefault(_ => _.TimeStamp == T);
if (S == null)
{
lock (_QueueLock)
{
// try again
S = _StoreQueue.FirstOrDefault(_ => _.TimeStamp == T);
if (S == null)
{
S = new Store(T);
_StoreQueue.Enqueue(S);
}
}
}
The system is collecting data in real time (fairly high frequency, around 300-400 calls / second) and puts it in bins (Store objects) that represent a 5 second interval. These bins are in a queue as they get written and the queue gets emptied as data is processed and written.
So, when data is arriving, a check is done to see if there is a bin for that timestamp (rounded by 5 seconds), if not, one is created.
Since this is quite heavily multi-threaded, the system goes with the following logic:
If there is a bin, it is used to put data.
If there is no bin, a lock gets initiated and within that lock, the check is done again to make sure it wasn't created by another thread in the meantime. and if there is still no bin, one gets created.
With this system, the lock is roughly used once every 2k calls
I am trying to see if there is a way to remove the lock, but it is mostly because I'm thinking there has to be a better solution that the double check.
An alternative I have been thinking about is to create empty bins ahead of time and that would entirely remove the need for any locks, but the search for the right bin would become slower as it would have to scan the list pre-built bins to find the proper one.
Using a ConcurrentDictionary can fix the issue you are having. Here i assumed a type double for your TimeStamp property but it can be anything, as long as you make the ConcurrentDictionary key match the type.
class Program
{
ConcurrentDictionary<double, Store> _StoreQueue = new ConcurrentDictionary<double, Store>();
static void Main(string[] args)
{
var T = 17d;
// try to add if not exit the store with 17
_StoreQueue.GetOrAdd(T, new Store(T));
}
public class Store
{
public double TimeStamp { get; set; }
public Store(double timeStamp)
{
TimeStamp = timeStamp;
}
}
}
Can you please explain to me what happens in the memory while executing the following code:
Case 1:
public static void Execute()
{
foreach(var text in DownloadTexts())
{
Console.WriteLine(text);
}
}
public static IEnumerable<string> DownloadTexts()
{
foreach(var url in _urls)
{
using (var webClient = new WebClient())
{
yield return webClient.DownloadText(url);
}
}
}
Let's assume after the first iteration I get html1.
When will html1 be cleared from the memory ?
on the next iteration?
when the foreach ends?
when the function ends ?
Thanks
** Edit **
Case 2:
public static void Execute()
{
var values = DownloadTexts();
foreach(var text in values)
{
Console.WriteLine(text);
}
}
public static IEnumerable<string> DownloadTexts()
{
foreach(var url in _urls)
{
using (var webClient = new WebClient())
{
yield return webClient.DownloadText(url);
}
}
}
To my understanding, Case 1 is better for the memory then case 2 right?
In case 2 will still keep a reference to the texts we already downloaded while in Case 1 every text is marked for garbage collection once its not used. Am I correct?
_urls will stay indefinitely because it is located in a field as it seems.
DownloadTexts() (the iterator returned by it) is kept alive until the end of the loop.
the WebClient and the html it produces stay alive for one iteration. If you want to know the absolute precise lifetime of it, you need to use Reflector and mentally simulate where the reference travels around. You'll find that the IEnumerator used in the loop references it until the next iteration has begun.
All objects that are not alive can be GC'ed. This happens whenever the GC thinks that is a good idea.
Regarding your Edit: The cases are equivalent. If you don't put the enumerator into a variable, the compiler will do that for you. It has to keep a reference till the end of the loop. It does not matter how many references there are. There is at least one.
Actually, the loop only requires the enumerator to be kept alive. The additional variable you added will also keep the enumerable alive. On the other hand you are not using the variable so the GC does not keep it alive.
You can test this easily:
//allocate 1TB of memory:
var items =
Enumerable.Range(0, 1024 * 1024 * 1024)
.Select(x => new string('x', 1024));
foreach (var _ in items) { } //constant memory usage
It will be cleared from memory when the garbage collector runs and determines that it's no longer in use.
The value will no longer be in use at the moment when the foreach causes the IEnumerator.MoveNext() method to be invoked. So, effectively, #1.
It will be cleared from memory when Garbage Collector will feel like doing so.
But starting point is when code holds no more references to instance of object.
So the answer in this case is: sometime after block in which you created object ends.
Have trust in GC, it is good at doing its job.
I tried searching for this but did not find the suggestion best suited for the issue that I am facing.
My issue is that we have list/stack of available resources (Calculation Engines). These resources are used to perform certain calculation.
The request to perform the calculation is triggered from an external process. So when the request for calculation is made, I need to check if any of the available resources are currently not performing other calculations, If so wait for some time and check again.
I was wondering what the best way to implement this is. I have the following code in place, but not sure if it is very safe.
If you have any further suggestions, that will be great:
void Process(int retries = 0) {
CalcEngineConnection connection = null;
bool securedConnection = false;
foreach (var calcEngineConnection in _connections) {
securedConnection = Monitor.TryEnter(calcEngineConnection);
if (securedConnection) {
connection = calcEngineConnection;
break;
}
}
if (securedConnection) {
//Dequeue the next request
var calcEnginePool = _pendingPool.Dequeue();
//Perform the operation and exit.
connection.RunCalc(calcEnginePool);
Monitor.Exit(connection);
}
else {
if (retries < 10)
retries += 1;
Thread.Sleep(200);
Process(retries);
}
}
I'm not sure that using Monitor is the best approach here anyway, but if you do decide to go that route, I'd refactor the above code to:
bool TryProcessWithRetries(int retries) {
for (int attempt = 0; attempt < retries; attempt++) {
if (TryProcess()) {
return true;
}
Thread.Sleep(200);
}
// Throw an exception here instead?
return false;
}
bool TryProcess() {
foreach (var connection in _connections) {
if (TryProcess(connection)) {
return true;
}
}
return false;
}
bool TryProcess(CalcEngineConnection connection) {
if (!Monitor.TryEnter(connection)) {
return false;
}
try {
var calcEnginePool = _pendingPool.Dequeue();
connection.RunCalc(calcEnginePool);
} finally {
Monitor.Exit(connection);
}
return true;
}
This decomposes the three pieces of logic:
Retrying several times
Trying each connection in a collection
Trying a single connection
It also avoids using recursion for the sake of it, and puts the Monitor.Exit call into a finally block, which it absolutely should be in.
You could replace the middle method implementation with:
return _connections.Any(TryProcess);
... but that may be a little too "clever" for its own good.
Personally I'd be tempted to move TryProcess into CalcEngineConnection itself - that way this code doesn't need to know about whether or not the connection is able to process something - it's up to the object itself. It means you can avoid having publicly visible locks, and also it would be flexible if some resources could (say) process two requests at a time in the future.
There are multiple issues that could potentially occur, but let's simplify your code first:
void Process(int retries = 0)
{
foreach (var connection in _connections)
{
if(Monitor.TryEnter(connection))
{
try
{
//Dequeue the next request
var calcEnginePool = _pendingPool.Dequeue();
//Perform the operation and exit.
connection.RunCalc(calcEnginePool);
}
finally
{
// Release the lock
Monitor.Exit(connection);
}
return;
}
}
if (retries < 10)
{
Thread.Sleep(200);
Process(retries+1);
}
}
This will correctly protect your connection, but note that one of the assumptions here is that your _connections list is safe and it will not be modified by another thread.
Furthermore, you might want to use a thread safe queue for the _connections because at certain load levels you might end up using only the first few connections (not sure if that will make a difference). In order to use all of your connections relatively evenly, I would place them in a queue and dequeue them. This will also guarantee that no two threads are using the same connection and you don't have to use the Monitor.TryEnter().
I have a thread which produces data in the form of simple object (record). The thread may produce a thousand records for each one that successfully passes a filter and is actually enqueued. Once the object is enqueued it is read-only.
I have one lock, which I acquire once the record has passed the filter, and I add the item to the back of the producer_queue.
On the consumer thread, I acquire the lock, confirm that the producer_queue is not empty,
set consumer_queue to equal producer_queue, create a new (empty) queue, and set it on producer_queue. Without any further locking I process consumer_queue until it's empty and repeat.
Everything works beautifully on most machines, but on one particular dual-quad server I see in ~1/500k iterations an object that is not fully initialized when I read it out of consumer_queue. The condition is so fleeting that when I dump the object after detecting the condition the fields are correct 90% of the time.
So my question is this: how can I assure that the writes to the object are flushed to main memory when the queue is swapped?
Edit:
On the producer thread:
(producer_queue above is m_fillingQueue; consumer_queue above is m_drainingQueue)
private void FillRecordQueue() {
while (!m_done) {
int count;
lock (m_swapLock) {
count = m_fillingQueue.Count;
}
if (count > 5000) {
Thread.Sleep(60);
} else {
DataRecord rec = GetNextRecord();
if (rec == null) break;
lock (m_swapLock) {
m_fillingQueue.AddLast(rec);
}
}
}
}
In the consumer thread:
private DataRecord Next(bool remove) {
bool drained = false;
while (!drained) {
if (m_drainingQueue.Count > 0) {
DataRecord rec = m_drainingQueue.First.Value;
if (remove) m_drainingQueue.RemoveFirst();
if (rec.Time < FIRST_VALID_TIME) {
throw new InvalidOperationException("Detected invalid timestamp in Next(): " + rec.Time + " from record " + rec);
}
return rec;
} else {
lock (m_swapLock) {
m_drainingQueue = m_fillingQueue;
m_fillingQueue = new LinkedList<DataRecord>();
if (m_drainingQueue.Count == 0) drained = true;
}
}
}
return null;
}
The consumer is rate-limited, so it can't get ahead of the consumer.
The behavior I see is that sometimes the Time field is reading as DateTime.MinValue; by the time I construct the string to throw the exception, however, it's perfectly fine.
Have you tried the obvious: is microcode update applied on the fancy 8-core box(via BIOS update)? Did you run Windows Updates to get the latest processor driver?
At the first glance, it looks like you're locking your containers. So I am recommending the systems approach, as it sound like you're not seeing this issue on a good-ol' dual core box.
Assuming these are in fact the only methods that interact with the m_fillingQueue variable, and that DataRecord cannot be changed after GetNextRecord() creates it (read-only properties hopefully?), then the code at least on the face of it appears to be correct.
In which case I suggest that GregC's answer would be the first thing to check; make sure the failing machine is fully updated (OS / drivers / .NET Framework), becasue the lock statement should involve all the required memory barriers to ensure that the rec variable is fully flushed out of any caches before the object is added to the list.