I have an array list which is continuously updated every second. I have to use the same array list in two other threads and make local copies of it. I have done all of this but i get weird exceptions of index out of bound , What i have found out so far is that i have to ensure some synchronization mechanism for the array list to be used across multiple threads.
this is how i am making it synchronized:
for (int i = 0; i < Globls.iterationCount; i++)
{
if (bw_Obj.CancellationPending)
{
eve.Cancel = true;
break;
}
byte[] rawData4 = DMM4.IO.Read(4 * numReadings);
TempDisplayData_DMM4.Add(rawData4);
Globls.Display_DataDMM4 = ArrayList.Synchronized(TempDisplayData_DMM4);
Globls.Write_DataDMM4 = ArrayList.Synchronized(TempDisplayData_DMM4);
}
in other thread i do the following to make local copies:
ArrayList Local_Write_DMM4 = new ArrayList();
Local_Write_DMM4 = new ArrayList(Globls.Write_DataDMM4);
Am i synchronizing the arraylist in right way?, also do i need to lock while copying array-list as well:
lock (Globls.Display_DataDMM4.SyncRoot){Local_Temp_Display1 = new ArrayList(Globls.Display_DataDMM4);}
or for single operations its safe?. I haven't actually ran this code i need to run it over the weekend and i don't want to see another exception :(. please help me on this!
as #Trickery stated assignment needs to be locked since the source array Globls.Write_DataDMM4 can be modified by another thread during enumeration.
It is essential therefore to lock both when populating the original array and when making your copy
for (int i = 0; i < Globls.iterationCount; i++)
{
if (bw_Obj.CancellationPending)
{
eve.Cancel = true;
break;
}
byte[] rawData4 = DMM4.IO.Read(4 * numReadings);
TempDisplayData_DMM4.Add(rawData4);
lock (Globls.Display_DataDMM4.SyncRoot)
{
Globls.Write_DataDMM4 = ArrayList.Synchronized(TempDisplayData_DMM4);
}
}
and
lock (Globls.Display_DataDMM4.SyncRoot)
{
Local_Temp_Display1 = new ArrayList(Globls.Display_DataDMM4);
}
Yes, all operations on your ArrayList need to use Lock.
EDIT: Sorry, the site won't let me add a comment to your question for some reason.
Related
I'm still somewhat new to threads and was wondering if I have an array of threads that get passed data from a loop. Is there any chance that the initial data passed to that thread could change before the thread actually starts processing it?
const int TOTAL_THREADS = 32;
Thread[] _threadList = new Thread[TOTAL_THREADS];
List<class> OIDS = new List<class>();
OIDS.Add(new class());
...
...
for (int i = 0; i < OIDS.Count; i++)
{
threadWait = true;
while (threadWait == true)
{
for (int t = 0; t < TOTAL_THREADS; t++)
{
if (_threadList[t] == null || _threadList[t].IsAlive == false)
{
class oid = OIDS[i];
_threadList[t] = new Thread(() => Worder.ProcessData(oid);
_threadList[t].Start();
threadWait = false;
break;
}
}
if (threadWait == true)
{
Thread.Sleep(1000);
}
}
}
Your code is trying to replicate thread-coordination functionality that already exists in the platform, in various ways and forms. For example you could use the Parallel class:
Parallel.ForEach(OIDS,
new ParallelOptions() { MaxDegreeOfParallelism = 32 },
Worder.ProcessData);
This could be a valid approach if your workload is CPU based (you are doing calculations, not requests to a database or to the web), and the available cores of your machine are at least 32¹. It is also required that the processing of each element of the OIDS list is independent, or if it's not that you are synchronizing the dependencies inside the Worder.ProcessData by using locks or other means to prevent concurrent access to shared state.
(¹ This is an advice regarding the general case, not a hard assertion)
I have to query in my company's CRM Solution(Oracle's Right Now) for our 600k users, and update them there if they exist or create them in case they don't. To know if the user already exists in Right Now, I consume a third party WS. And with 600k users this can be a real pain due to the time it takes each time to get a response(around 1 second). So I managed to change my code to use Parallel.ForEach, querying each record in just 0,35 seconds, and adding it to a List<User> of records to be created or to be updated (Right Now is kinda dumb so I need to separate them in 2 lists and call 2 distinct WS methods).
My code used to run perfectly before multithread, but took too long. The problem is that I can't make a batch too large or I get a timeout when I try to update or create via Web Service. So I'm sending them around 500 records at once, and when it runs the critical code part, it executes many times.
Parallel.ForEach(boDS.USERS.AsEnumerable(), new ParallelOptions { MaxDegreeOfParallelism = -1 }, row =>
{
...
user = null;
user = QueryUserById(row["USER_ID"].Trim());
if (user == null)
{
isUpdate = false;
gObject.ID = new ID();
}
else
{
isUpdate = true;
gObject.ID = user.ID;
}
... fill user attributes as generic fields ...
gObject.GenericFields = listGenericFields.ToArray();
if (isUpdate)
listUserUpdate.Add(gObject);
else
listUserCreate.Add(gObject);
if (i == batchSize - 1 || i == (boDS.USERS.Rows.Count - 1))
{
UpdateProcessingOptions upo = new UpdateProcessingOptions();
CreateProcessingOptions cpo = new CreateProcessingOptions();
upo.SuppressExternalEvents = false;
upo.SuppressRules = false;
cpo.SuppressExternalEvents = false;
cpo.SuppressRules = false;
RNObject[] results = null;
// <Critical_code>
if (listUserCreate.Count > 0)
{
results = _service.Create(_clientInfoHeader, listUserCreate.ToArray(), cpo);
}
if (listUserUpdate.Count > 0)
{
_service.Update(_clientInfoHeader, listUserUpdate.ToArray(), upo);
}
// </Critical_code>
listUserUpdate = new List<RNObject>();
listUserCreate = new List<RNObject>();
}
i++;
});
I thought about using lock or mutex, but it isn't gonna help me, since they will just wait to execute afterwards. I need some solution to execute only ONCE in only ONE thread that part of code. Is it possible? Can anyone share some light?
Thanks and kind regards,
Leandro
As you stated in the comments you're declaring the variables outside of the loop body. That's where your race conditions originate from.
Let's take variable listUserUpdate for example. It's accessed randomly by parallel executing threads. While one thread is still adding to it, e.g. in listUserUpdate.Add(gObject); another thread could already be resetting the lists in listUserUpdate = new List<RNObject>(); or enumerating it in listUserUpdate.ToArray().
You really need to refactor that code to
make each loop run as independent from each other as you can by moving variables inside the loop body and
access data in a synchronizing way using locks and/or concurrent collections
You can use the Double-checked locking pattern. This is usually used for singletons, but you're not making a singleton here so generic singletons like Lazy<T> do not apply.
It works like this:
Separate out your shared data into some sort of class:
class QuerySharedData {
// All the write-once-read-many fields that need to be shared between threads
public QuerySharedData() {
// Compute all the write-once-read-many fields. Or use a static Create method if that's handy.
}
}
In your outer class add the following:
object padlock;
volatile QuerySharedData data
In your thread's callback delegate, do this:
if (data == null)
{
lock (padlock)
{
if (data == null)
{
data = new QuerySharedData(); // this does all the work to initialize the shared fields
}
}
}
var localData = data
Then use the shared query data from localData By grouping the shared query data into a subordinate class you avoid the necessity of making its individual fields volatile.
More about volatile here: Part 4: Advanced Threading.
Update my assumption here is that all the classes and fields held by QuerySharedData are read-only once initialized. If this is not true, for instance if you initialize a list once but add to it in many threads, this pattern will not work for you. You will have to consider using things like Thread-Safe Collections.
This is a small program that only i am writing and using.
Now i am going to write code of all areas where i use the hashset that caused this problem
I don't understand how this is possible. This item is being used only at MainWindow
hsProxyList is a hashset
HashSet<string> hsProxyList = new HashSet<string>();
the error happened at below iteration
lock (hsProxyList)
{
int irRandomProxyNumber = GenerateRandomValue.GenerateRandomValueMin(hsProxyList.Count, 0);
int irLocalCounter = 0;
foreach (var vrProxy in hsProxyList)
{
if (irLocalCounter == irRandomProxyNumber)
{
srSelectedProxy = vrProxy;
break;
}
irLocalCounter++;
}
}
}
The other places where i use hsProxyList
I don't lock the object when i am getting its count - i suppose this would not cause any error but may be not correct - not fatally important
lblProxyCount.Content = "remaining proxy count: " + hsProxyList.Count;
new
lock (hsProxyList)
{
hsProxyList.Remove(srSelectedProxy);
}
new
lock (hsProxyList)
{
hsProxyList = new HashSet<string>();
foreach (var vrLine in File.ReadLines(cmbBoxSelectProxy.SelectedItem.ToString()))
{
hsProxyList.Add(vrLine);
}
}
As can be seen i am using lock everywhere. This is a multi threading software. All hsProxyList is being used in MainWindow.xaml.cs - it is a C# WPF application
The problem is where you have
lock (hsProxyList)
{
hsProxyList = new HashSet<string>();
// etc
}
All locks are on a particular object, however you're changing the object when you do hsProxyList = new HashSet<string>(); so the object that the variable hsProxyList refers to is no longer locked.
There are two issues here. The first, which has already been pointed out is that you're locking on the hash set whilst also changing the object hsProxyList points to:
lock (hsProxyList)
{
hsProxyList = new HashSet<string>();
// hsProxyList is no longer locked.
}
The second (and more subtle) problem, is that you're assuming that Count does not require a lock. This is not a safe assumption. Firstly, you don't know how HashSet has implemented it. The fact that Count is an O(1) operation indicates there is a member variable that keeps track of the count. This means that on Add or Remove this variable must be updated. An implementation of Add might look something like:
bool Add( T item ) {
this.count++;
// Point A.
addItemToHashSet(item);
}
Note that the count variable is incremented and then the item is added. If the thread calling Add is interupted at point A and your other thread that calls Count is executed you will receive a count that is higher than the number of actual elements (count has been incremented, but addItemToHashSet has not).
This may not have any serious consequences, but if you're iterating over Count elements it could cause a crash. Similar behaviour is also likely when calling Remove.
I am trying to deep clone a list of 100 multi-property objects, I am using the code below to perform the deep clone. The list creation and the list cloning happen in a loop so each time around the loop the list changes its contents but remains fixed at 100 objects.
The problem is each time around the loop, cloning the list takes what seems to be exponentially longer than the last time it executed.
public static Main ()
{
List<Chromosome<Gene>> population = Population.randomHalfAndHalf(rand, data, 100, opset);
for (int i = 0; i < numOfGenerations; i++)
{
offSpring = epoch.runEpoch(rand, population, SelectionMethod.Tournament, opset);
clone = DeepClone<List<Chromosome<Gene>>>(population);
clone.AddRange(offSpring);
population = fixPopulation(rand, clone, SelectionMethod.Tournament, 100);
}
//....REST OF CODE....
}
public static T DeepClone<T>(T obj)
{
object result = null;
using (var ms = new MemoryStream())
{
var formatter = new BinaryFormatter();
formatter.Serialize(ms, obj);
ms.Position = 0;
result = (T)formatter.Deserialize(ms);
ms.Close();
}
return (T)result;
}
Some of you may be thinking why am I even cloning the object if I can write over the original population. This is a valid point and one that I have explored but what happens when I do that is that loop executes perfectly for about 8 iterations the first time I run it, then it idles and does nothing so I stop it. The the next time I execute it it goes to 9 iterations and stops, ideals, does nothing etc etc each time around the loop. If any one has any ideas as to why this is happening also please share as I really dont get why that is happening.
But my main problem is that the time to clone the object takes notablely longer each time around the above loop first by a few seconds then eventually up to 5 mins etc.
Any body have any ideas as to why this is happening?
EDIT I have profiled the application while it was running the majority of the work over 90% is being done by BinaryFormatter.Deserialize(memoryStream) and here is fix population its doing nothing overly complex that would contribute to this problem.
private static List<Chromosome<Gene>> fixPopulation(Random rand, List<Chromosome<Gene>> oldPopulation, SelectionMethod selectionMethod, int populationSize)
{
if (selectionMethod == SelectionMethod.Tournament)
{
oldPopulation.Sort();
}
else
{
//NSGAII sorting method
}
oldPopulation.RemoveRange(populationSize, (oldPopulation.Count - populationSize));
for (int i = 0, n = populationSize / 2; i < n; i++)
{
int c1 = rand.Next(populationSize);
int c2 = rand.Next(populationSize);
// swap two chromosomes
Chromosome<Gene> temp = oldPopulation[c1];
oldPopulation[c1] = oldPopulation[c2];
oldPopulation[c2] = temp;
}
return oldPopulation;
}
You can use binary serialization to create a fast clone of your objects:
look at this :
public Entity Copy()
{
System.IO.MemoryStream memoryStream = new System.IO.MemoryStream();
System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bFormatter = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
bFormatter.Serialize(memoryStream, this);
memoryStream.Seek(0, System.IO.SeekOrigin.Begin);
IEntityForm entity= (IEntityForm)bFormatter.Deserialize(memoryStream);
return entity;
}
really easy and working!
I have an application that, before is creates a thread it calls the database to pull X amount of records. When the records are retrieved from the database a locked flag is set so those records are not pulled again.
Once a thread has completed it will pull some more records form that database. When I call the database from a thread should I set a lock on that section of code so it is called only by that thread at that time? Here is an exmaple of my code (I commented in the area where I have the lock):
private void CreateThreads()
{
for(var i = 1; i <= _threadCount; i++)
{
var adapter = new Dystopia.DataAdapter();
var records = adapter.FindAllWithLocking(_recordsPerThread,_validationId,_validationDateTime);
if(records != null && records.Count > 0)
{
var paramss = new ArrayList { i, records };
ThreadPool.QueueUserWorkItem(ThreadWorker, paramss);
}
this.Update();
}
}
private void ThreadWorker(object paramList)
{
try
{
var parms = (ArrayList) paramList;
var stopThread = false;
var threadCount = (int) parms[0];
var records = (List<Candidates>) parms[1];
var runOnce = false;
var adapter = new Dystopia.DataAdapter();
var lastCount = records.Count;
var runningCount = 0;
while (_stopThreads == false)
{
if (records.Count > 0)
{
foreach (var record in records)
{
var proc = new ProcRecords();
proc.Validate(ref rec);
adapter.Update(rec);
if (_stopThreads)
{
break;
}
}
//This is where I think I may need to sync the threads.
//Is this correct?
lock(this){
records = adapter.FindAllWithLocking;
}
}
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
SQL to Pull records:
WITH cte AS (
SELECT TOP (#topCount) *
FROM Candidates WITH (READPAST)
WHERE
isLocked = 0 and
isTested = 0 and
validated = 0
)
UPDATE cte
SET
isLocked = 1,
validationID = #validationId,
validationDateTime = #validationDateTime
OUTPUT INSERTED.*;
You shouldn't need to lock your threads as the database should be doing this on the request for you.
I see a few issues.
First, you are testing _stopThreads == false, but you have not revealed whether this a volatile read. Read the second of half this answer for a good description of what I am talking about.
Second, the lock is pointless because adapter is a local reference to a non-shared object and records is a local reference which just being replaced. I am assuming that the adapter makes a separate connection to the database, but if it shares an existing connection then some type of synchronization may need to take place since ADO.NET connection objects are not typically thread-safe.
Now, you probably will need locking somewhere to publish the results from the work item. I do not see where the results are being published to the main thread so I cannot offer any guidance here.
By the way, I would avoid showing a message box from a ThreadPool thread. The reason being that this will hang that thread until the message box closes.
You shouldn't lock(this) since its really easy for you to create deadlocks you should create a separate lock object. if you search for "lock(this)" you can find numerous articles on why.
Here's an SO question on lock(this)