C# List.AddRange in Parallel.For occur ArgumentException - c#

My code like that.
List<MyPanel> list_panel = new List<MyPanel>();
.......
List<string> list_sql = new List<string>();
Parallel.For(0, list_panel.Count, i =>
{
if (list_panel[i].R == 0)
{
list_sql.AddRange(list_panel[i].MakeSqlForSave()); // it returns two string
}
});
But AddRange occur System.ArgumentException sometimes.
I found 'list isn't for multi write'. So I fix it using lock.
string[] listLock = new string[2];
Parallel.For(0, list_panel.Count, i =>
{
if (list_panel[i].R == 0)
{
listLock = list_panel[i].MakeSqlForSave();
lock(listLock)
list_sql.AddRange(listLock);
}
});
But it still occur System.ArgumentException that 'Source array was not long enough. Check srcIndex and length, and the array's lower bounds.' sometimes.
An error occurred in list_sql. If the count is 34, but the IndexOfRangeException occurs when you call list_sql[32] and list_sql[33].
How can I handle it?

Use ConcurrentBag<T> as a thread-safe collection that you can safely append to from multiple threads:
ConcurrentBag<String> result = new ConcurrentBag<String>();
Parallel.For(0, list_panel.Count, i =>
{
if (list_panel[i].R == 0)
{
foreach( String s in list_panel[i].MakeSqlForSave() )
{
result.Add( s );
}
}
});
List<String> list_sql = result.Select( s => s ).ToList(); // Serialize to a single List<T> after the concurrent operations are complete.

You must use a dedicated lock for the specific List, and use it every time you access this list (for both read and write).
List<MyPanel> list_panel = new List<MyPanel>();
List<string> list_sql = new List<string>();
object listSqlLock = new object();
Parallel.For(0, list_panel.Count, i =>
{
if (list_panel[i].R == 0)
{
var sqlCommands = list_panel[i].MakeSqlForSave();
lock (listSqlLock)
list_sql.AddRange(sqlCommands);
}
});

Related

How to find the placement of a List within another List?

I am working with two lists. The first contains a large sequence of strings. The second contains a smaller list of strings. I need to find where the second list exists in the first list.
I worked with enumeration, and due to the large size of the data, this is very slow, I was hoping for a faster way.
List<string> first = new List<string>() { "AAA","BBB","CCC","DDD","EEE","FFF" };
List<string> second = new List<string>() { "CCC","DDD","EEE" };
int x = SomeMagic(first,second);
And I would need x to = 2.
Ok, here is my variant with old-good-for-each-loop:
private int SomeMagic(IEnumerable<string> source, IEnumerable<string> target)
{
/* Some obvious checks for `source` and `target` lenght / nullity are ommited */
// searched pattern
var pattern = target.ToArray();
// candidates in form `candidate index` -> `checked length`
var candidates = new Dictionary<int, int>();
// iteration index
var index = 0;
// so, lets the magic begin
foreach (var value in source)
{
// check candidates
foreach (var candidate in candidates.Keys.ToArray()) // <- we are going to change this collection
{
var checkedLength = candidates[candidate];
if (value == pattern[checkedLength]) // <- here `checkedLength` is used in sense `nextPositionToCheck`
{
// candidate has match next value
checkedLength += 1;
// check if we are done here
if (checkedLength == pattern.Length) return candidate; // <- exit point
candidates[candidate] = checkedLength;
}
else
// candidate has failed
candidates.Remove(candidate);
}
// check for new candidate
if (value == pattern[0])
candidates.Add(index, 1);
index++;
}
// we did everything we could
return -1;
}
We use dictionary of candidates to handle situations like:
var first = new List<string> { "AAA","BBB","CCC","CCC","CCC","CCC","EEE","FFF" };
var second = new List<string> { "CCC","CCC","CCC","EEE" };
If you are willing to use MoreLinq then consider using Window:
var windows = first.Window(second.Count);
var result = windows
.Select((subset, index) => new { subset, index = (int?)index })
.Where(z => Enumerable.SequenceEqual(second, z.subset))
.Select(z => z.index)
.FirstOrDefault();
Console.WriteLine(result);
Console.ReadLine();
Window will allow you to look at 'slices' of the data in chunks (based on the length of your second list). Then SequenceEqual can be used to see if the slice is equal to second. If it is, the index can be returned. If it doesn't find a match, null will be returned.
Implemented SomeMagic method as below, this will return -1 if no match found, else it will return the index of start element in first list.
private int SomeMagic(List<string> first, List<string> second)
{
if (first.Count < second.Count)
{
return -1;
}
for (int i = 0; i <= first.Count - second.Count; i++)
{
List<string> partialFirst = first.GetRange(i, second.Count);
if (Enumerable.SequenceEqual(partialFirst, second))
return i;
}
return -1;
}
you can use intersect extension method using the namepace System.Linq
var CommonList = Listfirst.Intersect(Listsecond)

Task.Run in a for loop

I have a for loop inside of which
First : I want to compute the SQL required to run
Second : Run the SQL asynchronously without waiting for them individually to finish in a loop
My code looks like:
for (
int i = 0;
i < gm.ListGroupMembershipUploadDetailsInput.GroupMembershipUploadInputList.Count;
i++)
{
// Compute
SQL.Upload.UploadDetails.insertGroupMembershipRecords(
gm.ListGroupMembershipUploadDetailsInput.GroupMembershipUploadInputList[i],max_seq_key++,max_unit_key++,
out strSPQuery,
out listParam);
//Run the out SPQuery async
Task.Run(() => rep.ExecuteStoredProcedureInputTypeAsync(strSPQuery, listParam));
}
The insertGroupMembershipRecords method in a separate DAL class looks like :
public static GroupMembershipUploadInput insertGroupMembershipRecords(GroupMembershipUploadInput gm, List<ChapterUploadFileDetailsHelper> ch, long max_seq_key, long max_unit_key, out string strSPQuery, out List<object> parameters)
{
GroupMembershipUploadInput gmHelper = new GroupMembershipUploadInput();
gmHelper = gm;
int com_unit_key = -1;
foreach(var item in com_unit_key_lst){
if (item.nk_ecode == gm.nk_ecode)
com_unit_key = item.unit_key;
}
int intNumberOfInputParameters = 42;
List<string> listOutputParameters = new List<string> { "o_outputMessage" };
strSPQuery = SPHelper.createSPQuery("dw_stuart_macs.strx_inst_cnst_grp_mbrshp", intNumberOfInputParameters, listOutputParameters);
var ParamObjects = new List<object>();
ParamObjects.Add(SPHelper.createTdParameter("i_seq_key", max_seq_key, "IN", TdType.BigInt, 10));
ParamObjects.Add(SPHelper.createTdParameter("i_chpt_cd", "S" + gm.appl_src_cd.Substring(1), "IN", TdType.VarChar, 4));
ParamObjects.Add(SPHelper.createTdParameter("i_nk_ecode", gm.nk_ecode, "IN", TdType.Char, 5));
// rest of the method
}
But in case of list Count of 2k which I tried,
It did not insert 2k records in DB but only 1.
Why this does not insert all the records the input list has ?
What am I missing ?
Task.Run in a for loop
Even though this is not the question, the title itself is what I'm going to address. For CPU bound operations you could use Parallel.For or Parallel.ForEach, but since we are IO bound (i.e.; database calls) we should rethink this approach.
The obvious answer here is to create a list of tasks that represent the asynchronous operations and then await them using the Task.WhenAll API like this:
public async Task InvokeAllTheSqlAsync()
{
var list = gm.ListGroupMembershipUploadDetailsInput.GroupMembershipUploadInputList;
var tasks = Enumerable.Range(0, list.Count).Select(i =>
{
var value = list[i];
string strSPQuery;
List<SqlParameter> listParam;
SQL.Upload.UploadDetails.insertGroupMembershipRecords(
value,
max_seq_key++,
max_unit_key++,
out strSPQuery,
out listParam
);
return rep.ExecuteStoredProcedureInputTypeAsync(strSPQuery, listParam);
});
await Task.WhenAll(tasks);
}

Use Linq to break a list by special values?

I'm trying to use Linq to convert IEnumerable<int> to IEnumerable<List<int>> - the input stream will be separated by special value 0.
IEnumerable<List<int>> Parse(IEnumerable<int> l)
{
l.Select(x => {
.....; //?
return new List<int>();
});
}
var l = new List<int> {0,1,3,5,0,3,4,0,1,4,0};
Parse(l) // returns {{1,3,5}, {3, 4}, {1,4}}
How to implement it using Linq instead of imperative looping?
Or is Linq not good for this requirement because the logic depends on the order of the input stream?
Simple loop would be good option.
Alternatives:
Enumerable.Aggregate and start new list on 0
Write own extension similar to Create batches in linq or Use LINQ to group a sequence of numbers with no gaps
Aggregate sample
var result = list.Aggregate(new List<List<int>>(),
(sum,current) => {
if(current == 0)
sum.Add(new List<int>());
else
sum.Last().Add(current);
return sum;
});
Note: this is only sample of the approach working for given very friendly input like {0,1,2,0,3,4}.
One can even make aggregation into immutable lists but that will look insane with basic .Net types.
Here's an answer that lazily enumerates the source enumerable, but eagerly enumerates the contents of each returned list between zeroes. It properly throws upon null input or upon being given a list that does not start with a zero (though allowing an empty list through--that's really an implementation detail you have to decide on). It does not return an extra and empty list at the end like at least one other answer's possible suggestions does.
public static IEnumerable<List<int>> Parse(this IEnumerable<int> source, int splitValue = 0) {
if (source == null) {
throw new ArgumentNullException(nameof (source));
}
using (var enumerator = source.GetEnumerator()) {
if (!enumerator.MoveNext()) {
return Enumerable.Empty<List<int>>();
}
if (enumerator.Current != splitValue) {
throw new ArgumentException(nameof (source), $"Source enumerable must begin with a {splitValue}.");
}
return ParseImpl(enumerator, splitValue);
}
}
private static IEnumerable<List<int>> ParseImpl(IEnumerator<int> enumerator, int splitValue) {
var list = new List<int>();
while (enumerator.MoveNext()) {
if (enumerator.Current == splitValue) {
yield return list;
list = new List<int>();
}
else {
list.Add(enumerator.Current);
}
}
if (list.Any()) {
yield return list;
}
}
This could easily be adapted to be generic instead of int, just change Parse to Parse<T>, change int to T everywhere, and use a.Equals(b) or !a.Equals(b) instead of a == b or a != b.
You could create an extension method like this:
public static IEnumerable<IEnumerable<T>> SplitBy<T>(this IEnumerable<T> source, T value)
{
using (var e = source.GetEnumerator())
{
if (e.MoveNext())
{
var list = new List<T> { };
//In case the source doesn't start with 0
if (!e.Current.Equals(value))
{
list.Add(e.Current);
}
while (e.MoveNext())
{
if ( !e.Current.Equals(value))
{
list.Add(e.Current);
}
else
{
yield return list;
list = new List<T> { };
}
}
//In case the source doesn't end with 0
if (list.Count>0)
{
yield return list;
}
}
}
}
Then, you can do the following:
var l = new List<int> { 0, 1, 3, 5, 0, 3, 4, 0, 1, 4, 0 };
var result = l.SplitBy(0);
You could use GroupBy with a counter.
var list = new List<int> {0,1,3,5,0,3,4,0,1,4,0};
int counter = 0;
var result = list.GroupBy(x => x==0 ? counter++ : counter)
.Select(g => g.TakeWhile(x => x!=0).ToList())
.Where(l => l.Any());
Edited to fix possibility of zeroes within numbers
Here is a semi-LINQ solution:
var l = new List<int> {0,1,3,5,0,3,4,0,1,4,0};
string
.Join(",", l.Select(x => x == 0 ? "|" : x.ToString()))
.Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries)
.Select(x => x.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries));
This is probably not preferable to using a loop due to performance and other reasons, but it should work.

Parallel.For loop for spell checking using NHunspell and c#

I have a list of string and if I run spell checking using NHunspell in sequential manner then everything works fine; but if I use Parallel.For loop against the List the application stops working in the middle( some address violation error )
public static bool IsSpellingRight(string inputword, byte[] frDic, byte[] frAff, byte[] enDic, byte[] enAff)
{
if (inputword.Length != 0)
{
bool correct;
if (IsEnglish(inputword))
{
using (var hunspell = new Hunspell(enAff, enDic))
{
correct = hunspell.Spell(inputword);
}
}
else
{
using (var hunspell = new Hunspell(frAff, frDic))
{
correct = hunspell.Spell(inputword);
}
}
return correct ;
}
return false;
}
Edit:
var tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
var poptions = new ParallelOptions();
// Keep one core/CPU free...
poptions.MaxDegreeOfParallelism = Environment.ProcessorCount - 1;
Task task = Task.Factory.StartNew(delegate
{
Parallel.For(0, total, poptions, i =>
{
if (words[i] != "")
{
_totalWords++;
if (IsSpellingRight(words[i],dictFileBytes,
affFileBytes,dictFileBytesE,affFileBytesE))
{
// do something
}
else
{
BeginInvoke((Action) (() =>
{
//do something on UI thread
}));
}
}
});
}, tokenSource.Token);
task.ContinueWith((t) => BeginInvoke((Action) (() =>
{
MessaageBox.Show("Done");
})));
I think you should forget about your parallel loop implement the things right.
Are you aware of the fact that this code loads and constructs the dictionary:
using (var hunspell = new Hunspell(enAff, enDic))
{
correct = hunspell.Spell(inputword);
}
Your are loading and construction the dictionary over and over again with your code. This is awfully slow! Load Your dictionary once and check all words, then dispose it. And don't do this in parallel because Hunspell objects are not thread safe.
Pseodocode:
Hunspell hunspell = null;
try
{
hunspell = new Hunspell(enAff, enDic)
for( ... )
{
hunspell.Spell(inputword[i]);
}
}
}
finally
{
if( hunspell != null ) hunspell.Dispose();
}
If you need to check words massive in parallel consider to read this article:
http://www.codeproject.com/Articles/43769/Spell-Check-Hyphenation-and-Thesaurus-for-NET-with
Ok, now I can see a potential problem. In line
_totalWords++;
you're incrementing a value, that (I suppose) is declared somewhere outside the loop. Use locking mechanism.
edit:
Also, you could use Interlocked.Increment(ref val);, which would be faster, than simple locking.
edit2:
Here's how the locking I described in comment should look like for the problem you encounter:
static object Locker = new object(); //anywhere in the class
//then in your method
if (inputword.Length != 0)
{
bool correct;
bool isEnglish;
lock(Locker) {isEnglish = IsEnglish(inputword);}
if(isEnglish)
{
//..do your stuff
}
//rest of your function
}

Null Reference while handling a List in multiple threads

Basically, i have a collection of objects, i am chopping it into small collections, and doing some work on a thread over each small collection simultaneously.
int totalCount = SomeDictionary.Values.ToList().Count;
int singleThreadCount = (int)Math.Round((decimal)(totalCount / 10));
int lastThreadCount = totalCount - (singleThreadCount * 9);
Stopwatch sw = new Stopwatch();
Dictionary<int,Thread> allThreads = new Dictionary<int,Thread>();
List<rCode> results = new List<rCode>();
for (int i = 0; i < 10; i++)
{
int count = i;
if (i != 9)
{
Thread someThread = new Thread(() =>
{
List<rBase> objects = SomeDictionary.Values
.Skip(count * singleThreadCount)
.Take(singleThreadCount).ToList();
List<rCode> result = objects.Where(r => r.ZBox != null)
.SelectMany(r => r.EffectiveCBox, (r, CBox) => new rCode
{
RBox = r,
// A Zbox may refer an object that can be
// shared by many
// rCode objects even on different threads
ZBox = r.ZBox,
CBox = CBox
}).ToList();
results.AddRange(result);
});
allThreads.Add(i, someThread);
someThread.Start();
}
else
{
Thread someThread = new Thread(() =>
{
List<rBase> objects = SomeDictionary.Values
.Skip(count * singleThreadCount)
.Take(lastThreadCount).ToList();
List<rCode> result = objects.Where(r => r.ZBox != null)
.SelectMany(r => r.EffectiveCBox, (r, CBox) => new rCode
{
RBox = r,
// A Zbox may refer an object that
// can be shared by many
// rCode objects even on different threads
ZBox = r.ZBox,
CBox = CBox
}).ToList();
results.AddRange(result);
});
allThreads.Add(i, someThread);
someThread.Start();
}
}
sw.Start();
while (allThreads.Values.Any(th => th.IsAlive))
{
if (sw.ElapsedMilliseconds >= 60000)
{
results = null;
allThreads.Values.ToList().ForEach(t => t.Abort());
sw.Stop();
break;
}
}
return results != null ? results.OrderBy(r => r.ZBox.Name).ToList():null;
so, My issue is that SOMETIMES, i get a null reference exception while performing the OrderBy operation before returning the results, and i couldn't determine where is the exception exactly, i press back, click the same button that does this operation on the same data again, and it works !! .. If someone can help me identify this issue i would be more than gratefull. NOTE :A Zbox may refer an object that can be shared by many rCode objects even on different threads, can this be the issue ?
as i can't determine this upon testing, because the error happening is not deterministic.
The bug is correctly found in the chosen answer although I do not agree with the answer. You should switch to using a concurrent collection. In your case a ConcurrentBag or ConcurrentQueue. Some of which are (partially) lockfree for better performance. And they provide more readable and less code since you do not need manual locking.
Your code would also more than halve in size and double in readability if you keep from manually created threads and manual paritioning;
Parallel.ForEach(objects, MyObjectProcessor);
public void MyObjectProcessor(Object o)
{
// Create result and add to results
}
Use a ParallelOptions object if you want to limit the number of threads with Parallel.ForEach............
Well, one obvious problem is here:
results.AddRange(result);
where you're updating a list from multiple threads. Try using a lock:
object resultsLock = new object(); // globally visible
...
lock(resultsLock)
{
results.AddRange(result);
}
I suppose the problem in results = null
while (allThreads.Values.Any(th => th.IsAlive))
{ if (sw.ElapsedMilliseconds >= 60000) { results = null; allThreads.Values.ToList().ForEach(t => t.Abort());
if the threads not finised faster than 60000 ms you results become equal null and you can't call results.OrderBy(r => r.ZBox.Name).ToList(); it's throws exception
you should add something like that
if (results != null)
return results.OrderBy(r => r.ZBox.Name).ToList();
else
return null;

Categories