public void Register(decimal groupId, decimal deptId, decimal employeeId)
{
using (EmployeeEntities db = new EmployeeEntities())
{
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions { IsolationLevel = IsolationLevel.Snapshot }))
{
var course = db.Courses.Where(s => s.GroupId == groupId && s.DepartmentId == deptId).ToList();
foreach (var item in course)
{
var filledSeats = db.CourseRegistrations.Count(c => c.CourseId == item.CourseId && c.DepartmentId == deptId && (c.CancelledFl == null || c.CancelledFl == false));
if (item.AllotedSeats <= filledSeats)
{
throw new Exception("Sorry! Seats are not available for " + db.Groups.Where(s => s.GroupId == item.GroupId).Select(s => s.GroupName).FirstOrDefault());
}
if (!db.CourseRegistrations.Any(s => s.EmployeeId == employeeId && s.CourseId == item.CourseId && (s.CancelledFl == false || s.CancelledFl == null)))
{
var courseRegister = new CourseRegistration();
db.CourseRegistrations.Add(courseRegister);
courseRegister.CourseId = item.CourseId;
courseRegister.EmployeeId = employeeId;
courseRegister.CreatedBy = 1;
courseRegister.CreatedDt = DateTime.Now;
courseRegister.RecordVa = 1;
item.FilledSeats = item.FilledSeats + 1;
}
}
db.SaveChanges();
scope.Complete();
}
}
}
Consider the above code. It is a function which is eventually called when a request is sent to an ASP.NET WebAPI controller.
The function simply registers an employee to a course after checking seat availability. Some 200 employees would register for courses at the same time.
I am using Snapshot Isolation for every transaction.
My problem is in performance. It is slow. Sometimes it times out.
My question is why? What part of my code have I gone wrong? What really happens in all these transactions? What waits or what locks?
You have multiple calls to the database inside your for loop, which means you eat the total cost of db request latency 2 x course.length times, when you should only need to eat it once, maybe twice. See if you can bring the necessary data from db.CourseRegistrations outside the loop, possibly as part of the same query joined with the data from Courses. Then you can do the operations inside the loop in memory, which will be orders of magnitude faster.
If you are doing many updates/inserts the EF automatically tracks these changes by default, this can really hurt performance if you are updating many records. To help this you can turn off the AutoDetectChangesEnabled feature just before you do your bulk update and then back on again as soon as you are finished;
try
{
db.Configuration.AutoDetectChangesEnabled = false;
//loop through your updates here
}
finally
{
db.Configuration.AutoDetectChangesEnabled = true;
}
db.SaveChanges();
Related
I have this function :
[HttpGet]
[Route("See/{OID}")]
public IActionResult See([FromRoute]long OID)
{
//time ham bayad chek bshe
orders order = _context.orders.FirstOrDefault(e => e.OID == OID);
lastViewed lv = _context.lastViewed.FirstOrDefault(e => e.UID.ToString() == User.Identity.Name);
if (DateTime.Compare(lv.sentTime.AddSeconds(order.seconds), GetUTCDateTime()) < 0)
return Content("1");
unViewed UV = _context.unViewed.FirstOrDefault(e => e.UID.ToString() == User.Identity.Name && e.OID == OID);
if (UV != null)
_context.unViewed.Remove(UV);
coins coin = _context.coins.FirstOrDefault(e => e.UID.ToString() == User.Identity.Name);
if (order.type > 50)
coin.Coin += order.seconds * 8;
else
coin.Coin += order.seconds;
order.view--;
if (order.view == 0)
{
var uv = _context.unViewed.Where(e => e.OID == OID);
foreach (unViewed uvv in uv)
_context.unViewed.Remove(uvv);
pastOrders po = new pastOrders
{
link = order.link,
OID = order.OID,
UID = order.UID
};
_context.pastOrders.Add(po);
_context.orders.Remove(order);
}
_context.SaveChanges();
return showlink();
}
the showlink() function is an IActionResult.the showlink() is fully independent from other codes.
my question is here how to first response the showlink() to user and after that response, run the other codes ?
thanks alot.
You can schedule the rest of the code to run on a separate thread. Note, though, that this is not entirely safe as you are not checking to see if the thread properly executed. In order to do that, you would need to add other checks to your code along with a way to be notified if something fails. Adopting an asynchronous model is a big undertaking and is the long term solution.
However, the following should unblock you for this particular case:
var result = showlink();
Task.Factory.StartNew(() => {
// Rest of your code that needs to run in the background.
});
return result;
I get this exception when I try to process 270k records. It fails at 12k. Can someone explain to me what I am missing?
The database is SQL and I am using EF 6. I am using predicate builder to build my where clause.
The idea being
select * from table where ((a = 'v1' and b = 'v2') or (a = 'v11' and b = 'v21') or (a = 'v12' and b = 'v22') ..)
I don't see anywhere that I still hold reference to my object that represents EF class. I am creating a POCO for the result I want to send back to view.
Any ideas?
Also I am using CommandTimeout of 10000 and the point where it fails, when I run the query with same paramters in sql management studio, it returns 400 rows.
When I ran profiler, I noticed a few seconds before I got the error, memory usage shot up to 1GB+
Thanks
public List<SearchResult> SearchDocuments(List<SearchCriteria> searchCriterias)
{
List<SearchResult> results = new List<SearchResult>();
var fieldSettings = GetData() ;// make a call to database to get this data
using (var context = CreateContext())
{
var theQuery = PredicateBuilder.False<ViewInSqlDatabase>();
int skipCount = 0;
const int recordsToProcessInOneBatch = 100;
while (searchCriterias.Skip(skipCount).Any())
{
var searchCriteriasBatched = searchCriterias.Skip(skipCount).Take(recordsToProcessInOneBatch);
foreach (var searchCriteria in searchCriteriasBatched)
{
var queryBuilder = PredicateBuilder.True<ViewInSqlDatabase>();
// theQuery
if (searchCriteria.State.HasValue)
queryBuilder = queryBuilder.And(a => a.State == searchCriteria.State.Value);
if (!string.IsNullOrWhiteSpace(searchCriteria.StateFullName))
queryBuilder = queryBuilder.And(a => a.StateName.Equals(searchCriteria.StateFullName, StringComparison.CurrentCultureIgnoreCase));
if (searchCriteria.County.HasValue)
queryBuilder = queryBuilder.And(a => a.County == searchCriteria.County.Value);
if (!string.IsNullOrWhiteSpace(searchCriteria.CountyFullName))
queryBuilder = queryBuilder.And(a => a.CountyName.Equals(searchCriteria.CountyFullName, StringComparison.CurrentCultureIgnoreCase));
if (!string.IsNullOrWhiteSpace(searchCriteria.Township))
queryBuilder = queryBuilder.And(a => a.Township == searchCriteria.Township);
// and so on...for another 10 parameters
theQuery = theQuery.Or(queryBuilder.Expand());
}
// this is where I get error after 12k to 15k criterias have been processed
var searchQuery = context.ViewInSqlDatabase.AsExpandable().Where(theQuery).Distinct().ToList();
foreach (var query in searchQuery)
{
var newResultItem = SearchResult.Create(query, fieldSettings); // POCO object with no relation to database
if (!results.Contains(newResultItem))
results.Add(newResultItem);
}
skipCount += recordsToProcessInOneBatch;
}
}
return results.Distinct().OrderBy(a => a.State).ThenBy(a => a.County).ThenBy(a => a.Township).ToList();
}
Fourat is correct that you can modify your query to context.SearchResults.Where(x => ((x.a == 'v1' &&x.b == 'v2') || (x.a = 'v11' &&x.b = 'v21') || (x.a = 'v12' && x.b = 'v22')).Distinct().OrderBy(a => a.State).ThenBy(a => a.County).ThenBy(a => a.Township).ToList(); What this do with make the database do the heavy lifting for you and you
I would also suggest that you use lazy evaluation instead of forcing it into a list if you can.
I run a build system. Datawise the simplified description would be that I have Configurations and each config has 0..n Builds.
Now builds produce artifacts and some of these are stored on server. What I am doing is writing kind of a rule, that sums all the bytes produced per configuration builds and checks if these are too much.
The code for the routine at the moment is following:
private void CalculateExtendedDiskUsage(IEnumerable<Configuration> allConfigurations)
{
var sw = new Stopwatch();
sw.Start();
// Lets take only confs that have been updated within last 7 days
var items = allConfigurations.AsParallel().Where(x =>
x.artifact_cleanup_type != null && x.build_cleanup_type != null &&
x.updated_date > DateTime.UtcNow.AddDays(-7)
).ToList();
using (var ctx = new LocalEntities())
{
Debug.WriteLine("Context: " + sw.Elapsed);
var allBuilds = ctx.Builds;
var ruleResult = new List<Notification>();
foreach (var configuration in items)
{
// all builds for current configuration
var configurationBuilds = allBuilds.Where(x => x.configuration_id == configuration.configuration_id)
.OrderByDescending(z => z.build_date);
Debug.WriteLine("Filter conf builds: " + sw.Elapsed);
// Since I don't know which builds/artifacts have been cleaned up, calculate it manually
if (configuration.build_cleanup_count != null)
{
var buildCleanupCount = "30"; // default
if (configuration.build_cleanup_type.Equals("ReserveBuildsByDays"))
{
var buildLastCleanupDate = DateTime.UtcNow.AddDays(-int.Parse(buildCleanupCount));
configurationBuilds = configurationBuilds.Where(x => x.build_date > buildLastCleanupDate)
.OrderByDescending(z => z.build_date);
}
if (configuration.build_cleanup_type.Equals("ReserveBuildsByCount"))
{
var buildLastCleanupCount = int.Parse(buildCleanupCount);
configurationBuilds =
configurationBuilds.Take(buildLastCleanupCount).OrderByDescending(z => z.build_date);
}
}
if (configuration.artifact_cleanup_count != null)
{
// skipped, similar to previous block
}
Debug.WriteLine("Done cleanup: " + sw.Elapsed);
const int maxDiscAllocationPerConfiguration = 1000000000; // 1GB
// Sum all disc usage per configuration
var confDiscSizePerConfiguration = configurationBuilds
.GroupBy(c => new {c.configuration_id})
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(groupedBuilds =>
new
{
configurationId = groupedBuilds.FirstOrDefault().configuration_id,
configurationPath = groupedBuilds.FirstOrDefault().configuration_path,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
}).ToList();
Debug.WriteLine("Done db query: " + sw.Elapsed);
ruleResult.AddRange(confDiscSizePerConfiguration.Select(iter => new Notification
{
ConfigurationId = iter.configurationId,
CreatedDate = DateTime.UtcNow,
RuleType = (int) RulesEnum.TooMuchDisc,
ConfigrationPath = iter.configurationPath
}));
Debug.WriteLine("Finished loop: " + sw.Elapsed);
}
// find owners and insert...
}
}
This does exactly what I want, but I am thinking if I could make it any faster. Currenly I see:
Context: 00:00:00.0609067
// first round
Filter conf builds: 00:00:00.0636291
Done cleanup: 00:00:00.0644505
Done db query: 00:00:00.3050122
Finished loop: 00:00:00.3062711
// avg round
Filter conf builds: 00:00:00.0001707
Done cleanup: 00:00:00.0006343
Done db query: 00:00:00.0760567
Finished loop: 00:00:00.0773370
The SQL generated by .ToList() looks very messy. (Everything that is used in WHERE is covered with an index in DB)
I am testing with 200 configurations, so this adds up to 00:00:18.6326722. I have a total of ~8k items that need to get processed daily (so the whole routine takes more than 10 minutes to complete).
I have been randomly googling around this internet and it seems to me that Entitiy Framework is not very good with parallel processing. Knowing that I still decided to give this async/await approch a try (First time a tried it, so sorry for any nonsense).
Basically if I move all the processing out of scope like:
foreach (var configuration in items)
{
var confDiscSizePerConfiguration = await GetData(configuration, allBuilds);
ruleResult.AddRange(confDiscSizePerConfiguration.Select(iter => new Notification
{
... skiped
}
And:
private async Task<List<Tmp>> GetData(Configuration configuration, IQueryable<Build> allBuilds)
{
var configurationBuilds = allBuilds.Where(x => x.configuration_id == configuration.configuration_id)
.OrderByDescending(z => z.build_date);
//..skipped
var confDiscSizePerConfiguration = configurationBuilds
.GroupBy(c => new {c.configuration_id})
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(groupedBuilds =>
new Tmp
{
ConfigurationId = groupedBuilds.FirstOrDefault().configuration_id,
ConfigurationPath = groupedBuilds.FirstOrDefault().configuration_path,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
}).ToListAsync();
return await confDiscSizePerConfiguration;
}
This, for some reason, drops the execution time for 200 items from 18 -> 13 sec. Anyway, from what I understand, since I am awaiting each .ToListAsync(), it is still processed in sequence, is that correct?
So the "can't process in parallel" claim starts coming out when I replace the foreach (var configuration in items) with Parallel.ForEach(items, async configuration =>. Doing this change results in:
A second operation started on this context before a previous
asynchronous operation completed. Use 'await' to ensure that any
asynchronous operations have completed before calling another method
on this context. Any instance members are not guaranteed to be thread
safe.
It was a bit confusing to me at first as I await practically in every place where the compiler allows it, but possibly the data gets seeded to fast.
I tried to overcome this by being less greedy and added the new ParallelOptions {MaxDegreeOfParallelism = 4} to that parallel loop, peasant assumption was that default connection pool size is 100, all I want to use is 4, should be plenty. But it still fails.
I have also tried to create new DbContexts inside the GetData method, but it still fails. If I remember correctly (can't test now), I got
Underlying connection failed to open
What possibilities there are to make this routine go faster?
Before going in parallel, it is worth to optimize query itself. Here are some suggestions that might improve your times:
1) Use Key when working with GroupBy. This might solve issue of complex & nested SQL query as in that way you instruct Linq to use the same keys defined in GROUP BY and not to create sub-select.
var confDiscSizePerConfiguration = configurationBuilds
.GroupBy(c => new { ConfigurationId = c.configuration_id, ConfigurationPath = c.configuration_path})
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(groupedBuilds =>
new
{
configurationId = groupedBuilds.Key.ConfigurationId,
configurationPath = groupedBuilds.Key.ConfigurationPath,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
})
.ToList();
2) It seems that you are bitten by N+1 problem. In simple words - you execute one SQL query to get all configurations and N another ones to get build information. In total that would be ~8k small queries where 2 bigger queries would suffice. If used memory is not a constraint, fetch all build data in memory and optimize for fast lookup using ToLookup.
var allBuilds = ctx.Builds.ToLookup(x=>x.configuration_id);
Later you can lookup builds by:
var configurationBuilds = allBuilds[configuration.configuration_id].OrderByDescending(z => z.build_date);
3) You are doing OrderBy on configurationBuilds multiple times. Filtering does not affect record order, so you can safely remove extra calls to OrderBy:
...
configurationBuilds = configurationBuilds.Where(x => x.build_date > buildLastCleanupDate);
...
configurationBuilds = configurationBuilds.Take(buildLastCleanupCount);
...
4) There is no point to do GroupBy as builds are already filtered for a single configuration.
UPDATE:
I took it one step further and created code that would retrieve same results as your provided code with a single request. It should be more performant and use less memory.
private void CalculateExtendedDiskUsage()
{
using (var ctx = new LocalEntities())
{
var ruleResult = ctx.Configurations
.Where(x => x.build_cleanup_count != null &&
(
(x.build_cleanup_type == "ReserveBuildsByDays" && ctx.Builds.Where(y => y.configuration_id == x.configuration_id).Where(y => y.build_date > buildLastCleanupDate).Sum(y => y.artifact_dir_size) > maxDiscAllocationPerConfiguration) ||
(x.build_cleanup_type == "ReserveBuildsByCount" && ctx.Builds.Where(y => y.configuration_id == x.configuration_id).OrderByDescending(y => y.build_date).Take(buildCleanupCount).Sum(y => y.artifact_dir_size) > maxDiscAllocationPerConfiguration)
)
)
.Select(x => new Notification
{
ConfigurationId = x.configuration_id,
ConfigrationPath = x.configuration_path
CreatedDate = DateTime.UtcNow,
RuleType = (int)RulesEnum.TooMuchDisc,
})
.ToList();
}
}
First make a new context every parallel.foreach of you going to go that route. But u need to write a query that gets all the needed data in one trip. To speed up ef u can also disable change tracking or proxies on the context when ur reading data.
There are a lot of places for optimizations...
There are places where you should put .ToArray() to avoid asking multiple time to server...
I did a lot of refactor, but I'm unable to check, due lack of more information.
Maybe this can lead you to a better solution...
private void CalculateExtendedDiskUsage(IEnumerable allConfigurations)
{
var sw = new Stopwatch();
sw.Start();
using (var ctx = new LocalEntities())
{
Debug.WriteLine("Context: " + sw.Elapsed);
var allBuilds = ctx.Builds;
var ruleResult = GetRulesResult(sw, allConfigurations, allBuilds); // Clean Code!!!
// find owners and insert...
}
}
private static IEnumerable<Notification> GetRulesResult(Stopwatch sw, IEnumerable<Configuration> allConfigurations, ICollection<Configuration> allBuilds)
{
// Lets take only confs that have been updated within last 7 days
var ruleResult = allConfigurations
.AsParallel() // Check if you really need this right here...
.Where(IsConfigElegible) // Clean Code!!!
.SelectMany(x => CreateNotifications(sw, allBuilds, x))
.ToArray();
Debug.WriteLine("Finished loop: " + sw.Elapsed);
return ruleResult;
}
private static bool IsConfigElegible(Configuration x)
{
return x.artifact_cleanup_type != null &&
x.build_cleanup_type != null &&
x.updated_date > DateTime.UtcNow.AddDays(-7);
}
private static IEnumerable<Notification> CreateNotifications(Stopwatch sw, IEnumerable<Configuration> allBuilds, Configuration configuration)
{
// all builds for current configuration
var configurationBuilds = allBuilds
.Where(x => x.configuration_id == configuration.configuration_id);
// .OrderByDescending(z => z.build_date); <<< You should order only when needed (most at the end)
Debug.WriteLine("Filter conf builds: " + sw.Elapsed);
configurationBuilds = BuildCleanup(configuration, configurationBuilds); // Clean Code!!!
configurationBuilds = ArtifactCleanup(configuration, configurationBuilds); // Clean Code!!!
Debug.WriteLine("Done cleanup: " + sw.Elapsed);
const int maxDiscAllocationPerConfiguration = 1000000000; // 1GB
// Sum all disc usage per configuration
var confDiscSizePerConfiguration = configurationBuilds
.OrderByDescending(z => z.build_date) // I think that you can put this even later (or not to have anyway)
.GroupBy(c => c.configuration_id) // No need to create a new object, just use the property
.Where(c => (c.Sum(z => z.artifact_dir_size) > maxDiscAllocationPerConfiguration))
.Select(CreateSumPerConfiguration);
Debug.WriteLine("Done db query: " + sw.Elapsed);
// Extracting to variable to be able to return it as function result
var notifications = confDiscSizePerConfiguration
.Select(CreateNotification);
return notifications;
}
private static IEnumerable<Configuration> BuildCleanup(Configuration configuration, IEnumerable<Configuration> builds)
{
// Since I don't know which builds/artifacts have been cleaned up, calculate it manually
if (configuration.build_cleanup_count == null) return builds;
const int buildCleanupCount = 30; // Why 'string' if you always need as integer?
builds = GetDiscartBelow(configuration, buildCleanupCount, builds); // Clean Code (almost)
builds = GetDiscartAbove(configuration, buildCleanupCount, builds); // Clean Code (almost)
return builds;
}
private static IEnumerable<Configuration> ArtifactCleanup(Configuration configuration, IEnumerable<Configuration> configurationBuilds)
{
if (configuration.artifact_cleanup_count != null)
{
// skipped, similar to previous block
}
return configurationBuilds;
}
private static SumPerConfiguration CreateSumPerConfiguration(IGrouping<object, Configuration> groupedBuilds)
{
var configuration = groupedBuilds.First();
return new SumPerConfiguration
{
configurationId = configuration.configuration_id,
configurationPath = configuration.configuration_path,
Total = groupedBuilds.Sum(c => c.artifact_dir_size),
Average = groupedBuilds.Average(c => c.artifact_dir_size)
};
}
private static IEnumerable<Configuration> GetDiscartBelow(Configuration configuration,
int buildCleanupCount,
IEnumerable<Configuration> configurationBuilds)
{
if (!configuration.build_cleanup_type.Equals("ReserveBuildsByDays"))
return configurationBuilds;
var buildLastCleanupDate = DateTime.UtcNow.AddDays(-buildCleanupCount);
var result = configurationBuilds
.Where(x => x.build_date > buildLastCleanupDate);
return result;
}
private static IEnumerable<Configuration> GetDiscartAbove(Configuration configuration,
int buildLastCleanupCount,
IEnumerable<Configuration> configurationBuilds)
{
if (!configuration.build_cleanup_type.Equals("ReserveBuildsByCount"))
return configurationBuilds;
var result = configurationBuilds
.Take(buildLastCleanupCount);
return result;
}
private static Notification CreateNotification(SumPerConfiguration iter)
{
return new Notification
{
ConfigurationId = iter.configurationId,
CreatedDate = DateTime.UtcNow,
RuleType = (int)RulesEnum.TooMuchDisc,
ConfigrationPath = iter.configurationPath
};
}
}
internal class SumPerConfiguration {
public object configurationId { get; set; } //
public object configurationPath { get; set; } // I did use 'object' cause I don't know your type data
public int Total { get; set; }
public double Average { get; set; }
}
I have the following method which filters 2 million records but most of the times if i want to get the last page it causes entity framework to timeout is there any way I could improve the following code so that it can run faster.
public virtual ActionResult GetData(DataTablesParamsModel param)
{
try
{
int totalRowCount = 0;
// Generate Data
var allRecords = _echoMediaRepository.GetMediaList();
//Apply search criteria to data
var predicate = PredicateBuilder.True<MediaChannelModel>();
if (!String.IsNullOrEmpty(param.sSearch))
{
var wherePredicate = PredicateBuilder.False<MediaChannelModel>();
int i;
if (int.TryParse(param.sSearch, out i))
{
wherePredicate = wherePredicate.Or(m => m.ID == i);
}
wherePredicate = wherePredicate.Or(m => m.Name.Contains(param.sSearch));
predicate = predicate.And(wherePredicate);
}
if (param.iMediaGroupID > 0)
{
var wherePredicate = PredicateBuilder.False<MediaChannelModel>();
var mediaTypes = new NeptuneRepository<Lookup_MediaTypes>();
var mediaGroups = mediaTypes.FindWhere(m => m.MediaGroupID == param.iMediaGroupID)
.Select(m => m.Name)
.ToArray();
wherePredicate = wherePredicate.Or(m => mediaGroups.Contains(m.NeptuneMediaType) || mediaGroups.Contains(m.MediaType));
predicate = predicate.And(wherePredicate);
}
var filteredRecord = allRecords.Where(predicate);
var columnCriteria = param.sColumns.Split(',').ToList();
if (!String.IsNullOrEmpty(columnCriteria[param.iSortCol_0]))
{
filteredRecord = filteredRecord.ApplyOrder(
columnCriteria[param.iSortCol_0],
param.sSortDir_0 == "asc" ? QuerySortOrder.OrderBy : QuerySortOrder.OrderByDescending);
}
totalRowCount = filteredRecord.Count();
var finalQuery = filteredRecord.Skip(param.iDisplayStart).Take(param.iDisplayLength).ToList();
// Create response
return Json(new
{
sEcho = param.sEcho,
aaData = finalQuery,
iTotalRecords = allRecords.Count(),
iTotalDisplayRecords = totalRowCount
}, JsonRequestBehavior.AllowGet);
}
catch (Exception ex)
{
Logger.Error(ex);
throw;
}
}
Your code and queries look optimized, so the problem should be the lack of indexes in the database that degrade the performance of your orderby (used by the skip).
Using a test code very similar to yours, I've done some tests in a local test DB with a table with 5 Million rows (with XML Type columns all filled) and, as expected, using queries ordered by indexes was really fast but, by unindexed columns, they could take very, very, long time.
I recommend you to analyse the most common used columns for the dynamic Where and Order functions and do some performance tests by creating the corresponding indexes.
Basically, i have a collection of objects, i am chopping it into small collections, and doing some work on a thread over each small collection simultaneously.
int totalCount = SomeDictionary.Values.ToList().Count;
int singleThreadCount = (int)Math.Round((decimal)(totalCount / 10));
int lastThreadCount = totalCount - (singleThreadCount * 9);
Stopwatch sw = new Stopwatch();
Dictionary<int,Thread> allThreads = new Dictionary<int,Thread>();
List<rCode> results = new List<rCode>();
for (int i = 0; i < 10; i++)
{
int count = i;
if (i != 9)
{
Thread someThread = new Thread(() =>
{
List<rBase> objects = SomeDictionary.Values
.Skip(count * singleThreadCount)
.Take(singleThreadCount).ToList();
List<rCode> result = objects.Where(r => r.ZBox != null)
.SelectMany(r => r.EffectiveCBox, (r, CBox) => new rCode
{
RBox = r,
// A Zbox may refer an object that can be
// shared by many
// rCode objects even on different threads
ZBox = r.ZBox,
CBox = CBox
}).ToList();
results.AddRange(result);
});
allThreads.Add(i, someThread);
someThread.Start();
}
else
{
Thread someThread = new Thread(() =>
{
List<rBase> objects = SomeDictionary.Values
.Skip(count * singleThreadCount)
.Take(lastThreadCount).ToList();
List<rCode> result = objects.Where(r => r.ZBox != null)
.SelectMany(r => r.EffectiveCBox, (r, CBox) => new rCode
{
RBox = r,
// A Zbox may refer an object that
// can be shared by many
// rCode objects even on different threads
ZBox = r.ZBox,
CBox = CBox
}).ToList();
results.AddRange(result);
});
allThreads.Add(i, someThread);
someThread.Start();
}
}
sw.Start();
while (allThreads.Values.Any(th => th.IsAlive))
{
if (sw.ElapsedMilliseconds >= 60000)
{
results = null;
allThreads.Values.ToList().ForEach(t => t.Abort());
sw.Stop();
break;
}
}
return results != null ? results.OrderBy(r => r.ZBox.Name).ToList():null;
so, My issue is that SOMETIMES, i get a null reference exception while performing the OrderBy operation before returning the results, and i couldn't determine where is the exception exactly, i press back, click the same button that does this operation on the same data again, and it works !! .. If someone can help me identify this issue i would be more than gratefull. NOTE :A Zbox may refer an object that can be shared by many rCode objects even on different threads, can this be the issue ?
as i can't determine this upon testing, because the error happening is not deterministic.
The bug is correctly found in the chosen answer although I do not agree with the answer. You should switch to using a concurrent collection. In your case a ConcurrentBag or ConcurrentQueue. Some of which are (partially) lockfree for better performance. And they provide more readable and less code since you do not need manual locking.
Your code would also more than halve in size and double in readability if you keep from manually created threads and manual paritioning;
Parallel.ForEach(objects, MyObjectProcessor);
public void MyObjectProcessor(Object o)
{
// Create result and add to results
}
Use a ParallelOptions object if you want to limit the number of threads with Parallel.ForEach............
Well, one obvious problem is here:
results.AddRange(result);
where you're updating a list from multiple threads. Try using a lock:
object resultsLock = new object(); // globally visible
...
lock(resultsLock)
{
results.AddRange(result);
}
I suppose the problem in results = null
while (allThreads.Values.Any(th => th.IsAlive))
{ if (sw.ElapsedMilliseconds >= 60000) { results = null; allThreads.Values.ToList().ForEach(t => t.Abort());
if the threads not finised faster than 60000 ms you results become equal null and you can't call results.OrderBy(r => r.ZBox.Name).ToList(); it's throws exception
you should add something like that
if (results != null)
return results.OrderBy(r => r.ZBox.Name).ToList();
else
return null;