I'm beginner in c# and linq ,write this query in c#:
var query1 = (from p in behzad.Customer_Care_Database_Analysis_Centers
select p).ToArray();
for (Int64 i = 0; i < query1.Count(); i++)
{
var query2 = (from tt in behzad.Customer_Care_Database_Analysis_DETAILs
where tt.fileid == FILE_ID && tt.code_markaz ==query1[i].code_markaz //"1215" //query1[i].code_markaz.ToString().Trim() //&& tt.code_markaz.ToString().Trim() == query1[i].code_markaz.ToString().Trim()
select new
{
tt.id
}).ToArray();
if (query2.Count() > 0)
{
series1.Points.Add(new SeriesPoint(query1[i].name_markaz, new double[] { query2.Count() }));
counter += 15;
}
}//end for
but up code is very slow,i have about 1000000 Customer_Care_Database_Analysis_Centers and about 20 million record into the Customer_Care_Database_Analysis_DETAILs table,which is best query for up code?thanks.
Your current code first gets a lot of records into memory, then executes a new query for each record - where you only use the count of items, even though you again get everything.
I think (untested) that the following will perform better:
var query = from center in behzad.Customer_Care_Database_Analysis_Centers
join details in behzad.Customer_Care_Database_Analysis_DETAILs
on center.code_markaz equals details.code_markaz
where details.fileid == FILE_ID
where details.Any()
select new { Name = center.name_markaz, Count = details.Count()};
foreach(var point in query)
{
series1.Points.Add(new SeriesPoint(point.Name, new double[] { point.Count };
counter += 15;
}
Instead of a lot of queries, execute just one query that will get just the data needed
Instead of getting everything into memory first (with ToArray()), loop through it as it arrives - this saves a lot of memory
Related
I have two large lists and I need get the diff between them.
The first list is from another system via webservice, the second list is from a database (destiny of data).
i will compare and get items from first list that not have in second list and insert in the database (second list source).
have another solution with best performance?
using List.Any(), the process take a lot of hours and not finish...
using for loop, the process take 10 hours or more.
Each list have 1.300.000 records
newItensForInsert = List1.Where(item1 => !List2.Any(item2 => item1.prop1 == item2.prop1 && item1.prop2 == item2.prop2)).ToList();
//or
for (int i = 0; i < List1.Count; i++)
{
if (!List2.Any(x => x.prop1 == List1[i].prop1 && x.prop2 == List1[i].prop2))
{
ListForInsert.Add(List1[i]);
}
}
//or
ListForInsert = List1.AsParallel().Except(List2.AsParallel(), IEqualityComparer).ToList();
You could use List.Except
List<object> webservice = new List<object>();
List<object> database = new List<object>();
IEnumerable<object> toPutIntoDatabase = webservice.Except(database);
database.AddRange(toPutIntoDatabase);
EDIT:
You can even use the new PLINQ (parallel LINQ) like this
IEnumerable<object> toPutIntoDatabase = webservice.AsParallel().Except(database.AsParallel());
EDIT:
Maybe you could use a Hashset to speed up lookups.
HashSet<object> databaseHash = new HashSet<object>(database);
foreach (var item in webservice)
{
if (databaseHash.Contains(item) == false)
{
database.Add(item);
}
{
If same data type then you can use List.Exists,
Else Better to go with inner join and emit
var newdata = from c in dblist
join p in list1 on c.Category equals p.Category into ps
from p in ps.DefaultIfEmpty()
it will select list if given data not present in dblist
HashSet<T> is optimized for executing this kind of set operations. In many cases it's worth the effort to create HashSets from Lists and do the set operation on the Hashsets. I demonstrated this with a little Linqpad program.
The program creates two lists containing 1,300,000 objects. It uses your method to get the difference (or better: attempted to used, because I ran out of patience). And it uses LINQ's Except and hashsets with ExceptWith, both with an IEqualityComparer. The program is listed below.
The result was:
Lists created: 00:00:00.9221369
Hashsets created: 00:00:00.1057532
Except: 00:00:00.2564191
ExceptWith: 00:00:00.0696830
So creating the HashSets and executing ExceptWith (together 0.18), beat Except (0.26s).
One caveat: creating HashSets may take too much memory since the large lists already take a fair amount of memory.
void Main()
{
var sw = Stopwatch.StartNew();
var amount = 1300000;
//amount = 50000;
var list1 = Enumerable.Range(0, amount).Select(i => new Demo(i)).ToList();
var list2 = Enumerable.Range(10, amount).Select(i => new Demo(i)).ToList();
sw.Stop();
sw.Elapsed.Dump("Lists created");
sw.Restart();
var hs1 = new HashSet<Demo>(list1, new DemoComparer());
var hs2 = new HashSet<Demo>(list2, new DemoComparer());
sw.Stop();
sw.Elapsed.Dump("Hashsets created");
sw.Restart();
// var list3 = list1.Where(item1 => !list2.Any(item2 => item1.ID == item2.ID)).ToList();
// sw.Stop();
// sw.Elapsed.Dump("Any");
// sw.Restart();
var list4 = list1.Except(list2, new DemoComparer()).ToList();
sw.Stop();
sw.Elapsed.Dump("Except");
sw.Restart();
hs1.ExceptWith(hs2);
sw.Stop();
sw.Elapsed.Dump("ExceptWith");
// list3.Count.Dump();
list4.Count.Dump();
hs1.Count.Dump();
}
// Define other methods and classes here
class Demo
{
public Demo(int id)
{
ID = id;
Name = id.ToString();
}
public int ID { get; set; }
public string Name { get; set; }
}
class DemoComparer : IEqualityComparer<Demo>
{
public bool Equals(Demo x, Demo y)
{
return (x == null && y == null)
|| (x != null && y != null) && x.ID.Equals(y.ID);
}
public int GetHashCode(Demo obj)
{
return obj.ID.GetHashCode();
}
}
Use List.Exists, it is better than List.Any Performance-wise
How can I refactor this code?
Is it possibel to make the aktuelKurs og kursFagenFor in the same line?
EDIT 2
if (aktiekurser != null)
{
int idDato = aktiekurser.Last().IdDato;
for (int i = 0; i < antalDage; i++)
{
aktuelKurs = (from a in aktiekurser
where a.IdDato == idDato - i
select a.Lukkekurs
).Sum();
kursDagenFor = (from a in aktiekurser
where a.IdDato == idDato - (i + 1)
select a.Lukkekurs
).Sum();
gnsOp += aktuelKurs > kursDagenFor ? aktuelKurs :0m;
}
}
This isn't very efficient. First, you query each individual sum separately and, second, in each iteration you calculate a sum that was also calculated in the previous iteration.
You can make this much more efficient by querying al required sums in one grouping query:
var aktuelKurs = from a in aktiekurser
where a.IdDato >= idDato - 1 + antalDage
group a by a.IdDato into grp
select grp.Sum(x => x.Lukkekurs);
Now you have a list of decimals of which you have to determine if elements are greater than their predecessors and Sum the results according to your rule:
var gnsOp = aktuelKurs.Zip(aktuelKurs.Skip(1),
(prev,act) => act > prev ? act :0m).Sum()
What I'm trying to do is create a method using a WCF service to go through One database ( transactions) grab everything from the table which has todays date, then add them to my daily sales table which will have one row per date displaying the profit, the daily takings, the expenses, etc.
I've tried to do it like this
public void CalculateProfit(string Date)
{
decimal takings = 0;// not needed
decimal Expenses = 0;// not needed
using (transactionClassDataContext cont = new transactionClassDataContext())
{
int counter = 0;
DailySale d = new DailySale();
var query = (from q in cont.DailySales where q.Date.Equals(Date) select q);
var query2 = (from r in cont.Transactions where r.Date.Equals(Date) select r);
foreach (var z in query)
{
counter++;
}
if (counter>0)
{
foreach (var y in query2)
{
takings = takings + y.Price;
Expenses = Expenses + 0;
d.Expenses += 0;
d.Takings += y.Price;
d.Profit = d.Takings - d.Expenses;
d.Date = Date;
cont.DailySales.InsertOnSubmit(d);// update the value
cont.SubmitChanges();
}
}
else
{
d.Date = Date;
cont.DailySales.InsertOnSubmit(d);// if there isnt an entry for todays date, add one
cont.SubmitChanges();
}
}
}
}
}
But all it does is throw this error "Cannot add an entity that already exists."
Most similar questions have said I need to create a new instance of d in the foreach, but all that seems to do is add loads of records to m daily sales, when all I want is one row with an updated total.
Any ideas?
You should move the DailySale d = new DailySale(); into the scope(s) where it is used.
You're adding same object for each turn in foreach statement.
1) You can move this lines outside of foreach :
cont.DailySales.InsertOnSubmit(d);// update the value
cont.SubmitChanges();
2) Insert the object in first turn. Update same object in following turns.
cont.UpdateObject(d);
cont.SaveChanges();
You should insert only once
using (transactionClassDataContext cont = new transactionClassDataContext())
{
int counter = 0;
DailySale d = new DailySale();
cont.DailySales.InsertOnSubmit(d);// **INSERT** the value
And update the rest of the times
var query = (from q in cont.DailySales where q.Date.Equals(Date) select q);
var query2 = (from r in cont.Transactions where r.Date.Equals(Date) select r);
foreach (var z in query)
{
counter++;
}
if (counter>0)
{
foreach (var y in query2)
{
takings = takings + y.Price;
Expenses = Expenses + 0;
d.Expenses += 0;
d.Takings += y.Price;
d.Profit = d.Takings - d.Expenses;
d.Date = Date;
cont.SubmitChanges(); // *** left it here but better move it outside the foreach,
}
The `cont.SubmitChanges(); can even be moved completely to the end so you have only one transaction.
it turns out I was changing the d ( the daily sales table) instead of changing the " var r"
DailySale d = new DailySale();
var query = (from q in cont.DailySales where q.Date.Equals(Date) select q);
foreach (var z in query)
{
counter++;
}
if (counter>0)
{
foreach (var r in query)
{
r.Takings = r.Takings + (decimal)price; // here i was using d instead of r
r.Profit = r.Takings - r.Expenses;
}
cont.SubmitChanges();
Thanks for the help .
I have table as below. I need to get all the manager id's for the user id provided.
userid managerid
10 1
9 10
6 9
2 6
4 1
If i pass 2 to my method I need to get 1,10,9 and 6. I have written the below query which will return only first level parent. I.e it will return only 6 & 9.
public List<int?> mymethod (int userId){
return (from e in mycontext.EmployeeManagers
join e1 in m_context.EmployeeManagers
on e.UserId equals e1.ManagerId
where e1.UserId == userId
select e1.ManagerId).AsQueryable().ToList()
}
How can I modify the query to return all the manager hirerachy?
please help.
You can not do this in a sinqle LINQ expression. You have to run this in a loop.
A better option is to do this in the database and then return the results to LINQ.
See:
Hierarchical data in Linq - options and performance
Fill a Recursive Data Structure from a Self-Referential Database Table
Linq-to-Sql: recursively get children
I would simply run a short loop like this (sorry for invalid capitalization, coded from scratch):
public List<Int> GetAllManagers(int userID)
{
var result = new List<int>();
int index = 0;
result.add(userID); // start with user (it will be removed later)
while (index < result.count)
{
var moreData = from e in mycontext.EmployeeManagers
where e.UserId == result[index];
select e.ManagerId;
foreach (int id in moreData)
if (result.indexOf(id)==-1)
result.add(id);
index++;
}
result.delete(0);
return result;
}
or recursevly
private void AddUniqueIds (List<int> elements, ref List<int> list)
{
foreach (int id in elements)
if (list.indexOf(id)==-1)
list.add(id);
}
public List<int> GetAllManagers(int userID)
{
var result = new List<int>();
var moreData = from e in mycontext.EmployeeManagers
where e.UserId == result[index];
select e.ManagerId;
foreach (int id in moreData)
AddUniqueIds(result, GetAllManagers(id));
return result;
}
You need to use different pattern.
Lets see that you get your query.
var query = myContext.EmployeeManagers
Then, you could join it however you want to
for(int = 0; i < 5; i++)
{
query = query.Join( ... ..., i, ... ); // Can't recall all
// the parameters right now.
}
And then just execute it:
var result = query.ToList();
I am clue less about what has happend to performance of for loop when i tried to iterate through IEnumerable type.
Following is the code that cause serious performance issue
foreach (IEdge ed in edcol)
{
IEnumerable<string> row =
from r in dtRow.AsEnumerable()
where (((r.Field<string>("F1") == ed.Vertex1.Name) &&
(r.Field<string>("F2") == ed.Vertex2.Name))
|| ((r.Field<string>("F1") == ed.Vertex2.Name) &&
(r.Field<string>("F2") == ed.Vertex1.Name)))
select r.Field<string>("EdgeId");
int co = row.Count();
//foreach (string s in row)
//{
//}
x++;
}
The upper foreach(IEdge ed in edcol) has about 11000 iteration to complete.
It runs in fraction of seconds if i remove the line
int co = row.Count();
from the code.
The row.Count() have maximum value of 10 in all loops.
If i Uncomment the
//foreach (string s in row)
//{
//}
it goes for about 10 minutes to complete the execution of code.
Does IEnumerable type have such a serious performance issues.. ??
This answer is for the implicit question of "how do I make this much faster"? Apologies if that's not actually what you were after, but...
You can go through the rows once, grouping by the names. (I haven't done the ordering like Marc has - I'm just looking up twice when querying :)
var lookup = dtRow.AsEnumerable()
.ToLookup(r => new { F1 = r.Field<string>("F1"),
F2 = r.Field<string>("F2") });
Then:
foreach (IEdge ed in edcol)
{
// Need to check both ways round...
var first = new { F1 = ed.Vertex1.Name, F2 = ed.Vertex2.Name };
var second = new { F1 = ed.Vertex2.Name, F2 = ed.Vertex1.Name };
var firstResult = lookup[first];
var secondResult = lookup[second];
// Due to the way Lookup works, this is quick - much quicker than
// calling query.Count()
var count = firstResult.Count() + secondResult.Count();
var query = firstResult.Concat(secondResult);
foreach (var row in query)
{
...
}
}
At the moment you have O(N*M) performance, which could be probematic if both N and M are large. I would be inclined to pre-compute some of the DataTable info. For example, we could try:
var lookup = dtRows.AsEnumerable().ToLookup(
row => string.Compare(row.Field<string>("F1"),row.Field<string>("F2"))<0
? Tuple.Create(row.Field<string>("F1"), row.Field<string>("F2"))
: Tuple.Create(row.Field<string>("F2"), row.Field<string>("F1")),
row => row.Field<string>("EdgeId"));
then we can iterate that:
foreach(IEdge ed in edCol)
{
var name1 = string.Compare(ed.Vertex1.Name,ed.Vertex2.Name) < 0
? ed.Vertex1.Name : ed.Vertex2.Name;
var name2 = string.Compare(ed.Vertex1.Name,ed.Vertex2.Name) < 0
? ed.Vertex2.Name : ed.Vertex1.Name;
var matches = lookup[Tuple.Create(name1,name2)];
// ...
}
(note I enforced ascending alphabetical pairs in there, for convenience)