iterating through IEnumerable<string> causing serious performance issue - c#

I am clue less about what has happend to performance of for loop when i tried to iterate through IEnumerable type.
Following is the code that cause serious performance issue
foreach (IEdge ed in edcol)
{
IEnumerable<string> row =
from r in dtRow.AsEnumerable()
where (((r.Field<string>("F1") == ed.Vertex1.Name) &&
(r.Field<string>("F2") == ed.Vertex2.Name))
|| ((r.Field<string>("F1") == ed.Vertex2.Name) &&
(r.Field<string>("F2") == ed.Vertex1.Name)))
select r.Field<string>("EdgeId");
int co = row.Count();
//foreach (string s in row)
//{
//}
x++;
}
The upper foreach(IEdge ed in edcol) has about 11000 iteration to complete.
It runs in fraction of seconds if i remove the line
int co = row.Count();
from the code.
The row.Count() have maximum value of 10 in all loops.
If i Uncomment the
//foreach (string s in row)
//{
//}
it goes for about 10 minutes to complete the execution of code.
Does IEnumerable type have such a serious performance issues.. ??

This answer is for the implicit question of "how do I make this much faster"? Apologies if that's not actually what you were after, but...
You can go through the rows once, grouping by the names. (I haven't done the ordering like Marc has - I'm just looking up twice when querying :)
var lookup = dtRow.AsEnumerable()
.ToLookup(r => new { F1 = r.Field<string>("F1"),
F2 = r.Field<string>("F2") });
Then:
foreach (IEdge ed in edcol)
{
// Need to check both ways round...
var first = new { F1 = ed.Vertex1.Name, F2 = ed.Vertex2.Name };
var second = new { F1 = ed.Vertex2.Name, F2 = ed.Vertex1.Name };
var firstResult = lookup[first];
var secondResult = lookup[second];
// Due to the way Lookup works, this is quick - much quicker than
// calling query.Count()
var count = firstResult.Count() + secondResult.Count();
var query = firstResult.Concat(secondResult);
foreach (var row in query)
{
...
}
}

At the moment you have O(N*M) performance, which could be probematic if both N and M are large. I would be inclined to pre-compute some of the DataTable info. For example, we could try:
var lookup = dtRows.AsEnumerable().ToLookup(
row => string.Compare(row.Field<string>("F1"),row.Field<string>("F2"))<0
? Tuple.Create(row.Field<string>("F1"), row.Field<string>("F2"))
: Tuple.Create(row.Field<string>("F2"), row.Field<string>("F1")),
row => row.Field<string>("EdgeId"));
then we can iterate that:
foreach(IEdge ed in edCol)
{
var name1 = string.Compare(ed.Vertex1.Name,ed.Vertex2.Name) < 0
? ed.Vertex1.Name : ed.Vertex2.Name;
var name2 = string.Compare(ed.Vertex1.Name,ed.Vertex2.Name) < 0
? ed.Vertex2.Name : ed.Vertex1.Name;
var matches = lookup[Tuple.Create(name1,name2)];
// ...
}
(note I enforced ascending alphabetical pairs in there, for convenience)

Related

How to find the placement of a List within another List?

I am working with two lists. The first contains a large sequence of strings. The second contains a smaller list of strings. I need to find where the second list exists in the first list.
I worked with enumeration, and due to the large size of the data, this is very slow, I was hoping for a faster way.
List<string> first = new List<string>() { "AAA","BBB","CCC","DDD","EEE","FFF" };
List<string> second = new List<string>() { "CCC","DDD","EEE" };
int x = SomeMagic(first,second);
And I would need x to = 2.
Ok, here is my variant with old-good-for-each-loop:
private int SomeMagic(IEnumerable<string> source, IEnumerable<string> target)
{
/* Some obvious checks for `source` and `target` lenght / nullity are ommited */
// searched pattern
var pattern = target.ToArray();
// candidates in form `candidate index` -> `checked length`
var candidates = new Dictionary<int, int>();
// iteration index
var index = 0;
// so, lets the magic begin
foreach (var value in source)
{
// check candidates
foreach (var candidate in candidates.Keys.ToArray()) // <- we are going to change this collection
{
var checkedLength = candidates[candidate];
if (value == pattern[checkedLength]) // <- here `checkedLength` is used in sense `nextPositionToCheck`
{
// candidate has match next value
checkedLength += 1;
// check if we are done here
if (checkedLength == pattern.Length) return candidate; // <- exit point
candidates[candidate] = checkedLength;
}
else
// candidate has failed
candidates.Remove(candidate);
}
// check for new candidate
if (value == pattern[0])
candidates.Add(index, 1);
index++;
}
// we did everything we could
return -1;
}
We use dictionary of candidates to handle situations like:
var first = new List<string> { "AAA","BBB","CCC","CCC","CCC","CCC","EEE","FFF" };
var second = new List<string> { "CCC","CCC","CCC","EEE" };
If you are willing to use MoreLinq then consider using Window:
var windows = first.Window(second.Count);
var result = windows
.Select((subset, index) => new { subset, index = (int?)index })
.Where(z => Enumerable.SequenceEqual(second, z.subset))
.Select(z => z.index)
.FirstOrDefault();
Console.WriteLine(result);
Console.ReadLine();
Window will allow you to look at 'slices' of the data in chunks (based on the length of your second list). Then SequenceEqual can be used to see if the slice is equal to second. If it is, the index can be returned. If it doesn't find a match, null will be returned.
Implemented SomeMagic method as below, this will return -1 if no match found, else it will return the index of start element in first list.
private int SomeMagic(List<string> first, List<string> second)
{
if (first.Count < second.Count)
{
return -1;
}
for (int i = 0; i <= first.Count - second.Count; i++)
{
List<string> partialFirst = first.GetRange(i, second.Count);
if (Enumerable.SequenceEqual(partialFirst, second))
return i;
}
return -1;
}
you can use intersect extension method using the namepace System.Linq
var CommonList = Listfirst.Intersect(Listsecond)

What is best and rapid way for calculate this query?

I'm beginner in c# and linq ,write this query in c#:
var query1 = (from p in behzad.Customer_Care_Database_Analysis_Centers
select p).ToArray();
for (Int64 i = 0; i < query1.Count(); i++)
{
var query2 = (from tt in behzad.Customer_Care_Database_Analysis_DETAILs
where tt.fileid == FILE_ID && tt.code_markaz ==query1[i].code_markaz //"1215" //query1[i].code_markaz.ToString().Trim() //&& tt.code_markaz.ToString().Trim() == query1[i].code_markaz.ToString().Trim()
select new
{
tt.id
}).ToArray();
if (query2.Count() > 0)
{
series1.Points.Add(new SeriesPoint(query1[i].name_markaz, new double[] { query2.Count() }));
counter += 15;
}
}//end for
but up code is very slow,i have about 1000000 Customer_Care_Database_Analysis_Centers and about 20 million record into the Customer_Care_Database_Analysis_DETAILs table,which is best query for up code?thanks.
Your current code first gets a lot of records into memory, then executes a new query for each record - where you only use the count of items, even though you again get everything.
I think (untested) that the following will perform better:
var query = from center in behzad.Customer_Care_Database_Analysis_Centers
join details in behzad.Customer_Care_Database_Analysis_DETAILs
on center.code_markaz equals details.code_markaz
where details.fileid == FILE_ID
where details.Any()
select new { Name = center.name_markaz, Count = details.Count()};
foreach(var point in query)
{
series1.Points.Add(new SeriesPoint(point.Name, new double[] { point.Count };
counter += 15;
}
Instead of a lot of queries, execute just one query that will get just the data needed
Instead of getting everything into memory first (with ToArray()), loop through it as it arrives - this saves a lot of memory

Join two List<T> with foreach loop

I am attempting to move slightly away from LINQ which has proven very useful overall, but also quite difficult to read at times.
I used to use LINQ to perform joins (full outer join) but would prefer to do so using for/foreach loops for their simplicity. I just converted one LINQ statement (not PLINQ) into a nested foreach loop and the performance took a severe hit. What used to take seconds is now taking around a minute, see code below.
foreach (var p in PortfolioELT)
{
double meanloss;
double expvalue;
double stddevc;
double stddevi;
bool matched = false;
foreach (var a in AccountELT)
{
if (a.eventid == p.eventid)
{ DO SOME MATH HERE <-----
Any ideas on either
Why this is slower than LINQ Join and
How can I speed it up?
The program fairly obviously does what it needs to, but is too slow.
EDIT:
OLD CODE FULL
public static ConcurrentList<Event> CreateNewELTSUB(IList<Event> AccountELT, IList<Event> PortfolioELT)
{
if (AccountELT == null)
{
return (ConcurrentList<Event>)PortfolioELT;
}
else
{
//Subtract the Account ELT from the Portfolio ELT
var newELT = from p in PortfolioELT
join a in AccountELT
on p.eventid equals a.eventid into g
from e in g.DefaultIfEmpty()
select new
{
EventID = p.eventid,
Rate = p.rate,
meanloss = p.meanloss - (e == null ? 0d : e.meanloss),
expValue = p.expValue - (e == null ? 0d : e.expValue),
stddevc = Math.Sqrt(Math.Pow(p.stddevc, 2) - (e == null ? 0d : Math.Pow(e.stddevc, 2))),
stddevi = Math.Sqrt(Math.Pow(p.stddevi, 2) - (e == null ? 0d : Math.Pow(e.stddevi, 2)))
};
ConcurrentList<Event> list = new ConcurrentList<Event>();
foreach (var x in newELT)
{
list.Add(new Event(x.meanloss, x.EventID, x.expValue, x.Rate, x.stddevc, x.stddevi));
}
return list;
}
}
NEW CODE FULL:
public static ConcurrentList<Event> CreateNewELTSUB(IList<Event> AccountELT, IList<Event> PortfolioELT)
{
if (AccountELT == null)
{
return (ConcurrentList<Event>)PortfolioELT;
}
else
{
//Subtract the Account ELT from the Portfolio ELT
ConcurrentList<Event> newlist = new ConcurrentList<Event>();
//Outer Join on Portfolio ELT
foreach (var p in PortfolioELT)
{
double meanloss;
double expvalue;
double stddevc;
double stddevi;
bool matched = false;
foreach (var a in AccountELT)
{
if (a.eventid == p.eventid)
{
matched = true;
meanloss = p.meanloss - a.meanloss;
expvalue = p.expValue - a.expValue;
stddevc = Math.Sqrt((Math.Pow(p.stddevc, 2)) - (Math.Pow(a.stddevc, 2)));
stddevi = Math.Sqrt((Math.Pow(p.stddevi, 2)) - (Math.Pow(a.stddevi, 2)));
newlist.Add(new Event(meanloss, p.eventid, expvalue, p.rate, stddevc, stddevi));
}
else if (a.eventid != p.eventid) //Outer Join on Account
{
newlist.Add(a);
}
}
if (!matched)
{
newlist.Add(p);
}
}
return newlist;
}
Why this is slower than LINQ Join and
Im skipping answering this on purpose
How can I speed it up?
You're looping over the entire AccountELT collection for every PortfolioELT. You should loop one, and have the other converted to a Dictionary to make finding a specifiec record easier. Something like:
var accountELTIdx = AccountELT.ToDictionary(k => k.eventid);
then
foreach (var p in PortfolioELT)
{
double meanloss;
double expvalue;
double stddevc;
double stddevi;
bool matched = false;
if(accountELTIdx.ContainsKey(p.eventid)
{
var acct = accountELTIdx[p.eventid];
// some maths
}
....
You're creating local variables every iteration, that may or may not ever be used.
double meanloss;
double expvalue;
double stddevc;
double stddevi;
bool matched = false;
You are doing a linear search for matching event id's, if just 1 of the Lists is ordered by "eventid" you could use binary search instead of the wasted effort of a full linear search.
foreach (var a in AccountELT)
{
if (a.eventid == p.eventid)

Creating a two-dimensional array

I am trying to create a two dimensional array and I am getting so confused. I was told by a coworker that I need to create a dictionary within a dictionary for the array list but he couldn't stick around to help me.
I have been able to create the first array that lists the the programs like this
+ project 1
+ project 2
+ project 3
+ project 4
The code that accomplishes this task is below-
var PGList = from x in db.month_mapping
where x.PG_SUB_PROGRAM == SP
select x;
//select x.PG.Distinct().ToArray();
var PGRow = PGList.Select(x => new { x.PG }).Distinct().ToArray();
So that takes care of my vertical array and now I need to add my horizontal array so that I can see the total amount spent in each accounting period. So the final output would look like this but without the dashes of course.
+ program 1-------100---200---300---400---500---600---700---800---900---1000---1100---1200
+ program 2-------100---200---300---400---500---600---700---800---900---1000---1100---1200
+ program 3-------100---200---300---400---500---600---700---800---900---1000---1100---1200
+ program 4-------100---200---300---400---500---600---700---800---900---1000---1100---1200
I have tried to use a foreach to cycle through the accounting periods but it doesn't work. I think I might be on the right track and I was hoping SO could provide some guidance or at the very least a tutorial for me to follow. I have posted the code that I written so far on the second array below. I am using C# and MVC 3. You might notice that their is no dictionary within a dictionary. If my coworker is correct how would I do something like that, I took a look at this question using dictionary as a key in other dictionary but I don't understand how I would use it in this situation.
Dictionary<string, double[]> MonthRow = new Dictionary<string, double[]>();
double[] PGContent = new double[12];
string lastPG = null;
foreach (var item in PGRow)
{
if (lastPG != item.PG)
{
PGContent = new double[12];
}
var MonthList = from x in db.Month_Web
where x.PG == PG
group x by new { x.ACCOUNTING_PERIOD, x.PG, x.Amount } into pggroup
select new { accounting_period = pggroup.Key.ACCOUNTING_PERIOD, amount = pggroup.Sum(x => x.Amount) };
foreach (var P in MonthList)
{
int accounting_period = int.Parse(P.accounting_period) - 1;
PAContent[accounting_period] = (double)P.amount;
MonthRow[item.PG] = PGContent;
lastPG = item.PG;
}
I hope I have clearly explained my issue, please feel free to ask for any clarification needed as I need to solve this problem and will be checking back often. Thanks for your help!
hope this helps.
// sample data
var data = new Dictionary<string, List<int>>();
data.Add("program-1", new List<int>() { 100, 110, 130 });
data.Add("program-2", new List<int>() { 200, 210, 230 });
data.Add("brogram-3", new List<int>() { 300, 310, 330 });
// query data
var newData = (from x in data
where x.Key.Contains("pro")
select x).ToDictionary(v => v.Key, v=>v.Value);
// display selected data
foreach (var kv in newData)
{
Console.Write(kv.Key);
foreach (var val in kv.Value)
{
Console.Write(" ");
Console.Write(val.ToString());
}
Console.WriteLine();
}
output is:
program-1 100 110 130
program-2 200 210 230
Don't try to use anonymous types or LINQ projection to create new data types, especially if you're a beginner, you will just get confused. If you want a specialized data type, define one; e.g.:
public class Account
{
public string Name { get; private set; }
public decimal[] MonthAmount { get; private set; }
readonly int maxMonths = 12;
public Account(string name, ICollection<decimal> monthAmounts)
{
if (name == null)
throw new ArgumentNullException("name");
if (monthAmounts == null)
throw new ArgumentNullException("monthAmounts");
if (monthAmounts.Count > maxMonths)
throw new ArgumentOutOfRangeException(string.Format(" monthAmounts must be <= {0}", maxMonths));
this.Name = name;
this.MonthAmount = new decimal[maxMonths];
int i = 0;
foreach (decimal d in monthAmounts)
{
this.MonthAmount[i] = d;
i++;
}
}
}
Use instances of this type directly, you do not have to convert them to arrays, dictionaries, lists, or anything else:
var accountPeriods = new List<Account>();
accountPeriods.Add(new Account("program-1", new decimal[] { 1, 2, 3, 4 }));
You can use LINQ or whatever to query or alter instances of your new type:
foreach (Account a in accountPeriods)
foreach (decimal d in a.MonthAmount)
DoSomethingWith(d);
That should be enough to get you started.
I want to thank #Ray Cheng and #Dour High Arch for their help but I have figured out another way to accomplish this task and I wanted to post my code so that the next person that is having the same trouble can figure out their problem faster.
Above I split my code into more managable sections to explain my problem as clearly as I could and the code below has all those parts combined so you can see the big picture. This code returns an array that contains the program and the amounts for every month.
public virtual ActionResult getAjaxPGs(string SP = null)
{
if (SP != null)
{
var PGList = from x in db.month_mapping
where x.PG_SUB_PROGRAM == SP
select x;
var PGRow = PGList.Select(x => new { x.PG }).Distinct().ToArray();
float[] PGContent = new float[12];
Dictionary<string,float[]> MonthRow = new Dictionary<string, float[]>();
foreach (var item in PGRow)
{
PGContent = new float[12];
var MonthList = from x in db.month_Web
where x.PG == item.PG
group x by new { x.ACCOUNTING_PERIOD, x.PG, x.Amount } into pggroup
select new { accounting_period = pggroup.Key.ACCOUNTING_PERIOD, amount = pggroup.Sum(x => x.Amount) };
foreach (var mon in MonthList)
{
int accounting_period = int.Parse(mon.accounting_period) - 1;
PGContent[accounting_period] = (float)mon.amount/1000000;
}
MonthRow[item.PG] = PGContent;
}
return Json(MonthRow, JsonRequestBehavior.AllowGet);
}
return View();
}
This code worked great for me since I am pulling from a Linq to SQL query instead of adding data directly into the code. My problems stemmed from mainly putting the data pulls outside of the foreach loops so it only pulled 1 piece of data from the SQL instead of all twelve months. I hope this helps some one else who is trying to pull data in from SQL data sources into multidimensional arrays.

Change the foreach loop into the LINQ query

I stuck in an easy scenario. I have a List<string> object, all of its items has the body of:
item_1_2_generatedGUID //generatedGUID is Guid.NewGuid()
but there may be much more numbers
item_1_2_3_4_5_generatedGUID etc
now, I'm wondering how to change that loop into the LINQ's query. Any ideas ?
string one = "1"; //an exaplme
string two = "2"; //an exaplme
foreach (var item in myStringsList)
{
string[] splitted = item.Split(new char[] { '_' },
StringSplitOptions.RemoveEmptyEntries);
if(splitted.Length >= 3)
{
if(splitted[1] == one && splitted[2] == two)
{
resultList.Add(item);
}
}
}
var result = from s in lst
let spl = s.Split('_')
where spl.Length >= 3 && spl[1] = one && spl[2] == two
select s;
Try this:
var query = from item in myStringsList
let splitted = item.Split(new[] { '_' }, SSO.RemoveEmptyEntries)
where splitted.Length >= 3
where splitted[1] == one && splitted[2] == two
select item;
var resultList = query.ToList();
This is a different approach:
var items = myStringsList.
Where(x => x.Substring(x.IndexOf("_")).StartsWith(one+"_"+two+"_"));
You probably will need to add a +1 in the IndexOf, but I'm not sure.
What it does is:
Removes the first item (that's the substring for). In your example, it should be "1_2_3_4_5_generatedGUID"
Checks the string starts with what you are expecting. In your example: 1_2_
Edited: Added the pattern for anything at the first "position"
var result = items.Where(i => Regex.IsMatch(i, "^[^_]_1_2_"));

Categories