Adding 90000 XElement to XDocument - c#

I have a Dictionary<int, MyClass>
It contains 100,000 items
10,000 items value is populated whilst 90,000 are null.
I have this code:
var nullitems = MyInfoCollection.Where(x => x.Value == null).ToList();
nullitems.ForEach(x => LogMissedSequenceError(x.Key + 1));
private void LogMissedSequenceError(long SequenceNumber)
{
DateTime recordTime = DateTime.Now;
var errors = MyXDocument.Descendants("ERRORS").FirstOrDefault();
if (errors != null)
{
errors.Add(
new XElement("ERROR",
new XElement("DATETIME", DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff")),
new XElement("DETAIL", "No information was read for expected sequence number " + SequenceNumber),
new XAttribute("TYPE", "MISSED"),
new XElement("PAGEID", SequenceNumber)
)
);
}
}
This seems to take about 2 minutes to complete. I can't seem to find where the bottleneck might be or if this timing sounds about right?
Can anyone see anything to why its taking so long?

If your MyInfoCollection is huge, I wouldn't call ToList() on it just so you can use the ForEach extension method. Calling ToList() is going to create and populate a huge list. I'd remove the ToList() call, and make the .ForEach into a for each statement, or write a .ForEach extension method for IEnumerable<T>.
Then profile it and see how long it takes. One other thing to do is remove the find and null check of the ERRORS element. If it's not there, then don't call the for each statement above. That way you null check it one time instead of 90,000 times.
Plus as Michael Stum pointed out, I'd define a string to hold the value DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff"), then reference it or pass it in. Plus, you don't even use this call:
DateTime recordTime = DateTime.Now;

This is what I would most likely do.
private void BuildErrorNodes()
{
const string nodeFormat = #"<ERROR TYPE=""MISSED""><DATETIME>{0}</DATETIME><DETAIL>No information was read for expected sequence number {1}</DETAIL><PAGEID>{1}</PAGEID></ERROR>";
var sb = new StringBuilder("<ERRORS>");
foreach (var item in MyInfoCollection)
{
if (item.Value == null)
{
sb.AppendFormat(
nodeFormat,
DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff"),
item.Key + 1
);
}
}
sb.Append("</ERRORS>");
var errorsNode = MyXDocument.Descendants("ERRORS").FirstOrDefault();
errorsNode.ReplaceWith(XElement.Parse(sb.ToString()));
}

How about replacing the method call with a LINQ query?
static void Main(string[] args)
{
var MyInfoCollection = (from key in Enumerable.Range(0, 100000)
let value = (MoreRandom() % 10 != 0)
? (string)null
: "H"
select new { Value = value, Key = key }
).ToDictionary(k => k.Key, v => v.Value);
var MyXDocument = new XElement("ROOT",
new XElement("ERRORS")
);
var sw = Stopwatch.StartNew();
//===
var errorTime = DateTime.Now.ToString("dd/MM/yyyy HH:mm:ss:fff");
var addedIndex = MyInfoCollection.Select((item, index) =>
new
{
Value = item.Value,
Key = item.Key,
Index = index
});
var errorQuery = from item in addedIndex
where string.IsNullOrEmpty(item.Value)
let sequenceNumber = item.Key + 1
let detail = "No information was read for expected " +
"sequence number " + sequenceNumber
select new XElement("ERROR",
new XElement("DATETIME", errorTime),
new XElement("DETAIL", detail),
new XAttribute("TYPE", "MISSED"),
new XElement("PAGEID", sequenceNumber)
);
var errors = MyXDocument.Descendants("ERRORS").FirstOrDefault();
if (errors != null)
errors.Add(errorQuery);
//===
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds); //623
}
static RandomNumberGenerator rand = RandomNumberGenerator.Create();
static int MoreRandom()
{
var buff = new byte[1];
rand.GetBytes(buff);
return buff[0];
}

Related

creating array of bad names to check and replace in c#

I'm looking to create a method that loops through an list and replaces with matched values with a new value. I have something working below but it really doesnt follow the DRY principal and looks ugly.
How could I create a dictionary of value pairs that would hold my data of values to match and replace?
var match = acreData.data;
foreach(var i in match)
{
if (i.county_name == "DE KALB")
{
i.county_name = "DEKALB";
}
if (i.county_name == "DU PAGE")
{
i.county_name = "DUPAGE";
}
}
In your question, you can try to use linq and Replace to make it.
var match = acreData.data.ToList();
match.ForEach(x =>
x.county_name = x.county_name.Replace(" ", "")
);
or you can try to create a mapper table to let your data mapper with your value. as #user2864740 say.
Dictionary<string, string> dict = new Dictionary<string, string>();
dict.Add("DE KALB", "DEKALB");
dict.Add("DU PAGE", "DUPAGE");
var match = acreData.data;
string val = string.Empty;
foreach (var i in match)
{
if (dict.TryGetValue(i.county_name, out val))
i.county_name = val;
}
If this were my problem and it is possible a county could have more than one common misspelling I would create a class to hold the correct name and the common misspellings. The you could easily determine if the misspelling exists and correct if. Something like this:
public class County
{
public string CountyName { get; set; }
public List<string> CommonMisspellings { get; set; }
public County()
{
CommonMisspellings = new List<string>();
}
}
Usage:
//most likely populate from db
var counties = new List<County>();
var dekalb = new County { CountyName = "DEKALB" };
dekalb.CommonMisspellings.Add("DE KALB");
dekalb.CommonMisspellings.Add("DE_KALB");
var test = "DE KALB";
if (counties.Any(c => c.CommonMisspellings.Contains(test)))
{
test = counties.First(c => c.CommonMisspellings.Contains(test)).CountyName;
}
If you are simply replacing all words in a list containing space without space, then can use below:
var newList = match.ConvertAll(word => word.Replace(" ", ""));
ConvertAll returns a new list.
Also, I suggest not to use variable names like i, j, k etc..but use temp etc.
Sample code below:
var oldList = new List<string> {"DE KALB", "DE PAGE"};
var newList = oldList.ConvertAll(word => word.Replace(" ", ""));
We can try removing all the characters but letters and apostroph (Cote d'Ivoire has it)
...
i.country_name = String.Concat(i.country_name
.Where(c => char.IsLetter(c) || c == '\''));
...
I made a comment under answer of #Kevin and it seems it needs further explanation. Sequential searching in list does not scale well and unfortunately for Kevin, that is not my opinion, asymptotic computational complexity is math. While searching in dictionary is more or less O(1), searching in list is O(n). To show a practical impact for solution with 100 countries with 100 misspellings each, lets make a test
public class Country
{
public string CountryName { get; set; }
public List<string> CommonMisspellings { get; set; }
public Country()
{
CommonMisspellings = new List<string>();
}
}
static void Main()
{
var counties = new List<Country>();
Dictionary<string, string> dict = new Dictionary<string, string>();
Random rnd = new Random();
List<string> allCountryNames = new List<string>();
List<string> allMissNames = new List<string>();
for (int state = 0; state < 100; ++state)
{
string countryName = state.ToString() + rnd.NextDouble();
allCountryNames.Add(countryName);
var country = new Country { CountryName = countryName };
counties.Add(country);
for (int miss = 0; miss < 100; ++miss)
{
string missname = countryName + miss;
allMissNames.Add(missname);
country.CommonMisspellings.Add(missname);
dict.Add(missname, countryName);
}
}
List<string> testNames = new List<string>();
for (int i = 0; i < 100000; ++i)
{
if (rnd.Next(20) == 1)
{
testNames.Add(allMissNames[rnd.Next(allMissNames.Count)]);
}
else
{
testNames.Add(allCountryNames[rnd.Next(allCountryNames.Count)]);
}
}
System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();
st.Start();
List<string> repairs = new List<string>();
foreach (var test in testNames)
{
if (counties.Any(c => c.CommonMisspellings.Contains(test)))
{
repairs.Add(counties.First(c => c.CommonMisspellings.Contains(test)).CountryName);
}
}
st.Stop();
Console.WriteLine("List approach: " + st.ElapsedMilliseconds.ToString() + "ms");
st = new System.Diagnostics.Stopwatch();
st.Start();
List<string> repairsDict = new List<string>();
foreach (var test in testNames)
{
if (dict.TryGetValue(test, out var val))
{
repairsDict.Add(val);
}
}
st.Stop();
Console.WriteLine("Dict approach: " + st.ElapsedMilliseconds.ToString() + "ms");
Console.WriteLine("Repaired count: " + repairs.Count
+ ", check " + (repairs.SequenceEqual(repairsDict) ? "OK" : "ERROR"));
Console.ReadLine();
}
And the result is
List approach: 7264ms
Dict approach: 4ms
Repaired count: 4968, check OK
List approach is about 1800x slower, actually more the thousand times slower in this case. The results are as expected. If that is a problem is another question, it depends on concrete usage pattern in concrete application and is out of scope of this post.

I have two large lists and I need get the diff between them

I have two large lists and I need get the diff between them.
The first list is from another system via webservice, the second list is from a database (destiny of data).
i will compare and get items from first list that not have in second list and insert in the database (second list source).
have another solution with best performance?
using List.Any(), the process take a lot of hours and not finish...
using for loop, the process take 10 hours or more.
Each list have 1.300.000 records
newItensForInsert = List1.Where(item1 => !List2.Any(item2 => item1.prop1 == item2.prop1 && item1.prop2 == item2.prop2)).ToList();
//or
for (int i = 0; i < List1.Count; i++)
{
if (!List2.Any(x => x.prop1 == List1[i].prop1 && x.prop2 == List1[i].prop2))
{
ListForInsert.Add(List1[i]);
}
}
//or
ListForInsert = List1.AsParallel().Except(List2.AsParallel(), IEqualityComparer).ToList();
You could use List.Except
List<object> webservice = new List<object>();
List<object> database = new List<object>();
IEnumerable<object> toPutIntoDatabase = webservice.Except(database);
database.AddRange(toPutIntoDatabase);
EDIT:
You can even use the new PLINQ (parallel LINQ) like this
IEnumerable<object> toPutIntoDatabase = webservice.AsParallel().Except(database.AsParallel());
EDIT:
Maybe you could use a Hashset to speed up lookups.
HashSet<object> databaseHash = new HashSet<object>(database);
foreach (var item in webservice)
{
if (databaseHash.Contains(item) == false)
{
database.Add(item);
}
{
If same data type then you can use List.Exists,
Else Better to go with inner join and emit
var newdata = from c in dblist
join p in list1 on c.Category equals p.Category into ps
from p in ps.DefaultIfEmpty()
it will select list if given data not present in dblist
HashSet<T> is optimized for executing this kind of set operations. In many cases it's worth the effort to create HashSets from Lists and do the set operation on the Hashsets. I demonstrated this with a little Linqpad program.
The program creates two lists containing 1,300,000 objects. It uses your method to get the difference (or better: attempted to used, because I ran out of patience). And it uses LINQ's Except and hashsets with ExceptWith, both with an IEqualityComparer. The program is listed below.
The result was:
Lists created: 00:00:00.9221369
Hashsets created: 00:00:00.1057532
Except: 00:00:00.2564191
ExceptWith: 00:00:00.0696830
So creating the HashSets and executing ExceptWith (together 0.18), beat Except (0.26s).
One caveat: creating HashSets may take too much memory since the large lists already take a fair amount of memory.
void Main()
{
var sw = Stopwatch.StartNew();
var amount = 1300000;
//amount = 50000;
var list1 = Enumerable.Range(0, amount).Select(i => new Demo(i)).ToList();
var list2 = Enumerable.Range(10, amount).Select(i => new Demo(i)).ToList();
sw.Stop();
sw.Elapsed.Dump("Lists created");
sw.Restart();
var hs1 = new HashSet<Demo>(list1, new DemoComparer());
var hs2 = new HashSet<Demo>(list2, new DemoComparer());
sw.Stop();
sw.Elapsed.Dump("Hashsets created");
sw.Restart();
// var list3 = list1.Where(item1 => !list2.Any(item2 => item1.ID == item2.ID)).ToList();
// sw.Stop();
// sw.Elapsed.Dump("Any");
// sw.Restart();
var list4 = list1.Except(list2, new DemoComparer()).ToList();
sw.Stop();
sw.Elapsed.Dump("Except");
sw.Restart();
hs1.ExceptWith(hs2);
sw.Stop();
sw.Elapsed.Dump("ExceptWith");
// list3.Count.Dump();
list4.Count.Dump();
hs1.Count.Dump();
}
// Define other methods and classes here
class Demo
{
public Demo(int id)
{
ID = id;
Name = id.ToString();
}
public int ID { get; set; }
public string Name { get; set; }
}
class DemoComparer : IEqualityComparer<Demo>
{
public bool Equals(Demo x, Demo y)
{
return (x == null && y == null)
|| (x != null && y != null) && x.ID.Equals(y.ID);
}
public int GetHashCode(Demo obj)
{
return obj.ID.GetHashCode();
}
}
Use List.Exists, it is better than List.Any Performance-wise

System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary

I receive the above error message when performing a unit test on a method. I know where the problem is at, I just don't know why it's not present in the dictionary.
Here is the dictionary:
var nmDict = xelem.Descendants(plantNS + "Month").ToDictionary(
k => new Tuple<int, int, string>(int.Parse(k.Ancestors(plantNS + "Year").First().Attribute("Year").Value), Int32.Parse(k.Attribute("Month1").Value), k.Ancestors(plantNS + "Report").First().Attribute("Location").Value.ToString()),
v => {
var detail = v.Descendants(plantNS + "Details").First();
return new HoursContainer
{
BaseHours = detail.Attribute("BaseHours").Value,
OvertimeHours = detail.Attribute("OvertimeHours").Value,
TotalHours = float.Parse(detail.Attribute("BaseHours").Value) + float.Parse(detail.Attribute("OvertimeHours").Value)
};
});
var mergedDict = new Dictionary<Tuple<int, int, string>, HoursContainer>();
foreach (var item in nmDict)
{
mergedDict.Add(Tuple.Create(item.Key.Item1, item.Key.Item2, "NM"), item.Value);
}
var thDict = xelem.Descendants(plantNS + "Month").ToDictionary(
k => new Tuple<int, int, string>(int.Parse(k.Ancestors(plantNS + "Year").First().Attribute("Year").Value), Int32.Parse(k.Attribute("Month1").Value), k.Ancestors(plantNS + "Report").First().Attribute("Location").Value.ToString()),
v => {
var detail = v.Descendants(plantNS + "Details").First();
return new HoursContainer
{
BaseHours = detail.Attribute("BaseHours").Value,
OvertimeHours = detail.Attribute("OvertimeHours").Value,
TotalHours = float.Parse(detail.Attribute("BaseHours").Value) + float.Parse(detail.Attribute("OvertimeHours").Value)
};
});
foreach (var item in thDict)
{
mergedDict.Add(Tuple.Create(item.Key.Item1, item.Key.Item2, "TH"), item.Value);
}
return mergedDict;
and here is the method that is being tested:
protected IList<DataResults> QueryData(HarvestTargetTimeRangeUTC ranges,
IDictionary<Tuple<int, int, string>, HoursContainer> mergedDict)
{
var startDate = new DateTime(ranges.StartTimeUTC.Year, ranges.StartTimeUTC.Month, 1);
var endDate = new DateTime(ranges.EndTimeUTC.Year, ranges.EndTimeUTC.Month, 1);
const string IndicatorName = "{6B5B57F6-A9FC-48AB-BA4C-9AB5A16F3745}";
DataResults endItem = new DataResults();
List<DataResults> ListOfResults = new List<DataResults>();
var allData =
(from vi in context.vDimIncidents
where vi.IncidentDate >= startDate.AddYears(-3) && vi.IncidentDate <= endDate
select new
{
vi.IncidentDate,
LocationName = vi.LocationCode,
GroupingName = vi.Location,
vi.ThisIncidentIs, vi.Location
});
var finalResults =
(from a in allData
group a by new { a.IncidentDate.Year, a.IncidentDate.Month, a.LocationName, a.GroupingName, a.ThisIncidentIs, a.Location }
into groupItem
select new
{
Year = String.Format("{0}", groupItem.Key.Year),
Month = String.Format("{0:00}", groupItem.Key.Month),
groupItem.Key.LocationName,
GroupingName = groupItem.Key.GroupingName,
Numerator = groupItem.Count(),
Denominator = mergedDict[Tuple.Create(groupItem.Key.Year, groupItem.Key.Month, groupItem.Key.LocationName)].TotalHours,
IndicatorName = IndicatorName,
}).ToList();
for (int counter = 0; counter < finalResults.Count; counter++)
{
var item = finalResults[counter];
endItem = new DataResults();
ListOfResults.Add(endItem);
endItem.IndicatorName = item.IndicatorName;
endItem.LocationName = item.LocationName;
endItem.Year = item.Year;
endItem.Month = item.Month;
endItem.GroupingName = item.GroupingName;
endItem.Numerator = item.Numerator;
endItem.Denominator = item.Denominator;
}
foreach(var item in mergedDict)
{
if(!ListOfResults.Exists(l=> l.Year == item.Key.Item1.ToString() && l.Month == item.Key.Item2.ToString()
&& l.LocationName == item.Key.Item3))
{
for (int counter = 0; counter < finalResults.Count; counter++)
{
var data = finalResults[counter];
endItem = new DataResults();
ListOfResults.Add(endItem);
endItem.IndicatorName = data.IndicatorName;
endItem.LocationName = item.Key.Item3;
endItem.Year = item.Key.Item1.ToString();
endItem.Month = item.Key.Item2.ToString();
endItem.GroupingName = data.GroupingName;
endItem.Numerator = 0;
endItem.Denominator = item.Value.TotalHours;
}
}
}
return ListOfResults;
}
The error occurs here:
Denominator = mergedDict[Tuple.Create(groupItem.Key.Year, groupItem.Key.Month, groupItem.Key.LocationName)].TotalHours,
I do not understand why it is not present in the key. The key consists on an int, int, string (year, month, location) and that is what I have assigned it.
I've looked at all of the other threads concerning this error message but I didn't see anything that applied to my situation.
I was unsure of what tags to put on this but from my understanding the dictionary was created with linq to xml, the query is linq to sql and it's all part of C# so I used all the tags. if this was incorrect then I apologize in advance.
The problem is with comparisons between the keys you are storing in the Dictionary and the keys you are trying to look up.
When you add something to a Dictionary or access the indexer of a Dictionary it uses the GetHashCode() method to get a hash value of the key. The hashcode for a Tuple is unique to that instance of the Tuple. This means that unless you are passing in the exact same instance of the Tuple class into the indexer, it will not find the previously stored value. Your usage of mergedDict[Tuple.Create(... creates a brand new Tuple with a different hash code than is stored in the Dictionary.
I would recommend creating your own class to use as the key and implementing GetHashCode() and the Equality methods on that class. That way the Dictionary will be able to find what you previously stored there.
More:
The reason this is confusing to a lot of people is that for something like String or Int32, String.GetHashCode() will return the same hash code for two different instances that have the same value. A more specialized class such as Tuple doesn't always work the same. The implementor of Tuple could have gotten the hash code of each input to the Tuple and added them together (or something), but running Tuple through a decompiler you can see that this is not the case.

Example to use a hashcode to detect if an element of a List<string> has changed C#

I have a List that updates every minute based on a Linq query of some XML elements.
the xml changes, from time to time. It was suggested to me that I could use Hashcode to determine if any of the strings in the list have changed.
I have seen some examples of Md5 hashcode calculations for just a string, but not for a list...could someone show me a way of doing this with a list?
I tried something simple like int test = list1.GetHashCode; but the code is the same no matter what is in the list...
here is the entire method with the link query and all..note the SequenceEqual at the end:
private void GetTrackInfo()
{
_currentTitles1.Clear();
var savedxmltracks = new XDocument();
listBox1.Items.Clear();
WebClient webClient = new WebClient();
XmlDocument xmltracks = new XmlDataDocument();
try
{
xmltracks.Load(_NPUrl);
xmltracks.Save("xmltracks.xml");
}
catch (WebException ex)
{
StatusLabel1.Text = ex.Message;
}
try
{
savedxmltracks = XDocument.Load("xmltracks.xml");
}
catch (Exception ex)
{
StatusLabel1.Text = ex.Message;
}
var dateQuery = from c in savedxmltracks.Descendants("content")
select c;
_count = savedxmltracks.Element("content").Element("collection").Attribute("count").Value;
var tracksQuery1 = from c in savedxmltracks.Descendants("data")
select new
{
title = c.Attribute("title").Value,
imageurl = c.Attribute("image").Value,
price = c.Attribute("price").Value,
description = c.Attribute("productdescription").Value,
qualifier = c.Attribute("pricequalifier").Value
};
var xml = new XDocument(new XDeclaration("1.0", "utf-8", "yes"),
new XElement("LastUsedSettings",
new XElement("TimerInterval",
new XElement("Interval", Convert.ToString(numericUpDown1.Value))),
new XElement("NowPlayingURL",
new XElement("URL", _NPUrl)),
new XElement("Email", emailAddress),
new XElement("LastUpdated", DateTime.Now.ToString())));
XElement StoreItems = new XElement("StoreItems");
int i = 0;
foreach (var c in tracksQuery1)
{
if (c.title.Length <= 40 & c.qualifier.Length <= 12 & i < 10)
{
if (c.title != null) _title1 = c.title;
if (c.imageurl != null) _imageUrl = c.imageurl;
if (c.price != null) _price = c.price;
if (c.description != null) _productDescription = c.description;
if (c.qualifier != null) _priceQualifier = c.qualifier;
//}
StoreItems.Add(new XElement("Title" + i.ToString(), _title1));
_currentTitles1.Add(_title1);
if (_oldTitles1.Count > 0)
{
Console.WriteLine("OldTitle: {0}, NewTitle: {1}", _oldTitles1[i], _currentTitles1[i]);
}
StoreItems.Add(new XElement("Price" + i.ToString(), _price));
StoreItems.Add(new XElement("Description" + i.ToString(), _productDescription));
StoreItems.Add(new XElement("PriceQualifier" + i.ToString(), _priceQualifier));
listBox1.Items.Add("Title: " + _title1);
listBox1.Items.Add("Image URL: " + _imageUrl);
listBox1.Items.Add("Price: " + _price);
listBox1.Items.Add("Description: " + _productDescription);
listBox1.Items.Add("PriceQualifier: " + _priceQualifier);
try
{
imageData = webClient.DownloadData(_imageUrl);
}
catch (WebException ex)
{
StatusLabel1.Text = ex.Message;
}
MemoryStream stream = new MemoryStream(imageData);
Image img = Image.FromStream(stream);
//Image saveimage = img;
//saveimage.Save("pic.jpg");
img.Save("pic" + i.ToString() + ".jpg");
stream.Close();
i++;
}
}
//Console.WriteLine("Count: " + _count);
Console.WriteLine("oldTitles Count: " + _oldTitles1.Count.ToString());
Console.WriteLine("currentTitles Count: " + _currentTitles1.Count.ToString());
if (_oldTitles1.Count == 0) _oldTitles1 = _currentTitles1;
if (!_oldTitles1.SequenceEqual(_currentTitles1))
{
Console.WriteLine("Items Changed!");
SendMail();
_oldTitles1 = _currentTitles1;
}
xml.Root.Add(StoreItems);
xml.Save("settings.xml");
}
why not just use an ObservableCollection and monitor changes to the list?
If you really wanted to hash an entire list, you might do something like this:
List<String> words;
int hash = String.Join("", words.ToArray()).GetHashCode();
I think MD5 may be overkill, you don't need a cryptographically secure hashing function for this task.
Reference: String.Join and String.GetHashCode
I don’t think you need to bother yourself about all the hash code discussion if you are not going to have hundreds thousands of elements or if you are not going to request this function thousands times a second.
Here is a small program that will show you how much time it will take to compare 10000 element using your correct way of doing this.
class Program
{
static void Main(string[] args)
{
var list1 = new List<string>();
var list2 = new List<string>();
for (int i = 0; i < 10000; i++)
{
list1.Add("Some very very very very very very very long email" + i);
list2.Add("Some very very very very very very very long email" + i);
}
var timer = new Stopwatch();
timer.Start();
list1.SequenceEqual(list2);
timer.Stop();
Console.WriteLine(timer.Elapsed);
Console.ReadKey();
}
}
At my PC it took 0.001 seconds.
Here is Jon Skeet's GetHashCode() implementation just for reference. Note that you'll have to figure out how to work this into what you need for comparing the list/list items.
What is the best algorithm for an overridden System.Object.GetHashCode?
I used this in a recent project and it worked great. You don't necessarily need to use a cryptographic hash to get a good hash code, you can calculate it yourself, but it should not be done naively.
You need to do something like this:
public static class ListExtensions {
private readonly static int seed = 17;
private readonly static int multiplier = 23;
public static int GetHashCodeByElements<T>(this List<T> list) {
int hashCode = seed;
for(int index = 0; index < list.Count; list++) {
hashCode = hashCode * multiplier + list[index].GetHashCode();
}
return hashCode;
}
}
Now you can say:
int previousCode = list.GetHashCodeByElements();
A few minutes later:
int currentCode = list.GetHashCodeByElements();
if(previousCode != currentCode) {
// list changed
}
Note that this is subject to false negatives (the list changed, but the hash code won't detect it). Any method of detecting changes in a list via hash codes is subject to this.
Finally, depending on what you are doing (if there are multiple threads hitting the list), you might want to consider locking access to the list while computing the hash code and updating the list. It depends on what you're doing whether or not this is appropriate.
You will have a better performance if you will use HashSet instead of List. HashSet uses hash codes of its element to compare them. That’s probably what you were told about.
The next example demonstrates how to update you list and detect changes in it every time your XML is changed using HashSet.
HashSet implement all the same interfaces as List. Thus, you can easily use it everywhere where you used your List.
public class UpdatableList
{
public HashSet<string> TheList { get; private set; }
//Returns true if new list contains different elements
//and updates the collection.
//Otherwise returns false.
public bool Update(List<String> newList)
{
if (TheList == null)
{
TheList = new HashSet<string>(newList);
return true;
}
foreach (var item in newList)
{
//This operation compares elements hash codes but not
//values itself.
if (!TheList.Contains(item))
{
TheList = new HashSet<string>(newList);
return true;
}
}
//It gets here only if both collections contain identical strings.
return false;
}
}

C# LINQ question about foreach

is there any way to write this foreach in linq or another better way,
int itemNr = -1;
foreach(ItemDoc itemDoc in handOverDoc.Assignment.Items) {
itemNr++;
foreach(ItemDetailDoc detail in itemDoc.Details) {
int eventDocNr = -1;
foreach(EventDoc eventDoc in detail.Events) {
eventDocNr++;
if(!eventDoc.HasEAN) {
HideShowPanels(pMatch);
txt_EAN.Text = String.Empty;
lbl_Match_ArtName.Text = itemDoc.Name;
lbl_ArtNr.Text = itemDoc.Number;
lbl_unitDesc.Text = eventDoc.Description;
m_tempItemNr = itemNr;
m_tempEventNr = eventDocNr;
txt_EAN.Focus();
return;
}
}
}
}
I just think this is not the correct way to write it. please advise.
If itemNr and eventDocNr is not needed you could use:
var item =
(from itemDoc in handOverDoc.Assignment.Items
from detail in itemDoc.Details
from eventDoc in detail.Events
where !eventDoc.HasEAN
select new
{
Name = itemDoc.Name,
Number = itemDoc.Number,
Description = eventDoc.Description
}).FirstOrDefault();
if (item != null)
{
HideShowPanels(pMatch);
txt_EAN.Text = String.Empty;
lbl_Match_ArtName.Text = item.Name;
lbl_ArtNr.Text = item.Number;
lbl_unitDesc.Text = item.Description;
txt_EAN.Focus();
}
No, I dont think there is a better way to do that. LINQ is about queries, you do quite a lot of processing in there. Unless you have a shortcut that is not obvious here.... this seems t obe the only way.
If you COULD start from the eventDoc - you could filter out those without EAN and then go from there backward, but Ican not say how feasible that is as I miss the complete model (as in: maybe you have no back lniks, so you would be stuck wit hthe eventDoc an dcoul dnot get up to the item.
First look that looks fine.
You could try the following LINQ:
var nonEANs = from ItemDoc itemDocs in itemDocList
from ItemDetailDoc itemDetailDocs in itemDocs.Details
from EventDoc eventDocs in itemDetailDocs.Events
where !eventDocs.HasEAN
select eventDocs;
foreach (var i in nonEANs)
{
System.Diagnostics.Debug.WriteLine( i.HasEAN);
}
Should return 7 false EANs: I recreated you nested structures like this
List<ItemDoc> itemDocList = new List<ItemDoc>()
{
new ItemDoc()
{
Details = new List<ItemDetailDoc>()
{
new ItemDetailDoc()
{
Events = new List<EventDoc>()
{
new EventDoc()
{HasEAN=false},
new EventDoc()
{HasEAN=false}
}
},
new ItemDetailDoc()
{
Events = new List<EventDoc>()
{
new EventDoc()
{HasEAN=true},
new EventDoc()
{HasEAN=false}
}
}
}
},
new ItemDoc()
{
Details = new List<ItemDetailDoc>()
{
new ItemDetailDoc()
{
Events = new List<EventDoc>()
{
new EventDoc()
{HasEAN=false},
new EventDoc()
{HasEAN=false}
}
},
new ItemDetailDoc()
{
Events = new List<EventDoc>()
{
new EventDoc()
{HasEAN=false},
new EventDoc()
{HasEAN=false}
}
}
}
}
};
I think you are stuck with the for each loops as you need the itemNr and eventDocNr. You can use for loops to avoid increasing the itemNr and eventDocNr, but this does not reduce the number of loops.
Edit:
And if you do need the itemNr and eventDocNr try this:
var query = handOverDoc.Assignment.Items
.SelectMany(
(x, i) => x.Details.SelectMany(
(d, di) => d.Events.Where(x => x.HasEAN).Select(
(e, ei) => new {
ItemIndex = di,
EventIndex = ei,
Detail = d,
Event = e
}
)
)
);
foreach (var eventInfo in query) {
HideShowPanels(pMatch);
txt_EAN.Text = String.Empty;
lbl_Match_ArtName.Text = eventInfo.Detail.Name;
lbl_ArtNr.Text = eventInfo.Detail.Number;
lbl_unitDesc.Text = eventInfo.Event.Description;
txt_EAN.Focus();
return;
}
If you need only the first event with an EAN you could also use the following on the above query:
var item = query.FirstOrDefault();
if (item != null) {
// do you stuff here
}
You can get the index in LINQ quite easily, for example :-
var itemDocs = handOverDoc.Assignment.Items.Select((h, i) => new { item = h, itemindex = i })
You can repeat this process for your inner loops also and I suspect you could then use SelectMany() to simplify it even further.
You're trying to do two different things here. Firstly you're trying to find a document, and secondly you're trying to change things based upon it. The first stage in the process is simply to clarify the code you already have, e.g.
(Note this takes into account previous comments that the computed indexes in the original code are not needed. The exact same type of split into two methods could be done whether or not the computed indexes are required, and it would still improve the original code.)
public void FindAndDisplayEventDocWithoutEAN(HandOverDoc handOverDoc)
{
var eventDoc = FindEventDocWithoutEAN(handOverDoc);
if (eventDoc != null)
{
Display(eventDoc);
}
}
public EventDoc FindEventDocWithoutEAN(HandOverDoc handOverDoc)
{
foreach(ItemDoc itemDoc in handOverDoc.Assignment.Items)
foreach(ItemDetailDoc detail in itemDoc.Details)
foreach(EventDoc eventDoc in detail.Events)
if(!eventDoc.HasEAN)
return eventDoc;
return null;
}
public void Display(EventDoc eventDoc)
{
HideShowPanels(pMatch);
txt_EAN.Text = String.Empty;
lbl_Match_ArtName.Text = itemDoc.Name;
lbl_ArtNr.Text = itemDoc.Number;
lbl_unitDesc.Text = eventDoc.Description;
m_tempItemNr = itemNr;
m_tempEventNr = eventDocNr;
txt_EAN.Focus();
}
Once you've done that, you should be able to see that one method is a query over the main document, and the other is a method to display the results of the query. This is what's known as the single responsibility principle, where each method does one thing, and is named after what it does.
The transformation of the nested foreach loops to a linq query is now almost trivial. Compare the method below with the method above, and you can see how mechanical it is to translate nested foreach loops into a linq query.
public EventDoc FindEventDocWithoutEAN(HandOverDoc handOverDoc)
{
return (from itemDoc in handOverDoc.Assignment.Items
from detail in itemDoc.Details
from eventDoc in detail.Events
where !eventDoc.HasEAN
select eventDoc).FirstOrDefault();
}
yet another spin...
var query = from itemDocVI in handOverDoc.Assignment
.Items
.Select((v, i) => new { v, i })
let itemDoc = itemDocVI.v
let itemNr = itemDocVI.i
from detail in itemDoc.Details
from eventDocVI in detail.Events
.Select((v, i) => new { v, i })
let eventDoc = eventDocVI.v
let eventDocNr = eventDocVI.i
where eventDoc.HasEAN
select new
{
itemDoc,
itemNr,
detail,
eventDoc,
eventDocNr
};
var item = query.FirstOrDefault();
if (item != null)
{
HideShowPanels(pMatch);
txt_EAN.Text = String.Empty;
lbl_Match_ArtName.Text = item.itemDoc.Name;
lbl_ArtNr.Text = item.itemDoc.Number;
lbl_unitDesc.Text = item.eventDoc.Description;
m_tempItemNr = item.itemNr;
m_tempEventNr = item.eventDocNr;
txt_EAN.Focus();
}

Categories