Release memory between foreach iterations in release builds? - c#

UPDATE
Obviously, the original question was confusing so I'll try to simplify it.
I'm working with a complex algorithme that is providing a list of object (let's say a company).
For each of those company I will have to load a large amount of data (let's say a list of employee).
public class Company
{
public string Name { get; set; } = "";
public List<Employee> EmployeeList { get; set; } = new List<Employee>();
}
public class Employee
{
public string FirstName { get; set; } = "Random first name";
public string LastName { get; set; } = "Random last name";
}
public MemoryTest()
{
//Simulate the complex algorithme...
//I can't change how I get that list and my question ain't about this part.
List<Company> companyList = new List<Company>();
for (int i = 0; i < 50000; i++)
{
companyList.Add(new Company() { Name = "Random company name " + i });
}
//Simulate the details loading. This is where the memory gets filled
foreach (Company company in companyList)
{
company.EmployeeList.AddRange(new Employee[25000]);
//Do some calculation and save to DB...
}
}
The problem with this code is that the memory allocated during each iteration won't be released until the end of the loop.
After reading this article I had hopes that the JIT would be able to determine a company reference would not be used after an iteration since the companyList isn't use beyond the foreach:
In release builds, the JIT is able to look at the program structure to work out the last point
within the execution that a variable can be used by the method and will discard it when it is
no longer required.
... but sadly, the JIT doesn't extrapolate that far.
In order to use a few memory as possible, my question is the following: Is there a way the loop thru a collection AND to remove reference to the element between each iteration?
Here's a more generic example if you don't want to work with Company / Employee
Dictionary<int, List<string>> dict = new Dictionary<int, List<string>>();
for (int i = 0; i < 100000; i++)
{
dict.Add(i, new List<string>());
}
foreach (var item in dict)
{
item.Value.AddRange(new string[25000]);
}

I'm going to revise me original answer here. Not to beat the proverbial dead horse but I just want to emphasize that if you're having to pop items out of a collection to keep the collection from rooting the objects in memory then it's almost certain you've got some issues with the design of the your application. There might be some scenarios where that's a great design but I say most often it's not.
Lets take Smurf's company scenario and change it slightly to make it not hold onto large objects in the collection at all.
I'm going to ignore the dictionary of Company objects. It's never used as a dictionary and is only used as a collection. It's also important to note that what's in the dictionary to begin with is not a complete company object. We had to use the original company object to retrieve the extra data. We usually call that a key. So instead of that dictionary, we'll have a stream of keys instead. Here's the key objects:
public class CompanyKey
{ }
And a data source to produce keys. This might actually be Smurfs dictionary, but for our purposes we'll make it an iterator method. That way nothing is rooting these things in memory. If the keys are small then it doesn't really matter but better to not use a collection if you don't need it.
public class CompanyKeySource
{
public IEnumerable<CompanyKey> GetKeys()
{
for(int i =0;i < 10;++i)
yield return new CompanyKey();
}
}
And here's the actual company object:
public class Company
{
public EmployeeData Employees { get; set; }
}
And the big glob of data. That's in the employee object.
public class Employee
{
public string[] LotOfData { get; set; }
}
Finally we need something that'll load the big glob of data into the company object. That's usually a repository of some type:
public class CompanyDataRepository
{
public IEnumerable<Company> GetCompanyDetails(IEnumerable<CompanyKey> keys)
{
foreach (var key in keys)
{
yield return new Company() { Employees = GetEmployees(key) };
}
}
public EmployeeData GetEmployees(CompanyKey key) =>
new EmployeeData() { LotOfData = new string[2500] };
}
Now we wire everything together and iterate over our company instances.
static void Main(string[] _)
{
CompanyDataRepository repository = new CompanyDataRepository();
CompanyKeySource keySource = new CompanyKeySource();
var keys = keySource.GetKeys();
foreach (var company in repository.GetCompanyDetails(keys))
{
// do whatever it is you're doing with your companies...
}
}
Now there's no need to pop items off a dictionary to keep them out of memory. The large chunks of data are used where they're needed and then can be eligible for collection right away.

foreach is using an enumerator concept. With the help of .MoveNext() method, it is observing items in the collection one after another with the help of .Current property. In other words, your code is translated to something like:
List<Compagny>.Enumerator enumerator = new List<Compagny>().GetEnumerator();
try
{
while (enumerator.MoveNext())
{
Compagny current = enumerator.Current;
LoadContacts(current);
current.Log = "Very large string...";
}
}
finally
{
((IDisposable)enumerator).Dispose();
}
As you see, each var compagny is in the end just a single local variable of reference type. This reference is being replaced in every iteration. So, from this point of view, current variable becomes an additional root for some Compagny object, for the time of the iteration.
The difference here may be what's the range of such a root. In case of the Debug build, it is indeed the whole {} block of a underlying while loop (in fact, the whole method, from the CIL perspective). In case of Release build, JIT is reporting roots more aggresively. So, it will report current as no longer needed, as fast as it is indeed no longer needed. In you case, you are using current till the end of the {} so it is the same.
The difference could be seen if you don't use compagny till the end of the block, like:
foreach (var compagny in compagnyList)
{
this.LoadContacts(compagny);
this.LoadNotes(compagny.ContactList)
compagny.Log = "Very large string...";
CallingHereLongLastingMethod(); // Debug - compagny is still a root
// Release - compagny is no longer a root
}
BUT... all this does not matter because current is just an additional object root. Object itself ("compagny") is still rooted by the compagnyList itself.
"would the GC be smart enough to understand that the company object would not be reused beyond the foreach iteration and mark it as candidate for garbage collection (despite the compagnyList that would remain be on the stack)?" question suggests you have some problems in understanding reference types and the GC fundamentals. List<> is a reference type, it contains of the reference (living "on the stack" as a local variable) and the data - collection of the companies.

Based on MikeJ's answer, I created a generic method that will loop thru a list and remove the reference to the element right after the iteration:
public static IEnumerable<T> PopEnumerable<T>(this List<T> list)
{
while (list.Count > 0)
{
yield return list[0];
list.RemoveAt(0);
}
}
That way you can still call the foreach loop...
foreach (Company company in companyList.PopEnumerable())
{
company.EmployeeList.AddRange(new Employee[25000]);
}
... but only the company of the current iteration will remain in memory. I did some benchmark and the memory allocation that use to reach 16 gb dropped to less than 100 mb for the whole process.

Related

Check if list of objects single property value exists in Hashset

I have a list of objects in which the objects have a Guid Id property.
I also have a Hashset containing a bunch of Guids.
What is the fastest way to check if each objects Guid in the list exists in the Hashset, and then update another property on the Object in the list if it does exist? I do have the ability to change the Hashset to a different data type if needed, but the list must remain the same.
Here's the classes/enumerable
public class Test
{
public Guid Id {get; set;}
public bool IsResponded {get; set;}
}
var clientResponses = new HashSet<Guid>();
var testRecords = new List<Test>();
This is what I currently am doing
foreach (var test in testRecords)
{
if (clientResponses.Contains(test.Id))
test.IsResponded = true;
}
You can do so
foreach (var test in testRecords)
{
if (clientResponses.Remove(test.Id))
test.IsResponded = true;
}
Or, more briefly
foreach (var test in testRecords)
{
test.IsResponded = clientResponses.Remove(test.Id);
}
Every found value is removed from the HashSet, so each next iteration will be faster. Of course, it's worth only for the very large volumes of data. And besides, it is necessary to recreate a HashSet.
Also you can try this optimization (it is assumed that the properties IsResponded are false by default)
foreach (var test in testRecords)
{
if (clientResponses.Remove(test.Id))
{
test.IsResponded = true;
if (clientResponses.Count == 0)
break; // the remaining IsResponded values will remain unchanged
}
}
This approach is advantageous if the size of the testRecords collection significantly larger than the size of the HashSet and with high probability all values from the HashSet will coincide with the values in this collection. In the case of finding all, there is no reason to continue to iterate on the collection. So, break the loop.

Avoid object instantiation in loops c# - How this can be avoided?

Here is the scenario
I know that class are reference types and Structures are value types
Below is Code1 which successfully outputs the Output1 which is expected functionality because as a new obj is created a new ref point is created and added to the persons list.
In code2. The same Object is getting assigned and as the code describes the foreach loop is actually updating the same reference that Obj is pointing all the time. At the end of for loop execution, final value is assigned to all the list items as in Output2
For case Code1 upon CAST tool review we are getting "Avoid object instantiation in loops".I know instantiation objects in for loop takes extra memory and performance too which is what I guess CAST tool is telling us. In such scenarios is there any solution that we can avoid new object instatiation inside the loop.
Using Structures is one solution based on the present scenario. But i would like to have any other ideas.
Code 1
public class person
{
public string name { get; set; }
public int age { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<person> personList = new List<person>();
for (int i = 0; i < 10; i++)
{
person Obj = Obj = new person();
Obj.name = "Robert" + i;
Obj.age = i * 10;
personList.Add(Obj);
}
foreach(person indv in personList)
{
Console.WriteLine(indv.name + indv.age);
}
}
}
Output
Robert00
Robert110
Robert220
Robert330
Robert440
Robert550
Robert660
Robert770
Robert880
Robert990
Code 2
List<person> personList = new List<person>();
person Obj = Obj = new person();
for (int i = 0; i < 10; i++)
{
Obj.name = "Robert" + i;
Obj.age = i * 10;
personList.Add(Obj);
}
foreach(person indv in personList)
{
Console.WriteLine(indv.name + indv.age);
}
Output 2
Robert990
Robert990
Robert990
Robert990
Robert990
Robert990
Robert990
Robert990
Robert990
I know instantiation objects in for loop takes extra memory and performance too which is what I guess CAST tool is telling us.
That's incorrect. An allocation will have the same "price" when used inside or outside that loop. I'm assuming your tool is warning you because allocating objects in a loop on each iteration may cause alot of objects to be instansiated, but that's exactly what's needed here. There is absolutely no need to avoid object allocation in this case.
I'd be more worried about that particular tool you're using and the advice it brings.
There is nothing wrong with instantiating those objects so I can't think why your tool is telling you that. At the end of the day the whole point of your code is to create a list of "person" objects. Whether you did it in a loop, or typed out all 10 instantiations in a row, it wouldn't make a difference. The loop is obviously better.
On another note though, you can really simplify this code by using linq, try writing it this way and see if your tool gives you the same warning:
List<person> personList = Enumerable.Range(1, 9).Select(x =>
new person { name = "Robert" + x, age = x * 10 }).ToList();
I mainly avoid instantiation in loops in cases where I want to use the object outside of the loop and it isn't being added to a collection. Additionally it wouldn't be necessary if I were instantiating in the loop and passing the object to another method within the loop. In that case you can instantiate outside of the loop and pass the values to the method within the loop. If this is all of the code that you're going to use, move the Console.WriteLine inside of the loop and don't bother instantiating inside of the loop.
But I get the impression that you're trying to create a collection of objects inside a loop to be used outside of that loop. In that case, your collection of objects isn't going to have any bigger memory footprint simply because you instantiated the objects inside the loop. As you can see, if you instantiate the object outside of the loop, you're simply assigning a new value to the same object and then adding a reference to the same object multiple times to the array. You'll need to instantiate each object in the array no matter what you do.

Why is dictionary so much faster than list?

I am testing the speed of getting data from Dictionary VS list.
I've used this code to test :
internal class Program
{
private static void Main(string[] args)
{
var stopwatch = new Stopwatch();
List<Grade> grades = Grade.GetData().ToList();
List<Student> students = Student.GetStudents().ToList();
stopwatch.Start();
foreach (Student student in students)
{
student.Grade = grades.Single(x => x.StudentId == student.Id).Value;
}
stopwatch.Stop();
Console.WriteLine("Using list {0}", stopwatch.Elapsed);
stopwatch.Reset();
students = Student.GetStudents().ToList();
stopwatch.Start();
Dictionary<Guid, string> dic = Grade.GetData().ToDictionary(x => x.StudentId, x => x.Value);
foreach (Student student in students)
{
student.Grade = dic[student.Id];
}
stopwatch.Stop();
Console.WriteLine("Using dictionary {0}", stopwatch.Elapsed);
Console.ReadKey();
}
}
public class GuidHelper
{
public static List<Guid> ListOfIds=new List<Guid>();
static GuidHelper()
{
for (int i = 0; i < 10000; i++)
{
ListOfIds.Add(Guid.NewGuid());
}
}
}
public class Grade
{
public Guid StudentId { get; set; }
public string Value { get; set; }
public static IEnumerable<Grade> GetData()
{
for (int i = 0; i < 10000; i++)
{
yield return new Grade
{
StudentId = GuidHelper.ListOfIds[i], Value = "Value " + i
};
}
}
}
public class Student
{
public Guid Id { get; set; }
public string Name { get; set; }
public string Grade { get; set; }
public static IEnumerable<Student> GetStudents()
{
for (int i = 0; i < 10000; i++)
{
yield return new Student
{
Id = GuidHelper.ListOfIds[i],
Name = "Name " + i
};
}
}
}
There is list of students and grades in memory they have StudentId in common.
In first way I tried to find Grade of a student using LINQ on a list that takes near 7 seconds on my machine and in another way first I converted List into dictionary then finding grades of student from dictionary using key that takes less than a second .
When you do this:
student.Grade = grades.Single(x => x.StudentId == student.Id).Value;
As written it has to enumerate the entire List until it finds the entry in the List that has the correct studentId (does entry 0 match the lambda? No... Does entry 1 match the lambda? No... etc etc). This is O(n). Since you do it once for every student, it is O(n^2).
However when you do this:
student.Grade = dic[student.Id];
If you want to find a certain element by key in a dictionary, it can instantly jump to where it is in the dictionary - this is O(1). O(n) for doing it for every student. (If you want to know how this is done - Dictionary runs a mathematical operation on the key, which turns it into a value that is a place inside the dictionary, which is the same place it put it when it was inserted)
So, dictionary is faster because you used a better algorithm.
The reason is because a dictionary is a lookup, while a list is an iteration.
Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time.
to put it another way. The list will be faster than the dictionary on the first item, because there's nothing to look up. it's the first item, boom.. it's done. but the second time the list has to look through the first item, then the second item. The third time through it has to look through the first item, then the second item, then the third item.. etc..
So each iteration the lookup takes more and more time. The larger the list, the longer it takes. While the dictionary is always a more or less fixed lookup time (it also increases as the dictionary gets larger, but at a much slower pace, so by comparison it's almost fixed).
When using Dictionary you are using a key to retrieve your information, which enables it to find it more efficiently, with List you are using Single Linq expression, which since it is a list, has no other option other than to look in entire list for wanted the item.
Dictionary uses hashing to search for the data. Each item in the dictionary is stored in buckets of items that contain the same hash. It's a lot quicker.
Try sorting your list, it will be a a bit quicker then.
A dictionary uses a hash table, it is a great data structure as it maps an input to a corresponding output almost instantaneously, it has a complexity of O(1) as already pointed out which means more or less immediate retrieval.
The cons of it is that for the sake of performance you need lots of space in advance (depending on the implementation be it separate chaining or linear/quadratic probing you may need at least as much as you're planning to store, probably double in the latter case) and you need a good hashing algorithm that maps uniquely your input ("John Smith") to a corresponding output such as a position in an array (hash_array[34521]).
Also listing the entries in a sorted order is a problem. If I may quote Wikipedia:
Listing all n entries in some specific order generally requires a
separate sorting step, whose cost is proportional to log(n) per entry.
Have a read on linear probing and separate chaining for some gorier details :)
Dictionary is based on a hash table which is a rather efficient algorithm to look up things. In a list you have to go element by element in order to find something.
It's all a matter of data organization...
When it comes to lookup of data, a keyed collection is always faster than a non-keyed collection. This is because a non-keyed collection will have to enumerate its elements to find what you are looking for. While in a keyed collection you can just access the element directly via the key.
These are some nice articles for comparing list to dictionary.
Here. And this one.
From MSDN - Dictionary mentions close to O(1) but I think it depends on the types involved.
The Dictionary(TKey,TValue) generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table.
Note:
The speed of retrieval depends on the quality of the hashing algorithm of the type specified for TKey.
List(TValue) does not implement a hash lookup so it is sequential and the performance is O(n). It also depends on the types involved and boxing/unboxing needs to be considered.

List threading issue

I'm trying to make my application thread safe. I hold my hands up and admit I'm new to threading so not sure what way to proceed.
To give a simplified version, my application contains a list.
Most of the application accesses this list and doesn't change it but
may enumerate through it. All this happens on the UI thread.
Thread
one will periodically look for items to be Added and Removed from the
list.
Thread two will enumerate the list and update the items with
extra information. This has to run at the same time as thread one as
can take anything from seconds to hours.
The first question is does anyone have a recommend stragy for this.
Secondly I was trying to make seperate copies of the list that the main application will use, periodically getting a new copy when something is updated/added or removed, but this doesn't seem to be working.
I have my list and a copy......
public class MDGlobalObjects
{
public List<T> mainList= new List<T>();
public List<T> copyList
{
get
{
return new List<T>(mainList);
}
}
}
If I get copyList, modify it, save mainlist, restart my application, load mainlist and look again at copylist then the changes are present. I presume I've done something wrong as copylist seems to still refer to mainlist.
I'm not sure if it makes a difference but everything is accessed through a static instance of the class.
public static MDGlobalObjects CacheObjects = new MDGlobalObjects();
This is the gist using a ConcurrentDictionary:
public class Element
{
public string Key { get; set; }
public string Property { get; set; }
public Element CreateCopy()
{
return new Element
{
Key = this.Key,
Property = this.Property,
};
}
}
var d = new ConcurrentDictionary<string, Element>();
// thread 1
// prune
foreach ( var kv in d )
{
if ( kv.Value.Property == "ToBeRemoved" )
{
Element dummy = null;
d.TryRemove( kv.Key, out dummy );
}
}
// thread 1
// add
Element toBeAdded = new Element();
// set basic properties here
d.TryAdd( toBeAdded.Key, toBeAdded );
// thread 2
// populate element
Element unPopulated = null;
if ( d.TryGetValue( "ToBePopulated", out unPopulated ) )
{
Element nowPopulated = unPopulated.CreateCopy();
nowPopulated.Property = "Populated";
// either
d.TryUpdate( unPopulated.Key, nowPopulated, unPopulated );
// or
d.AddOrUpdate( unPopulated.Key, nowPopulated, ( key, value ) => nowPopulated );
}
// read threads
// enumerate
foreach ( Element element in d.Values )
{
// do something with each element
}
// read threads
// try to get specific element
Element specific = null;
if ( d.TryGetValue( "SpecificKey", out specific ) )
{
// do something with specific element
}
In thread 2, if you can set properties so that the whole object is consistent after each atomic write, then you can skip making a copy and just populate the properties with the object in place in the collection.
There are a few race conditions in this code, but they should be benign in that readers always have a consistent view of the collection.
actly copylist is just a shallow copy of the mainList. the list is new but the refrences of the objects contained in the list are still the same. to achieve what you are trying to you have to make a deep copy of the list
something like this
public static IEnumerable<T> Clone<T>(this IEnumerable<T> collection) where T : ICloneable
{
return collection.Select(item => (T)item.Clone());
}
and use it like
return mainList.Clone();
looking at your ques again.. i would like to suggest an overall change of approach.
you should use ConcurrentDictionary() as you are using .Net 4.0. in that you wont hav eto use locks as a concurrent collection always maintains a valid state.
so your code will look something like this.
Thread 1s code --- <br>
var object = download_the_object();
dic.TryAdd("SomeUniqueKeyOfTheObject",object);
//try add will return false so implement some sort of retry mechanism
Thread 2s code
foreach(var item in Dictionary)
{
var object item.Value;
var extraInfo = downloadExtraInfoforObject(object);
//update object by using Update
dictionary.TryUpdate(object.uniqueKey,"somenewobjectWithExtraInfoAdded",object);
}

c# looping object creation

I'm very new with c#, and was previously attempting to ignore classes and build my small program structurally more similar to PHP. After reaching a road block, I'm trying to start over and approach the problem properly OO. I'm taking a long file, and in a loop, every time certain conditions are met, I want to make a new object. How can I have it create a new object, without having to specify a unique name?
Referral ObjectName = new Referral(string, string, int);
Secondly, once this is done, and the strings & int set their appropriate object properties, how can i unique-ify the class by one property, and then sort the class by another?
I'm sorry if these are basic questions, I have spent a large, large amount of time first trying to figure it out on my own with google, and a textbook. If only C# would allow multi-dimensional arrays with different types!
Thank you so much!
PS. I do mean to extract a list of unique objects.
All these answers, while helpful, seem to involve creating a shadow set of IEnumerables. Is there no way to do this with the class itself?
Trying the first solution, provided by Earwicker, adding each object to a List from within the loop, when I try to Write a property of the element to the console, i get the ClassName+Referral. What could I be doing wrong?--solved. still needed .property
still working. . .
C# does allow untyped arrays. All objects are derived ultimately from object, so you use an array or container of objects. But it's rarely necessary. How many types of object do you have?
Within the loop block, you can create an object exactly as you do in that line of code (except with the syntax fixed), and it will be a new object each time around the loop. To keep all the objects available outside the loop, you would add it to a container:
List<Referral> referrals = new List<Referral>();
// in the loop:
Referral r = new Referral(str1, str2, num1);
referrals.Add(r);
Suppose Referral has a numeric property called Cost.
referrals.Sort((l, r) => l.Cost - r.Cost);
That sorts by the cost.
For ensuring uniqueness by some key, you may find it easier to pick a more suitable container.
Dictionary<string, Referral> referrals = new List<Referral>();
// in the loop:
Referral r = new Referral(str1, str2, num1);
referrals[str1] = r;
This stores the referral in a "slot" named after the value of str1. Duplicates will overwrite each other silently.
First, you're going to need to spend some time familiarizing yourself with the basics of the language to be productive. I recommend you take a little time to read up on C# before getting in too deep - otherwise you'll spend a lot of your time spinning your wheels - or reinventing them :)
But here's some info to get you started.
Typically, in C# you create classes to represent elements of your program - including those that are used to represent information (data) that your program intends to manipulate. You should really consider using one, as it will make data manipulation clearer and more manageable. I would advise avoiding untyped, multi-dimensions array structures as some may suggest, as these rapidly become very difficult to work with.
You can easily create a Referall class in C# using automatic properties and a simple constructor:
public class Referall
{
// these should be named in line with what they represent...
public string FirstString { get; set; }
public string AnotherString { get; set; }
public int SomeValue { get; set; }
public Referall( string first, string another, int value )
{
FirstString = first;
AnotherString = another;
SomeValue = value;
}
}
You can add these to a dictionary as you create them - the dictionary can be keyed by which ever property is unique. Dictionaries allow you to store objects based on a unique key:
Dictionary<string,Referall> dict = new Dictionary<string,Referall>();
As you process items, you can add them to the dictionary:
Referall ref = new Referall( v1, v2, v3 );
// add to the dictionary, keying on FirstString...
dict.Add( ref.FirstString, ref );
If you need to sort items in the dictionary when you're done, you can use LINQ in C# 3.0:
IEnumerable<Referall> sortedResults =
dict.Values.OrderBy( x => x.AnotherString );
You can sort by multiple dimension using ThenBy() as well:
IEnumerable<Referall> sortedResults =
dict.Values.OrderBy( x => x.AnotherString )
.ThenBy( x => x.SomeValue );
List<Referral> referrals = new List<Referral>();
for (...)
{
referrals.Add(new Referral(string1, string2, number1));
}
Then, if you're using Linq (which I highly suggest), you can do this:
IEnumerable<Referral> sorted = referrals.OrderBy(x => x.string1).ThenBy(x => x.string2);
Otherwise, you can use the Sort() method on List<Referral>.
You can create an object without a reference, but you won't have any access to it later:
new Referral(string, string, int);
If you wish to put them in an array/list, these different types need to have a common base class. This is called polimorfism, which is a very important concept in OO programming.
You cannot ignore classes while using c#. Don't resist the change!
Do you really not need to create a class here? Do you really not need to give it a name? C# does allow loose typing, but type safety is a good thing.
I don't fully understand what you're trying to do. But maybe LINQ is what you're looking for. There's tons of documentation around, but as a quick 'teaser' have a look at the 101 Linq samples on MSDN
C# includes a wonderful feature called "iterator blocks". What you want to do is use the yield keyword to create an Enumerable of your Referal object, something like this (not that I'm making the file format and property names up, because you didn't share that):
public class Referral
{
public Guid id { get; private set; } // "uniquify"
public int ReferringId { get; set; }
public string ReferrerText { get; set; }
public string ReferrerDescription { get; set; }
private Referral()
{
id = new Guid();
}
private Referral(string Text, string Description, int ReferringId) : this()
{
this.ReferrerText = Text;
this.ReferrerDescription = Description;
this.ReferringId = ReferringId;
}
public static IEnumerable<Referral> GetReferrals(string fileName)
{
using (var rdr = new StreamReader(fileName))
{
var next = new Referrer();
int state = 0;
string line;
while ( (line = rdr.ReadLine() ) != null)
{
switch (state)
{
case 0:
next.ReferrerText = line;
state = 1;
break;
case 1:
next.ReferrerDescription = line;
state = 2;
break;
case 2:
next.ReferringId = int.Parse(line);
yield return next;
next = new Referral();
state = 0;
break;
}
}
}
}
}
Now you want to sort the referrals and presumable enumerate over them for some purpose. You can do that easily like this:
foreach (var referral in Referral.GetReferrals(#"C:\referralfile.txt").OrderBy( r => r.Text ) )
{
OutputReferral(referral);
}

Categories