Where does "i" get its value in this LINQ statement? - c#

I'm little perplexed by the behavior of this select LINQ statement. Just below the LOOK HERE comments you can see a select LINQ statement. That select statement is on the employees collection. So, it should accept only x as the input param. Out of curiosity I passed i to the delegate and it works. When it iterates through the select, it assigns 0 first and then it increments by 1. The result can be seen at the end of this post.
Where does the variable i get its value from? First of all, why does it allow me to use a variable i which is nowhere in the scope. It is not in the global scope neither in the local Main method. Any help is appreciated to understand this mystery.
namespace ConsoleApplication
{
using System;
using System.Collections.Generic;
using System.Linq;
public class Employee
{
public int EmployeedId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
class Program
{
static void Main(string[] args)
{
var employees = new List<Employee>()
{
new Employee() { FirstName = "John", LastName = "Doe" },
new Employee() { FirstName = "Jacob", LastName = "Doe" }
};
// LOOK HERE...
var newEmployees = employees.Select((x, i) => new { id = i, name = x.FirstName + " " + x.LastName });
newEmployees.ToList().ForEach(x => { Console.Write(x.id); Console.Write(" "); Console.WriteLine(x.name); });
Console.ReadKey();
}
}
}
The result is
0 John Doe
1 Jacob Doe

Enumerable.Select has an overload that projects the current index of the element in the sequence. Also Enumerable.Where and Enumerable.SkipWhile/TakeWhile have it. You can use it like a loop variable in a for-loop which is sometimes handy.
One example which uses the index to create an anonymous type to group a long list into groups of 4:
var list = Enumerable.Range(1, 1000).ToList();
List<List<int>> groupsOf4 = list
.Select((num, index) => new { num, index })
.GroupBy(x => x.index / 4).Select(g => g.Select(x => x.num).ToList())
.ToList(); // 250 groups of 4
or one with Where which only selects even indices:
var evenIndices = list.Where((num, index) => index % 2 == 0);
It might also be important to mention that you can use these overloads that project the index only in method-syntax. LINQ query-syntax does not support it.

Related

Linq Union ordering - how do i ensure items from the first IEnumerable remain first in result?

i'd like to figure out if it's possible (or, if it's already being done) to ensure the items from the first IEnumerable are kept - while duplicates from a union of another IEnumerable are discarded.
For example:
using System.Collections.Generic;
using System.Linq;
namespace MyApp.ExampleStuff
{
public class SomeDto
{
string name {get; set;}
int classId {get; set;}
int notComparedObject {get; set;}
}
public class test {
public void DoSomething()
{
IEnumerable<SomeDto> firstDto = new List<SomeDto>() { new SomeDto() {name = "Dave", classId = 1, notComparedObject = 12}};
IEnumerable<SomeDto> secondDto = new List<SomeDto>() { new SomeDto() {name = "Dave", classId = 1, notComparedObject = 16}, new SomeDto() {name = "Brad", classId = 1, notComparedObject = 77}};
var result = GetUnionedLists(firstDto, secondDto);
}
public ILookup<SomeDto> GetUnionedLists (IEnumerable<SomeDto> dtoA, IEnumerable<SomeDto> dtoB)
{
return dtoA.Union(dtoB, new SomeDtoComparer()).ToLookUp(x => x.classId);
}
}
public class SomeDtoComparer : IEqualityComparer<SomeDto>
{
public bool Equals(SomeDto SomeDtoA, SomeDto SomeDtoB)
{
if (SomeDtoA == null && SomeDtoB == null)
{
return true;
} else if (SomeDtoA == null || SomeDtoB == null)
{
return false;
}
return (SomeDtoA.Name == SomeDtoB.Name && SomeDtoA.classId == SomeDtoB.classId);
}
public int GetHashCode(SomeDto SomeDtoX)
{
int hashName = SomeDtoX.Name == null ? 0 : SomeDtoX.Name.GetHashCode();
int hashClassId = SomeDtoX.classId == null ? 0 : SomeDtoX.classId.GetHashCode();
return hashName ^ hashClassId;
}
}
}
If this is run - i would hope that the value of result in DoSomething() is a Lookup containing only the following someDto's under classId "1":
SomeDto() {name = "Dave", classId = 1, notComparedObject = 12}
SomeDto() {name = "Brad", classId = 1, notComparedObject = 77}
As you can see, if "Name" and "classId" are the same - the results are considered Equal, and i'd then like to keep the item from the original IEnumerable, and discard the "duplicate" - in this case that was:
SomeDto() {name = "Dave", id = 1, notComparedObject = 16}
If the result were to come out like this - it would be considered wrong (as the items from the second Enumerable were placed first in the result):
SomeDto() {name = "Brad", classId = 1, notComparedObject = 77}
SomeDto() {name = "Dave", classId = 1, notComparedObject = 12}
Enumerable.Union method already yields items in order that you've described. It's written in the docs that
When the object returned by this method is enumerated, Union enumerates first and second in that order and yields each element that has not already been yielded.
On the other side, Lookup type as well as IGrouping interface does not give any guarantees on elements order (looks like current implementation of ToLookup keeps the original order, but this could change). So if it really matters, you should add some additional logic — like using custom type instead of Lookup, adding custom property for index and ordering by it or, probably, using GroupBy, which does guarantees the order as it's stated in the docs.
The IGrouping objects are yielded in an order based on the order of the elements in source that produced the first key of each IGrouping. Elements in a grouping are yielded in the order that the elements that produced them appear in source.
I think you could do this using the FullJoin function available within the MoreLinq library (available on NuGet).
https://morelinq.github.io/3.0/ref/api/html/M_MoreLinq_MoreEnumerable_FullJoin__3_1.htm
Example:
public ILookup<SomeDto> GetUnionedLists (IEnumerable<SomeDto> dtoA, IEnumerable<SomeDto> dtoB)
{
return dtoA
.FullJoin(dtoB,
e => e,
first => first,
second => second,
(first, second) => first,
new SomeDtoComparer())
.ToLookUp(x => x.classId);
}

How to pass the current index iteration inside a select new MyObject

This is my code:
infoGraphic.chartData = (from x in db.MyDataSource
group x by x.Data.Value.Year into g
select new MyObject
{
index = "", // here I need a string such as "index is:" + index
counter = g.Count()
});
I need the current index iteration inside the select new. Where do I pass it?
EDIT - My current query:
var test = db.MyData
.GroupBy(item => item.Data.Value.Year)
.Select((item, index ) => new ChartData()
{
index = ((double)(3 + index ) / 10).ToString(),
value = item.Count().ToString(),
fill = index.ToString(),
label = item.First().Data.Value.Year.ToString(),
}).ToList();
public class ChartData
{
public string index { get; set; }
public string value { get; set; }
public string fill { get; set; }
public string label { get; set; }
}
Use IEnumerable extension methods, I think the syntax is more straightforward.
You need the 2nd overload, that receives the IEnumerable item and the index.
infoGraphic.chartData.Select((item, index) => {
//what you want to do here
});
You want to apply grouping on your chartData, and afterwards select a subset / generate a projection on the resulting data ?
your solution should look like:
infoGraphic.chartData
.GroupBy(...)
.Select((item, index) => {
//what you want to do here
});
abstracting the dataSource as x:
x.GroupBy(item => item.Data.Value.Year)
.Select((item, index) => new { index = index, counter = item.Count() });
As a follow up to your new question...
here is a simple working scenario with a custom type (like your ChartData):
class Program
{
static void Main(string[] args)
{
List<int> data = new List<int> { 1, 872, -7, 271 ,-3, 7123, -721, -67, 68 ,15 };
IEnumerable<A> result = data
.GroupBy(key => Math.Sign(key))
.Select((item, index) => new A { groupCount = item.Count(), str = item.Where(i => Math.Sign(i) > 0).Count() == 0 ? "negative" : "positive" });
foreach(A a in result)
{
Console.WriteLine(a);
}
}
}
public class A
{
public int groupCount;
public string str;
public override string ToString()
{
return string.Format("Group Count: [{0}], String: [{1}].", groupCount, str);
}
}
/* Output:
* -------
* Group Count: [6], String: positive
* Group Count: [4], String: negative
*/
Important: Make sure the data type you are to use the extension methods is of type IEnumerable (inherits IEnumerable), otherwise you will not find this Select overload my solution is talking about, exposed.
you can do something like this:
let currIndex = collection.IndexOf(collectionItem)
Your code would then become:
infoGraphic.chartData =
(from x in db.MyDataSource group x by x.Data.Value.Year into g
// Get Iterator Index Here
let currIndex = db.MyDataSource.IndexOf(x)
select new MyObject
{index = currIndex.ToString(), // Your Iterator Index
counter = g.Count()
});

Group by in LINQ on a property in an array

I have an array for which I want to group the items based on a property. I tried the below code, but it is not grouping correctyly. MyArray is the array and Id is the property on which I want to do the grouping.
var docGroup = (from x in MyArray
group x by x.Id).Select(grp => new
{
Id = grp.Key,
Results = grp.ToList(),
})
.Results
.ToList());
To keep it simple if I just make it
var docGroup = from x in MyArray group x by x.Id;
where Id is a string "123" in the array and MyArray[2] has both the same Id. When I check the docGroup it has two entries and both have the 123 key instead of just one entry with the 123 key.
Here's a very simple example:
class Program
{
static void Main(string[] args)
{
Test[] tArray = new Test[3];
Test t = new Test() { Id = "123", Val="First" };
Test t1 = new Test() { Id = "123", Val="Second" };
Test t2 = new Test() { Id = "1234", Val="Third" };
tArray[0] = t;
tArray[1] = t1;
tArray[2] = t2;
var g = from x in tArray group x by x.Id;
}
}
class Test
{
public string Id { get; set; }
public string Val { get; set; }
}
Now if I look at g it has count 2 of which one is the Id 123 and the second is the Id 1234. I am not sure what is going wrong with my array. So this seems to work, but I am not sure what is going on with my array. I'll do some research on it.
Sorry guys, I found the issue. The Id was in a value property in MyArray which I was not using and so it was not grouping correctly. Thanks for the help everyone.
Everything works as expected.
GroupBy produces an enumerable of IGrouping. Since you have two distinct keys ("123" and "1234") you will get an enumerable of two elements. These grouping have a uniqe key and they're by themself enumerables.
So
g.Where(x => x.Key == "123").ToList();
will contain two elements (First, Second) and
g.Where(x => x.Key == "1233").ToList();
will contain one element (Third).

group items by range of values using linq (IGrouping)

Say I have a list of Person class
public class Person
{
public int Age { get; set; }
public string Name { get; set; }
}
How can I group by dynamic ranges? (For example starting from the youngest person I would like to group by ranges of 5 so if the youngest person is 12 the groups would be 12-17, 18-23 ....)
How can I determine the Key of IGrouping interface? (Set the Key of each group to be the ages average in that group for example)
To get the key to group by you can create a function:
String GetAgeInterval(Int32 age, Int32 minimumAge, Int32 intervalSize) {
var group = (age - minimumAge)/intervalSize;
var startAge = group*intervalSize + minimumAge;
var endAge = startAge + intervalSize - 1;
return String.Format("{0}-{1}", startAge, endAge);
}
Assuming that the minimum age is 12 and the interval size is 5 then for ages between 12 and 16 (inclusive) the function will return the string 12-16, for ages between 17 and 21 (inclusive) the function will return the string 17-21 etc. Or you can use an interval size of 6 to get the intervals 12-17, 18-23 etc.
You can then create the groups:
var minimumAge = persons.Min(person => person.Age);
var personsByAgeIntervals = persons
.GroupBy(person => GetAgeInterval(person.Age, minimumAge, 5));
To get the average age in each group you can do something like this:
var groups = personsByAgeIntervals.Select(
grouping => new {
AgeInterval = grouping.Key,
AverageAge = grouping.Average(person => person.Age),
Persons = grouping.ToList()
}
);
This will create a sequence of groups represented by an anonymous type with properties AgeInterval, AverageAge and Persons.
Using Linq but not IGrouping (I've never used this interface, so I didn't think helping you would be the best time to start). I added a configuration class to set the min/max age as well as a basic descriptor.
public class GroupConfiguration {
public int MinimumAge { get; set; }
public int MaximumAge { get; set; }
public string Description { get; set; }
}
I created a list of Person (people) and populated it with a few sample records.
List<Person> people = new List<Person>() {
new Person(12, "Joe"),
new Person(17, "Bob"),
new Person(21, "Sally"),
new Person(15, "Jim")
};
Then I created a list of GroupConfiguration (configurations) and populated it with 3 logical-for-me records.
List<GroupConfiguration> configurations = new List<GroupConfiguration>() {
new GroupConfiguration() {MinimumAge = 0, MaximumAge=17, Description="Minors"},
new GroupConfiguration() {MinimumAge = 18, MaximumAge=20, Description="Adult-No Alcohol"},
new GroupConfiguration() {MinimumAge = 21, MaximumAge=999, Description="Adult-Alcohol"},
};
I then load them to a dictionary, to maintain the relationship between the configuration and the results that match that configration. This uses Linq to find the records from people that match MinimumAge <= age <= MaximumAge. This would allow someone to be placed in multiple results, if there were MinimumAge and Maximum age overlaps.
Dictionary<GroupConfiguration, IEnumerable<Person>> groupingDictionary = configurations.ToDictionary(groupConfiguration => groupConfiguration, groupConfiguration
=> people.Where(x => x.Age >= groupConfiguration.MinimumAge && x.Age <= groupConfiguration.MaximumAge));
Throwing this in a console program, I validated that 3 people exist in the Minors group, 0 in the Adult-No Alcohol group, and 1 in the Adult-Alcohol group.
foreach (var kvp in groupingDictionary) {
Console.WriteLine(kvp.Key.Description + " " + kvp.Value.Count());
}
Console.ReadLine();

Raven returning wrong document in OrderByDescending Statement

I have 50,000 documents in my raven database, but when I I run this query the Id of the latestProfile object is returned as 9999 (the first id in the db is 0, so this is the ten thousandth item).
//find the profile with the highest ID now existing in the collection
var latestProfile = session.Query<SiteProfile>()
.Customize(c => c.WaitForNonStaleResults())
.OrderByDescending(p => p.Id)
.FirstOrDefault();
//lastProfile.Id is 9999 here
//See how many items there are in the collection. This returns 50,000
var count = session.Query<SiteProfile>()
.Customize(c => c.WaitForNonStaleResults()).Count();
My guess is that Raven is paging before my OrderByDescending statement, but
The default page size is 10, and even the max is 1024
All the Parts of this are either IRavenQueryable or IQueryable
It is also not a stale index as I have tested this with WaitForNonStaleResults()
My expected result here is the most recent id I added (50,000) to be the item returned here, but yet it is not.
Why not? This looks like a bug in Raven to me.
EDIT:
Ok, so I now know exactly why, but it still looks like a bug. Here is a list of the items from that same list actualised by a ToArray()
{ Id = 9999 },
{ Id = 9998 },
{ Id = 9997 },
{ Id = 9996 },
{ Id = 9995 },
{ Id = 9994 },
{ Id = 9993 },
{ Id = 9992 },
{ Id = 9991 },
{ Id = 9990 },
{ Id = 999 }, //<-- Whoops! This is text order not int order
{ Id = 9989 },
So even though my Id column is an integer because Raven stores it internally as a string it is ordering by that representation. Clearly Ravens Queryable implementation is resolving the ordering before checking types
I have read that you can define sort order to use integer sorting on defined indexes but really, this should not matter. In a strongly typed language integers should be sorted as integers.
Is there a way to make this Id ordering correct? Do I have actually have to resort to creating a special index on the id column just to get integers ordered correctly?
UPDATE 2:
I am now using an index as follows:
public SiteProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id
};
Sort(x => x.Id, SortOptions.Int);
}
To try and force it to understand integers. I can see that my index is called via the Raven server console as follows:
Request # 249: GET - 3 ms - Bede.Profiles - 200 - /indexes/SiteProfiles/ByProfileId?&pageSize=1&sort=-__document_id&operationHeadersHash=-1789353429
Query:
Time: 3 ms
Index: SiteProfiles/ByProfileId
Results: 1 returned out of 20,000 total.
but still it comes back with string ordered results. I have seen advice not to use integers as the id, but that would cause massive issues on this project as there are 3rd parties referencing the current ids (in the old service this is designed to replace).
UPDATE 3: I have specific unit test that shows the issue. it appears to work fine for any integer property except for the Id.
[TestMethod]
public void Test_IndexAllowsCorrectIntSortingWhenNotId()
{
using (var store = new EmbeddableDocumentStore() {RunInMemory = true})
{
store.Initialize();
IndexCreation.CreateIndexes(typeof(MyFakeProfiles_ByProfileId).Assembly, store);
using (var session = store.OpenSession())
{
var profiles = new List<MyFakeProfile>()
{
new MyFakeProfile() { Id=80, Age = 80, FirstName = "Grandpa", LastName = "Joe"},
new MyFakeProfile() { Id=9, Age = 9,FirstName = "Jonny", LastName = "Boy"},
new MyFakeProfile() { Id=22, Age = 22, FirstName = "John", LastName = "Smith"}
};
foreach (var myFakeProfile in profiles)
{
session.Store(myFakeProfile, "MyFakeProfiles/" + myFakeProfile.Id);
}
session.SaveChanges();
var oldestPerson = session.Query<MyFakeProfile>().Customize(c => c.WaitForNonStaleResults())
.OrderByDescending(p => p.Age).FirstOrDefault();
var youngestPerson = session.Query<MyFakeProfile>().Customize(c => c.WaitForNonStaleResults())
.OrderBy(p => p.Age).FirstOrDefault();
var highestId = session.Query<MyFakeProfile>("MyFakeProfiles/ByProfileId").Customize(c => c.WaitForNonStaleResults())
.OrderByDescending(p => p.Id).FirstOrDefault();
var lowestId = session.Query<MyFakeProfile>("MyFakeProfiles/ByProfileId").Customize(c => c.WaitForNonStaleResults())
.OrderBy(p => p.Id).FirstOrDefault();
//sanity checks for ordering in Raven
Assert.AreEqual(80,oldestPerson.Age); //succeeds
Assert.AreEqual(9, youngestPerson.Age);//succeeds
Assert.AreEqual(80, highestId.Id);//fails
Assert.AreEqual(9, lowestId.Id);//fails
}
}
}
private void PopulateTestValues(IDocumentSession session)
{
var profiles = new List<MyFakeProfile>()
{
new MyFakeProfile() { Id=80, Age = 80, FirstName = "Grandpa", LastName = "Joe"},
new MyFakeProfile() { Id=9, Age = 9,FirstName = "Jonny", LastName = "Boy"},
new MyFakeProfile() { Id=22, Age = 22, FirstName = "John", LastName = "Smith"}
};
foreach (var myFakeProfile in profiles)
{
session.Store(myFakeProfile, "MyFakeProfiles/" + myFakeProfile.Id);
}
}
}
public class MyFakeProfile
{
public int Id { get; set; }
public int Age { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
public class MyFakeProfiles_ByProfileId : AbstractIndexCreationTask<MyFakeProfile>
{
// The index name generated by this is going to be SiteProfiles/ByProfileId
public MyFakeProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id
};
Sort(x => (int)x.Id, SortOptions.Int);
}
}
You need to specify the type of the field on the index, see http://ravendb.net/docs/2.5/client-api/querying/static-indexes/customizing-results-order
Side note, IDs in RavenDB are always strings. You seem to be trying to use integer IDs - don't do that.
You can provide multiple Sort field, as you have only defined it for Id:
public SiteProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id
};
Sort(x => x.Id, SortOptions.Int);
Sort(x => x.Age, SortOptions.Int);
}
BUT ... I am unsure of the effects of applying a sort on a field that isn't mapped.
You may have to extend the mapping to select both fields, like this:
public SiteProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id,
profile.Age
};
Sort(x => x.Id, SortOptions.Int);
Sort(x => x.Age, SortOptions.Int);
}

Categories