LINQ C# - Combining multiple groups - c#

LINQ Groupby query creates a new group for each unique key. I would like to combine multiple groups into a single group based on the key value.
e.g.
var customersData = new[]
{
new { id = 1, company = "ABC" },
new { id = 2, company = "AAA" },
new { id = 3, company = "ABCD" },
new { id = 4, company = "XYZ" },
new { id = 5, company = "X.Y.Z." },
new { id = 6, company = "QQQ" },
};
var groups = from d in customersData
group d by d.company;
Let's say I want ABC, AAA, and ABCD in the same group, and XYZ, X.Y.Z. in the same group.
Is there anyway to achieve this through LINQ queries?

You will need to use the overload of GroupBy that takes an IEqualityComparer.
var groups = customersData.GroupBy(k => k.company, new KeyComparer());
where KeyComparer could look like
public class KeyComparer : IEqualityComparer
{
public bool Equals(string x, string y)
{
// put your comparison logic here
}
public int GetHashCode(string obj)
{
// same comparison logic here
}
}
You can comparer the strings any way you like in the Equals method of KeyComparer.
EDIT:
You also need to make sure that the implementation of GetHashCode obeys the same rules as the Equals method. For example if you just removed the "." and replaced with "" as in other answers you need to do it in both methods like this
public class KeyComparer : IEqualityComparer
{
public bool Equals(string x, string y)
{
return x.Replace(".", "") == y.Replace(".", "");
}
public int GetHashCode(string obj)
{
return obj.Replace(".", "").GetHashCode();
}
}

I am assuming the following:
You meant to have quotes surrounding the company "names" (as below).
Your problem is simply solved by removing the '.'s from each company name.
If these assumptions are correct, the solution is simply the following:
var customersData = new[] {
new { id = 1, company = "ABC" },
new { id = 2, company = "A.B.C." },
new { id = 3, company = "A.B.C." },
new { id = 4, company = "XYZ" },
new { id = 5, company = "X.Y.Z." },
new { id = 6, company = "QQQ" },
};
var groups = from d in customersData
group d by d.company.Replace(".", "");
If these assumptions are not correct, please clarify and I can help work closer to a solution.

var groups = from d in customersData
group d by d.company.Replace(".", "");

public void int GetId(Company c)
{
int result = //whatever you want
return result;
}
then later:
var groups = from d in customersData
group d by GetId(d.company);

I think this is what you want:
var customersData = new[]
{
new { id = 1, company = "ABC" },
new { id = 2, company = "AAA" },
new { id = 3, company = "ABCD" },
new { id = 4, company = "XYZ" },
new { id = 5, company = "X.Y.Z." },
new { id = 6, company = "QQQ" },
};
var groups = from d in customersData
group d by d.company[0];
foreach (var group in groups)
{
Console.WriteLine("Group " + group.Key);
foreach (var item in group)
{
Console.WriteLine("Item " + item.company);
}
}
Console.ReadLine();

Related

Data Set to Tree Structure

I have the below set of data
Where each City belongs to a specific Department, which belongs to a specific Region, which belongs to a specific Country (in this case there is only one country: France).
This data is contained in a CSV file which I can read from on a row-by-row basis, however my goal is to convert this data into a tree structure (with France being at the root).
Each of these nodes will be given a specific Id value, which is something I've already gone and done, but the tricky part is that each node here must also contain a ParentId (for instance Belley and Gex need the ParentId of Ain, but Moulins and Vichy need the ParentId of Aller).
Below is a snippet of code I've written that has assigned an Id value to each name in this data set, along with some other values:
int id = 0;
List<CoverageAreaLevel> coverageAreas = GetCoverageAreaDataFromCsv(path, true);
List<LevelList> levelLists = new List<LevelList>
{
new LevelList { Names = coverageAreas.Select(a => a.Level1).Distinct().ToList(), Level = "1" },
new LevelList { Names = coverageAreas.Select(a => a.Level2).Distinct().ToList(), Level = "2" },
new LevelList { Names = coverageAreas.Select(a => a.Level3).Distinct().ToList(), Level = "3" },
new LevelList { Names = coverageAreas.Select(a => a.Level4).Distinct().ToList(), Level = "4" }
};
List<CoverageArea> newCoverageAreas = new List<CoverageArea>();
foreach (LevelList levelList in levelLists)
{
foreach (string name in levelList.Names)
{
CoverageArea coverageArea = new CoverageArea
{
Id = id++.ToString(),
Description = name,
FullDescription = name,
Level = levelList.Level
};
newCoverageAreas.Add(coverageArea);
}
}
The levelLists variable contains a sort-of heirarchical structure of the data that I'm looking for, but none of the items in that list are linked together by anything.
Any idea of how this could be implemented? I can manually figure out each ParentId, but I'd like to automate this process, especially if this needs to be done in the future.
The solution from #Camilo is really good and pragmatic. I would also suggest the use of a tree.
A sample implementation:
var countries = models.GroupBy(xco => xco.Country)
.Select((xco, index) =>
{
var country = new Tree<String>();
country.Value = xco.Key;
country.Children = xco.GroupBy(xr => xr.Region)
.Select((xr, xrIndex) =>
{
var region = new Tree<String>();
region.Value = xr.Key;
region.Parent = country;
region.Children =
xr.GroupBy(xd => xd.Department)
.Select((xd, index) =>
{
var department = new Tree<String>();
department.Value = xd.Key;
department.Parent = region;
department.Children = xd
.Select(xc => new Tree<String> { Value = xc.City, Parent = department });
return department;
});
return region;
});
return country;
});
public class Tree<T>
{
public IEnumerable<Tree<T>> Children;
public T Value;
public Tree<T> Parent;
}
One way you could solve this is by building dictionaries with the names and IDs of each level.
Assuming you have data like this:
var models = new List<Model>
{
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept1", City = "FranceA" },
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept1", City = "FranceB" },
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept2", City = "FranceC" },
new Model { Country = "France", Region = "FranceRegionB", Department = "FranceDept3", City = "FranceD" },
new Model { Country = "Italy", Region = "ItalyRegionA", Department = "ItalyDept1", City = "ItalyA" },
new Model { Country = "Italy", Region = "ItalyRegionA", Department = "ItalyDept2", City = "ItalyB" },
};
You could do something like this, which can probably be improved further if needed:
var countries = models.GroupBy(x => x.Country)
.Select((x, index) => Tuple.Create(x.Key, new { Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var regions = models.GroupBy(x => x.Region)
.Select((x, index) => Tuple.Create(x.Key, new { ParentId = countries[x.First().Country].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var departments = models.GroupBy(x => x.Department)
.Select((x, index) => Tuple.Create(x.Key, new { ParentId = regions[x.First().Region].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var cities = models
.Select((x, index) => Tuple.Create(x.City, new { ParentId = departments[x.Department].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
The main idea is to leverage the index parameter of the Select method and the speed of dictionaries to find the parent ID.
Sample output from a fiddle:
countries:
[France, { Id = 1 }],
[Italy, { Id = 2 }]
regions:
[FranceRegionA, { ParentId = 1, Id = 1 }],
[FranceRegionB, { ParentId = 1, Id = 2 }],
[ItalyRegionA, { ParentId = 2, Id = 3 }]
departments:
[FranceDept1, { ParentId = 1, Id = 1 }],
[FranceDept2, { ParentId = 1, Id = 2 }],
[FranceDept3, { ParentId = 2, Id = 3 }],
[ItalyDept1, { ParentId = 3, Id = 4 }],
[ItalyDept2, { ParentId = 3, Id = 5 }]
cities:
[FranceA, { ParentId = 1, Id = 1 }],
[FranceB, { ParentId = 1, Id = 2 }],
[FranceC, { ParentId = 2, Id = 3 }],
[FranceD, { ParentId = 3, Id = 4 }],
[ItalyA, { ParentId = 4, Id = 5 }],
[ItalyB, { ParentId = 5, Id = 6 }]

Cartesian Product of an arbitrary number of objects [duplicate]

This question already has answers here:
Is there a good LINQ way to do a cartesian product?
(3 answers)
Closed 4 years ago.
I'm looking to get the Cartesian Product of an arbitrary number of objects in c#. My situation is slightly unusual - my inputs are not lists of base types, but objects which have a property that's a list of base types.
My input and output objects are as follows:
public class Input
{
public string Label;
public List<int> Ids;
}
public class Result
{
public string Label;
public int Id;
}
Some sample input data:
var inputs = new List<Input>
{
new Input { Label = "List1", Ids = new List<int>{ 1, 2 } },
new Input { Label = "List2", Ids = new List<int>{ 2, 3 } },
new Input { Label = "List3", Ids = new List<int>{ 4 } }
};
And my expected output object:
var expectedResult = new List<List<Result>>
{
new List<Result>
{
new Result{Label = "List1", Id = 1},
new Result{Label = "List2", Id = 2},
new Result{Label = "List3", Id = 4}
},
new List<Result>
{
new Result{Label = "List1", Id = 1},
new Result{Label = "List2", Id = 3},
new Result{Label = "List3", Id = 4}
},
new List<Result>
{
new Result{Label = "List1", Id = 2},
new Result{Label = "List2", Id = 2},
new Result{Label = "List3", Id = 4}
},
new List<Result>
{
new Result{Label = "List1", Id = 2},
new Result{Label = "List2", Id = 3},
new Result{Label = "List3", Id = 4}
}
};
If I knew the number of items in 'inputs' in advance I could do this:
var knownInputResult =
from id1 in inputs[0].Ids
from id2 in inputs[1].Ids
from id3 in inputs[2].Ids
select
new List<Result>
{
new Result { Id = id1, Label = inputs[0].Label },
new Result { Id = id2, Label = inputs[1].Label },
new Result { Id = id3, Label = inputs[2].Label },
};
I'm struggling to adapt this to an arbitrary number of inputs - is there a possible way to do this?
I consider this duplicate of question linked in comments, but since it was reopened and you struggle to adapt that question to your case, here is how.
First grab function by Eric Lippert from duplicate question as is (how it works is explained there):
public static class Extensions {
public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item })
);
}
}
Then flatten your input. Basically just attach corresponding label to each id:
var flatten = inputs.Select(c => c.Ids.Select(r => new Result {Label = c.Label, Id = r}));
Then run cartesian product and done:
// your expected result
var result = flatten.CartesianProduct().Select(r => r.ToList()).ToList();
I'm not proud of the amount of time I spent messing with this, but it works.
It's basically black magic, and I would replace it the first chance you get.
public static List<List<Result>> Permutate(IEnumerable<Input> inputs)
{
List<List<Result>> results = new List<List<Result>>();
var size = inputs.Select(inp => factorial_WhileLoop(inp.Ids.Count)).Aggregate((item, carry) => item + carry) - 1;
for (int i = 0; i < size; i++) results.Add(new List<Result>());
foreach (var input in inputs)
{
for (int j = 0; j < input.Ids.Count; j++)
{
for (int i = 0; i < (size / input.Ids.Count); i++)
{
var x = new Result() { Label = input.Label, Id = input.Ids[j] };
results[(input.Ids.Count * i) + j].Add(x);
}
}
}
return results;
}
public static int factorial_WhileLoop(int number)
{
var result = 1;
while (number != 1)
{
result = result * number;
number = number - 1;
}
return result;
}

Linq query - List within List [duplicate]

This question already has answers here:
Group by in LINQ
(11 answers)
Closed 5 years ago.
I'm trying to select a list that contains Fund.Name and List<Investment>.
var funds = new List<Fund>
{
new Fund { Id = 1 , Name = "good" },
new Fund { Id = 2, Name = "bad" }
};
var investments = new List<Investment>
{
new Investment { Fund = funds[0], Value = 100 },
new Investment { Fund = funds[0], Value = 200 },
new Investment { Fund = funds[1], Value = 300 }
};
Then I'm trying to create the query with this:
var query = from f in funds
join i in investments
on f.Id equals i.Fund.Id
select new { f.Name, i };
I wanted something like this:
{ Name = good, {{ Id = 1, Value = 100 }, { Id = 1, Value = 200 }}},
{ Name = bad, { Id = 2, Value = 300 }}
But I'm getting something like this:
{ Name = good, { Id = 1, Value = 100 }},
{ Name = good, { Id = 1, Value = 200 }},
{ Name = bad, { Id = 2, Value = 300 }}
Try using GroupJoin.
var query = funds.GroupJoin(investments, f => f.Id, i => i.Fund.Id, (f, result) => new { f.Name, result });

At least one object must implement IComparable error in linq query

var pairs = new [] { new { id = 1, name = "ram", dept = "IT", sal = "3000" }, new { id = 2, name = "ramesh", dept = "IT", sal = "5000" }, new { id = 3, name = "rahesh", dept = "NONIT", sal = "2000" },
new { id = 5, name = "rash", dept = "NONIT", sal = "7000" } };
var query = from stud in pairs
where (stud.name.StartsWith("r") && stud.id % 2 != 0)
//orderby stud.sal descending
group stud by stud.dept into grps
select new { Values = grps, Key = grps.Key, maxsal=grps.Max() };
////select new { id = stud.id };
foreach (dynamic result in query)
{
Console.WriteLine(result.Key);
Console.WriteLine(result.maxsal);
foreach (dynamic result2 in result.Values)
{
Console.WriteLine(result2.id + "," + result2.sal);
}
}
Console.Read();
I am getting the error "At least one object must implement IComparable.", can someone explain me why iam I getting this error ?
You are calling grps.Max() to get maximnum item in group. Your anonymous objects are not comparable. How Linq will know which one is maximum from them? Should it use id property for comparison, or name?
I believe you want to select max salary:
maxsal = grps.Max(s => Int32.Parse(s.sal))

Can a single LINQ Query Expression be framed in this scenario?

I am facing a scenario where I have to filter a single object based on many objects.
For sake of example, I have a Grocery object which comprises of both Fruit and Vegetable properties. Then I have the individual Fruit and Vegetable objects.
My objective is this:
var groceryList = from grocery in Grocery.ToList()
from fruit in Fruit.ToList()
from veggie in Vegetable.ToList()
where (grocery.fruitId = fruit.fruitId)
where (grocery.vegId = veggie.vegId)
select (grocery);
The problem I am facing is when Fruit and Vegetable objects are empty.
By empty, I mean their list count is 0 and I want to apply the filter only if the filter list is populated.
I am also NOT able to use something like since objects are null:
var groceryList = from grocery in Grocery.ToList()
from fruit in Fruit.ToList()
from veggie in Vegetable.ToList()
where (grocery.fruitId = fruit.fruitId || fruit.fruitId == String.Empty)
where (grocery.vegId = veggie.vegId || veggie.vegId == String.Empty)
select (grocery);
So, I intend to check for Fruit and Vegetable list count...and filter them as separate expressions on successively filtered Grocery objects.
But is there a way to still get the list in case of null objects in a single query expression?
I think the LINQ GroupJoin operator will help you here. It's similar to the TSQL LEFT OUTER JOIN
IEnumerable<Grocery> query = Grocery
if (Fruit != null)
{
query = query.Where(grocery =>
Fruit.Any(fruit => fruit.FruitId == grocery.FruitId));
}
if (Vegetable != null)
{
query = query.Where(grocery =>
Vegetable.Any(veggie => veggie.VegetableId == grocery.VegetableId));
}
List<Grocery> results = query.ToList();
Try something like the following:
var joined = grocery.Join(fruit, g => g.fruitId,
f => f.fruitId,
(g, f) => new Grocery() { /*set grocery properties*/ }).
Join(veggie, g => g.vegId,
v => v.vegId,
(g, v) => new Grocery() { /*set grocery properties*/ });
Where I have said set grocery properties you can set the properties of the grocery object from the g, f, v variables of the selector. Of interest will obviouly be setting g.fruitId = f.fruitId and g.vegeId = v.vegeId.
var groceryList =
from grocery in Grocery.ToList()
join fruit in Fruit.ToList()
on grocery.fruidId equals fruit.fruitId
into groceryFruits
join veggie in Vegetable.ToList()
on grocery.vegId equals veggie.vegId
into groceryVeggies
where ... // filter as needed
select new
{
Grocery = grocery,
GroceryFruits = groceryFruits,
GroceryVeggies = groceryVeggies
};
You have to use leftouter join (like TSQL) for this. below the query for the trick
private void test()
{
var grocery = new List<groceryy>() { new groceryy { fruitId = 1, vegid = 1, name = "s" }, new groceryy { fruitId = 2, vegid = 2, name = "a" }, new groceryy { fruitId = 3, vegid = 3, name = "h" } };
var fruit = new List<fruitt>() { new fruitt { fruitId = 1, fname = "s" }, new fruitt { fruitId = 2, fname = "a" } };
var veggie = new List<veggiee>() { new veggiee { vegid = 1, vname = "s" }, new veggiee { vegid = 2, vname = "a" } };
//var fruit= new List<fruitt>();
//var veggie = new List<veggiee>();
var result = from g in grocery
join f in fruit on g.fruitId equals f.fruitId into tempFruit
join v in veggie on g.vegid equals v.vegid into tempVegg
from joinedFruit in tempFruit.DefaultIfEmpty()
from joinedVegg in tempVegg.DefaultIfEmpty()
select new { g.fruitId, g.vegid, fname = ((joinedFruit == null) ? string.Empty : joinedFruit.fname), vname = ((joinedVegg == null) ? string.Empty : joinedVegg.vname) };
foreach (var outt in result)
Console.WriteLine(outt.fruitId + " " + outt.vegid + " " + outt.fname + " " + outt.vname);
}
public class groceryy
{
public int fruitId;
public int vegid;
public string name;
}
public class fruitt
{
public int fruitId;
public string fname;
}
public class veggiee
{
public int vegid;
public string vname;
}
EDIT:
this is the sample result
1 1 s s
2 2 a a
3 3

Categories