Convert data collection into nested hierarchical object list - c#

I'm wanting to make an API call that gets all the unique survey IDs and put them into an array with total answer counts based on the unique answer value and list of user ids. For example: ICollection<Survey>
ID Survey_Id Answer User
1 Apple_Survey 1 Jones
2 Apple_Survey 1 Smith
3 Banana_Survey 2 Smith
4 Apple_Survey 3 Jane
5 Banana_Survey 2 John
The API result I currently have:
{Data: [
{
survey_id: "Apple_Survey",
answer: "1",
user: "Jones"
},
...
]}
Where I get stuck is in the code to process the data:
foreach (var info in data
.GroupBy(x => x.Survey_Id)
.Select(group => new { SurveyId = group.Key,
Count = group.Count() }) )
{
Console.WriteLine("{0} {1}", info.SurveyId, info.Count);
//Result: Apple_Survey 3 Banana_Survey 2
}
Ideal results:
{Data: [
{
survey_id: "Apple_Survey",
answers: [//Example: rating answer would be 1-10, not an ID
{answer: "1", count: 2, users: ["Jones", "Smith"]},
{answer: "3", count: 1, users: ["Jane"]}
]
},
...
]}
How can I get the distinct answers based on survey_id and the list of users based on the answer? Any help would be greatly appreciated!

See if following helps :
class Program
{
static void Main(string[] args)
{
List<Survey> surveys = new List<Survey>() {
new Survey() { ID = 1, Survey_Id = "Apple_Survey", Answer = 1, User = "Jones"},
new Survey() { ID = 2, Survey_Id = "Apple_Survey", Answer = 1, User = "Smith"},
new Survey() { ID = 3, Survey_Id = "Banana_Survey", Answer = 2, User = "Smith"},
new Survey() { ID = 4, Survey_Id = "Apple_Survey", Answer = 3, User = "Jane"},
new Survey() { ID = 5, Survey_Id = "Banana_Survey", Answer = 2, User = "John"}
};
var results = surveys.GroupBy(x => x.Survey_Id).Select(x => x.GroupBy(y => y.Answer)
.Select(y => new { answer = y.Key, count = y.Count(), users = y.Select(z => z.User).ToList()}).ToList())
.ToList();
}
}
public class Survey
{
public int ID { get; set; }
public string Survey_Id { get; set; }
public int Answer { get; set; }
public string User { get; set; }
}

A simple way is based on sql only.. you could use a query as :
select Survey_Id, Answer, COUNT(*) answer_count, group_concat(user) answer_user
from my_table
group Survey_Id, Answer

I'd go for
table.GroupBy( x => x.Survey_Id ).Select( x => new { Survey_Id=x.Key, Answers=x.GroupBy( y => y.Answer ).Select( y => new { Answer=y.Key, Count=y.Count(), Users=y.Select( z => z.User)})} )
That creates an ienumerable of pairs of a survey and an ienumerable of answers, each with its count and an ienumerable of the users that voted for that answer.
Try it out on dotnetfiddle.net!

Related

Data Set to Tree Structure

I have the below set of data
Where each City belongs to a specific Department, which belongs to a specific Region, which belongs to a specific Country (in this case there is only one country: France).
This data is contained in a CSV file which I can read from on a row-by-row basis, however my goal is to convert this data into a tree structure (with France being at the root).
Each of these nodes will be given a specific Id value, which is something I've already gone and done, but the tricky part is that each node here must also contain a ParentId (for instance Belley and Gex need the ParentId of Ain, but Moulins and Vichy need the ParentId of Aller).
Below is a snippet of code I've written that has assigned an Id value to each name in this data set, along with some other values:
int id = 0;
List<CoverageAreaLevel> coverageAreas = GetCoverageAreaDataFromCsv(path, true);
List<LevelList> levelLists = new List<LevelList>
{
new LevelList { Names = coverageAreas.Select(a => a.Level1).Distinct().ToList(), Level = "1" },
new LevelList { Names = coverageAreas.Select(a => a.Level2).Distinct().ToList(), Level = "2" },
new LevelList { Names = coverageAreas.Select(a => a.Level3).Distinct().ToList(), Level = "3" },
new LevelList { Names = coverageAreas.Select(a => a.Level4).Distinct().ToList(), Level = "4" }
};
List<CoverageArea> newCoverageAreas = new List<CoverageArea>();
foreach (LevelList levelList in levelLists)
{
foreach (string name in levelList.Names)
{
CoverageArea coverageArea = new CoverageArea
{
Id = id++.ToString(),
Description = name,
FullDescription = name,
Level = levelList.Level
};
newCoverageAreas.Add(coverageArea);
}
}
The levelLists variable contains a sort-of heirarchical structure of the data that I'm looking for, but none of the items in that list are linked together by anything.
Any idea of how this could be implemented? I can manually figure out each ParentId, but I'd like to automate this process, especially if this needs to be done in the future.
The solution from #Camilo is really good and pragmatic. I would also suggest the use of a tree.
A sample implementation:
var countries = models.GroupBy(xco => xco.Country)
.Select((xco, index) =>
{
var country = new Tree<String>();
country.Value = xco.Key;
country.Children = xco.GroupBy(xr => xr.Region)
.Select((xr, xrIndex) =>
{
var region = new Tree<String>();
region.Value = xr.Key;
region.Parent = country;
region.Children =
xr.GroupBy(xd => xd.Department)
.Select((xd, index) =>
{
var department = new Tree<String>();
department.Value = xd.Key;
department.Parent = region;
department.Children = xd
.Select(xc => new Tree<String> { Value = xc.City, Parent = department });
return department;
});
return region;
});
return country;
});
public class Tree<T>
{
public IEnumerable<Tree<T>> Children;
public T Value;
public Tree<T> Parent;
}
One way you could solve this is by building dictionaries with the names and IDs of each level.
Assuming you have data like this:
var models = new List<Model>
{
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept1", City = "FranceA" },
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept1", City = "FranceB" },
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept2", City = "FranceC" },
new Model { Country = "France", Region = "FranceRegionB", Department = "FranceDept3", City = "FranceD" },
new Model { Country = "Italy", Region = "ItalyRegionA", Department = "ItalyDept1", City = "ItalyA" },
new Model { Country = "Italy", Region = "ItalyRegionA", Department = "ItalyDept2", City = "ItalyB" },
};
You could do something like this, which can probably be improved further if needed:
var countries = models.GroupBy(x => x.Country)
.Select((x, index) => Tuple.Create(x.Key, new { Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var regions = models.GroupBy(x => x.Region)
.Select((x, index) => Tuple.Create(x.Key, new { ParentId = countries[x.First().Country].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var departments = models.GroupBy(x => x.Department)
.Select((x, index) => Tuple.Create(x.Key, new { ParentId = regions[x.First().Region].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var cities = models
.Select((x, index) => Tuple.Create(x.City, new { ParentId = departments[x.Department].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
The main idea is to leverage the index parameter of the Select method and the speed of dictionaries to find the parent ID.
Sample output from a fiddle:
countries:
[France, { Id = 1 }],
[Italy, { Id = 2 }]
regions:
[FranceRegionA, { ParentId = 1, Id = 1 }],
[FranceRegionB, { ParentId = 1, Id = 2 }],
[ItalyRegionA, { ParentId = 2, Id = 3 }]
departments:
[FranceDept1, { ParentId = 1, Id = 1 }],
[FranceDept2, { ParentId = 1, Id = 2 }],
[FranceDept3, { ParentId = 2, Id = 3 }],
[ItalyDept1, { ParentId = 3, Id = 4 }],
[ItalyDept2, { ParentId = 3, Id = 5 }]
cities:
[FranceA, { ParentId = 1, Id = 1 }],
[FranceB, { ParentId = 1, Id = 2 }],
[FranceC, { ParentId = 2, Id = 3 }],
[FranceD, { ParentId = 3, Id = 4 }],
[ItalyA, { ParentId = 4, Id = 5 }],
[ItalyB, { ParentId = 5, Id = 6 }]

Linq group by and then get only one of the items

I have the following class which is used to assign tasks to employees.
public class TaskDetails
{
public int TaskGroupId { get; set; }
public int TaskId { get; set; }
public string AssignedTo { get; set; }
public string TaskName { get; set; }
}
Typically we get a list of task groups and who are they assigned to like the following, each task is grouped under TaskGroupId and under each group for each task we have a specific taskid and who is responsible for it.
List<TaskDetails> tasks = new List<TaskDetails>
{
new TaskDetails
{
AssignedTo = "JOHN",
TaskGroupId = 100,
TaskId = 1,
TaskName = "FA"
},
new TaskDetails
{
AssignedTo = "TOM",
TaskGroupId = 100,
TaskId = 1,
TaskName = "FA"
},
new TaskDetails
{
AssignedTo = "JOHN",
TaskGroupId = 100,
TaskId = 2,
TaskName = "GH"
},
new TaskDetails
{
AssignedTo = "TOM",
TaskGroupId = 100,
TaskId = 2,
TaskName = "GH"
},
new TaskDetails
{
AssignedTo = "JOHN",
TaskGroupId = 99,
TaskId = 1,
TaskName = "XY"
},
new TaskDetails
{
AssignedTo = "TOM",
TaskGroupId = 99,
TaskId = 1,
TaskName = "XY"
},
new TaskDetails
{
AssignedTo = "JOHN",
TaskGroupId = 99,
TaskId = 2,
TaskName = "YX"
},
new TaskDetails
{
AssignedTo = "TOM",
TaskGroupId = 99,
TaskId = 2,
TaskName = "YX"
}
};
What I am trying to do is to group each task by TaskGroupId and AssignedTo, however if a task is assigned to more than one person I only need to retrieve one of them back, in the above example, task 1 is assigned to John and Tom but I only need one of them, it does not matter which one (it could be Tom or John). I have tried the following but it is retrieving 4 results and both John and Tom as seen in the screenshot. I could remove the x.AssignedTo from the GroupBy statement which gives me 2 results but then the tasks are then repeated in TaskLegs section, so not useful either.
var result = tasks
.GroupBy(x => new {
x.TaskGroupId,
x.AssignedTo })
.Select(group => new {
GroupDetails = group.Key,
TaskLegs = group
.OrderBy(x => x.TaskId)
.ToList() })
.ToList();
Is there anyway of grouping the results in a way so I can only retrieve one of the results from the grouped resultset? Based on the above example I am trying to get 2 results, one for task group 100 and one for task group 99.
Thanks
If you only want one item per TaskGroupId then you only group by that field. Grouping by TaskGroupId and AssignedTo means that you will get one group for each combination of the two which is why you are getting four items.
So your query must start:
tasks.GroupBy (x => x.TaskGroupId)
You then have two groups (for TaskGroupId 99 and 100).
You then need to select the data into the form you want. I'm a little unclear on what form this data should take. Should it just have one task under each group?
If so something like:
.Select(group => new { TaskGroupId = group.Key, TaskLegs = group.OrderBy(x => x.TaskId).First() }).
If it should have each distinct task in there then you will need to do some more grouping:
.Select(group => new {
TaskGroupId = group.Key,
TaskLegs = group
.GroupBy(x=>x.TaskId)
.Select(y => y.First())
.OrderBy(y=>y.TaskId)
})
This will give you two items, one for each taskgroup. Each of those items will have two items in the TaskLegs, for TaskIds 1 and 2.
Bonus thought:
If you wanted to list all people assigned to a task you could change your definition of TaskLegs to:
TaskLegs = group
.GroupBy(x=>x.TaskId)
.Select(y => new {
TaskId = y.Key,
AssignedTo = y.Select(z => z.AssignedTo)
})
.OrderBy(y=>y.TaskId)
Try this..
var result = tasks.GroupBy(x => new x.TaskGroupId) .Select(group => new { GroupDetails = group.Key, TaskLegs = group.Take(1) }) .ToList();

Add duplicates together in List

First question :)
I have a List<Materiau> (where Materiau implements IComparable<Materiau>), and I would like to remove all duplicates and add them together
(if two Materiau is the same (using the comparator), merge it to the first and remove the second from the list)
A Materiau contains an ID and a quantity, when I merge two Materiau using += or +, it keeps the same ID, and the quantity is added
I cannot control the input of the list.
I would like something like this:
List<Materiau> materiaux = getList().mergeDuplicates();
Thank you for your time :)
Check out Linq! Specifically the GroupBy method.
I don't know how familiar you are with sql, but Linq lets you query collections similarly to how sql works.
It's a bit in depth to explain of you are totally unfamiliar, but Code Project has a wonderful example
To sum it up:
Imagine we have this
List<Product> prodList = new List<Product>
{
new Product
{
ID = 1,
Quantity = 1
},
new Product
{
ID = 2,
Quantity = 2
},
new Product
{
ID = 3,
Quantity = 7
},
new Product
{
ID = 4,
Quantity = 3
}
};
and we wanted to group all the duplicate products, and sum their quantities.
We can do this:
var groupedProducts = prodList.GroupBy(item => item.ID)
and then select the values out of the grouping, with the aggregates as needed
var results = groupedProducts.Select( i => new Product
{
ID = i.Key, // this is what we Grouped By above
Quantity = i.Sum(prod => prod.Quantity) // we want to sum up all the quantities in this grouping
});
and boom! we have a list of aggregated products
Lets say you have a class
class Foo
{
public int Id { get; set; }
public int Value { get; set; }
}
and a bunch of them inside a list
var foocollection = new List<Foo> {
new Foo { Id = 1, Value = 1, },
new Foo { Id = 2, Value = 1, },
new Foo { Id = 2, Value = 1, },
};
then you can group them and build the aggregate on each group
var foogrouped = foocollection
.GroupBy( f => f.Id )
.Select( g => new Foo { Id = g.Key, Value = g.Aggregate( 0, ( a, f ) => a + f.Value ) } )
.ToList();
List<Materiau> distinctList = getList().Distinct(EqualityComparer<Materiau>.Default).ToList();

Code to collapse duplicate and semi-duplicate records?

I have a list of models of this type:
public class TourDude {
public int Id { get; set; }
public string Name { get; set; }
}
And here is my list:
public IEnumerable<TourDude> GetAllGuides {
get {
List<TourDude> guides = new List<TourDude>();
guides.Add(new TourDude() { Name = "Dave Et", Id = 1 });
guides.Add(new TourDude() { Name = "Dave Eton", Id = 1 });
guides.Add(new TourDude() { Name = "Dave EtZ5", Id = 1 });
guides.Add(new TourDude() { Name = "Danial Maze A", Id = 2 });
guides.Add(new TourDude() { Name = "Danial Maze B", Id = 2 });
guides.Add(new TourDude() { Name = "Danial", Id = 3 });
return guides;
}
}
I want to retrieve these records:
{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze", Id = 2 }
{ Name = "Danial", Id = 3 }
The goal mainly to collapse duplicates and near duplicates (confirmable by the ID), taking the shortest possible value (when compared) as name.
Where do I start? Is there a complete LINQ that will do this for me? Do I need to code up an equality comparer?
Edit 1:
var result = from x in GetAllGuides
group x.Name by x.Id into g
select new TourDude {
Test = Exts.LongestCommonPrefix(g),
Id = g.Key,
};
IEnumerable<IEnumerable<char>> test = result.First().Test;
string str = test.First().ToString();
If you want to group the items by Id and then find the longest common prefix of the Names within each group, then you can do so as follows:
var result = from x in guides
group x.Name by x.Id into g
select new TourDude
{
Name = LongestCommonPrefix(g),
Id = g.Key,
};
using the algorithm for finding the longest common prefix from here.
Result:
{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze ", Id = 2 }
{ Name = "Danial", Id = 3 }
static string LongestCommonPrefix(IEnumerable<string> xs)
{
return new string(xs
.Transpose()
.TakeWhile(s => s.All(d => d == s.First()))
.Select(s => s.First())
.ToArray());
}
I was able to achieve this by grouping the records on the ID then selecting the first record from each group ordered by the Name length:
var result = GetAllGuides.GroupBy(td => td.Id)
.Select(g => g.OrderBy(td => td.Name.Length).First());
foreach (var dude in result)
{
Console.WriteLine("{{Name = {0}, Id = {1}}}", dude.Name, dude.Id);
}

Count of flattened parent child association in LINQ

I'm trying to get a count of parents with no children plus parents children. As I write this I realize it is better explained with code.. So, here it goes:
With these example types:
public class Customer
{
public int Id { get; set; }
public string Name { get; set; }
public List<Order> Orders { get; set; }
}
public class Order
{
public int Id { get; set; }
public string Description { get; set; }
}
And this data:
var customers = new List<Customer>
{
new Customer
{
Id = 2,
Name = "Jane Doe"
},
new Customer
{
Id = 1,
Name = "John Doe",
Orders = new List<Order>
{
new Order { Id = 342, Description = "Ordered a ball" },
new Order { Id = 345, Description = "Ordered a bat" }
}
}
};
// I'm trying to get a count of customer orders added with customers with no orders
// In the above data, I would expect a count of 3 as detailed below
//
// CId Name OId
// ---- -------- ----
// 2 Jane Doe
// 1 John Doe 342
// 1 John Doe 345
int customerAndOrdersCount = {linq call here}; // equals 3
I am trying to get a count of 3 back.
Thank you in advance for your help.
-Jessy Houle
ADDED AFTER:
I was truly impressed with all the great (and quick) answers. For others coming to this question, looking for a few options, here is a Unit Test with a few of the working examples from below.
[TestMethod]
public void TestSolutions()
{
var customers = GetCustomers(); // data from above
var count1 = customers.Select(customer => customer.Orders).Sum(orders => (orders != null) ? orders.Count() : 1);
var count2 = (from c in customers from o in (c.Orders ?? Enumerable.Empty<Order>() ).DefaultIfEmpty() select c).Count();
var count3 = customers.Sum(c => c.Orders == null ? 1 : c.Orders.Count());
var count4 = customers.Sum(c => c.Orders==null ? 1 : Math.Max(1, c.Orders.Count()));
Assert.AreEqual(3, count1);
Assert.AreEqual(3, count2);
Assert.AreEqual(3, count3);
Assert.AreEqual(3, count4);
}
Again, thank you all for your help!
How about
int customerAndOrdersCount = customers.Sum(c => c.Orders==null ? 1 : Math.Max(1, c.Orders.Count()));
If you would initialize that Order property with an empty list instead of a null, you could do:
int count =
(
from c in customers
from o in c.Orders.DefaultIfEmpty()
select c
).Count();
If you decide to keep the uninitialized property around, then instead do:
int count =
(
from c in customers
from o in (c.Orders ?? Enumerable.Empty<Order>() ).DefaultIfEmpty()
select c
).Count();
customers
.Select(customer => customer.Order)
.Sum(orders => (orders != null) ? orders.Count() : 1)
This works if you want to count "no orders" as 1 and count the orders otherwise:
int customerOrders = customers.Sum(c => c.Orders == null ? 1 : c.Orders.Count());
By the way, the question is very exemplary.
You probabbly searching for something like this:
customers.GroupBy(customer=>customer). //group by object iyself
Select(c=> //select
new
{
ID = c.Key.Id,
Name = c.Key.Name,
Count = (c.Key.Orders!=null)? c.Key.Orders.Count():0
}
);
var orderFreeCustomers = customers.Where(c=>c.Orders== null || c.Orders.Any()==false);
var totalOrders = customers.Where (c => c.Orders !=null).
Aggregate (0,(v,e)=>(v+e.Orders.Count) );
Result is the sum of those two values

Categories