Data Set to Tree Structure - c#

I have the below set of data
Where each City belongs to a specific Department, which belongs to a specific Region, which belongs to a specific Country (in this case there is only one country: France).
This data is contained in a CSV file which I can read from on a row-by-row basis, however my goal is to convert this data into a tree structure (with France being at the root).
Each of these nodes will be given a specific Id value, which is something I've already gone and done, but the tricky part is that each node here must also contain a ParentId (for instance Belley and Gex need the ParentId of Ain, but Moulins and Vichy need the ParentId of Aller).
Below is a snippet of code I've written that has assigned an Id value to each name in this data set, along with some other values:
int id = 0;
List<CoverageAreaLevel> coverageAreas = GetCoverageAreaDataFromCsv(path, true);
List<LevelList> levelLists = new List<LevelList>
{
new LevelList { Names = coverageAreas.Select(a => a.Level1).Distinct().ToList(), Level = "1" },
new LevelList { Names = coverageAreas.Select(a => a.Level2).Distinct().ToList(), Level = "2" },
new LevelList { Names = coverageAreas.Select(a => a.Level3).Distinct().ToList(), Level = "3" },
new LevelList { Names = coverageAreas.Select(a => a.Level4).Distinct().ToList(), Level = "4" }
};
List<CoverageArea> newCoverageAreas = new List<CoverageArea>();
foreach (LevelList levelList in levelLists)
{
foreach (string name in levelList.Names)
{
CoverageArea coverageArea = new CoverageArea
{
Id = id++.ToString(),
Description = name,
FullDescription = name,
Level = levelList.Level
};
newCoverageAreas.Add(coverageArea);
}
}
The levelLists variable contains a sort-of heirarchical structure of the data that I'm looking for, but none of the items in that list are linked together by anything.
Any idea of how this could be implemented? I can manually figure out each ParentId, but I'd like to automate this process, especially if this needs to be done in the future.

The solution from #Camilo is really good and pragmatic. I would also suggest the use of a tree.
A sample implementation:
var countries = models.GroupBy(xco => xco.Country)
.Select((xco, index) =>
{
var country = new Tree<String>();
country.Value = xco.Key;
country.Children = xco.GroupBy(xr => xr.Region)
.Select((xr, xrIndex) =>
{
var region = new Tree<String>();
region.Value = xr.Key;
region.Parent = country;
region.Children =
xr.GroupBy(xd => xd.Department)
.Select((xd, index) =>
{
var department = new Tree<String>();
department.Value = xd.Key;
department.Parent = region;
department.Children = xd
.Select(xc => new Tree<String> { Value = xc.City, Parent = department });
return department;
});
return region;
});
return country;
});
public class Tree<T>
{
public IEnumerable<Tree<T>> Children;
public T Value;
public Tree<T> Parent;
}

One way you could solve this is by building dictionaries with the names and IDs of each level.
Assuming you have data like this:
var models = new List<Model>
{
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept1", City = "FranceA" },
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept1", City = "FranceB" },
new Model { Country = "France", Region = "FranceRegionA", Department = "FranceDept2", City = "FranceC" },
new Model { Country = "France", Region = "FranceRegionB", Department = "FranceDept3", City = "FranceD" },
new Model { Country = "Italy", Region = "ItalyRegionA", Department = "ItalyDept1", City = "ItalyA" },
new Model { Country = "Italy", Region = "ItalyRegionA", Department = "ItalyDept2", City = "ItalyB" },
};
You could do something like this, which can probably be improved further if needed:
var countries = models.GroupBy(x => x.Country)
.Select((x, index) => Tuple.Create(x.Key, new { Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var regions = models.GroupBy(x => x.Region)
.Select((x, index) => Tuple.Create(x.Key, new { ParentId = countries[x.First().Country].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var departments = models.GroupBy(x => x.Department)
.Select((x, index) => Tuple.Create(x.Key, new { ParentId = regions[x.First().Region].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
var cities = models
.Select((x, index) => Tuple.Create(x.City, new { ParentId = departments[x.Department].Id, Id = index + 1 }))
.ToDictionary(x => x.Item1, x => x.Item2);
The main idea is to leverage the index parameter of the Select method and the speed of dictionaries to find the parent ID.
Sample output from a fiddle:
countries:
[France, { Id = 1 }],
[Italy, { Id = 2 }]
regions:
[FranceRegionA, { ParentId = 1, Id = 1 }],
[FranceRegionB, { ParentId = 1, Id = 2 }],
[ItalyRegionA, { ParentId = 2, Id = 3 }]
departments:
[FranceDept1, { ParentId = 1, Id = 1 }],
[FranceDept2, { ParentId = 1, Id = 2 }],
[FranceDept3, { ParentId = 2, Id = 3 }],
[ItalyDept1, { ParentId = 3, Id = 4 }],
[ItalyDept2, { ParentId = 3, Id = 5 }]
cities:
[FranceA, { ParentId = 1, Id = 1 }],
[FranceB, { ParentId = 1, Id = 2 }],
[FranceC, { ParentId = 2, Id = 3 }],
[FranceD, { ParentId = 3, Id = 4 }],
[ItalyA, { ParentId = 4, Id = 5 }],
[ItalyB, { ParentId = 5, Id = 6 }]

Related

Linq query to filter multi level classes

I have my departments data coming from the database. I want to filter this data based on certain criteria.
[
{
"Id":10,
"Name":"Name 10",
"Teachers":[
{
"TeacherId":100,
"TeacherName":null,
"DepartmentId":100,
"Students":[
{
"StudentId":1001,
"StudentName":null,
"TeacherId":10,
"DepartmentId":100
}
]
},
{
"TeacherId":101,
"TeacherName":null,
"DepartmentId":100,
"Students":[
{
"StudentId":1001,
"StudentName":null,
"TeacherId":10,
"DepartmentId":100
}
]
}
]
},
{
"Id":100,
"Name":"Name 10",
"Teachers":[
{
"TeacherId":0,
"TeacherName":null,
"DepartmentId":100,
"Students":[
{
"StudentId":5000,
"StudentName":null,
"TeacherId":50,
"DepartmentId":100
}
]
}
]
},
{
"Id":50,
"Name":"Name 10",
"Teachers":[
{
"TeacherId":0,
"TeacherName":null,
"DepartmentId":100,
"Students":[
{
"StudentId":2000,
"StudentName":null,
"TeacherId":50,
"DepartmentId":100
}
]
}
]
}
]
Now I have to filter the departments based on some values as shown below
var departmenIds = new List<int>() { 10, 20, 30 };
var teachers = new List<int>() { 100, 200, 300 };
var students = new List<int>() { 1000, 2000, 3000 };
I am looking for a query that will return the data in a following fashion
If all department ids exists in the json it will return entire data. If a department with a particular teacher is in the list then only return that teacher and the department. like wise for the student.
I tried this to test if it atleast work at the second level but I am getting all the teachers
var list = allDeplrtments.Where(d => d.Teachers.Any(t => teachers.Contains(t.TeacherId))).ToList();
var list = allDepartments
.Where(d => departmentIds.Contains(d.Id))
.Select(d => new Department() {
Id = d.Id,
Name = d.Name,
Teachers = (d.Teachers.Any(t => teacherIds.Contains(t.TeacherId))
? d.Teachers.Where(t => teacherIds.Contains(t.TeacherId))
: d.Teachers)
.Select(t => new Teacher() {
TeacherId = t.TeacherId,
TeacherName = t.TeacherName,
DepartmentId = d.Id,
Students = t.Students.Any(s => studentIds.Contains(s.StudentId))
? t.Students.Where(s => studentIds.Contains(s.StudentId))
: t.Students
})
})
Would something like this work for you?

Linq how to find max repeat items?

I have a list of object (ProductInfo).
ProductInfo contains an id, name, and an option.
Imagine this sample, i have this
ProductInfo Id => 1, Name => XXX, Option = A
ProductInfo Id => 1, Name => XXX, Option = B
ProductInfo Id => 2, Name => DEB, Option = A
ProductInfo Id => 2, Name => DEB, Option = B
ProductInfo Id => 2, Name => DEB, Option = C
ProductInfo Id => 3, Name => ZZZ, Option = D
....
....
We see we have 2 time the option A AND B for product 1 and 2.
My goal will be to obtain the max repeat item for each product in the list.
i would like to obtain as result this :
Id = 1, Name = XXX = A, count = 2
Id =2, Name = DEB, count = 2
How i can do that ?
thanks for your time
try to do this code:
var list = new List<ProductInfo> {
new ProductInfo { Id = 1, Name = "XXX", Option = "A"},
new ProductInfo { Id = 1, Name = "XXX", Option = "B" },
new ProductInfo { Id = 2, Name = "DEB", Option = "A" },
new ProductInfo { Id = 2, Name = "DEB", Option = "B"},
new ProductInfo { Id = 2, Name = "DEB", Option = "C" },
new ProductInfo { Id = 3, Name = "ZZZ", Option = "D" }
};
var x = from p in list
group p by new { p.Id, p.Name, p.Option } into g
select new
{
Id = g.Key.Id,
Name = g.Key.Name,
Count = list.Count(m => m.Name == g.Key.Name)
};
var t = x.Distinct();
You can use GroupBy on the Name and Id parameter. Sorry read that wrong at first.

Code to collapse duplicate and semi-duplicate records?

I have a list of models of this type:
public class TourDude {
public int Id { get; set; }
public string Name { get; set; }
}
And here is my list:
public IEnumerable<TourDude> GetAllGuides {
get {
List<TourDude> guides = new List<TourDude>();
guides.Add(new TourDude() { Name = "Dave Et", Id = 1 });
guides.Add(new TourDude() { Name = "Dave Eton", Id = 1 });
guides.Add(new TourDude() { Name = "Dave EtZ5", Id = 1 });
guides.Add(new TourDude() { Name = "Danial Maze A", Id = 2 });
guides.Add(new TourDude() { Name = "Danial Maze B", Id = 2 });
guides.Add(new TourDude() { Name = "Danial", Id = 3 });
return guides;
}
}
I want to retrieve these records:
{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze", Id = 2 }
{ Name = "Danial", Id = 3 }
The goal mainly to collapse duplicates and near duplicates (confirmable by the ID), taking the shortest possible value (when compared) as name.
Where do I start? Is there a complete LINQ that will do this for me? Do I need to code up an equality comparer?
Edit 1:
var result = from x in GetAllGuides
group x.Name by x.Id into g
select new TourDude {
Test = Exts.LongestCommonPrefix(g),
Id = g.Key,
};
IEnumerable<IEnumerable<char>> test = result.First().Test;
string str = test.First().ToString();
If you want to group the items by Id and then find the longest common prefix of the Names within each group, then you can do so as follows:
var result = from x in guides
group x.Name by x.Id into g
select new TourDude
{
Name = LongestCommonPrefix(g),
Id = g.Key,
};
using the algorithm for finding the longest common prefix from here.
Result:
{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze ", Id = 2 }
{ Name = "Danial", Id = 3 }
static string LongestCommonPrefix(IEnumerable<string> xs)
{
return new string(xs
.Transpose()
.TakeWhile(s => s.All(d => d == s.First()))
.Select(s => s.First())
.ToArray());
}
I was able to achieve this by grouping the records on the ID then selecting the first record from each group ordered by the Name length:
var result = GetAllGuides.GroupBy(td => td.Id)
.Select(g => g.OrderBy(td => td.Name.Length).First());
foreach (var dude in result)
{
Console.WriteLine("{{Name = {0}, Id = {1}}}", dude.Name, dude.Id);
}

Select multiple fields group by and sum

I want to do a query with linq (list of objects) and I really don't know how to do it, I can do the group and the sum but can't select rest of the fields.
Example:
ID Value Name Category
1 5 Name1 Category1
1 7 Name1 Category1
2 1 Name2 Category2
3 6 Name3 Category3
3 2 Name3 Category3
I want to group by ID, SUM by Value and return all fields like this.
ID Value Name Category
1 12 Name1 Category1
2 1 Name2 Category2
3 8 Name3 Category3
Updated :
If you're trying to avoid grouping for all the fields, you can group just by Id:
data.GroupBy(d => d.Id)
.Select(
g => new
{
Key = g.Key,
Value = g.Sum(s => s.Value),
Name = g.First().Name,
Category = g.First().Category
});
But this code assumes that for each Id, the same Name and Category apply. If so, you should consider normalizing as #Aron suggests. It would imply keeping Id and Value in one class and moving Name, Category (and whichever other fields would be the same for the same Id) to another class, while also having the Id for reference. The normalization process reduces data redundancy and dependency.
void Main()
{
//Me being lazy in init
var foos = new []
{
new Foo { Id = 1, Value = 5},
new Foo { Id = 1, Value = 7},
new Foo { Id = 2, Value = 1},
new Foo { Id = 3, Value = 6},
new Foo { Id = 3, Value = 2},
};
foreach(var x in foos)
{
x.Name = "Name" + x.Id;
x.Category = "Category" + x.Id;
}
//end init.
var result = from x in foos
group x.Value by new { x.Id, x.Name, x.Category}
into g
select new { g.Key.Id, g.Key.Name, g.Key.Category, Value = g.Sum()};
Console.WriteLine(result);
}
// Define other methods and classes here
public class Foo
{
public int Id {get;set;}
public int Value {get;set;}
public string Name {get;set;}
public string Category {get;set;}
}
If your class is really long and you don't want to copy all the stuff, you can try something like this:
l.GroupBy(x => x.id).
Select(x => {
var ret = x.First();
ret.value = x.Sum(xt => xt.value);
return ret;
}).ToList();
With great power great responsibility comes. You need to be careful. Line ret.value = x.Sum(xt => xt.value) will change your original collection, as you are passing reference, not new object. If you want to avoid it, you need to add some Clone method into your class like MemberwiseClone (but again, this will create shallow copy, so be careful). Afer that just replace the line with: var ret = x.First().Clone();
try this:
var objList = new List<SampleObject>();
objList.Add(new SampleObject() { ID = 1, Value = 5, Name = "Name1", Category = "Catergory1"});
objList.Add(new SampleObject() { ID = 1, Value = 7, Name = "Name1", Category = "Catergory1"});
objList.Add(new SampleObject() { ID = 2, Value = 1, Name = "Name2", Category = "Catergory2"});
objList.Add(new SampleObject() { ID = 3, Value = 6, Name = "Name3", Category = "Catergory3"});
objList.Add(new SampleObject() { ID = 3, Value = 2, Name = "Name3", Category = "Catergory3"});
var newList = from val in objList
group val by new { val.ID, val.Name, val.Category } into grouped
select new SampleObject() { ID = grouped.ID, Value = grouped.Sum(), Name = grouped.Name, Category = grouped.Category };
to check with LINQPad:
newList.Dump();

Select top N records after filtering in each group

I am an old bee in .NET but very new to Linq! After some basic reading I have decided to check my skill and I failed completely! I don't know where I am making mistake.
I want to select highest 2 order for each person for while Amount % 100 == 0.
Here is my code.
var crecords = new[] {
new {
Name = "XYZ",
Orders = new[]
{
new { OrderId = 1, Amount = 340 },
new { OrderId = 2, Amount = 100 },
new { OrderId = 3, Amount = 200 }
}
},
new {
Name = "ABC",
Orders = new[]
{
new { OrderId = 11, Amount = 900 },
new { OrderId = 12, Amount = 800 },
new { OrderId = 13, Amount = 700 }
}
}
};
var result = crecords
.OrderBy(record => record.Name)
.ForEach
(
person => person.Orders
.Where(order => order.Amount % 100 == 0)
.OrderByDescending(t => t.Amount)
.Take(2)
);
foreach (var record in result)
{
Console.WriteLine(record.Name);
foreach (var order in record.Orders)
{
Console.WriteLine("-->" + order.Amount.ToString());
}
}
Can anyone focus and tell me what would be correct query?
Thanks in advance
Try this query:
var result = crecords.Select(person =>
new
{
Name = person.Name,
Orders = person.Orders.Where(order => order.Amount%100 == 0)
.OrderByDescending(x => x.Amount)
.Take(2)
});
Using your foreach loop to print the resulting IEnumerable, the output of it is:
XYZ
-->200
-->100
ABC
-->900
-->800
This has already been answered but if you didn't want to create new objects and simply modify your existing crecords, the code would look like this alternatively. But you wouldn't be able to use anonymous structures like shown in your example. Meaning you would have to create People and Order classes
private class People
{
public string Name;
public IEnumerable<Order> Orders;
}
private class Order
{
public int OrderId;
public int Amount;
}
public void PrintPeople()
{
IEnumerable<People> crecords = new[] {
new People{
Name = "XYZ",
Orders = new Order[]
{
new Order{ OrderId = 1, Amount = 340 },
new Order{ OrderId = 2, Amount = 100 },
new Order{ OrderId = 3, Amount = 200 }
}
},
new People{
Name = "ABC",
Orders = new Order[]
{
new Order{ OrderId = 11, Amount = 900 },
new Order{ OrderId = 12, Amount = 800 },
new Order{ OrderId = 13, Amount = 700 }
}
}
};
crecords = crecords.OrderBy(record => record.Name);
crecords.ToList().ForEach(
person =>
{
person.Orders = person.Orders
.Where(order => order.Amount%100 == 0)
.OrderByDescending(t => t.Amount)
.Take(2);
}
);
foreach (People record in crecords)
{
Console.WriteLine(record.Name);
foreach (var order in record.Orders)
{
Console.WriteLine("-->" + order.Amount.ToString());
}
}
}

Categories