Best approach to compare if one list is subset of another in C#

Best approach to compare if one list is subset of another in C# - c#

I have the below two classes:
public class FirstInner
{
public int Id { get; set; }
public string Type { get; set; }
public string RoleId { get; set; }
}
public class SecondInner
{
public int Id { get; set; }
public string Type { get; set; }
}
Again, there are lists of those types inside the below two classes:
public class FirstOuter
{
public int Id { get; set; }
public string Name { get; set; }
public string Title { get; set; }
public List<FirstInner> Inners { get; set; }
}
public class SecondOuter
{
public int Id { get; set; }
public string Name { get; set; }
public List<SecondInner> Inners { get; set; }
}
Now, I have list of FirstOuter and SecondOuter. I need to check if FirstOuter list is a subset of SecondOuter list.
Please note:
The names of the classes cannot be changed as they are from different systems.
Some additional properties are present in FirstOuter but not in SecondOuter. When comparing subset, we can ignore their presence in SecondOuter.
No.2 is true for FirstInner and SecondInner as well.
List items can be in any order---FirstOuterList[1] could be found in SecondOuterList[3], based on Id, but inside that again need to compare that FirstOuterList[1].FirstInner[3], could be found in SecondOuterList[3].SecondInner[2], based on Id.
I tried Intersect, but that is failing as the property names are mismatching. Another solution I have is doing the crude for each iteration, which I want to avoid.
Should I convert the SecondOuter list to FirstOuter list, ignoring the additional properties?
Basically, here is a test data:
var firstInnerList = new List<FirstInner>();
firstInnerList.Add(new FirstInner
{
Id = 1,
Type = "xx",
RoleId = "5"
});
var secondInnerList = new List<SecondInner>();
secondInner.Add(new SecondInner
{
Id = 1,
Type = "xx"
});
var firstOuter = new FirstOuter
{
Id = 1,
Name = "John",
Title = "Cena",
Inners = firstInnerList
}
var secondOuter = new SecondOuter
{
Id = 1,
Name = "John",
Inners = secondInnerList,
}
var firstOuterList = new List<FirstOuter> { firstOuter };
var secondOuterList = new List<SecondOuter> { secondOuter };
Need to check if firstOuterList is part of secondOuterList (ignoring the additional properties).

So the foreach way that I have is:
foreach (var item in firstOuterList)
{
var secondItem = secondOuterList.Find(so => so.Id == item.Id);
//if secondItem is null->throw exception
if (item.Name == secondItem.Name)
{
foreach (var firstInnerItem in item.Inners)
{
var secondInnerItem = secondItem.Inners.Find(sI => sI.Id == firstInnerItem.Id);
//if secondInnerItem is null,throw exception
if (firstInnerItem.Type != secondInnerItem.Type)
{
//throw exception
}
}
}
else
{
//throw exception
}
}
//move with normal flow
Please let me know if there is any better approach.

First, do the join of firstOuterList and secondOuterList
bool isSubset = false;
var firstOuterList = new List<FirstOuter> { firstOuter };
var secondOuterList = new List<SecondOuter> { secondOuter };
var jointOuterList = firstOuterList.Join(
secondOuterList,
p => new { p.Id, p.Name },
m => new { m.Id, m.Name },
(p, m) => new { FOuterList = p, SOuterList = m }
);
if(jointOuterList.Count != firstOuterList.Count)
{
isSubset = false;
return;
}
foreach(var item in jointOuterList)
{
var jointInnerList = item.firstInnerList.Join(
item.firstInnerList,
p => new { p.Id, p.Type },
m => new { m.Id, m.type },
(p, m) => p.Id
);
if(jointInnerList.Count != item.firstInnerList.Count)
{
isSubset = false;
return;
}
}
Note: I am assuming Id is unique in its outer lists. It means there will not be multiple entries with same id in a list. If no, then we need to use group by in above query

I think to break the question down..
We have two sets of Ids, the Inners and the Outers.
We have two instances of those sets, the Firsts and the Seconds.
We want Second's inner Ids to be a subset of First's inner Ids.
We want Second's outer Ids to be a subset of First's outer Ids.
If that's the case, these are a couple of working test cases:
[TestMethod]
public void ICanSeeWhenInnerAndOuterCollectionsAreSubsets()
{
HashSet<int> firstInnerIds = new HashSet<int>(GetFirstOuterList().SelectMany(outer => outer.Inners.Select(inner => inner.Id)).Distinct());
HashSet<int> firstOuterIds = new HashSet<int>(GetFirstOuterList().Select(outer => outer.Id).Distinct());
HashSet<int> secondInnerIds = new HashSet<int>(GetSecondOuterList().SelectMany(outer => outer.Inners.Select(inner => inner.Id)).Distinct());
HashSet<int> secondOuterIds = new HashSet<int>(GetSecondOuterList().Select(outer => outer.Id).Distinct());
bool isInnerSubset = secondInnerIds.IsSubsetOf(firstInnerIds);
bool isOuterSubset = secondOuterIds.IsSubsetOf(firstOuterIds);
Assert.IsTrue(isInnerSubset);
Assert.IsTrue(isOuterSubset);
}
[TestMethod]
public void ICanSeeWhenInnerAndOuterCollectionsAreNotSubsets()
{
HashSet<int> firstInnerIds = new HashSet<int>(GetFirstOuterList().SelectMany(outer => outer.Inners.Select(inner => inner.Id)).Distinct());
HashSet<int> firstOuterIds = new HashSet<int>(GetFirstOuterList().Select(outer => outer.Id).Distinct());
HashSet<int> secondInnerIds = new HashSet<int>(GetSecondOuterList().SelectMany(outer => outer.Inners.Select(inner => inner.Id)).Distinct());
HashSet<int> secondOuterIds = new HashSet<int>(GetSecondOuterList().Select(outer => outer.Id).Distinct());
firstInnerIds.Clear();
firstInnerIds.Add(5);
firstOuterIds.Clear();
firstOuterIds.Add(5);
bool isInnerSubset = secondInnerIds.IsSubsetOf(firstInnerIds);
bool isOuterSubset = secondOuterIds.IsSubsetOf(firstOuterIds);
Assert.IsFalse(isInnerSubset);
Assert.IsFalse(isOuterSubset);
}
private List<FirstOuter> GetFirstOuterList() { ... }
private List<SecondOuter> GetSecondOuterList() { ... }

Related

Compare list against other list and modify

Supposed that I have these classes
public class Subject
{
public int Id { get; set; }
public string Category { get; set; }
public string Type { get; set; }
}
public class Student
{
public int Id { get; set; }
public List<MySubject> MySubjects { get; set; }
}
public class MySubject
{
public int Id { get; set; }
public string Category { get; set; }
public string Type { get; set; }
public string Schedule { get; set; }
public string RoomNumber { get; set; }
}
sample data
var subjects = new List<Subject>()
{
new Subject(){ Id = 1, Category = "Mathematics", Type = "Algebra" },
new Subject(){ Id = 2, Category = "Computer Science", Type = "Pascal" }
};
var student = new Student()
{ Id = 1, MySubjects = new List<MySubject>() {
new MySubject() {Id = 1, Category = "Mathematics", Type = "Algebra" },
new MySubject() {Id = 3, Category = "Mathematics", Type = "Trigonometry"},
}
};
//TODO: Update list here
student.MySubjects.ForEach(i => Console.WriteLine("{0}-{1}-{2}\t", i.Id, i.Category, i.Type));
the above line of code returns
1-Mathematics-Algebra
3-Mathematics-Trigonometry
which is incorrect. I need to return this
1-Mathematics-Algebra
2-Computer Science-Pascal
Basically I would like to modify and iterate the student.MySubjects and check its contents against subjects.
I would like to remove the subjects (3-Mathematics-Trigonometry) that are not present in the subjects and also ADD subjects that are missing (2-Computer Science-Pascal).
Can you suggest an efficient way to do this by searching/comparing using Category + Type?

Try like below.
// Remove those subjects which are not present in subjects list
student.MySubjects.RemoveAll(x => !subjects.Any(y => y.Category == x.Category && y.Type == x.Type));
// Retrieve list of subjects which are not added in students.MySubjects
var mySubjectsToAdd = subjects.Where(x => !student.MySubjects.Any(y => y.Category == x.Category && y.Type == x.Type))
.Select(x => new MySubject() {
Id = x.Id,
Category = x.Category,
Type = x.Type
}).ToList();
// If mySubjectsToAdd has any value then add it into student.MySubjects
if (mySubjectsToAdd.Any())
{
student.MySubjects.AddRange(mySubjectsToAdd);
}
student.MySubjects.ForEach(i => Console.WriteLine("{0}-{1}-{2}\t", i.Id, i.Category, i.Type));

// make an inner join based on mutual values to filter out wrong subjects.
var filteredList =
from mySubject in student.MySubjects
join subject in subjects
on new { mySubject.Category, mySubject.Type }
equals new { subject.Category, subject.Type }
select new MySubject { Id = mySubject.Id, Category = mySubject.Category, Type = mySubject.Type };
// make a left outer join to find absent subjects.
var absentList =
from subject in subjects
join mySubject in filteredList
on new { subject.Category, subject.Type }
equals new { mySubject.Category, mySubject.Type } into sm
from s in sm.DefaultIfEmpty()
where s == null
select new MySubject { Id = subject.Id, Category = subject.Category, Type = subject.Type };
student.MySubjects = filteredList.ToList();
student.MySubjects.AddRange(absentList.ToList());

Custom File Parser

I am building a parser for a custom pipe delimited file format and I am finding my code to be very bulky, could someone suggest better methods of parsing this data?
The file's data is broken down by a line delimited by a pipe (|), each line starts with a record type, followed by an ID, followed by different number of columns after.
Ex:
CDI|11111|OTHERDATA|somemore|other
CEX001|123131|DATA|data
CCC|123131|DATA|data1|data2|data3|data4|data5|data6
. I am splitting by pipe, then grabbing the first two columns, and then using a switch checking the first line and calling a function that will parse the remaining into an object purpose built for that record type. I would really like a more elegant method.
public Dictionary<string, DataRecord> Parse()
{
var data = new Dictionary<string, DataRecord>();
var rawDataDict = new Dictionary<string, List<List<string>>>();
foreach (var line in File.ReadLines(_path))
{
var split = line.Split('|');
var Id = split[1];
if (!rawDataDict.ContainsKey(Id))
{
rawDataDict.Add(Id, new List<List<string>> {split.ToList()});
}
else
{
rawDataDict[Id].Add(split.ToList());
}
}
rawDataDict.ToList().ForEach(pair =>
{
var key = pair.Key.ToString();
var values = pair.Value;
foreach (var value in values)
{
var recordType = value[0];
switch (recordType)
{
case "CDI":
var cdiRecord = ParseCdi(value);
if (!data.ContainsKey(key))
{
data.Add(key, new DataRecord
{
Id = key, CdiRecords = new List<CdiRecord>() { cdiRecord }
});
}
else
{
data[key].CdiRecords.Add(cdiRecord);
}
break;
case "CEX015":
var cexRecord = ParseCex(value);
if (!data.ContainsKey(key))
{
data.Add(key, new DataRecord
{
Id = key,
CexRecords = new List<Cex015Record>() { cexRecord }
});
}
else
{
data[key].CexRecords.Add(cexRecord);
}
break;
case "CPH":
CphRecord cphRecord = ParseCph(value);
if (!data.ContainsKey(key))
{
data.Add(key, new DataRecord
{
Id = key,
CphRecords = new List<CphRecord>() { cphRecord }
});
}
else
{
data[key].CphRecords.Add(cphRecord);
}
break;
}
}
});
return data;
}

Try out FileHelper, here is your exact example - http://www.filehelpers.net/example/QuickStart/ReadFileDelimited/
Given you're data of
CDI|11111|OTHERDATA|Datas
CEX001|123131|DATA
CCC|123131
You could create a class to model this to allow FileHelpers to parse the delimited file:
[DelimitedRecord("|")]
public class Record
{
public string Type { get; set; }
public string[] Fields { get; set; }
}
Then we could allow FileHelpers to parse in to this object type:
var engine = new FileHelperEngine<Record>();
var records = engine.ReadFile("Input.txt");
After we've got all the records loaded in to Record objects we can use a bit of linq to pull them in to their given types
var cdis = records.Where(x => x.Type == "CDI")
.Select(x => new Cdi(x.Fields[0], x.Fields[1], x.Fields[2])
.ToArray();
var cexs = records.Where(x => x.Type == "CEX001")
.Select(x => new Cex(x.Fields[0], x.Fields[1)
.ToArray();
var cccs = records.Where(x => x.Type == "CCC")
.Select(x => new Ccc(x.Fields[0])
.ToArray();
You could also simplify the above using something like AutoMapper - http://automapper.org/
Alternatively you could use ConditionalRecord attributes which will only parse certain lines if they match a given criteria. This will however be slower the more record types you have but you're code will be cleaner and FileHelpers will be doing most of the heavy lifting:
[DelimitedRecord("|")]
[ConditionalRecord(RecordCondition.IncludeIfMatchRegex, "^CDI")]
public class Cdi
{
public string Type { get; set; }
public int Number { get; set; }
public string Data1 { get; set; }
public string Data2 { get; set; }
public string Data3 { get; set; }
}
[DelimitedRecord("|")]
[ConditionalRecord(RecordCondition.IncludeIfMatchRegex, "^CEX001")]
public class Cex001
{
public string Type { get; set; }
public int Number { get; set; }
public string Data1 { get; set; }
}
[DelimitedRecord("|")]
[ConditionalRecord(RecordCondition.IncludeIfMatchRegex, "^CCC")]
public class Ccc
{
public string Type { get; set; }
public int Number { get; set; }
}
var input =
#"CDI|11111|Data1|Data2|Data3
CEX001|123131|Data1
CCC|123131";
var CdiEngine = new FileHelperEngine<Cdi>();
var cdis = CdiEngine.ReadString(input);
var cexEngine = new FileHelperEngine<Cex001>();
var cexs = cexEngine.ReadString(input);
var cccEngine = new FileHelperEngine<Ccc>();
var cccs = cccEngine.ReadString(input);

Your first loop isn't really doing anything other than organizing your data differently. You should be able to eliminate it and use the data as it is from the file. Something like this should give you what you want:
foreach (var line in File.ReadLines(_path))
{
var split = line.Split('|');
var key = split[1];
var value = split;
var recordType = value[0];
switch (recordType)
{
case "CDI":
var cdiRecord = ParseCdi(value.ToList());
if (!data.ContainsKey(key))
{
data.Add(key, new DataRecord
{
Id = key, CdiRecords = new List<CdiRecord>() { cdiRecord }
});
}
else
{
data[key].CdiRecords.Add(cdiRecord);
}
break;
case "CEX015":
var cexRecord = ParseCex(value.ToList());
if (!data.ContainsKey(key))
{
data.Add(key, new DataRecord
{
Id = key,
CexRecords = new List<Cex015Record>() { cexRecord }
});
}
else
{
data[key].CexRecords.Add(cexRecord);
}
break;
case "CPH":
CphRecord cphRecord = ParseCph(value.ToList());
if (!data.ContainsKey(key))
{
data.Add(key, new DataRecord
{
Id = key,
CphRecords = new List<CphRecord>() { cphRecord }
});
}
else
{
data[key].CphRecords.Add(cphRecord);
}
break;
}
};
Caveat: This is just put together here and hasn't been properly checked for syntax.

LINQ Query to Filter Items By Criteria From Multiple Lists

I'm having trouble conceptualizing something that should be fairly simple using LINQ. I have a collection that I want to narrow down, or filter, based on the id values of child objects.
My primary collection consists of a List of Spots. This is what a spot looks like:
public class Spot
{
public virtual int? ID { get; set; }
public virtual string Name { get; set; }
public virtual string Description { get; set; }
public virtual string TheGood { get; set; }
public virtual string TheBad { get; set; }
public virtual IEnumerable<Season> Seasons { get; set; }
public virtual IEnumerable<PhotographyType> PhotographyTypes { get; set; }
}
I'm trying to filter the list of Spots by PhotographyType and Season. I have a list of ids for PhotographyTypes and Seasons, each in an int[] array. Those lists look like this:
criteria.PhotographyTypeIds //an int[]
criteria.SeasonIds //an int[]
I want to build a collection that only contains Spots with child objects (ids) matching those in the above lists. The goal of this functionality is filtering a set of photography spots by type and season and only displaying those that match. Any suggestions would be greatly appreciated.

Thanks everyone for the suggestions. I ended up solving the problem. It's not the best way I'm sure but it's working now. Because this is a search filter, there are a lot of conditions.
private List<Spot> FilterSpots(List<Spot> spots, SearchCriteriaModel criteria)
{
if (criteria.PhotographyTypeIds != null || criteria.SeasonIds != null)
{
List<Spot> filteredSpots = new List<Spot>();
if (criteria.PhotographyTypeIds != null)
{
foreach (int id in criteria.PhotographyTypeIds)
{
var matchingSpots = spots.Where(x => x.PhotographyTypes.Any(p => p.ID == id));
filteredSpots.AddRange(matchingSpots.ToList());
}
}
if (criteria.SeasonIds != null)
{
foreach (int id in criteria.SeasonIds)
{
if (filteredSpots.Count() > 0)
{
filteredSpots = filteredSpots.Where(x => x.Seasons.Any(p => p.ID == id)).ToList();
}
else
{
var matchingSpots = spots.Where(x => x.Seasons.Any(p => p.ID == id));
filteredSpots.AddRange(matchingSpots.ToList());
}
}
}
return filteredSpots;
}
else
{
return spots;
}
}

You have an array of IDs that has a Contains extension method that will return true when the ID is in the list. Combined with LINQ Where you'll get:
List<Spot> spots; // List of spots
int[] seasonIDs; // List of season IDs
var seasonSpots = from s in spots
where s.ID != null
where seasonIDs.Contains((int)s.ID)
select s;
You can then convert the returned IEnumerable<Spot> into a list if you want:
var seasonSpotsList = seasonSpots.ToList();

This may helps you:
List<Spot> spots = new List<Spot>();
Spot s1 = new Spot();
s1.Seasons = new List<Season>()
{ new Season() { ID = 1 },
new Season() { ID = 2 },
new Season() { ID = 3 }
};
s1.PhotographyTypes = new List<PhotographyType>()
{ new PhotographyType() { ID = 1 },
new PhotographyType() { ID = 2 }
};
Spot s2 = new Spot();
s2.Seasons = new List<Season>()
{ new Season() { ID = 3 },
new Season() { ID = 4 },
new Season() { ID = 5 }
};
s2.PhotographyTypes = new List<PhotographyType>()
{ new PhotographyType() { ID = 2 },
new PhotographyType() { ID = 3 }
};
List<int> PhotographyTypeIds = new List<int>() { 1, 2};
List<int> SeasonIds = new List<int>() { 1, 2, 3, 4 };
spots.Add(s1);
spots.Add(s2);
Then:
var result = spots
.Where(input => input.Seasons.All
(i => SeasonIds.Contains(i.ID))
&& input.PhotographyTypes.All
(j => PhotographyTypeIds.Contains(j.ID))
).ToList();
// it will return 1 value
Assuming:
public class Season
{
public int ID { get; set; }
//some codes
}
public class PhotographyType
{
public int ID { get; set; }
//some codes
}

Build hierarchy from strings C#

I have a collection of strings:
"Alberton;Johannesburg"
"Allendale;Phoenix"
"Brackenhurst;Alberton"
"Cape Town;"
"Durban;"
"Johannesburg;"
"Mayville;Durban"
"Phoenix;Durban"
"Sandton;Johannesburg"
that I want to structure into a hierarchical structure in the fastest possible manner, like:
Johannesburg
Alberton
Brackenhurst
Sandton
Cape Town
Durban
Phoenix
Allandale
Mayville
Currently I have nested for loops and checks, but was hoping I could achieve this with a single LAMBDA query?
The above mentioned strings are in a List.

I prepared lambda-like solution, but you should really think if it's more readable/efficient then your current one:
Helper Extension Method:
public static class ChildrenGroupExtensions
{
public static List<CityInfo> GetChildren(this IEnumerable<IGrouping<string, City>> source, string parentName)
{
var cities = source.SingleOrDefault(g => g.Key == parentName);
if (cities == null)
return new List<CityInfo>();
return cities.Select(c => new CityInfo { Name = c.Name, Children = source.GetChildren(c.Name) }).ToList();
}
}
Helper Classes:
public class City
{
public string Name { get; set; }
public string Parent { get; set; }
}
public class CityInfo
{
public string Name { get; set; }
public List<CityInfo> Children { get; set; }
}
Usage:
var groups = (from i in items
let s = i.Split(new[] { ';' })
select new City { Name = s[0], Parent = s[1] }).GroupBy(e => e.Parent);
var root = groups.GetChildren(string.Empty);
Where items is your List<string>
You can look the results with simple helper method like that one:
private static void PrintTree(List<CityInfo> source, int level)
{
if (source != null)
{
source.ForEach(c =>
{
Enumerable.Range(1, level).ToList().ForEach(i => Console.Write("\t"));
Console.WriteLine(c.Name);
PrintTree(c.Children, level + 1);
});
}
}
And the results are:
Cape Town
Durban
Mayville
Phoenix
Allendale
Johannesburg
Alberton
Brackenhurst
Sandton

You haven't specified any specific data structure so I just used a class called Area with a list of children of itself. Also, it's in 2 lines of linq. There is also no check to see if an area is a child of 2 separate parents as the code is. Here's the code for the test I used(Relevant lines in-between the equals comments):
[TestFixture]
public class CitiesTest
{
[Test]
public void Test()
{
var strings = new List<string>
{
"Alberton;Johannesburg",
"Allendale;Phoenix",
"Brackenhurst;Alberton",
"Cape Town;",
"Durban;",
"Johannesburg;",
"Mayville;Durban",
"Phoenix;Durban",
"Sandton;Johannesburg"
};
//===================================================
var allAreas = strings.SelectMany(x=>x.Split(';')).Where(x=>!string.IsNullOrWhiteSpace(x)).Distinct().ToDictionary(x=>x, x=>new Area{Name = x});
strings.ForEach(area =>
{
var areas = area.Split(';');
if (string.IsNullOrWhiteSpace(areas[1]))
return;
var childArea = allAreas[areas[0]];
if (!allAreas[areas[1]].Children.Contains(childArea))
allAreas[areas[1]].Children.Add(childArea);
childArea.IsParent = false;
});
var result = allAreas.Select(x=>x.Value).Where(x => x.IsParent);
//===================================================
}
public class Area
{
public string Name;
public bool IsParent;
public List<Area> Children { get; set; }
public Area()
{
Children = new List<Area>();
IsParent = true;
}
}
}

LINQ - GroupBy and project to a new type?

I have a list of items, i.e, List<SearchFilter>, and this is the SearchFilter object:
public class SearchFilter
{
public int ItemID { get { return ValueInt("ItemID"); } }
public string ItemName { get { return ValueString("ItemName"); } }
public string Type { get { return ValueString("Type"); } }
}
How do I group by the Type, and project the grouped item into a new list of GroupedFilter, i.e:
public class Filter
{
public int ItemID { get; set; }
public string ItemName { get; set; }
}
public class GroupedFilter
{
public int Type { get; set; }
public List<Filter> Filters { get; set; }
}
Thanks.

var result = items.GroupBy(
sf => sf.Type,
sf => new Filter() { ItemID = sf.ItemID, ItemName = sf.ItemName },
(t, f) => new GroupedFilter() { Type = t, Filters = new List<Filter>(f) });
But you need to make sure your GroupedFilter.Type property is a string to match your SearchFilter.Type property.

With Linq query syntax it is longer and more complex but just for reference:
var grpFilters = (from itm in list group itm by itm.Type into grp select
new GroupedFilter
{
Type = grp.Key,
Filters = grp.Select(g => new Filter
{
ItemID = g.ItemID,
ItemName = g.ItemName
}).ToList()
}).ToList();
Somebody may find it more readable because they don't know all the possible parameters to GroupBy().

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Best approach to compare if one list is subset of another in C# - c#

Related

Compare list against other list and modify

Custom File Parser

LINQ Query to Filter Items By Criteria From Multiple Lists

Build hierarchy from strings C#

LINQ - GroupBy and project to a new type?

Categories

Resources