Using Linq to remove from set where key exists in other set? - c#

What is the proper way to do set subtraction using Linq? I have a List of 8000+ banks where I want to remove a portion of those based on the routing number. The portion is in another List and routing number is the key property to both. Here is a simplification:
public class Bank
{
public string RoutingNumber { get; set; }
public string Name { get; set; }
}
var removeThese = new List<string>() { "111", "444", "777" };
var banks = new List<Bank>()
{
new Bank() { RoutingNumber = "111", Name = "First Federal" },
new Bank() { RoutingNumber = "222", Name = "Second Federal" },
new Bank() { RoutingNumber = "333", Name = "Third Federal" },
new Bank() { RoutingNumber = "444", Name = "Fourth Federal" },
new Bank() { RoutingNumber = "555", Name = "Fifth Federal" },
new Bank() { RoutingNumber = "666", Name = "Sixth Federal" },
new Bank() { RoutingNumber = "777", Name = "Seventh Federal" },
new Bank() { RoutingNumber = "888", Name = "Eight Federal" },
new Bank() { RoutingNumber = "999", Name = "Ninth Federal" },
};
var query = banks.Remove(banks.Where(x => removeThese.Contains(x.RoutingNumber)));

This should do the trick:
var toRemove = banks.Where(x => removeThese.Contains(x.RoutingNumber)).ToList();
var query = banks.RemoveAll(x => toRemove.Contains(x));
The first step is to make sure that you don't have to re-run that first query over and over again, whenever banks changes.
This should work too:
var query = banks.Except(toRemove);
as your second line.
EDIT
Tim Schmelter pointed out that for Except to work, you need to override Equals and GetHashCode.
So you could implement it like so:
public override string ToString()
{
... any serialization will do, for instance JSON or CSV or XML ...
... OR any serialization that identifies the object quickly, such as:
return "Bank: " + this.RoutingNumber;
}
public override bool Equals(System.Object obj)
{
return ((obj is Bank) && (this.ToString().Equals(obj.ToString()));
}
public override int GetHashCode()
{
return this.ToString().GetHashCode();
}

Generally it's less work to just pull out the ones you need rather than deleting the ones you don't i.e.
var query = myList.Where(x => !removeThese.Contains(x.RoutingNumber));

Filtering of this type is generally done with generic LINQ constructs:
banks = banks.Where(bank => !removeThese.Contains(bank.RoutingNumber)).ToList();
In this specific case you can also use List<T>.RemoveAll to do the filtering in-place, which will be faster:
banks.RemoveAll(bank => removeThese.Contains(bank.RoutingNumber));
Also, for performance reasons, if the amount of routing numbers to remove is large you should consider putting them into a HashSet<string> instead.

Either use the Linq extension methods Where and ToList to create a new list or use List.RemoveAll which is more efficient since it modifies the original list:
banks = banks.Where(x => !removeThese.Contains(x.RoutingNumber)).ToList();
banks.RemoveAll(x => removeThese.Contains(x.RoutingNumber));
Of course you have to reverse the condition since the former keeps what Where leaves and the latter removes what the predicate in RemoveAll returns.

Have you tried using RemoveAll()?
var query = banks.RemoveAll(p => removeThese.Contains(p.RoutingNumber));
This will remove the any values from banks where a matching record is present in removeThese.
query will contain the number of records removed from the list.
Note: The orginal variable banks will be updated directly by this query; a reassignment is not required.

You can use RemoveAll()
var removedIndexes = banks.RemoveAll(x => removeThese.Contains(x.RoutingNumber));
or
banks = banks.Where(bank => !removeThese.Contains(bank.RoutingNumber)).ToList();

Related

How to mock which is returning list of integers on some where condition?

I blocked in below step
var ids = _repository.GetIQueryable<Customers>().Where(lrt => lrt.IsActive == true &&
lrt.NextRoleId == defaultRoleSetting.RoleId &&
lrt.NextUserId == null).Select(x => x.MasterId).Distinct().Take(100).ToHashSet();
I tried this but I don't find the right Returns syntax
_mockRepository.Setup(s => s.GetIQueryable<Customers>()).Returns<List<int>>(ids =>
{
return ????;
});
As you are creating a setup for GetIQueryable<Customers> you'd not return a list of integers, but instead an IQueryable of Customers objects that are filtered afterwards:
IQueryable<Customers> models = new Customers[] {
new Customers() { MasterId = 1, IsActive = true, NextRoleId = nextRoleId, ... },
new Customers() { MasterId = 2, IsActive = false, NextRoleId = nextRoleId, ... },
new Customers() { MasterId = 3, IsActive = true, NextRoleId = nextRoleId, ... },
}).AsQueryable();
_mockRepository
.Setup(s => s.GetIQueryable<Customers>())
.Returns(models);
In this sample, you create an array of Customers objects and set the properties of the customers so that the filter afterwards works on the IQueryable. Which properties to set on the Customers objects depends on the classes and your test case.
by below code my issue solved thanks every one.
var fakeCustomers = FakeCustomers();
_repository.Setup(s => s.GetIQueryable<Customers()).Returns(fakeCustomers.AsQueryable());
private List<Customers> FakeCustomers()
{
string fakeData = #"[{
'Id': '118',
'CreatedBy':'00000000-0000-0000-0000-000000000000',
'SId':'4',
'UserId':'00000000-0000-0000-0000-000000000000',
'NId':'7'
}]";
return JsonConvert.DeserializeObject<List<Customers>>(fakeData);
}

Updating property values in one list with a property value average of matching items in another list

I have two Lists and need to update a property value of all the items in the 1st list with a property value average of all the matching items in another list.
class transaction
{
public string orderId;
public string parentOrderId;
public int quantity;
public decimal marketPrice;
public decimal fillPrice;
}
List<transaction> makerTransactions = new List<transaction>()
{
new transaction(){
orderId = "1",
parentOrderId = "1",
quantity = 100,
marketPrice = 75.87M,
fillPrice = 75.87M
}
};
List<transaction> takerTransactions = new List<transaction>()
{
new transaction(){
orderId = "2",
parentOrderId = "1",
quantity = 50,
marketPrice = 75.97M,
fillPrice = 75.97M
},
new transaction(){
orderId = "3",
parentOrderId = "1",
quantity = 50,
marketPrice = 75.85M,
fillPrice = 75.85M
}
};
Trying to make this work with LINQ extension methods but cant figure out the correct way.
makerTransactions.All(mt => mt.fillPrice = takerTransactions
.Where(tt => tt.parentOrderId == mt.orderId)
.Average(ta => ta.fillPrice));
try this:
makerTransactions.ForEach(mt => mt.fillPrice = takerTransactions
.Where(tt => tt.parentOrderId == mt.orderId)
.Average(ta => ta.fillPrice));
All is an extension method. It tells you if all the elements in a collection match a certain condition and, apparently, it's not what you need.
To make it more efficient, first create a dictionary and use that to take the averages from:
var priceDictionary = takerTransactions
.GroupBy(tt => tt.parentOrderId)
.ToDictionary(grp => gr.Key, grp => grp.Average(ta => ta.fillPrice));
makerTransactions.ForEach(mt => mt.fillPrice = priceDictionary[mt.orderId]);

Group items by the items it holds

Please note: My question contains pseudo code!
In my army I have foot soldiers.
Every soldier is unique: name, strength etc...
All soldiers have inventory. It can be empty.
Inventory can contain: weapons, shields, other items.
I want to group my footsoldiers by their exact inventory.
Very simple example:
I have a collection of:
Weapons: {"AK-47", "Grenade", "Knife"}
Shields: {"Aegis"}
OtherItems: {"KevlarVest"}
Collection of footsoldiers. (Count = 6)
"Joe" : {"AK-47", "Kevlar Vest"}
"Fred" : {"AK-47"}
"John" : {"AK-47", "Grenade"}
"Rambo" : {"Knife"}
"Foo" : {"AK-47"}
"Bar" : {"KevlarVest"}
These are the resulting groups (count=5) : (already in specific order now)
{"AK-47"}
{"AK-47", "Grenade"}
{"AK-47", "Kevlar Vest"}
{"Knife"}
{"KevlarVest"}
I want to sort the groups by: Weapons, then by shields, then by other items in specific order in which they are declared within their collection.
When I open the inventorygroup {"Knife"} I will find a collection with 1 footsoldier named "Rambo".
Please note: I have made this simplified version, in order not to distract you with the complexity of the data at hand. In my business case I am working with ConditionalActionFlags, that may hold Conditions of a certain type.
Hereby I supply a TestMethod that still fails now.
Can you rewrite the GetSoldierGroupings method so that the TestSoldierGroupings method succeeds ?
public class FootSoldier
{
public string Name { get; set; }
public string[] Inventory { get; set; }
}
public class ArrayComparer<T> : IEqualityComparer<T[]>
{
public bool Equals(T[] x, T[] y)
{
return x.SequenceEqual(y);
}
public int GetHashCode(T[] obj)
{
return obj.Aggregate(string.Empty, (s, i) => s + i.GetHashCode(), s => s.GetHashCode());
}
}
[TestMethod]
public void TestSoldierGroupings()
{
//Arrange
var weapons = new[] { "AK-47", "Grenade", "Knife" };
var shields = new[] { "Aegis" };
var otherItems = new[] { "KevlarVest" };
var footSoldiers = new FootSoldier[]
{
new FootSoldier() { Name="Joe" , Inventory= new string[]{ "AK-47", "Kevlar Vest" } },
new FootSoldier() { Name="Fred" , Inventory= new string[]{ "AK-47" } },
new FootSoldier() { Name="John" , Inventory= new string[]{ "AK-47", "Grenade" } },
new FootSoldier() { Name="Rambo" , Inventory= new string[]{ "Knife" } },
new FootSoldier() { Name="Foo" , Inventory= new string[]{ "AK-47" } },
new FootSoldier() { Name="Bar" , Inventory= new string[]{ "Kevlar Vest" } }
};
//Act
var result = GetSoldierGroupings(footSoldiers, weapons, shields, otherItems);
//Assert
Assert.AreEqual(result.Count, 5);
Assert.AreEqual(result.First().Key, new[] { "AK-47" });
Assert.AreEqual(result.First().Value.Count(), 2);
Assert.AreEqual(result.Last().Key, new[] { "Kevlar Vest" });
Assert.AreEqual(result[new[] { "Knife" }].First().Name, "Rambo");
}
public Dictionary<string[], FootSoldier[]> GetSoldierGroupings(FootSoldier[] footSoldiers,
string[] weapons,
string[] shields,
string[] otherItems)
{
//var result = new Dictionary<string[], FootSoldier[]>();
var result = footSoldiers
.GroupBy(fs => fs.Inventory, new ArrayComparer<string>())
.ToDictionary(x => x.Key, x => x.ToArray());
//TODO: the actual sorting.
return result;
}
You need to group your soldiers by a key of combined items. It can be done using custom comparers.
As for me, I would make it simpler by using String.Join with separator which cannot be met in any weapon, shield etc.
Assuming that a soldiers has a property Items which is an array of strings (like ["AK-47", "Kevlar Vest"]), you can do something like this:
var groups = soldiers
.GroupBy(s => String.Join("~~~", s.Items))
.ToDictionary(g => g.First().Items, g => g.ToArray());
It will result into a Dictionary where key is unique item set, and value is an array of all soldiers having such set.
You may change this code such that it returns IGrouping, array of classes \ structs, Dictionary, whatever else convenient for you.
I would go for a Dictionary or an array of something like SoldiersItemGroup[] with items and soldiers as properties.
Make sure to change such join separator that no weapon can theoretically contain it.

Raven returning wrong document in OrderByDescending Statement

I have 50,000 documents in my raven database, but when I I run this query the Id of the latestProfile object is returned as 9999 (the first id in the db is 0, so this is the ten thousandth item).
//find the profile with the highest ID now existing in the collection
var latestProfile = session.Query<SiteProfile>()
.Customize(c => c.WaitForNonStaleResults())
.OrderByDescending(p => p.Id)
.FirstOrDefault();
//lastProfile.Id is 9999 here
//See how many items there are in the collection. This returns 50,000
var count = session.Query<SiteProfile>()
.Customize(c => c.WaitForNonStaleResults()).Count();
My guess is that Raven is paging before my OrderByDescending statement, but
The default page size is 10, and even the max is 1024
All the Parts of this are either IRavenQueryable or IQueryable
It is also not a stale index as I have tested this with WaitForNonStaleResults()
My expected result here is the most recent id I added (50,000) to be the item returned here, but yet it is not.
Why not? This looks like a bug in Raven to me.
EDIT:
Ok, so I now know exactly why, but it still looks like a bug. Here is a list of the items from that same list actualised by a ToArray()
{ Id = 9999 },
{ Id = 9998 },
{ Id = 9997 },
{ Id = 9996 },
{ Id = 9995 },
{ Id = 9994 },
{ Id = 9993 },
{ Id = 9992 },
{ Id = 9991 },
{ Id = 9990 },
{ Id = 999 }, //<-- Whoops! This is text order not int order
{ Id = 9989 },
So even though my Id column is an integer because Raven stores it internally as a string it is ordering by that representation. Clearly Ravens Queryable implementation is resolving the ordering before checking types
I have read that you can define sort order to use integer sorting on defined indexes but really, this should not matter. In a strongly typed language integers should be sorted as integers.
Is there a way to make this Id ordering correct? Do I have actually have to resort to creating a special index on the id column just to get integers ordered correctly?
UPDATE 2:
I am now using an index as follows:
public SiteProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id
};
Sort(x => x.Id, SortOptions.Int);
}
To try and force it to understand integers. I can see that my index is called via the Raven server console as follows:
Request # 249: GET - 3 ms - Bede.Profiles - 200 - /indexes/SiteProfiles/ByProfileId?&pageSize=1&sort=-__document_id&operationHeadersHash=-1789353429
Query:
Time: 3 ms
Index: SiteProfiles/ByProfileId
Results: 1 returned out of 20,000 total.
but still it comes back with string ordered results. I have seen advice not to use integers as the id, but that would cause massive issues on this project as there are 3rd parties referencing the current ids (in the old service this is designed to replace).
UPDATE 3: I have specific unit test that shows the issue. it appears to work fine for any integer property except for the Id.
[TestMethod]
public void Test_IndexAllowsCorrectIntSortingWhenNotId()
{
using (var store = new EmbeddableDocumentStore() {RunInMemory = true})
{
store.Initialize();
IndexCreation.CreateIndexes(typeof(MyFakeProfiles_ByProfileId).Assembly, store);
using (var session = store.OpenSession())
{
var profiles = new List<MyFakeProfile>()
{
new MyFakeProfile() { Id=80, Age = 80, FirstName = "Grandpa", LastName = "Joe"},
new MyFakeProfile() { Id=9, Age = 9,FirstName = "Jonny", LastName = "Boy"},
new MyFakeProfile() { Id=22, Age = 22, FirstName = "John", LastName = "Smith"}
};
foreach (var myFakeProfile in profiles)
{
session.Store(myFakeProfile, "MyFakeProfiles/" + myFakeProfile.Id);
}
session.SaveChanges();
var oldestPerson = session.Query<MyFakeProfile>().Customize(c => c.WaitForNonStaleResults())
.OrderByDescending(p => p.Age).FirstOrDefault();
var youngestPerson = session.Query<MyFakeProfile>().Customize(c => c.WaitForNonStaleResults())
.OrderBy(p => p.Age).FirstOrDefault();
var highestId = session.Query<MyFakeProfile>("MyFakeProfiles/ByProfileId").Customize(c => c.WaitForNonStaleResults())
.OrderByDescending(p => p.Id).FirstOrDefault();
var lowestId = session.Query<MyFakeProfile>("MyFakeProfiles/ByProfileId").Customize(c => c.WaitForNonStaleResults())
.OrderBy(p => p.Id).FirstOrDefault();
//sanity checks for ordering in Raven
Assert.AreEqual(80,oldestPerson.Age); //succeeds
Assert.AreEqual(9, youngestPerson.Age);//succeeds
Assert.AreEqual(80, highestId.Id);//fails
Assert.AreEqual(9, lowestId.Id);//fails
}
}
}
private void PopulateTestValues(IDocumentSession session)
{
var profiles = new List<MyFakeProfile>()
{
new MyFakeProfile() { Id=80, Age = 80, FirstName = "Grandpa", LastName = "Joe"},
new MyFakeProfile() { Id=9, Age = 9,FirstName = "Jonny", LastName = "Boy"},
new MyFakeProfile() { Id=22, Age = 22, FirstName = "John", LastName = "Smith"}
};
foreach (var myFakeProfile in profiles)
{
session.Store(myFakeProfile, "MyFakeProfiles/" + myFakeProfile.Id);
}
}
}
public class MyFakeProfile
{
public int Id { get; set; }
public int Age { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
public class MyFakeProfiles_ByProfileId : AbstractIndexCreationTask<MyFakeProfile>
{
// The index name generated by this is going to be SiteProfiles/ByProfileId
public MyFakeProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id
};
Sort(x => (int)x.Id, SortOptions.Int);
}
}
You need to specify the type of the field on the index, see http://ravendb.net/docs/2.5/client-api/querying/static-indexes/customizing-results-order
Side note, IDs in RavenDB are always strings. You seem to be trying to use integer IDs - don't do that.
You can provide multiple Sort field, as you have only defined it for Id:
public SiteProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id
};
Sort(x => x.Id, SortOptions.Int);
Sort(x => x.Age, SortOptions.Int);
}
BUT ... I am unsure of the effects of applying a sort on a field that isn't mapped.
You may have to extend the mapping to select both fields, like this:
public SiteProfiles_ByProfileId()
{
Map = profiles => from profile in profiles
select new
{
profile.Id,
profile.Age
};
Sort(x => x.Id, SortOptions.Int);
Sort(x => x.Age, SortOptions.Int);
}

Removing Duplicates from bottom of Generic List

I am trying to remove duplicates item from bottom of generic list. I have class defined as below
public class Identifier
{
public string Name { get; set; }
}
And I have defined another class which implements IEqualityComparer to remove the duplicates from List
public class DistinctIdentifierComparer : IEqualityComparer<Identifier>
{
public bool Equals(Identifier x, Identifier y)
{
return x.Name == y.Name;
}
public int GetHashCode(Identifier obj)
{
return obj.Name.GetHashCode();
}
}
However, I am trying to remove the old items and keep the latest. For example if I have list of identifier defined as below
Identifier idn1 = new Identifier { Name = "X" };
Identifier idn2 = new Identifier { Name = "Y" };
Identifier idn3 = new Identifier { Name = "Z" };
Identifier idn4 = new Identifier { Name = "X" };
Identifier idn5 = new Identifier { Name = "P" };
Identifier idn6 = new Identifier { Name = "X" };
List<Identifier> list = new List<Identifier>();
list.Add(idn1);
list.Add(idn2);
list.Add(idn3);
list.Add(idn4);
list.Add(idn5);
list.Add(idn6);
And I have implemented
var res = list.Distinct(new DistinctIdentifierComparer());
How do I make sure by using distinct that I am keeping idn6 and removing idn1 and idn4?
Most LINQ operators are order-preserving: the API of Distinct() says it will take the first instance of each item it comes across. If you want the last instance, just do:
var res = list.Reverse().Distinct(new DistinctIdentifierComparer());
Another option that would avoid you having to define an explicit comparer would be:
var res = list.GroupBy(i => i.Name).Select(g => g.Last());
From MSDN:
The IGrouping objects are yielded in an order based on
the order of the elements in source that produced the first key of
each IGrouping. Elements in a grouping are yielded in
the order they appear in source.
You could also implement a custom add method to maintain the latest records:
public class IdentifierList : List<Identifier>
{
public void Add(Identifier item)
{
this.RemoveAll(x => x.Name == item.Name);
base.Add(item);
}
}
Identifier idn1 = new Identifier { Name = "X" };
Identifier idn2 = new Identifier { Name = "Y" };
Identifier idn3 = new Identifier { Name = "Z" };
Identifier idn4 = new Identifier { Name = "X" };
Identifier idn5 = new Identifier { Name = "P" };
Identifier idn6 = new Identifier { Name = "X" };
IdentifierList list = new IdentifierList ();
list.Add(idn1);
list.Add(idn2);
list.Add(idn3);
list.Add(idn4);
list.Add(idn5);
list.Add(idn6);
You could Group and check if any Count is > 1
var distinctWorked = !(res
.GroupBy(a => a.Name)
.Select(g => new{g.Key, Count = g.Count()})
.Any(a => a.Count > 1));

Categories