Having trouble with LINQ left join on List<FileInfo> - c#

I have two List<FileInfo> lists, SourceFiles and DestFiles. I want to build a LINQ query that will return a list of the items whose filenames are in Source but not in Dest, i.e. a left join.
My data set for SourceFiles is:
folder1\a.txt
folder1\b.txt
folder1\c.txt
folder1\d.txt
DestFiles is:
folder2\a.txt
folder2\b.txt
folder2\c.txt
so the query should return folder1\d.txt.
Following the MSDN example, I've tried using LINQ syntax:
var queryX = from s in SourceFiles
join d in DestFiles
on s.Name equals d.Name
into SourceJoinDest
from joinRow in SourceJoinDest.DefaultIfEmpty()
select new
{
joinRow.FullName
};
and using extension methods:
var query = SourceFiles.GroupJoin(DestFiles,
source => source.Name,
dest => dest.Name,
(source,dest) => new
{
path = source.FullName
}).Select(x => x.path.DefaultIfEmpty())
But neither one of these work; the LINQ syntax version returns Object reference not sent to an instance of an object and the extension version returns Enumeration yielded no results.
I realize that these queries are only returning sets of FullName properties and not the full FileInfo objects; I have code that takes each FullName and returns a FileInfo, and does this for each item in the query to rebuild the list. But if there's a way to return a FileInfo directly from the query, that would be great.

I don't think Join is the ideal tool here. Basically you're looking for an Except. The built in Except doesn't have the overload to specify your properties through lambda. You will have to create your own IEqualityComparer. You could do it, however, like this:
var excepts = SourceFiles.Where(c => !DestFiles.Any(p => p.Name == c.Name)).ToList();
Or, to select just the full path, you can use Select at the end.
var excepts = SourceFiles.Where(c => !DestFiles.Any(p => p.Name == c.Name))
.Select(f => f.FullName).ToList();
I would suggest having extension methods to do quick Except and Intersect.
public static IEnumerable<U> Except<R, S, T, U>(this IEnumerable<R> mainList,
IEnumerable<S> toBeSubtractedList,
Func<R, T> mainListFunction,
Func<S, T> toBeSubtractedListFunction,
Func<R, U> resultSelector)
{
return EnumerateToCheck(mainList, toBeSubtractedList, mainListFunction,
toBeSubtractedListFunction, resultSelector, false);
}
static IEnumerable<U> EnumerateToCheck<R, S, T, U>(IEnumerable<R> mainList,
IEnumerable<S> secondaryList,
Func<R, T> mainListFunction,
Func<S, T> secondaryListFunction,
Func<R, U> resultSelector,
bool ifFound)
{
foreach (var r in mainList)
{
bool found = false;
foreach (var s in secondaryList)
{
if (object.Equals(mainListFunction(r), secondaryListFunction(s)))
{
found = true;
break;
}
}
if (found == ifFound)
yield return resultSelector(r);
}
//or may be just
//return mainList.Where(r => secondaryList.Any(s => object.Equals(mainListFunction(r), secondaryListFunction(s))) == ifFound)
// .Select(r => resultSelector(r));
//but I like the verbose way.. easier to debug..
}
public static IEnumerable<U> Intersect<R, S, T, U>(this IEnumerable<R> mainList,
IEnumerable<S> toIntersectList,
Func<R, T> mainListFunction,
Func<S, T> toIntersectListFunction,
Func<R, U> resultSelector)
{
return EnumerateToCheck(mainList, toIntersectList, mainListFunction,
toIntersectListFunction, resultSelector, true);
}
Now in your case you can do just:
var excepts = SourceFiles.Except(DestFiles, p => p.Name, p => p.Name, p => p.FullName)
.ToList();

Instead of using a join you might be able to handle this with .Except()
var enumerable = sourceFiles.Except(destFiles, new FileInfoComparer<FileInfo>((f1, f2)=>f1.Name == f2.Name, f=>f.Name.GetHashCode()));
.Except() takes an IEqualityComparer<T> which you can write yourself or use a wrapper that takes a lambda.
class FileInfoComparer<T> : IEqualityComparer<T>
{
public FileInfoComparer(Func<T, T, bool> equals, Func<T, int> getHashCode)
{
_equals = equals;
_getHashCode = getHashCode;
}
readonly Func<T, T, bool> _equals;
public bool Equals(T x, T y)
{
return _equals(x, y);
}
readonly Func<T, int> _getHashCode;
public int GetHashCode(T obj)
{
return _getHashCode(obj);
}
}
Running it with a few sample data results in the one FileInfo object which contains "d.txt"

You almost did it. But you need to take only those source files, which do not have joined destination files:
var query = from s in SourceFiles
join d in DestFiles
on s.Name equals d.Name into g
where !g.Any() // empty group!
select s;

Related

Merge two list based on Id [duplicate]

I am taking a union of two lists using Linq to Sql. Using List1 and List2:
var tr = List1.Union(List2).ToList();
Union works fine, but the problem is it is checking each column and removes some of the rows that I want. So I was wondering if there is a a way I can perform a union based on one column only, like let's say id, of each list?
Something Like:
var t = List1.id.Union(List2.id).ToList();
This doesn't work, but I was wondering if there is a way to do this, either with LINQ or T-SQL
You should use this Union() overload (with a custom equality comparer) , or something like this:
list1.Concat(list2).GroupBy(x => x.DateProperty).Select(m => m.First());
The first solution is certainly more efficient.
Sure, you need a custom IEqualityComparer with Union. I have one that's really dynamic, big block of code incoming though:
public class PropertyEqualityComparer<TObject, TProperty>
: IEqualityComparer<TObject>
{
Func<TObject, TProperty> _selector;
IEqualityComparer<TProperty> _internalComparer;
public PropertyEqualityComparer(Func<TObject, TProperty> propertySelector,
IEqualityComparer<TProperty> innerEqualityComparer = null)
{
_selector = propertySelector;
_internalComparer = innerEqualityComparer;
}
public int GetHashCode(TObject obj)
{
return _selector(obj).GetHashCode();
}
public bool Equals(TObject x, TObject y)
{
IEqualityComparer<TProperty> comparer =
_internalComparer ?? EqualityComparer<TProperty>.Default;
return comparer.Equals(_selector(x), _selector(y));
}
}
public static class PropertyEqualityComparer
{
public static PropertyEqualityComparer<TObject, TProperty>
GetNew<TObject, TProperty>(Func<TObject, TProperty> propertySelector)
{
return new PropertyEqualityComparer<TObject, TProperty>
(propertySelector);
}
public static PropertyEqualityComparer<TObject, TProperty>
GetNew<TObject, TProperty>
(Func<TObject, TProperty> propertySelector,
IEqualityComparer<TProperty> comparer)
{
return new PropertyEqualityComparer<TObject, TProperty>
(propertySelector, comparer);
}
}
Now, all you need to do is call Union with that equality comparer (instantiated with a lambda that fits your circumstance):
var tr = List1.Union(List2, PropertyEqualityComparer.GetNew(n => n.Id)).ToList();
try somthing this
var List3 = List1.Join(
List2,
l1 => l1.Id,
l2 => l2.Id,
(l1, l2) => new Model
{
Id = l1.Id,
Val1 = l1.Val1 or other,
Val2 = l2.Val2 or other
});
for more details you can show your model
Try this:
var merged = new List<Person>(list1);
merged.AddRange(list2.Where(p2 =>
list1.All(p1 => p1.Id != p2.Id)));

How to pass delegates as <T> parameter in function

public delegate bool CompareValue<in T1, in T2>(T1 val1, T2 val2);
public static bool CompareTwoLists<T1, T2>(IEnumerable<T1> list1, IEnumerable<T2> list2, CompareValue<T1, T2> compareValue)
{
return list1.Select(item1 => list2.Any(item2 => compareValue(item1, item2))).All(search => search)
&& list2.Select(item2 => list1.Any(item1 => compareValue(item1, item2))).All(search => search);
}
In the above function; how to pass "compareValue" as parameter while calling "CompareTwoLists" function?
With a lambda expression that matches the delegate:
var people = new List<Person>();
var orders = new List<Order>();
bool result = CompareTwoLists(people, orders,
(person, order) => person.Id == order.PersonId);
Or as a reference to a method that matches the delegate:
static bool PersonMatchesOrder(Person person, Order order)
{
return person.Id == order.PersonId;
}
bool result = CompareTwoLists(people, orders, PersonMatchesOrder);
You need to create a method (Normal or Anonymous) that matches that delegate's signature. Below is a sample:
var list1 = new List<string>();
var list2 = new List<int>();
CompareValue<string, int> compareValues = (x, y) => true;
CompareTwoLists(list1, list2, compareValues);
You can also replace the anonymous method, with a normal method:
CompareValue<string, int> compareValues = SomeComparingMethod;
static bool SomeComparingMethod(string str, int number)
{
// code here
}
Another Approach
You can change your method to use Func:
public static bool CompareTwoLists<T1, T2>(IEnumerable<T1> list1, IEnumerable<T2> list2,
Func<T1, T2, bool> compareValue)
{
return list1.All(x => list2.Any(y => compareValue(x, y)))
&& list2.All(x => list1.Any(y => compareValue(y, x)));
}
And Change the caller method to:
Func<User, Role, bool> compareValues =
(u, r) => r.Active
&& u.Something == r.Something
&& u.SomethingElse != r.SomethingElse);

How to sort list on multiple properties in one line of code in Linq

I want to sort list on on multiple properties. I know I can use
List<Order> l = source.OrderBy(c=> c.Property1).ThenBy(c=> c.Property2).ToList();
However, it is for ascending only. If I want to sort list for descending, I need another code
List<Order> l = source.OrderByDescending(c=> c.Property1).ThenByDescending(c=> c.Property2).ToList();
And if I want to sort property1 for ascending and property2 for descending, I have to use
List<Order> l = source.OrderBy(c=> c.Property1).ThenByDescending(c=> c.Property2).ToList();
For sort on 2 properties, I need 4 different codes and on 3 properties, I need 9 different codes. There is not good. I want to know if there is a way to do the sorting in one code. Thanks.
The System.Linq.Dynamic package provides an OrderBy extension method that just takes a string:
using System.Linq.Dynamic;
//...
var l = source.OrderBy("Property1 ascending, Property2 descending").ToList();
You can use it to build elaborate expressions on the fly:
string orderByClause = string.Format("Property1 {0}, Property2 {1}", "ascending", "descending");
var l = source.OrderBy(orderByClause).ToList();
See IQueryable Extension Methods for more info in the project Wiki
You can write your own LINQ extension to determine which method to pick. For example:
public static class LinqExtensions
{
public static IOrderedQueryable<TSource> OrderBy<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> keySelector, bool ascending)
{
return ascending ? source.OrderBy(keySelector) : source.OrderByDescending(keySelector);
}
}
And then you can use it like this:
source.OrderBy(a => a.Id, ascending);
As I mention in the comments you can create something like this...
public static class LinqExtension
{
public static IOrderedEnumerable<TSource> OrderByAsc<TSource, TKey>(
this IEnumerable<TSource> source,
Expression<Func<T, object>> expression)
{
IEnumerable<TSource> newEnumerable = source;
int ctr = 0;
NewArrayExpression array = expression.Body as NewArrayExpression;
foreach( object obj in ( IEnumerable<object> )( array.Expressions ) )
{
if(ctr == 0)
newEnumerable = newEnumerable
.OrderBy(item => item.GetType().GetProperty(obj.ToString()).GetValue(item, null));
else
newEnumerable = newEnumerable
.ThenBy(item => item.GetType().GetProperty(obj.ToString()).GetValue(item, null));
ctr++;
}
return newEnumerable;
}
public static IOrderedEnumerable<TSource> OrderByDesc<TSource, TKey>(
this IEnumerable<TSource> source,
Expression<Func<T, object>> expression)
{
IEnumerable<TSource> newEnumerable = source;
NewArrayExpression array = expression.Body as NewArrayExpression;
int ctr = 0;
foreach( object obj in ( IEnumerable<object> )( array.Expressions ) )
{
if(ctr == 0)
newEnumerable = newEnumerable
.OrderByDescending(item => item.GetType().GetProperty(obj.ToString()).GetValue(item, null));
else
newEnumerable = newEnumerable
.ThenByDescending(item => item.GetType().GetProperty(obj.ToString()).GetValue(item, null));
ctr++;
}
return newEnumerable;
}
}
And use it like this:
using myNameSpace.LinqExtensions;
...
l = l.OrderByAsc(item => new [] { item.Prop1, item.Prop2, item.Prop3});
l = l.OrderByDesc(item => new [] { item.Prop1, item.Prop2, item.Prop3});
Of course you can try creating an extension to combine both order by ascending and descending into a single method. Just an idea.
DISCLAIMER: I wrote this on notepad and haven't tested it due to lack of resources.

Taking union of two lists based on column

I am taking a union of two lists using Linq to Sql. Using List1 and List2:
var tr = List1.Union(List2).ToList();
Union works fine, but the problem is it is checking each column and removes some of the rows that I want. So I was wondering if there is a a way I can perform a union based on one column only, like let's say id, of each list?
Something Like:
var t = List1.id.Union(List2.id).ToList();
This doesn't work, but I was wondering if there is a way to do this, either with LINQ or T-SQL
You should use this Union() overload (with a custom equality comparer) , or something like this:
list1.Concat(list2).GroupBy(x => x.DateProperty).Select(m => m.First());
The first solution is certainly more efficient.
Sure, you need a custom IEqualityComparer with Union. I have one that's really dynamic, big block of code incoming though:
public class PropertyEqualityComparer<TObject, TProperty>
: IEqualityComparer<TObject>
{
Func<TObject, TProperty> _selector;
IEqualityComparer<TProperty> _internalComparer;
public PropertyEqualityComparer(Func<TObject, TProperty> propertySelector,
IEqualityComparer<TProperty> innerEqualityComparer = null)
{
_selector = propertySelector;
_internalComparer = innerEqualityComparer;
}
public int GetHashCode(TObject obj)
{
return _selector(obj).GetHashCode();
}
public bool Equals(TObject x, TObject y)
{
IEqualityComparer<TProperty> comparer =
_internalComparer ?? EqualityComparer<TProperty>.Default;
return comparer.Equals(_selector(x), _selector(y));
}
}
public static class PropertyEqualityComparer
{
public static PropertyEqualityComparer<TObject, TProperty>
GetNew<TObject, TProperty>(Func<TObject, TProperty> propertySelector)
{
return new PropertyEqualityComparer<TObject, TProperty>
(propertySelector);
}
public static PropertyEqualityComparer<TObject, TProperty>
GetNew<TObject, TProperty>
(Func<TObject, TProperty> propertySelector,
IEqualityComparer<TProperty> comparer)
{
return new PropertyEqualityComparer<TObject, TProperty>
(propertySelector, comparer);
}
}
Now, all you need to do is call Union with that equality comparer (instantiated with a lambda that fits your circumstance):
var tr = List1.Union(List2, PropertyEqualityComparer.GetNew(n => n.Id)).ToList();
try somthing this
var List3 = List1.Join(
List2,
l1 => l1.Id,
l2 => l2.Id,
(l1, l2) => new Model
{
Id = l1.Id,
Val1 = l1.Val1 or other,
Val2 = l2.Val2 or other
});
for more details you can show your model
Try this:
var merged = new List<Person>(list1);
merged.AddRange(list2.Where(p2 =>
list1.All(p1 => p1.Id != p2.Id)));

LINQ: is there a way to supply a predicate with more than one parameter to where clause

wondering if there is a way to do the following:
I basically want to supply a predicate to a where clause with more than one paremeters like the following:
public bool Predicate (string a, object obj)
{
// blah blah
}
public void Test()
{
var obj = "Object";
var items = new string[]{"a", "b", "c"};
var result = items.Where(Predicate); // here I want to somehow supply obj to Predicate as the second argument
}
var result = items.Where(i => Predicate(i, obj));
The operation you want is called "partial evaluation"; it is logically related to "currying" a two-parameter function into two one-parameter functions.
static class Extensions
{
static Func<A, R> PartiallyEvaluateRight<A, B, R>(this Func<A, B, R> f, B b)
{
return a => f(a, b);
}
}
...
Func<int, int, bool> isGreater = (x, y) => x > y;
Func<int, bool> isGreaterThanTwo = isGreater.PartiallyEvaluateRight(2);
And now you can use isGreaterThanTwo in a where clause.
If you wanted to supply the first argument then you could easily write PartiallyEvaluateLeft.
Make sense?
The currying operation (which partially applies to the left) is usually written:
static class Extensions
{
static Func<A, Func<B, R>> Curry<A, B, R>(this Func<A, B, R> f)
{
return a => b => f(a, b);
}
}
And now you can make a factory:
Func<int, int, bool> greaterThan = (x, y) => x > y;
Func<int, Func<int, bool>> factory = greaterThan.Curry();
Func<int, bool> withTwo = factory(2); // makes y => 2 > y
Is that all clear?
Do you expect something like this
public bool Predicate (string a, object obj)
{
// blah blah
}
public void Test()
{
var obj = "Object";
var items = new string[]{"a", "b", "c"};
var result = items.Where(x => Predicate(x, obj)); // here I want to somehow supply obj to Predicate as the second argument
}

Categories