I have a List<Thing> things, where a number of Things need to be frequently retrieved by looking up a combination of two variables T1 f1 and T2 f2, which are value types. They way I do that now is simply things.Where(t => t.Field1 == f1 && t.Field2 == f2). However, I do extremely many of those lookups frequently, and need a more effective method.
Fortunately, things does not need to have elements removed or added, so I thought of parsing the list on construction and add to a Dictionary<T1, Lookup<T2, Thing>>. However, this feels messy, especially with the added parsing. And it gets really hairy if I need to lookup even more fields. Three fields would look like Dictionary<T1, Dictionary<T2, Lookup<T3, Thing>>>.
My next thought was to make a Lookup<Tuple<T1,T2,T3,...>,Thing>. But in this case, I am not sure whether the keys will actually work because Tuple is a reference type.
Even if I make a Lookup<ValueType<T1,T2,T3,...>,Thing> things, the lookup statement will be something like things[new ValueType<T1,T2,T3,...>(f1, f2, f3, ...)] which is pretty ugly (and I am still not sure whether I could trust those keys).
Is there a more elegant solution to this which keeps the performance benefits of a hashtable and where I could simply type something like IEnumerable<Thing> found = things[f1, f2, f3, ...];?
Lookup<Tuple<T1,T2,T3,...>,Thing> will work, since Tuple overrides Equals and GetHashCode.
To make the lookup syntax less ugly, you can use Tuple.Create which supports type inference. Your code becomes things[Tuple.Create(f1, f2, f3, ...)]. If that's still too ugly, it's trivial to add a helper method that takes the individual values as parameters.
I'd also consider creating my own immutable class(or value type) for the key, so you get clean field names instead of ItemX. You just need to override Equals and GetHashCode consistently.
You can create multiple lookups, and then intersect them to do your searches. Here is a somewhat oversimplified example, but it should illustrate the idea:
class Test {
public string A { get; set; }
public string B { get; set; }
public string C { get; set; }
}
var list = new List<Test> {
new Test {A = "quick", B = "brown", C = "fox"}
, new Test {A = "jumps", B = "over", C = "the"}
, new Test {A = "lazy", B = "dog", C = "quick"}
, new Test {A = "brown", B = "fox", C = "jumps"}
, new Test {A = "over", B = "the", C = "lazy"}
, new Test {A = "dog", B = "quick", C = "brown"}
, new Test {A = "fox", B = "jumps", C = "over"}
, new Test {A = "the", B = "lazy", C = "dog"}
, new Test {A = "fox", B = "brown", C = "quick"}
, new Test {A = "the", B = "over", C = "jumps"}
, new Test {A = "quick", B = "dog", C = "lazy"}
, new Test {A = "jums", B = "fox", C = "brown"}
, new Test {A = "lazy", B = "the", C = "over"}
, new Test {A = "brown", B = "quick", C = "dog"}
, new Test {A = "over", B = "jumps", C = "fox"}
, new Test {A = "dog", B = "lazy", C = "the"}
};
var byA = list.ToLookup(v => v.A);
var byB = list.ToLookup(v => v.B);
var byC = list.ToLookup(v => v.C);
var all = byA["quick"].Intersect(byB["dog"]);
foreach (var test in all) {
Console.WriteLine("{0} {1} {2}", test.A, test.B, test.C);
}
all = byA["fox"].Intersect(byC["over"]);
foreach (var test in all) {
Console.WriteLine("{0} {1} {2}", test.A, test.B, test.C);
}
This prints
quick dog lazy
fox jumps over
Have you considered using a hash table with some kind of combination of the Fields as the key? I don't know enough about your data set to say if this is viable or not. Since the keys would need to be unique. But since you're not doing additions or removals using a hash table for look ups in memory is about as fast as you can get.
If i got you right, you can use Hashtable with Tuple, example below:
// populate Hastable
var hash = new Hashtable();
var tuple = Tuple.Create("string", 1, 1.0);
hash.Add(tuple,tuple);
// search for item you want
var anotherTuple = Tuple.Create("string", 1, 1.0);
// result will be tuple declared above
var result = hash[anotherTuple];
more complex solution (if duplicate keys needed):
public class Thing
{
public int Value1 { get; set; }
public double Value2 { get; set; }
public string Value3 { get; set; }
// preferable to create own Equals and GetHashCode methods
public Tuple<int, double> GetKey()
{
// create key on fields you want
return Tuple.Create(Value1, Value2);
}
}
usage
var t1 = new Thing() {Value1 = 1, Value2 = 1.0, Value3 = "something"};
var t2 = new Thing() {Value1 = 1, Value2 = 2.0, Value3 = "something"};
var hash = new [] { t1, t2 }.ToLookup(item => item.GetKey());
var criteria = new Thing() { Value1 = 1, Value2 = 2.0, value3 = "bla-bla-bla" };
var r = hash[criteria.GetKey()]; // will give you t1
The Linq Where or Dictionary of Dictionaries is probably the prettiest you are going to get. But it may be more of a question of how you are organising your data.
E.G. This never going to be a pretty way of accessing people data:
people["FirstName"]["LastName"]
It is usually better so try and come up with a simpler key.
Related
I have the code below
var allA = // holds a List<classA>
var allB = //holds a List<ClassB>
var res = from A in allA
join B in allB on A.Id equals B.Id
select new Tuple<string,string,string,string,string>
(B.val1,B.val2,A.val1,A.val2,A.val3);
var resList = res as List<Tuple<string, string, string, string, string>>;
Now the issue is, with the way im doing it I'd have to remember which item in my tuples hold what value. I don't why resList = res as a List<Tuple<...>> doesn't work either, it doesn't hold any values.
How can I structure this where I have a List<Tuple<ClassA,ClassB>> and in each tuple, ClassA and ClassB are the joined pair in Linq select statement?
Let's consider the following two classes and lists
class A {
public int Id {get;set;}
public string Name {get;set;}
}
class B {
public int Id {get;set;}
public decimal Size {get;set;}
}
(...)
var la = new A[]{ new A { Id = 1, Name = "Snake"}, new A { Id = 2, Name = "Adam"}};
var lb = new B[]{ new B { Id = 1, Size = 0.8m}, new B { Id = 2, Size = 1}};
You can create an object with two properties:
var lab = from a in la
join b in lb on a.Id equals b.Id
select new {a, b}; // or select new { A = a, B = b};
I used anonymous type, but you can create a type the has two properties A and B and use that.
If you wish for a tuple, use a modern tuple with named fields:
select (A: a, B: b);
Having said that, maybe an object with the properties that you need is the best choice.
var lab = from a in la
join b in lb on a.Id equals b.Id
select new
{
Id = a.Id,
Name = a.Name,
Size = b.Size
};
I need make union between two LINQ queries, but the second query need have more fields that the first. How can I do it?
Example:
public static void Dummy()
{
var query1 = this.Db.Table1.Select(s => new MyObject() { A = s.Field1, B = s.Field2 });
var query2 = this.Db.Table2.Select(s => new MyObject() { A = s.Field1, B = s.Field2, C = s.Field3 });
var result = query1.Union(query2);
}
When I calls result.ToList(), occurs the following error:
The type 'MyObject' appears in two structurally incompatible
initializations within a single LINQ to Entities query. A type can be
initialized in two places in the same query, but only if the same
properties are set in both places and those properties are set in the
same order.
How Can I resolve this problem?
Obs.: I can't put the Field3 in the query1 (I don't have access to the query one, because this I Can't changed it)
You don't have to put Field3 in first query but Union requires same number of columns and in same order. Specify a dummy value for third column/field C like:
var query1 = this.Db.Table1.Select(s => new MyObject()
{ A = s.Field1, B = s.Field2 , C= ""});
Assign C whatever is the default value of Field3, may be null for reference type and 0 for numbers etc.
If you don't have access to it modify query1 then create a new query using query1 like:
var newQuery = query1.Select(s=> new MyObject()
{ A = A, B = B , C= ""});
and then use that in Union
var result = newQuery.Union(query2);
As-is, you can't. You can only union 2 sets that have the same structure. If you don't mind modifying query1, however:
var query1 = this.Db.Table1.Select(s => new MyObject()
{ A = s.Field1, B = s.Field2, C = null });
This would allow them to union properly, as they have the same structure.
You can do it, like this:
Create a object devired from MyObject
class MyObjectUnion : MyObject{
}
So, the method goes like this:
public static void Dummy()
{
var query1 = this.Db.Table1.Select(s => new MyObject() { A = s.Field1, B = s.Field2 });
var query1modified = this.Db.Table2.Select(s => new MyObjectUnion() { A = s.Field1, B = s.Field2, C = null });
var query2 = this.Db.Table2.Select(s => new MyObjectUnion() { A = s.Field1, B = s.Field2, C = s.Field3 });
var result = query1modified.Union(query2);
}
It works
Because records in query1 will never have a property "C", and all records in query2 will have a property "C", it is unlikely that a record in query1 will be equivalent to a record in query2. The only reason for using Union over Concat is to remove duplicates and since you can't have any, you should likely be using Concat instead of Union.
public static void Dummy()
{
var query1 = this.Db.Table1.Select(s => new MyObject() { A = s.Field1, B = s.Field2 });
var query2 = this.Db.Table2.Select(s => new MyObject() { A = s.Field1, B = s.Field2, C = s.Field3 });
var result = query1.ToList().Concat(query2);
}
There are exceptions, as if you have a custom IEqualityComparer for MyObject that ignores the "C" property, or the default for the "C" property may exist in a record for table2, and you wanted to remove the duplicate, or if there possibly exists duplicates within either query1 or query2 and you wanted them removed then you can still use Concat, but you need to use Distinct before the Concat.
Editted to force query1 to be materialized before concatenation via .ToList()
Double checked with LinqPad, and the following executable had no issues, using a datasource that had both Categories and Cities tables of which were completely different schemas:
void Main()
{
var query1 = Categories.Select(s => new MyObject { A = s.id, B = s.name });
var query2 = Cities.Select(s => new MyObject { A = s.id, B = s.city_name, C = s.location });
var result = query1.ToList().Concat(query2);
result.Dump();
}
public class MyObject
{
public int A {get;set;}
public string B {get;set;}
public object C {get;set;}
}
Please consider the following code segment:
var list = new string[] { "ab", "ab", "cd", "cd", "cd" };
var groups = list.GroupBy(l => l);
var count = groups.Count();
The results:
count: 2,
groups: [{ Key: "ab", elements: ["ab", "ab"] }, { Key: "cd", elements: ["cd", "cd", "cd"] }]
When I do the same for class X:
public class X
{
public int A { get; set; }
public string B { get; set; }
}
And the same algorithm is used in order to create the grouped results:
var list2 = new X[]
{
new X { A = 1, B = "b1" },
new X { A = 1, B = "b1" },
new X { A = 2, B = "b2" },
new X { A = 2, B = "b2" },
new X { A = 2, B = "b2" },
};
var groups2 = list2.GroupBy(l => l);
var count2 = groups2.Count();
I would expect the same behavior. I would say count2 is 2, and groups2 contains the two different distinct data sets with 2 and 3 elements respectively.
However when I run this, I get 5 as count and a list of groups containing one item each. Why is the different behavior? I would expect the same aggregation algorithm to behave the same.
Thanks in advance for the explanation.
GroupBy uses default equality comparer for the type unless you provide any implementation.The default comparer for reference types only return true if they are same instances, meaning they have same references. If this is not the behaviour you want you have two choices:
Override Equals and GetHashCode methods in your clas
Implement an IEqualityComparer for your type and pass it to GroupBy
Suppose I have two Lists<myObject> where myObject consists of the two properties
Id (of type Int) and
Value (of type Double)
I need to get a list out of these two lists that is made of (anonymous) objects like this:
Id, [Double value from List 1], [Double value from List 2]
So if for a given Id both lists contain a value, it should look like this example:
12, 21.75, 19.87
If one list does not contain an object with an Id that is present in the other list, the value should be null:
15, null, 22.52
How can I achieve that?
Update: I know how I could get such a list, of course, but I'm looking for the most performant way to do it, preferrably by using some witty Linq magic.
Not sure how optimized this is, but should suit your needs - Assuming I understood what you wanted:
var enumerable1 = new[]
{
new {Id = "A", Value = 1.0},
new {Id = "B", Value = 2.0},
new {Id = "C", Value = 3.0},
new {Id = "D", Value = 4.0},
new {Id = "E", Value = 5.0},
};
var enumerable2 = new[]
{
new {Id = "A", Value = 6.0},
new {Id = "NOT PRESENT", Value = 542.23},
new {Id = "C", Value = 7.0},
new {Id = "D", Value = 8.0},
new {Id = "E", Value = 9.0},
};
var result = enumerable1.Join(enumerable2, arg => arg.Id, arg => arg.Id,
(first, second) => new {Id = first.Id, Value1 = first.Value, Value2 = second.Value});
foreach (var item in result)
Console.WriteLine("{0}: {1} - {2}", item.Id, item.Value1, item.Value2);
Console.ReadLine();
The resulting output would be something akin to:
A: 1 - 6
C: 3 - 7
D: 4 - 8
E: 5 - 9
Don't really see why you would want null values returned, unless you absolutely need to (Besides, double is not-nullable, so it would have to be the resulting combined entry that would be null instead).
The requirement is slightly unclear. Do you want a Cartesian product or a join on Id? If the latter, then this should work:
var result = from l1 in list1
join l2 in list2
on l1.Id equals l2.Id
select new {l1.Id, Value1 = l1.Value, Value2 = l2.Value};
If you actually want a full outer join, see this.
**Let say tempAllocationR is list 1 and tempAllocationV is List2 **
var tempAllocation = new List<Object>();
if (tempAllocationR.Count > 0 && tempAllocationV.Count > 0)
{
foreach (TempAllocation tv in tempAllocationV)
{
var rec = tempAllocationR.FirstOrDefault(tr => tr.TERR_ID == tv.TERR_ID && tr.TERR == tv.TERR && tr.Team == tv.Team);
if (rec != null)
{
rec.Vyzulta = tv.Vyzulta;
}
else
{
tempAllocationR.Add(tv);
}
}
tempAllocation = tempAllocationR;
}}
I have multiple sets of arrays that contain additional arrays that have values attached that I use for figuring out math. In order to find the best combination of these things, I need to mix and match from these arrays. I've seen "solutions" similar to this around, but they're usually 1 array deep with no real combinations/possibilities. So to give an example.
I have sets A, B, and C. Set A contains Aa, Ab, Ac, and Ad. Aa contains a set of values. Extrapolate that out for the others. Aa can only be compared with Ba and Ca. How do I go about writing a program to find all combinations(i.e. Aa, Ab, Cc, Bd compared with Ba, Cb, Ac, Bd and etc) so I can compare the math on each combination to find the best one? Note: this is just an example, I don't need it for specifically 3 sets of 4 sets of 4, it needs to be able to expand.
Now I know I didn't use very meaningful names for my variables, but I would appreciate if any code given does have meaningful names in it(I'd really rather not follow around variables of x and c around in code).
The accepted answer appears to be correct but is a very strange way to do a Cartesian product in C#. If you have a given number of sequences you can take their Cartesian product idiomatically like this:
var aList = new[] { "a1", "a2", "a3" };
var bList = new[] { "b1", "b2", "b3" };
var cList = new[] { "c1", "c2", "c3" };
var product = from a in aList
from b in bList
from c in cList
select new[] { a, b, c };
foreach (var p in product)
Console.WriteLine(string.Join(",", p));
If you have arbitrarily many sequences that you need to take their Cartesian product then you can do it like this:
static class Extensions
{
public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(
this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] {item}));
}
}
And then:
var aList = new[] { "a1", "a2", "a3" };
var bList = new[] { "b1", "b2", "b3" };
var cList = new[] { "c1", "c2", "c3" };
var lists = new[] { aList, bList, cList };
var product = lists.CartesianProduct();
foreach (var p in product)
Console.WriteLine(string.Join(",", p));
See
http://ericlippert.com/2010/06/28/computing-a-cartesian-product-with-linq/
and my answer to
Generating all Possible Combinations
for more discussion of this problem.
Assuming you are using a version of C# which supports LINQ:
static void Main(string[] args)
{
// declare some lists
var aList = new string[] { "a1", "a2", "a3" };
var bList = new string[] { "b1", "b2", "b3" };
var cList = new string[] { "c1", "c2", "c3" };
// do the equivalent of a SQL CROSS JOIN
var permutations = aList
.Join(bList, a => "", b => "", (a, b) => new string[] { a, b })
.Join(cList, ab => "", c => "", (ab, c) => new string[] { ab[0], ab[1], c });
// print the results
Console.WriteLine("Permutations:");
foreach (var p in permutations)
Console.WriteLine(string.Join(", ", p));
}
The Join calls with the lambda expressions pointing the strings to empty strings causes the Join function to treat the strings as equal, emulating a SQL CROSS JOIN.