Join two tables using linq, and fill a Dictionary of them - c#

I've been searching how to join two tables (Data and DataValues, one to many) and fill a dictionary of type .
The records of Data(s) might be thousands (e.g. 500,000 or more) and each Data may have 10 to 20 DataValues which makes it a much heavier query, so the performance is really important here.
here is the code I've write:
// Passed via the arguments, for example, sensorIDs would contain:
int[] sensorIDs = { 0, 1, 2, 3, 4, 5, 6, 17, 18 };
Dictionary<Data, List<DataValue>> dict = new Dictionary<Data, List<DataValue>>();
foreach (Data Data in dt.Datas)
{
var dValues = from d in dt.Datas
join dV in dt.DataValues on d.DataID equals dV.DataID
where (SensorIDs.Contains(dV.SensorID))
select dV;
dict.Add(Data, dValues.ToList<DataValue>());
}
But this approach has a significant performance issue and takes a long time to execute.
Not sure if I need to use SQL Views. any suggestions?

You're querying way too many times. You can do this in one query.
var dict = (from d in dt.Datas
join dV in dt.DataValues on d.DataID equals dv.DataID
where SensorIDs.Contains(dv.SensorID)
select new { d, dV }).ToDictionary(o => o.d, o => o.dV.ToList());
In your foreach loop, you are fetching all Data and for each of them, you are doing the same thing.
Edit: Now that wasn't very clear, but I think you want to join only the DataValues that are in the SensorIDs array. In this case:
var dict = (from d in dt.Datas
let dV = (from dataValue in dt.DataValues
where SensorIDs.Contains(dataValue.SensorID) &&
dataValue.DataID = d.DataID
select dataValue)
select new { d, dV }).ToDictionary(o => o.d, o => o.dV.ToList());

You do not need a foreach loop in this case, you can use group join to create the dictionary straight from linq which should give you better performance.
dict=(from DataValue d in dt.DataValues
where sensorIDs.Contains(d.SensorID)
group d by d.DataID
into datavalues
join data in dt.Datas
on datavalues.Key equals data.DataId
select new {
Key = data,
Value = datavalues
}).ToDictionary(a=>a.Key,a=>a.Value.ToList());
or you can use linq expression methods
dict = dt.DataValues.Where(d=>sensorIDs.Contains(d.SensorID))
.GroupBy(a=>a.DataID)
.Join(dt.Datas,a=>a.Key,a=>a.DataId,
(a,b)=>new{Key=b,Value=a.ToList()})
.ToDictionary(a=>a.Key,a=>a.Value);

You don't need foreach loop. Try something like this in general:
var columns = dt.Columns.Cast<DataColumn>();
dt.AsEnumerable().Select(dataRow => columns.Select(column =>
new { Column = column.ColumnName, Value = dataRow[column] })
.ToDictionary(data => data.Column, data => data.Value));
Also, consider reading this: http://blogs.teamb.com/craigstuntz/2010/01/13/38525/

Related

Data in Linq query not in join is not in output to json only those that are related in 2 classes are showing up

The basis of this question is from this question:
Combine 2 classes with adding data and 1 table has a colllection list of the other table and wanting to use linq to display
In which I "thought" the problem was solved.
However as I added in a new object to the List, now this join query does not output it
reportData.Add(new ReportData() {ReportGroupId = 3, ReportGroupName = "Straggler", SortOrder = 3, Type = 1});
var reports = reportDefinition.GroupBy(r=>r.ReportGroupId);
var query = reportData.Join(reports, d => d.ReportGroupId, gr => gr.Key,
(r,gr) => new
{
r.ReportGroupName,
items = gr.ToList(),
r.ReportGroupId
});
Here is the dotNetFiddle https://dotnetfiddle.net/IIBFKG
Why doesn't the item that I added to the ReportData not show up? Is it the type of JOIN in Linq?
I think the linked question was not answered correctly.
Looks like all you need is a simple Group Join:
var query =
from d in reportData
join r in reportDefinition on d.ReportGroupId equals r.ReportGroupId into items
select new
{
d.ReportGroupName,
items = items.ToList(),
d.ReportGroupId
};

Unioning two LINQ queries

I just need to make full outer join with Linq, But When i union two quires i get this error:
Instance argument: cannot convert from 'System.Linq.IQueryable' to 'System.Linq.ParallelQuery
And here is my full Code:
using (GoodDataBaseEntities con = new GoodDataBaseEntities())
{
var LeftOuterJoin = from MyCustomer in con.Customer
join MyAddress in con.Address
on MyCustomer.CustomerId equals MyAddress.CustomerID into gr
from g in gr.DefaultIfEmpty()
select new { MyCustomer.CustomerId, MyCustomer.Name, g.Address1 };
var RightOuterJoin = from MyAddress in con.Address
join MyCustomer in con.Customer
on MyAddress.CustomerID equals MyCustomer.CustomerId into gr
from g in gr.DefaultIfEmpty()
select new { MyAddress.Address1, g.Name };
var FullOuterJoin = LeftOuterJoin.Union(RightOuterJoin);
IEnumerable myList = FullOuterJoin.ToList();
GridView1.DataSource = myList;
GridView1.DataBind();
}
The types of your two sequences are not the same, so you can't do a Union.
new { MyCustomer.CustomerId, MyCustomer.Name, g.Address1 };
new { MyAddress.Address1, g.Name };
Try making sure that the fields have the same names and types in the same order.
Why not select it all as one thing? Depending on your setup (i.e., if you have foreign keys properly set up on your tables), you shouldn't need to do explicit joins:
var fullJoin = from MyCustomer in con.Customer
select new {
MyCustomer.CustomerId,
MyCustomer.Name,
MyCustomer.Address.Address1,
MyCustomer.Address.Name
};
Method syntax:
var fullJoin = con.Customers.Select(x => new
{
x.CustomerId,
x.Name,
x.Address.Address1,
x.Address.Name
});
union appends items from one collection to the end of another collection, so if each collection had 5 items, the new collection will have 10 items.
What you seem to want is to end up with 5 rows with more infomration is each. That's not a job for Union. You might be able to do it with Zip(), but you'll really be best with the single query as shown by DLeh.

Controlling LINQ query order based on other list of integers

I'm doing a LINQ query where I select the video info from table Videos. The query selects only those videos whose IDs are present in the following list:
List<int> results; //Has some values
var query = from l in dataContext.Videos
where results.Contains(l.ID)
select l;
Now how do I order the items(Video infos) in the query such the their IDs follow the same order as the List results?
I am able to do this as:
List<int> results; //Has some values
var query = from k in results
from l in dataContext.Videos
where k==l.ID
select l;
But this is slow, I need something faster.
Use a join, it's much faster
var orderedByIDList = from k in results
join l in dataContext.Videos
on k equals l.Id
select l;
Addon/Edit due to #MarcinJuraszek and #Phil comments, thanks guys.
Basically grab your data first, then sort so here's what I got:
var myList = (from l in dataContext.Videos
where results.Contains(l.ID)
select l).ToList(); //grab data and resolve to list or array
var orderedByIDList = from k in results
join l in myList
on k equals l.Id
select l; //result type IEnumerable<Video>
Here's my alternative attempt (probably not as fast as a join), which retrieves the minimum set of rows and then orders the data locally.
var results = new List<int>{ 9, 2, 3, 6, 8 };
// record the original order
var results2 = results.Select ((r, index) => new {r, index});
// get results and convert to list
var videos = dataContext.Videos.Where(v => results.Contains(v.Id)).ToList();
// order according to results order
var ordered = videos.Select (v =>
new {v, results2.Single (r => r.r == v.Id).index})
.OrderBy (v => v.index).Select (v => v.v)

Count occurrences of values across multiple columns

I am having a terrible time finding a solution to what I am sure is a simple problem.
I started an app with data in Lists of objects. It's pertinent objects used to look like this (very simplified):
class A {
int[] Nums;
}
and
List<A> myListOfA;
I wanted to count occurrences of values in the member array over all the List.
I found this solution somehow:
var results
from a in myListOfA
from n in a.Nums
group n by n into g
orderby g.Key
select new{ number = g.Key, Occurences = g.Count}
int NumberOfValues = results.Count();
That worked well and I was able to generate the histogram I wanted from the query.
Now I have converted to using an SQL database. The table I am using now looks like this:
MyTable {
int Value1;
int Value2;
int Value3;
int Value4;
int Value5;
int Value6;
}
I have a DataContext that maps to the DB.
I cannot figure out how to translate the previous LINQ statement to work with this. I have tried this:
MyDataContext myContext;
var results =
from d in myContext.MyTable
from n in new{ d.Value1, d.Value2, d.Value3, d.Value4, d.Value5, d.Value6 }
group n by n into g
orderby g.Key
select new { number = g.Key, Occurences = g.Count() };
I have tried some variations on the constructed array like adding .AsQueryable() at the end - something I saw somewhere else. I have tried using group to create the array of values but nothing works. I am a relative newbie when it come to database languages. I just cannot find any clue anywhere on the web. Maybe I am not asking the right question. Any help is appreciated.
I received help on a microsoft site. The problem is mixing LINQ to SQL with LINQ to Objects.
This is how the query should be stated:
var results =
from d in MyContext.MyTable.AsEnumerable()
from n in new[]{d.Value1, d.Value2, d.Value3, d.Value4, d.Value5, d.Value6}
group n by n into g
orderby g.Key
select new {number = g.Key, Occureneces = g.Count()};
Works like a charm.
If you wish to use LINQ to SQL, you could try this "hack" that I recently discovered. It isn't the prettiest most cleanest code, but at least you won't have to revert to using LINQ to Objects.
var query =
from d in MyContext.MyTable
let v1 = MyContext.MyTable.Where(dd => dd.ID == d.ID).Select(dd => dd.Value1)
let v2 = MyContext.MyTable.Where(dd => dd.ID == d.ID).Select(dd => dd.Value2)
// ...
let v6 = MyContext.MyTable.Where(dd => dd.ID == d.ID).Select(dd => dd.Value6)
from n in v1.Concat(v2).Concat(v3).Concat(v4).Concat(v5).Concat(v6)
group 1 by n into g
orderby g.Key
select new
{
number = g.Key,
Occureneces = g.Count(),
};
How about creating your int array on the fly?
var results =
from d in myContext.MyTable
from n in new int[] { d.Value1, d.Value2, d.Value3, d.Value4, d.Value5, d.Value6 }
group n by n into g
orderby g.Key
select new { number = g.Key, Occurences = g.Count() };
In a relational database, such as SQL Server, collections are represented as tables. So you should actually have two tables - Samples and Values. The Keys table would represent a single "A" object, while the Values table would represent each element in A.Nums, with a foreign key pointing to the one of the records in the Samples table. LINQ to SQL
's O/R mapper will then create a "Values" property for each Sample object, which contains a queryable collection of the attached Values. You would then use the following query:
var results =
from sample in myContext.Samples
from value in sample.Values
group value by value into values
orderby values.Key
select new { Value = values.Key, Frequency = values.Count() };

C# LINQ Query - Group By

I'm having a hard time understanding how I can form a LINQ query to do the following:
I have a table CallLogs and I want to get back a single result which represents the call that has the longest duration.
The row looks like this:
[ID] [RemoteParty] [Duration]
There can be multiple rows for the same RemoteParty, each which represents a call of a particular duration. I'm wanting to know which RemoteParty has the longest total duration.
Using LINQ, I got this far:
var callStats = (from c in database.CallLogs
group c by c.RemoteParty into d
select new
{
RemoteParty = d.Key,
TotalDuration = d.Sum(x => x.Duration)
});
So now I have a grouped result with the total duration for each RemoteParty but I need the maximum single result.
[DistinctRemoteParty1] [Duration]
[DistinctRemoteParty2] [Duration]
[DistinctRemotePartyN] [Duration]
How can I modify the query to achieve this?
Order the result and return the first one.
var callStats = (from c in database.CallLogs
group c by c.RemoteParty into d
select new
{
RemoteParty = d.Key,
TotalDuration = d.Sum(x => x.Duration)
});
callStats = callStats.OrderByDescending( a => a.TotalDuration )
.FirstOrDefault();
Have a look at the "Max" extension method from linq
callStats.Max(g=>g.TotalDuration);

Categories