whats the fastest way to write join in linq - c#

I have 2 classes user(count-10k),address(count-1million). these are like one to many.
i am trying to map the address for users.
Using List(takes few minutes):
List<User> us = usrs.Select(u => new User { id = u.id ,email=u.email,name=u.name,addresses=adrs.Where(a=>a.userid==u.id).ToList()}).ToList();
the above works but its very slow
i changed it to use dictionary and its fast.
Using Dictionary(takes few seconds):
var dusrs = usrs.ToDictionary(usr => usr.id);
var daddrs = adrs.ToDictionary(adr => Tuple.Create(adr.id,adr.userid));
foreach (var addr in daddrs)
{
var usr = dusrs[addr.Value.userid];
if (usr.addresses == null)
{
usr.addresses = new List<Address>();
}
usr.addresses.Add(addr.Value);
}
is there any way i can write better query using list rather than dictionary?
I am just trying to see if i can write better linq using lists
thanks...
vamsee

Assuming you are keeping users and addresses in Lists for some reason, you can use a join in LINQ which will combine the two lists and use a hashed data structure internally to put them together:
var us2 = (from u in usrs
join a in adrs on u.id equals a.userid into aj
select new User { id = u.id, email = u.email, name = u.name, addresses = aj.Select(a => a).ToList() }).ToList();
Alternatively, you can convert the addresses into a Lookup and use that, but it would probably be best to just keep the addresses in a Lookup or create them in a Lookup initially if possible:
var addressLookup = adrs.ToLookup(a => a.userid);
List<User> us = usrs.Select(u => new User { id = u.id, email=u.email, name=u.name, addresses=addressLookup[u.id].ToList() }).ToList();
In my test cases which is faster seems to depend on how many users versus addresses match.

Related

Join not working as expected (Entity Framework)

I can't for the life of me figure out how to join these two tables on UserName using entity framework.
I tried both the statement and the method and neither worked.
The tables definitely have the same user in them
var employees = _context.Employees.Include(e => e.Loc);
//Only show employees with a user role of manager
var managerUsers = await _userManager.GetUsersInRoleAsync("Manager");
var match = (from e in employees
join m in managerUsers on e.UserName equals m.UserName
select new { Employee = e }).ToList();
So, short code breakdown I get a list of all employees from the database context. I look in user roles to find a list of users with the Manager role. Employee also has a UserName field, and I tried to join them using the UserName field. There is one manager currently returning correctly in both tables with a matching username, yet after this code, match has 0 results.
I also tried it like this:
employees.Join(managerUsers,
e => e.UserName,
m => m.UserName,
(e,m) => new { e }).ToList();
But that also doesn't return any records. What am I doing wrong?
Figured out a solution myself
var managerEmployees = new List<Employee>();
for(int a = 0; a< selectManagersList.Count(); a++)
{
var found = await _context.Employees.FirstOrDefaultAsync(u=> u.UserName == managerUsers.ElementAt(a).UserName);
if (found!=null)
{
managerEmployees.Add(found);
}
}

The LINQ expression contains references to queries that are associated with different contexts

Here's my code:
var myStrings = (from x in db1.MyStrings.Where(x => homeStrings.Contains(x.Content))
join y in db2.MyStaticStringTranslations on x.Id equals y.id
select new MyStringModel()
{
Id = x.Id,
Original = x.Content,
Translation = y.translation
}).ToList();
And I get the error that the specified LINQ expression contains references to queries that are associated with different contexts. I know that the problem is that I try to access tables from both db1 and db2, but how do I fix this?
MyStrings is a small table
Load filtered MyStrings in memory, then join with MyStaticStringTranslations using LINQ:
// Read the small table into memory, and make a dictionary from it.
// The last step will use this dictionary for joining.
var byId = db1.MyStrings
.Where(x => homeStrings.Contains(x.Content))
.ToDictionary(s => s.Id);
// Extract the keys. We will need them to filter the big table
var ids = byId.Keys.ToList();
// Bring in only the relevant records
var myStrings = db2.MyStaticStringTranslations
.Where(y => ids.Contains(y.id))
.AsEnumerable() // Make sure the joining is done in memory
.Select(y => new {
Id = y.id
// Use y.id to look up the content from the dictionary
, Original = byId[y.id].Content
, Translation = y.translation
});
You are right that db1 and db2 can't be used in the same Linq expression. x and y have to be joined in this process and not by a Linq provider. Try this:
var x = db1.MyStrings.Where(xx => homeStrings.Contains(xx.Content)).ToEnumerable();
var y = db2.MyStaticStringTranslations.ToEnumerable();
var myStrings = (from a in x
join b in y on x.Id equals y.id
select new MyStringModel()
{
Id = x.Id,
Original = x.Content,
Translation = y.translation
}).ToList();
Refer to this answer for more details: The specified LINQ expression contains references to queries that are associated with different contexts
dasblinkenlight's answer has a better overall approach than this. In this answer I'm trying to minimize the diff against your original code.
I also faced the same problem:
"The specified LINQ expression contains references to queries that are associated with different contexts."
This is because it's not able to connect to two context at a time so i find the solution as below.
Here in this example I want to list the lottery cards with the owner name but the Table having the owner name is in another Database.So I made two context DB1Context and DB2Context.and write the code as follows:
var query = from lc in db1.LotteryCardMaster
from om in db2.OwnerMaster
where lc.IsActive == 1
select new
{
lc.CashCardID,
lc.CashCardNO,
om.PersonnelName,
lc.Status
};
AB.LottryList = new List<LotteryCardMaster>();
foreach (var result in query)
{
AB.LottryList.Add(new LotteryCardMaster()
{
CashCardID = result.CashCardID,
CashCardNO = result.CashCardNO,
PersonnelName =result.PersonnelName,
Status = result.Status
});
}
but this gives me the above error so i found the other way to perform joining on two tables from diffrent database.and that way is as below.
var query = from lc in db1.LotteryCardMaster
where lc.IsActive == 1
select new
{
lc.CashCardID,
lc.CashCardNO,
om.PersonnelName,
lc.Status
};
AB.LottryList = new List<LotteryCardMaster>();
foreach (var result in query)
{
AB.LottryList.Add(new LotteryCardMaster()
{
CashCardID = result.CashCardID,
CashCardNO = result.CashCardNO,
PersonnelName =db2.OwnerMaster.FirstOrDefault(x=>x.OwnerID== result.OwnerID).OwnerName,
Status = result.Status
});
}

Join tables in NHibernate without mapping

I have the following two objects:
User
class User {
public int role;
}
Role
class Role {
public int id;
public string name;
}
be note that role property inside User is int and not Role, that's our limitations.
I want to join between all the users and each of his role. In the mapping objects there is no reference as you can understand, just a simple type (int).
How do I do that join statement?
It's called a theta join:
var a = (from u in session.Query<User>()
from r in session.Query<Role>()
where u.role == r.id
select new { u.Username, Role = r.name }).ToList();
Assuming you have a Username property on the User class.
Yes, this "theta join" (as I just learned this term) is very handy and let's us not worry about putting in pointless mapping relationships.
WARNING HOWEVER IN USING THIS!!! This tripped me up a lot.
Adding to the above example...
var list = new List<int>( { 2, 3 } ); // pretend in-memory data from something.
var a =
(from u in session.Query<User>()
from x in list
from r in session.Query<Role>()
where u.role == r.id
where r.id == x.id // pretend list wants to limit to only certain roles.
select new { u.Username, Role = r.name }).ToList();
THIS WILL BOMB with some NotSupported exception.
The trick is that anything coming from NHibernate Session must come LAST. So this alteration WILL work:
var a =
(from x in list
from u in session.Query<User>()
from r in session.Query<Role>()
where u.role == r.id
where r.id == x.id // pretend list wants to limit to only certain roles.
select new { u.Username, Role = r.name }).ToList();
And and BTW, you can use join as well, however you have to make sure if you have any nullable data types, that you use the .Value if you are joining to something not-nullable.
var a =
(from x in list
from u in session.Query<User>()
join r in session.Query<Role>() on u.role equals r.id
where r.id == x.id // pretend list wants to limit to only certain roles.
select new { u.Username, Role = r.name }).ToList();
And while we're at it, let's say you have a method that has some dynamic condition. In this example the 'list' which could be a list of roles to filter by, but don't filter at all if the list is not there. Well, if you do the .ToList() then you are causing this query to execute immediately. But instead you can add a condition and then execute it later:
var a =
from u in session.Query<User>()
join r in session.Query<Role>() on u.role equals r.id
where r.id == x.id // pretend list wants to limit to only certain roles.
select new { u.Username, Role = r.name, RoleID = r.id }; // Adding the Role ID into this output.
if (list != null) // assume if the list given is null, that means no filter.
{
a = a.Where(x => list.Contains(x.RoleID));
// WARNING. Unfortunately using the "theta" format here will not work. Not sure why.
}
var b = a.ToList(); // actually execute it.
var c = a.Select(x => new { x.Username, x.Role }).ToList() // if you insist on removing that extra RoleID in the output.
One last thing.. Sometimes some simple logic will fail when executed in the select new { .. } part. I don't have an explanation. In our case the logic was just converting a DB value of a uint to an Enumerator of a model. But to get around that, I just avoided doing that conversion while reading the data but saved the value. Then in a later step, after the data was loaded, I just did the conversion in another LINQ statement.
DISCLAIMER: While I wrote many of these things all the past several weeks, I did not put this code into my compiler to verify 100%.

Getting DISTINCT values from a JOIN

i currently have the following LINQ statement:
using (MYEntities ctx = CommonMY.GetMYContext())
{
List<datUser> lstC = (from cObj in ctx.datUser
join fs in ctx.datFS on cObj.UserID equals fs.datUser.UserID
where userOrg.Contains(fs.userOrg.OrgName)
select cObj).ToList();
foreach (datUser c in lstC)
{
Claim x = new Claim
{
UserID= c.userID,
FirstName = c.FirstName,
LastName = c.LastName,
MiddleName = c.MiddleName,
};
}
}
right now it returns all users, but it duplicates them if they have more then 1 org associated with them.
how can i ensure that it only returns distinct UserIDs?
each user can have multiple orgs, but i really just need to return users that have at least 1 org from the userOrg list.
Right before your ToList, put in .Distinct().
In response to #DJ BURB, you should probably use the Distinct overload that takes in an IEqualityComparer to best be sure that you're doing it based off of the unique id of each record.
Look at this blog post for an example.
use group by.
syntax:
var result= from p in <any collection> group p by p.<property/attribute> into grps
select new
{
Key=grps.Key,
Value=grps
}
You will have to call Distinct(), there is no linq query equivalent of that command.

How to LINQ Join when one key is an ArrayOfInt and the other an int

I'm trying to get a list of messages from the database where one of the recipients of the message matches a user. Normally, if there was only one recipient, you would have something like the following
var res = db.messages.Where(m => m.id == message_id)
.Join(db.persons, m => m.recipients, p => p.id, (m, p) => new {m, p})
.Select(x => new Message(){ msg = x.m, person = x.p})
But what if recipients is a comma seperated string of integers and id is an integer?
You would need to convert recipients into a list of elements as a start. I'm assuming that recipients is a list of ids from the person table. As such from your question you have to pass in the person id to do a select on it?
var messages=db.messages.Select(
m => m.id == message_id &&
(m.recipients.Split(",").
Any(recipient => reipient == person_id)
)
var person = db.Persons.Select(p => p.id == person_id)
Note that doing this in linq is going to suffer a performance penalty as things like .Split are C# and will not work on IQueryable. As such the DB will have to transmit up a lot of data to perform this query depending on the size of your table. If you have a view on the database when you have tokenized this out, or you are capable of creating a new table in the DB where the recipients of a message are listed with a message ID, you could do this much more easily (not to mention normalising your DB in the process).
Option 1 - using Contains:
var res = from m in db.messages
where m.id == message_id
from p in db.persons
where m.recipients.Split(",").Select(i => Int32.Parse(i))
.Contains(p.id)
select new Message() {
msg = m,
person = p
};
The idea:
Get the messages as in your original query
Get all persons where contained in the recipient list
Continue with your original query
Option 2 - using a LINQ join (maybe more complicated than it needs to be):
var res = from m in db.messages
where m.id == message_id
from r in m.recipients.Split(",")
select new {
Message = m,
Recipient = Int32.Parse(r)
} into rec
join p in db.persons on rec equals p.id
select new Message () {
msg = m,
person = p
};
The idea:
Get the messages as in your original query
Split the string into a list of Int32
Join this list against db.persons
Continue with your original query
Not sure which of these is faster, you'll have to check.
I would also shy away from using comma-delimited strings as a foreign key reference namely because of the troubles you're having here. They're ugly to look at and a true pain to manipulate. Instead, if you have control over the schema, consider a one-to-many relation table having MessageId and PersonId. Of course if you don't have control, you're stuck.
Disclaimer: These are untested, so may need some tweaking for the code to work. The algorithms however should be ok.

Categories