MongoDB C# driver 2.0 update collection and ignore duplicates

MongoDB C# driver 2.0 update collection and ignore duplicates - c#

I am very new with MongoDB (only spend a day learning). I have a relatively simple problem to solve and I choose to take the opportunity and learn about this popular nosql database.
In C# I have the following classes:
public class Item
{
[BsonId]
public string ItemId { get; set; }
public string Name { get; set; }
public ICollection<Detail> Details { get; set; }
}
public class Detail
{
//[BsonId]
public int DetailId { get; set; }
public DateTime StartDate { get; set; }
public double Qty { get; set; }
}
I want to be able to add multiple objects (Details) to the Details collection. However I know that some of the items I have (coming from a rest api) will already be stored in the database and I want to avoid the duplicates.
So far I can think of 2 ways of doing it, but I am not really happy with either:
Get all stored details (per item) from MongoDB and then in .net I can filter
and find the new items and add them to the db. This way I can be sure that there will be no duplicates. That is however far from ideal solution.
I can add [BsonId] attribute to the DetailId (without this attribute this solution does not work) and then use AddToSetEach. This works and my only problem with that is that I don’t quite understand it. I mean, it suppose to only add the new objects if they do not already exists in the database,
but how does it know? How does it compare the objects? Do I have any control over that comparison process? Can I supply custom comparers? Also I noticed that if I pass 2 objects with the same DetailId (this should never happen in the real app), it still adds both, so BsonId attribute does not guarantee uniqueness?
Is there any elegant solution for this problem? Basically I just want to update the Details collection by passing another collection (which I know that contain some objects already stored in the db i.e. first collection) and ignore all duplicates.

The AddToSetEach based version is certainly the way to go since it is the only one that scales properly.
I would, however, recommend you to drop the entire DetailId field unless it is really required for some other part of your application. Judging from a distance it would appear like any entry in your list of item details is uniquely identifiable by its StartDate field (plus potentially Qty, too). So why would you need the DetailId on top of that?
That leads directly to your question of why adding a [BsonId] attribute to the DetailId property does not result in guaranteed uniqueness inside your collection of Detail elements. One reason is that MongoDB simply cannot do it (see this link). The second reason is that MongoDB C# driver does not create an unique index or attempts other magic in order to ensure uniqueness here - probably because of reason #1. ;) All the [BsonId] attribute does is tell the driver to serialize the attributed property as the "_id" field (and write the other way round upon deserialization).
On the topic of "how does MongoDB know which objects are already present", the documentation is pretty clear:
If the value is a document, MongoDB determines that the document is a
duplicate if an existing document in the array matches the to-be-added
document exactly; i.e. the existing document has the exact same fields
and values and the fields are in the same order. As such, field order
matters and you cannot specify that MongoDB compare only a subset of
the fields in the document to determine whether the document is a
duplicate of an existing array element.
And, no, there is no option to specify custom comparers.

Related

Entity Framework's "Index" decoration is not creating an Index

Here is the super simple class I'm trying to create.
public class Company
{
public int ID { get; set; }
[Column(TypeName = "VARCHAR(254)")]
[Index]
public string Name { get; set; }
[Index]
public int stupidField { get; set; }
}
My goal was to force Name to be unique, so I added the decoration [Index(IsUnique = true)]. But no unique index was created, so I figured I'll first try to solve the simpler problem of creating any index. Because I read here that indices cannot be created for columns of type varchar(max), I limited the length of the Name field. Still no luck. I even tried a few different syntaxes for limiting the length of the field, but still no index.
To see if something other than string length was at play, I created the integer field stupidField, but I can't index that field either. So now I'm completely out of ideas as to what could be wrong. Please help me!
Check out this screenshot from MS SQL Server Management Studio that shows that my fields are being created but not the indices.
Note: I'm certain migrations are not the issue.
Some of the people I've read about on SO were updating their classes, but those changes were not reflected in the database because of problems with their migrations. That is not relevant here. I delete the database and recreate it every time I make a change. (I even make silly changes like renaming my fields, just to make sure that I can still affect the database.)

Turns out I'm actually using Entity Framework Core, not Entity Framework. In Entity Framework Core, indices cannot be created using attributes, although they can be created using fluent API. See Microsoft's documentation.

Updating relational entities

I have a scenario in which I need some help.
Let us assume that there is a User who listens to some type of Music.
class User
{
public virtual List<UserMusicType> Music { get; set; }
}
public class UserMusicType
{
public int ID { get; set; }
public MusicType name { get; set; }
}
public class MusicType
{
public int ID { get; set; }
public string Name { get; set; }
}
There is a form where I am asking users to check/select all types of Music he listens to. He selects 3 types namely { Pop, Rock, and Electronic }
CASE 1:
Now I want to update the User Entity and insert these 3 new types. From my understanding, I need to first remove whatever MusicTypes for this users were saved in the Database then insert these new types again. Is it a correct approach? Removing all previous and Inserting new ones? Or any other way to do it?
CASE 2:
I am taking MusicType names as string of course. Now while updating the User Entity, I'll have to first fetch the MusicType.ID after that I'll be able to do this:
user.Music.Add(new UserMusicType() { ID = SOME_ID });
Is there a better approach for this case?
I'll be glad to have some replies from experienced people in EF. I want to learn if there is an efficient way of doing it. Or even if my approach/Models are totally wrong or could be improved.

First of all, you don't need the UserMusicType class, you can just declare the `User class as
class User
{
public virtual List<MusicType> Music { get; set; }
}
And entity framework will create a many to many relationship table in the database
As for the first question, it depends. If you use this relationship any where else, like payment or audit trail, then the best way would be to compare the posted values to the saved values, ex:
User selected Music 1, Music 2, Music 3 for the first time and saved, in this case the 3 records will be inserted.
User edited his selection and chose Music 1,Music 3,Music 4, in this case you will get the values submitted which is 1,3,4 and retrieve the values stored in the database which is 1,2,3
Then you will get the new values which are the items that exist in the new values but not in the old, in this case it will be 4
You will get the removed values, which exist in the old but not in the new, in this case it will be Music 2.
The rest can be ignored.
So, your query, will be add Music 4, remove Music 2.
If you don't depend on the relationship, then it is easier to just remove all user music and add the collection again.
As for the second part of your question, I assume you will display some chechboxes for the user, you should make the value for the checkbox control as the MusicType ID, and this is what will be posted to the backend and you can use it to link it to the user.
ex:
user.Music.Add(new MusicType{ID=[selected ID ]}
You should not depend on the music name

First question:
Actually, it is a personal preference. Because, wouldn't want to delete all rows which belongs to that user and then insert them. I would compare the collection which is posted from the form with the rows which is stored in the database. Then, delete those entities from the database which are not exist in the collection anymore. And, insert new ones. Even, you can update those entities which has modified some additional details.
By the way, you can easily achieve this with the newly released EntityGraphOperations for Entity Framework Code First. I am the author of this product. And I have published it in the github, code-project and nuget. With the help of InsertOrUpdateGraph method, it will automatically set your entities as Added or Modified. And with the help of DeleteMissingEntities method, you can delete those entities which exists in the database, but not in the current collection.
// This will set the state of the main entity and all of it's navigational
// properties as `Added` or `Modified`.
context.InsertOrUpdateGraph(user)
.After(entity =>
{
// And this will delete missing UserMusicType objects.
entity.HasCollection(p => p.Music)
.DeleteMissingEntities();
});
You can read my article on Code-project with a step-by-step demonstration and a sample project is ready for downloading.
Second question:
I don't know on which platform you are developing your application. But, generally I am storing such libraries as MusicType in a cache. And use DropDownList element for rendering all types. When user posts the form, I am getting values rather than names of the selected types. So, no additional work is required.

HQL to insert entity references by passing property values?

I'm having problems sorting through all the Google results for my search terms; too much information that is close but not what I'm looking for, so... off to StackOverflow!
I have three tables, Stocks, StockProperties, and StockPropertyTypes.
One Stock record has zero or more StockProperties associated with it, and has a unique column symbol. Each StockProperty record has exactly one reference to a StockPropertyType, and exactly one reference to a Stock. The StockPropertyTypes table has a unique column code. The entities for these tables do not have the FK id's for the references, just the classes. For example, the StockProperty entity looks like:
public class StockProperty
{
public virtual Guid StockPropertyId { get; set; }
public virtual Stock Stock { get; set; }
public virtual string PropertyValue { get; set; }
public virtual StockPropertyType Type { get; set; }
}
I want to pass into a method a stock symbol, a type code, and a value, and create a StockProperty record using HQL, but I'm not sure how to write that query. The Stocks table and the StockPropertyTypes have no relation. Seems like this should be some nested HQL query, but I'm not sure how to differentiate between a property and a referenced entity.
Can someone educate me what that HQL query should look like?
I should add, my goal here is to do this with one db trip; I don't want to load the Stock and StockPropertyType entities before creating the StockProperty record.

The typical way to do this is to load the Stock and StockPropertyType from the ISession. Then create a StockProperty to save using ISession.Save().
As you mention, this requires a few extra trips to the DB. One way to avoid this is to execute SQL directly as follows:
session
.CreateSQLQuery(#"insert
into StockProperty(StockSymbol, Value, TypeCode)
values (:stockSymbol, :value, :typeCode)")
.SetProperty("stockSymbol", stockSymbol)
.SetProperty("value", value)
.SetProperty("typeCode", typeCode)
.ExecuteUpdate();
You are kind of bypassing NHibernate here, but it is more efficient.
Personally, I would consider loading the related entities into memory unless you are experiencing a bottleneck. You can load both the Stock and StockPropertyType in a single DB call by using the Future<T>() paradigm.
Alternatively...
You could try fiddling with <sql-insert> inside of your hibernate mapping file. This allows you more control over how the insert is generated. You might want to add some properties StockId and StockPropertyTypeId that are only used during insert.

Linq: how return property of specific object

i have the following model:
public partial class location
{
public int Id { get; set; }
public double Lat { get; set; }
public double Long { get; set; }
public virtual ICollection<localserver> localserver { get; set; }
}
When, in a controller, i do:
List<location> l = db.location.ToList();
i get also the localserver object. How, in LINQ, get only the property of location (Id, Lat and Long) without using the Select argument in linq?

The way to extract part of an object is to project it to a new form, which is what .Select() is for:
var result = db.location
.Select(x => new
{
Id = x.Id,
Lat = x.Lat,
Long = x.Long
})
.ToList();
Now, you've specifically asked how to do this without using .Select(), so... really, the answer is "you don't". You've ruled out the tool that's specifically designed for the scenario you're presenting, so no useful answer remains.
To address the intent of your question, however, I'm going to make a guess that you don't want to load the collection of localserver objects, perhaps because it's large and uses a lot of memory. To solve that problem, I would suggest the following:
If you're using Entity Framework Code First, check your cascade options when creating the database (if you're setting any).
Check that the collection is declared as virtual (you've shown it as such here but check the actual source)
Check that you have lazy-loading enabled, by setting myContext.ContextOptions.LazyLoadingEnabled = true; at some point
This should allow the application to lazy-load that property, which means that the contents of that localserver property will only be retrieved from the database and loaded into memory when they're actually needed. If you don't ever access it, it won't be loaded and won't take up any memory.

When you are getting Location list entity is not pulling Localserver object data, the thing that entity framework has feature called lazy loading. Lazy loading means you can get associated object at any time you need, no need write one more separate linq query to get. This will pull data from associated object only when you call them inside the code, then entity framework will make one more call to database to load that associated object data.
So just go with your code, but if you want some selected columns from Location object itself than you have write Select and supply column names.

How can I store an Array of Ints in my database, in a single field, using Code First Entity Framework?

What I want to do is store an Array of Ints in my database. I am Using Code First Entity Framework to create my database I can't seem to create a List or Array element in the database. The List item doesn't create a field. I gather that this is because it would need a separate table.
What is the best way to store my array in a single field here. Should I convert the array to a String, or is there an easier way?
My model contains:
public string Title { get; set; }
public List<int> IdArray { get; set; }

NHibernate supports this, but EF doesn't.
If you need to do it the only option is to store it serialized. For example you can do:
a XML column, and store it serialzed as XML
a char column, and store it serialized as JSON
EF will not automatically serialize / deserialize it. You should add a property that does the serialization/deserialization for you
public string IdsColumn { get; set; } // Mapped to DB
[NotMapped]
public List<int> Ids
{
get { return Deserialize(IdsColumn); }
set { IdsColumn = Serialize(value); }
}
Of course you have to provide the serialization functions implementation, using XMLSerializer or JSON.NET, for example.
If you want this functionality implemented, look here: EF data user voice
For example Richer collection support, including ordered collections and dictionaries of value types
You can vote for it.

Im not too sure of your situation but Most of the time this is never necessary...
Basically if you arrive at such a situation you should look at normalizing your table further. data in tables need to be "atomic". once retrieved as objects it can be in a list.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.