TLDR:
Is it possible to:
store a document (Article) with a relation (Tags) in one Store/StoreAsync call to RavenDB, but into separate collections?
then fetch the parent document (Article) including the related documents (Tags) in one query (without including/loading the tags separately)?
Explanation
AFAIK the only way to store data to RavenDB which have a relation into separate collections is to store them individually. When you read the data, you need to Include the related documents and call Load to get them.
I wonder if there is a way to simplify this by storing and querying Articles and related Tags in one go.
I have a idea how I wish it would work (but it does not), as well a working but cumbersome example.
The examples below are split into these steps
POCOs
Storing data
Index definition
Querying the index
I put the broken-but-I-wish-it-would-work and the working examples next to each other. I think it is easier to understand it that way.
POCOs
broken-but-I-wish-it-would-work
namespace Articles
{
public class ArticlePersistance
{
public string Id { get; set; }
public string Title { get; set; }
public List<TagPersistance> Tags { get; set; } // Specify TagPersistance here
}
[DearRavenDBStoreToSeparateCollectionPlease] // Does not exist
public class TagPersistance
{
public string Id { get; set; }
public string Name { get; set; }
}
}
working
namespace Articles
{
public class ArticlePersistance
{
public string Id { get; set; }
public string Title { get; set; }
public List<string> Tags { get; set; } // Specify string here
}
public class TagPersistance
{
public string Id { get; set; }
public string Name { get; set; }
}
}
Storing data
broken-but-I-wish-it-would-work
Storing ArticlePersistance and TagPersistance into their own collections with one call to StoreAsync. AFAIK this stores the Article and Tags into the same collection.
var tag = new TagPersistance() { Name = "Tag1" };
var article = new ArticlePersistance()
{
Title = "aaa",
Tags = new List<TagPersistance> { tag } // Embed the full Tag here
};
await session.StoreAsync(article); // Only one call to StoreAsync
await session.SaveChangesAsync();
working
Storing the Article and Tags separately:
var tag = new TagPersistance() { Name = "Tag1" };
await session.StoreAsync(tag); // Store Tag separately
var article = new ArticlePersistance()
{
Title = "aaa",
Tags = new List<string> { tag.Id } // Embed only the tag id
};
await session.StoreAsync(article);
await session.SaveChangesAsync();
Index definition
broken-but-I-wish-it-would-work
Index on ArticlePersistance which stores the full Tag objects
public class Articles_Test : AbstractIndexCreationTask<ArticlePersistance>
{
public Articles_Test()
{
Map = articles =>
from article in articles
let tags = article.Tags.Select(t => LoadDocument<TagPersistance>(t)) // Load the related Tags
select new
{
Title = article.Title,
Tags = tags // Store the full Tag objects here
};
}
}
working
Index which holds only the Tag names, not the full Tag objects:
public class Articles_Test : AbstractIndexCreationTask<ArticlePersistance>
{
public Articles_Test()
{
Map = articles =>
from article in articles
let tags = article.Tags.Select(t => LoadDocument<TagPersistance>(t)) // Load the related Tags
select new
{
Title = article.Title,
Tags = tags.Select(t => t.Name) // Store only the Tag name
};
}
}
Querying the index
broken-but-I-wish-it-would-work
Finally querying the index and getting the article with the tags back.
I hoped for fetching the Article and the Tags in one go here
// This does not work
var article = await session
.Query<ArticlePersistance, Articles_Test>()
.Where(a => a.Title == "aaa")
.ToListAsync();
working
This is working, but cumbersome.
You need to care about the relation between Article and Tags which could already be specified in the Index definition.
var article = await session
.Query<ArticlePersistance, Articles_Test>()
.Where(a => a.Title == "aaa")
.Include(t => t.Tags) // Include the tags
.ToListAsync();
// Query the tags separatelly
var tags = await session.LoadAsync<TagPersistance>(article.SelectMany(a => a.Tags));
Related
I need your help
I try to create a linq sentence with .Include but my problem is that i have a property in mi class witch is a list, it is my class specifically:
public partial class document
{
public int ID { get; set; }
public string Amount { get; set; }
public List<Log> Log { get; set; }
}
this is the class log
public partial class Log
{
[Key]
public int ID { get; set; }
[Required]
public Status Status { get; set; }
[Column(TypeName = "text")]
public string Description { get; set; }
public DateTime? DateLog { get; set; }
public int? DocumentID{ get; set; }
[ForeignKey("DocumentID")]
public Document Document{ get; set; }
}
my problem is that I don't know how to filter my list record inside the document for include in the class, I need to get the whole document class and filter the log that only shows status = recieved, a document can have many logs
y tried to do that but it didnĀ“t work
var Result = db.document
.Include(m => m.Log.Where(c => c.Status == Status.Recieved));
i recived the next error
"the include path expression must refer to a navigation property defined on the type. use dotted paths for reference navigation properties and the select operator for collection navigation properties.\r\nparameter name: path"
I appreciate your help
Include used for include relationships with an entity and fetch related entity properties, check documentation - Fetching related data
If you select documents without Include like this
var documents = await db.document.ToListAsync();
you get documents data where Log will be null.
You need something like that:
var result = await db.document
.Select(w=> new
{
document = w,
log = w.Log.Where(c => c.Status == Status.Recieved).ToList()
}).ToListAsync();
EF does support some automatic filtering rules to help with concepts like soft-delete (IsActive) and multi-tenancy (ClientId), but not really applicable for scenarios like this where you want to apply a situational filter like "received" documents.
EF entities should be considered as models reflecting the data state. To filter results like that is more of a view model state which you can achieve through projection:
var result = db.document.Select(d => new DocumentViewModel
{
DocumentId = d.DocumentId,
// .. fill in other required details...
ReceivedLogs = d.Logs
.Where(l => l.Status == Status.Received)
.Select(l => new LogViewModel
{
// Fill needed log details...
}).ToList()
}).ToList();
Otherwise if you are doing something local with the entities and just want the document and the received log entries:
var documentDetails = db.document
.Where(d => d.DocumentId == documentId)
.Select(d => new
{
Document = d,
ReceivedLogs = d.Logs
.Where(l => l.Status == Status.Received)
.ToList()
}).Single();
documentDetails.Document.Logs will not be eager loaded, and would trigger lazy loading if you access it, but the documentDetails does contain the relevant Received logs to access. As an anonymous type it's not suitable to being returned, only consumed locally.
I have Places, each place can have many tags. Each tag can be assigned to many places.
public class Place {
public int Id { get; set; }
public string PlaceName { get; set; }
public IEnumerable<Tag> Tags { get; set; }
}
public class Tag {
public int Id { get; set; }
public string TagName { get; set; }
}
public class TagPlace {
public int Id { get; set; }
public PlaceId { get; set; }
public TagId { get; set; }
}
The database has equivalent tables with foreign keys as appropriate.
I want to get a collection of Places, and I want each Place to have an appropriate colleciton of Tags. I guess using Linq might be required.
I've found various articles on this, but they aren't quite the same / deal with a list of ints rather than two collections of objects.
eg
https://social.msdn.microsoft.com/Forums/en-US/fda19d75-b2ac-4fb1-801b-4402d4bd5255/how-to-do-in-linq-quotselect-from-employee-where-id-in-101112quot?forum=linqprojectgeneral
LINQ Where in collection clause
What's the best way of doing this?
The classical approach with Dapper is to use a Dictionary to store the main objects while the query enumerates the records
public IEnumerable<Place> SelectPlaces()
{
string query = #"SELECT p.id, p.PlaceName, t.id, t.tagname
FROM Place p INNER JOIN TagPlace tp ON tp.PlaceId = p.Id
INNER JOIN Tag t ON tp.TagId = t.Id";
var result = default(IEnumerable<Place>);
Dictionary<int, Place> lookup = new Dictionary<int, Place>();
using (IDbConnection connection = GetOpenedConnection())
{
// Each record is passed to the delegate where p is an instance of
// Place and t is an instance of Tag, delegate should return the Place instance.
result = connection.Query<Place, Tag, Place(query, (p, t) =>
{
// Check if we have already stored the Place in the dictionary
if (!lookup.TryGetValue(p.Id, out Place placeFound))
{
// The dictionary doesnt have that Place
// Add it to the dictionary and
// set the variable where we will add the Tag
lookup.Add(p.Id, p);
placeFound = p;
// Probably it is better to initialize the IEnumerable
// directly in the class
placeFound.Tags = new List<Tag>();
}
// Add the tag to the current Place.
placeFound.Tags.Add(t);
return placeFound;
}, splitOn: "id");
// SplitOn is where we tell Dapper how to split the record returned
// in the two instances required, but here SplitOn
// is not really needed because "Id" is the default.
}
return result;
}
In Example II of Indexing Related Documents, an index is built over Authors by Name and Book title. The relevant entities look like so:
public class Book {
public string Id { get; set; }
public string Name { get; set; }
}
public class Author {
public string Id { get; set; }
public string Name { get; set; }
public IList<string> BookIds { get; set; }
}
I.e. only the Author holds information about the relation. This information is used in constructing said index.
But how would I construct an index over Books by Authors (assuming a book could have multiple authors)?
Edit:
The book/author analogy only goes so far. I'll make an example that's closer to my actual use case:
Suppose we have some tasks that are tied to locations:
public class Location {
public string Id { get; set; }
public double Latitude { get; set; }
public double Longitude { get; set; }
}
public class Task {
public string Id { get; set; }
public string Name { get; set; }
public string LocationId { get; set; }
public Status TaskStatus { get; set; }
}
I have an endpoint serving Locations as GeoJson to a map view in a client. I want to color the Locations depending on status of Tasks associated with them. The map would typically show 500-2000 locations.
The query on locations is implemented as a streaming query.
Using the query-method indicated in Ayende's initial answer, I might do something like:
foreach (var location in locationsInView)
{
var completedTaskIds = await RavenSession.Query<Task>()
.Where(t => t.LocationId == location.Id && t.TaskStatus == Status.Completed)
.ToListAsync();
//
// Construct geoJson from location and completedTaskIds
//
}
This results in 500-2000 queries being executed against RavenDB, which doesn't seem right.
This is why I initially thought I needed an index to construct my result.
I have since read that RavenDB caches everything by default, so that might be a non-issue. On the other hand, having implemented this approach, I get an error ("...maximum number of requests (30) allowed for this session...").
What is a good way of fixing this?
You cannot index them in this manner.
But you also don't need to.
If you want to find all the books by an author, you load the author and you have the full list.
You can do this using a multi map/reduce index.
All sources of truth about the objects of interest are mapped to a common object type (Result). This mapping is then reduced grouping by Id and keeping just the relevant pieces, creating a "merge" of truths about each object (Book in this case). So using the Book/Author example, where several Authors might have contributed to the same book, you could do something like the following.
Note that the map and reduce steps must output the same type of object, which is why author.Id is wrapped in a list during the mapping from author.
Author.Names are excluded for brevity, but could be included in the exact same way as Author.Id.
public class BooksWithAuthors : AbstractMultiMapIndexCreationTask<BooksWithAuthors.Result>
{
public class Result
{
string Id;
string Title;
IEnumerable<string> AuthorIds;
}
public BooksWithAuthors()
{
AddMap<Book>(book => from book in books
select new
{
Id = book.Id,
Title = book.Title,
AuthorIds = null;
});
AddMap<Author>(author => from author in authors
from bookId in author.bookIds
select new
{
Id = bookId,
Title = null,
AuthorIds = new List<string>(){ author.Id };
});
Reduce = results => from result in results
group result by result.Id
into g
select new
{
Id = g.Key,
Title = g.Select(r => r.Title).Where(t => t != null).First(),
AuthorIds = g.Where(r => r.AuthorIds != null).SelectMany(r => r.AuthorIds)
};
}
}
I have the following entity collections in RavenDB:
public class EntityA
{
public string Id { get; set; }
public string Name { get; set; }
public string[] Tags { get; set; }
}
public class EntityB
{
public string Id { get; set; }
public string Name { get; set; }
public string[] Tags { get; set; }
}
The only thing shared is the Tags collection: a tag of EntityA may exist in EntityB, so that they may intersect.
How can I retrieve every EntityA that has intersecting tags with EntityB where the Name property of EntityB is equal to a given value?
Well, this is a difficult one. To do it right, you would need two levels of reducing - one by the tag which would expand out your results, and another by the id to collapse it back. Raven doesn't have an easy way to do this.
You can fake it out though using a Transform. The only problem is that you will have skipped items in your result set, so make sure you know how to deal with those.
public class TestIndex : AbstractMultiMapIndexCreationTask<TestIndex.Result>
{
public class Result
{
public string[] Ids { get; set; }
public string Name { get; set; }
public string Tag { get; set; }
}
public TestIndex()
{
AddMap<EntityA>(entities => from a in entities
from tag in a.Tags.DefaultIfEmpty("_")
select new
{
Ids = new[] { a.Id },
Name = (string) null,
Tag = tag
});
AddMap<EntityB>(entities => from b in entities
from tag in b.Tags
select new
{
Ids = new string[0],
b.Name,
Tag = tag
});
Reduce = results => from result in results
group result by result.Tag
into g
select new
{
Ids = g.SelectMany(x => x.Ids),
g.First(x => x.Name != null).Name,
Tag = g.Key
};
TransformResults = (database, results) =>
results.SelectMany(x => x.Ids)
.Distinct()
.Select(x => database.Load<EntityA>(x));
}
}
See also the full unit test here.
There is another approach, but I haven't tested it yet. That would be to use the Indexed Properties Bundle to do the first pass, and then map those results for the second pass. I am experimenting with this in general, and if it works, I will update this answer with the results.
I am getting a primary key violation error when I attempt to add an item with a many-to-many relationship:
I have two classes - Articles and Tags which have a many-to-many relationship :
public class Article
{
public int ID { get; set; }
public string Text { get; set; }
public ICollection<Tag> Tags { get; set; }
}
public class Tag
{
[Key]
public string UrlSlug { get; set; }
public string Name { get; set; }
public ICollection<Article> Articles{ get; set; }
}
When I add a new Article I allow the user to input any Tags and then I want to create a new Tag if the Tag isn't created yet in the database or add the Tag to the Tags collection of the Article object if the Tag already exists.
Therefore when I am creating the new Article object I call the below function:
public static Tag GetOrLoadTag(String tagStr)
{
string tagUrl = Tag.CreateTagUrl(tagStr);
var db = new SnippetContext();
var tagFromDb = from tagdummy in db.Tags.Include(x => x.Articles)
where tagdummy.UrlSlug == tagUrl
select tagdummy;
if (tagFromDb.FirstOrDefault() != null)
{ return tagFromDb.FirstOrDefault(); }
else
{
//create and send back a new Tag
}
}
This function basically checks if there is an available Tag in the database and if so returns that Tag which is then added to the Tag collection of the Article object using article.Tags.Add().
However, when I attempt to save this using the below code I get a Violation of PRIMARY KEY constraint error
db.Entry(article).State = EntityState.Modified;
db.SaveChanges();
I can't figure out how I should go about just creating a relationship between the Article and the already existing Tag.
Use the same context instance for the whole processing of your operation and your life will be much easier:
using (var ctx = new MyContext())
{
Article article = ctx.Articles.Single(a => a.Id == articleId);
Tag tag = ctx.Tags.SingleOrDefault(t => t.UrlSlug == tagUrl);
if (tag == null)
{
tag = new Tag() { ... }
ctx.Tags.AddObject(tag);
}
article.Tags.Add(tag);
ctx.SaveChanges();
}
If you don't want to load the article from database (that query is redundant if you know that article exists) you can use:
using (var ctx = new MyContext())
{
Article article = new Article() { Id = articleId };
ctx.Articles.Attach(article);
Tag tag = ctx.Tags.SingleOrDefalut(t => t.UrlSlug == tagUrl);
if (tag == null)
{
tag = new Tag() { ... }
ctx.Tags.AddObject(tag);
}
article.Tags.Add(tag);
ctx.SaveChanges();
}
How do you go about creating new tags? And how do you attach the existing or created entity to the the article.
Use something like
Article a = new Article(...);
a.tags.add(GetOrLoadTag("some tag"));
Read this article http://thedatafarm.com/blog/data-access/inserting-many-to-many-relationships-in-ef-with-or-without-a-join-entity/