I currently access a Web API endpoint serving up hierarchical objects (complex deals) using JSON/BSON. The objects are translated from entity framework objects stored as standard normalised data in a SQL Server database. This all works well.
However, as the number of these objects grows it becomes increasingly inefficient to serialise/deserialise them across the wire before filtering out those required at the client. Having methods for all objects or object-by-id is fine, but often there are more complex criteria for filtering which would require a myriad of different method signatures to fully capture. In an ideal world it would be possible to send Func<Deal,bool> to the Deals endpoint and this would provide the filtering mechanism from the client side to be enacted server-side. The premise being that different users will be interested in deals based on varying facets.
This may be mad, but is there any way that something along these lines can be achieved?
I do this by passing a "SearchCriteria" object to the search endpoint and then performing filtering at the server based on the values set in the various criteria properties. However, we do have a fairly well defined list of criteria and performing the filtering isn't too bad.
Alternatively, I've not used OData, but from what I understand this might be what you are looking for. If I was pondering this again I would investigate this.
https://learn.microsoft.com/en-us/aspnet/web-api/overview/odata-support-in-aspnet-web-api/odata-v4/create-an-odata-v4-endpoint
Related
I'm reviewing OData again as I would like to use it in a new Rest project with EF but I have the same concerns I had a few years back.
Exposing a general IQueryable can be quite dangerous. Restricting potentially expensive queries must be done elsewhere. DB, connection level.
OData doesn't allow any interception/customisation of the behaviour by developers as it sits outside the interface.
OData doesn't play well with DI in general. While it is possible to DI an alternative IQueryable you can't intercept the OD calls and check, amend or modify.
My suggestion is that the tool be broken down into more distinct elements to allow far greater customisation and re-use. Break open the black box :) It would also be better in terms of single responsibility. Would it be possible to have components that did the following
Expression generators from urls. Converts OData urls extensions into typed expressions usable with an IQueryable but independent of it. Generate T => Expression<Func<T, bool>> for a where for example. This would be a super useful stand alone component and support OData url formats being used more widely as a standard.
An EF Adaptor to attach the expressions to an EF context. An EF Adaptor to attach the expressions to an EF context or use in any other DI'ed code. So rather than exposing a public IQueryable the service can encapsulate an interface and get the benefits of OData functionality. Rest Get -> Expression Generation -> Map to IQueryable.
This approach would allow developers to intercept the query calls and customise the behaviour if required while maintaining the ease of use for simple cases. We could embed OData and EF within repository patterns where we add our own functionality.
There is a lot of misunderstanding in your post, it's not really well suited to this site, but it is a recurring line of speculation that does need to be addressed.
OData doesn't play well with DI in general. While it is possible to DI an alternative IQueryable you can't intercept the OD calls and check, amend or modify.
This statement is just not accurate at all, not on the DI topic or the query interception. To go into detail is too far out of scope as there are many different ways to achieve this, it would be better to post a specific scenario that you are challenged by and we can post a specific solution.
Exposing a general IQueryable can be quite dangerous. Restricting potentially expensive queries must be done elsewhere. DB, connection level.
Exposing raw IQueryable as a concept has some inherent dangers if you do not put in any restrictions, but in OData we are not exposing the IQueryable to the public at all, not in the traditional SDK or direct API sense. Yes your controller method can (and should) return an IQueryable but OData parses the Path and Query from the incoming Http Request to compose the final query to serve the request without pre-loading data into memory.
The inherent risk with IQueryable comes from when you allow external logic to compose or execute a query that is attached to your internal data context, but in OData the HTTP Host boundary prevents external operators from interacting with your query or code directly, so this risk is not present due to the hosting model.
OData gives you granularity over which fields are available for projecting, filtering or sorting, and although there is rich support for composing extended queries including functions and aggregates, the IQueryable expression itself does not pass the boundary of the executable interface. The IQueryable method response is itself fundamental to many of the features that drive us to choose OData in the first place.
However, you do not need to expose IQueryable at all if you really do not want to! You can return IEnumerable instead, but by doing so you will need to load enough data into memory to satisfy the query request, if you want to fulfil it that is. There are extension points to help you do this as well as tools to parse the URL query parameters into simple strings or an expression tree that you can apply to your own data models if you need to.
The EnableQueryAttribute is an Action Filter that will compose a LINQ query over the results from your controller endpoints to apply any $filter criteria or $select/$expand projections or even $apply aggregations.
OData doesn't allow any interception/customisation of the behaviour by developers as it sits outside the interface.
EnableQueryAttribute is about as close to a Black Box as you can find in OData, but the OData Libraries are completely open source and you can extend or override the implementation or omit the attribute altogether. If you do so (omit it), you will then need to process and format the response to be OData compliant. The specification allows for a high degree of flexibility, the major caveat is that you need to make sure the $metadata document describes the inputs and outputs.
The very nature of the ASP request processing pipeline means that we can inject all sorts of middleware implementations at many different points, we can even implement our own custom query options or we pass the query through the request body if we need to.
If your endpoints do NOT return IQueryable, then the LINQ composition in the EnableQueryAttribute can only operate over the data that is in the IEnumerable feed. A simple example of the implication of this is if the URL Query includes a $select parameter for a single field, something like this:
http://my.service.net/api/products(101)?$select=Description
If you are only exposing IEnumerable, then you must manually load the data from the underlying store. You can use the ODataQueryOptions class to access the OData arguments through a structured interface, the specific syntax will vary based on your DAL, ORM and the actual Model of course. However, like most Repository or MVC implementations, many implementations that do not use IQueryable will default to simply loading the entire object into memory instead of the specifically requested fields, they might end up loading the results from this comparative SQL query:
SELECT * FROM Product WHERE Id = #Id
If this Product has 20 fields, then all that data will be materialised into memory to service the request, even though only 1 field was requested. Even without using IQueryable, OData still has significant benefits here by reducing the bytes being sent across the wire to the client application. This reduces costs but also the time it will take to fulfill a request.
By comparison, if the controller method returned an IQueryable expression that had been deferred or not yet materialised, then the final SQL that gets executed could be something much more specific:
SELECT Description FROM Product WHERE Id = #Id
This can have significant performance benefits, not just in the SQL execution but in the transport between the data store and the service layer as well as the serialization of the data that is received.
Serialization is often taken for granted as a necessary aspect of API development, but that doesn't mean there is no room to improve the process. In the cloud age where we pay for individual CPU cycles there is a lot of wasted processing that we can reclaim by only loading the information that we need, when we need it.
To fully realise the performance gains requires selective data calls from the Client. If the end client makes a call to explicitly request all fields, then there should be no difference between OData and a traditional API approach, but with OData the potential is there to be realized.
If the controller is exposing a complex view, so not a traditional table, then there is even more significance in supporting IQueryable. For custom business DTOs (views) that do not match the underlying storage model we are often forced to compromise between performance practicalities and data structures. Without OData that allows for the caller to trim the data schema, it is common for APIs to either implement some fully dynamic endpoints, or to see a sprawl of similar DTO models that have restricted scope or potentially single purpose. OData provides a mechanism to expose a single common view that has more metadata than all callers need, while still allowing individual callers to only retrieve the sub-set that they need.
In aggregate views you can end up with some individual columns adding significant impact on the overall query execution, in traditional REST APIs this becomes a common justification for having similar DTO models, with OData we can define the view once and give the callers flexibility to choose when the extra data, that comes with a longer response wait time, should be queried, and when it should not.
OData provides a way to balance between being 100% generic with your DTOs or resorting to single use DTOs.
The flexibility provided by OData can significantly reduce the overall time to market by reducing the iterative evolution of views and complex types that often comes up as the front-end development teams start to consume your services. The nature of IQueryable and the conventions offered by the OData standard means that there is potential for front-end work to begin before the API is fully implemented
This was a very simple and contrived example, we didn't yet cover $expand or $apply that can lead to very memory intensive operations to support. I will however quickly talk about $count, it is a seemingly simple requirement, to return a count of all records for a specific criteria or for no criteria at all. An OData IQueryable implementation requires no additional code and has almost zero processing to service this request as it can be passed entirely to the underlying data store in the form of a SELECT COUNT(*) FROM...
With OData and the OData Libraries, we get a lot of functionality and flexibility OOTB, but the default functionality is just the start, you can extend your controllers with additional Functions and Actions and views as you need to.
Regarding the Dangers of IQueryable...
A key argument against exposing IQueryable from the DbContext is that it might allow callers to access more of your database than you might have intended. OData has a number of protections against this. The first is that for each field in the entire schema you can specify if the field is available at all, can be filtered, or can be sorted.
The next level of protection is that for each endpoint we can specify the overall expansion depth, by default this is 2.
It is worth mentioning that it is not necessary to expose your data model directly through OData, if your domain model is not in-line with your data model, it may be practical to only expose selected views or DTOs through the OData API, or only a sub-set of tables in your schema.
For more discussion on DTOs and Over/Under posting protection, have a read over this this post: How to deal with overposting/underposting when your setup is OData with Entity framework
Opening the Black Box
Expression generators from urls. Converts OData urls extensions into typed expressions usable with an IQueryable but independent of it. Generate T => Expression<Func<T, bool>> for a where for example.
This is a problematic concept, if you're not open to IQueryable ... That being said, you can use open types and can have a completely dynamic schema that you can validate in real-time or be derived from the query routes entirely without validation. There is not a lot of published documentation on this, mainly due to the scenarios where you want to implement this are highly specific, but it's not hard to sort out. While out of scope for this post, if you post a question to SO with a specific scenario in mind we can post specific implementation advice...
An EF Adaptor to attach the expressions to an EF context. An EF Adaptor to attach the expressions to an EF context or use in any other DI'ed code. So rather than exposing a public IQueryable the service can encapsulate an interface and get the benefits of OData functionality. Rest Get -> Expression Generation -> Map to IQueryable.
What you are describing is pretty close to how the OData Context works. To configure OData, you need to specify the structure of the Entities that the OData Model exposes. There are convention based mappers provided OOTB that can help you to expose an OData model that is close to 1:1 representation of an Entity Framework DbContext model with minimal code, but OData is not dependant on EF at all. The only requirement is that you define the DTO models, including the actions and functions, from this model the OData runtime is able to validate and parse the incoming HTTP request into queryable expressions composed from the base expressions that your controllers provide.
I don't recommend it, but I have seen many implementations that use AutoMapper to map between the EF Model to DTOs, and then the DTOs are mapped to the OData Entity model. The OData Model is itself an ORM that maps between your internal model and the model that you want to expose through the API. If this model is a significantly different structure or involves different relationships, then AutoMapper can be justified.
You don't have to implement the whole OData runtime including the OData Entity Model configuration and inheritng from ODataController if you don't want to.
The usual approach when you want to Support OData Query Options in ASP.NET Web API 2 without fully implementing the OData API is to use the EnableQueryAttribute in your standard API, it is after all just an Action Filter... and an example of how the OData libraries are already packaged in a way that you can implement OData query conventions within other API patterns.
OData Url "$filter=id eq 1"
becomes:
Func<TestModel, bool> filterLambda = x => x.id == 1;
where TestModel is the 'implied' in some way (maybe this is the problem you refer to)
in terms of generating the code it's something like
Expression.Equal(Expression.PropertyOrField(ExpressionParam, "id"), Expression.Constant(1))
generalised into an OData expression parser
Thanks for the reply. Your time is much appreciated. This isn't really an answer rather a design discussion. Apologies if it's not appropriate here.
I'm no expert with this tech so I may also be missing options. Manually exposing IEnumerables instead of IQueryable seems to require much more coding if I'm reading that suggestion correctly. It would also lead to service processing of the data after the database queries. The idea of custom actions on a custom IQueryable may also be worth some investigation.
An example of not playing well with DI...If we have an http context user with the token who has some limits re the queries they can do. We would like to take the user info from the http context and restrict the db context queries they do. Maybe the user can only see certain business units or certain clients. There are many other use-cases.
It should be possible to append/amend queries before presented to the database with the extra user context. This is where the decomposition idea comes in. If OData could generate a structure (lambda expression may not be good here either) that can be manipulated we can have the best of both by manipulating before the query execution.
The $filter, $extends concepts could be added more generally to interfaces that would allow the database to be better encapsulated. The interface applies the filter behind the scenes. The OData implementation could make the filters available on the context rather than applying the results 'outside' of the controller.
It would be interesting to know what you mean by "a problematic concept". This model seems so natural to me. I'm amazed it doesn't work this way already.
I have an application with several Web API controllers and I now I have a requirement which is to be able to filter GET results by the object properties. I've been looking at using OData but I'm not sure if it's a good fit for a couple reasons:
The Web API controller does not have direct access to the DataContext, instead it gets data from our database through our "domain" layer so it has no visibility into our Entity Framework models.
Tying into the first item, the Web API deals with lightweight DTO model objects which are produced in the domain layer. This is effectively what hides the EF models. The issue here is I want these queries to be executed in our database but by the time the Web API method gets a collection from the domain layer all of the objects in the collection have been mapped to these DTO objects, so I don't see how the OData filter could possibly do it's job when the objects are once-removed from EF in this way.
This item may be the most important one: We don't really want to allow arbitrary querying against our Web API/Database. We just sort of want to leverage this OData library to avoid writing our own filters, and filter parsers/builders for every type of object that could be returned by one of our Web API endpoints.
Am I on the wrong track based on #3? If not, would we be able to use this OData library without significant refactoring to how our Web API and our EF interact?
I haven't had experience with OData, but from what I can see it's designed to be fed a Context and manages the interaction and returning of those models. I am definitely not a fan of returning Entities in any form to a client.
It's an ugly situation to be in, but when faced with this, my first course of action is to push back to the clients to justify their searching needs. The default request is almost always "Well, it would be nice to be able to search against everything." My answer to that is that I don't want to know what you want, I want to know what you need because I don't want to give you a loaded gun to shoot your own foot off with and then have you blame me because the system came grinding to a halt. Searching is a huge performance killer if it's too open-ended. It's hard to test for accuracy/relevance, and efficiently index for 100% of possible search cases when users only need 25% of those scenarios. If the client cannot tell you what searching they will need, and just want everything because they might need it, then they don't need it yet.
Personally I stick to specific search DTOs and translate those into the linq expressions.
If I was faced with a hard requirement to implement something like that, I would:
Try to push for these searches/reports to be done off a reporting replica that is synchronized with the live database. (To minimize the bleeding when some idiot managers fire up some wacky non-indexed search criteria so that it doesn't tie up the production DB where people are trying to do work.)
Create a new bounded DbContext specific for searching with separate entity definitions that only expose the minimum # of properties to represent search criteria and IDs.
Hook this bounded context into the API and OData. It will return "search results". When a user selects a search result, use the ID(s) against the API to load the applicable domain, or initiate an action, etc.
no. 1. is optional, a nice to have provided they can live with searches not "seeing" updated criteria until replicated. (I.e. a few seconds to minutes depending on replication strategy/size) Normally these searches are used for reporting-type queries so I'd push to keep these separate from the normal day-to-day searching options that users use. (I.e. an advanced search option or the like.)
Coming from a sql background. When I want to limit access to data based upon certain attributes of a user, I can create a view and use the view as a filter in limiting what data a user sees based upon the criteria in the view. This approach relies upon relationships and so far it has worked for me. Looking at NoSql and the shift in strategy and concept, I am confused about how to implement this consdering the nature of NoSql. What is the NoSql approach to a problem such as this? When users are only privy to certain rows based upon their user type? For example, say an administrator can see all of the records for a particular group and a generic user can only see their records and certain group level items, group photos, group messaging, etc. that are public within a group. I am really trying to wrap my head around not thinking in terms of the sql approach to this problem but I am new to NoSql so that has been a challenge.
NoSQL databases are conceptually different from relational databases in many aspects. Authorization and security in general is not their primary focus. But, most of them have evolved in that area, and they have fine-grained authorization. Basically, it depends on a particular database.
For example, Cassandra has column-level permissions in plan (https://issues.apache.org/jira/browse/CASSANDRA-12859), HBase has cell-level permissions (https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cdh_sg_hbase_authorization.html). On the other hand, MongoDB is generally schemaless and it has different (more complex) document-oriented data model, which makes it hard to implement low level access control. Additionally, MongoDB has views.
So, if DBMS you are using doesn't have built-in authorization on expected level, it has to be implemented inside an application that interacts with db (if there are more than one application, things can get tricky, and some usage rules have to be established). Using denormalized models is a common thing, so different roles/groups can interact with different tables/collections which contain data that only that role/group can see (basically, it's a simulation of RDBMS views). This denormalized approach usually takes more space and it requires keeping copies in sync. If DBMS supports projection, subset of columns/fields can be presented for different roles/groups (that way, at least some of processing is done on db side).
I hope this helps, though it's a late answer.
Understanding that passing Datasets through web services is a bit heavy (and almost if not completely unconsumable to non .NET clients) what is the best way to prep database query results that don't map to known types for transport through web services in c#?
Return your results as collections of Data Transfer Objects. These would be simple objects with nothing but properties. There would be one property for each "column" of your result.
I don't know what you mean about passing the query. That's not normally done. You might pass criteria for the query, but not the query itself.
i know this is an age old question, but this is my scenario
This is in C# 2.0
Have a windows application which has a datagridview control. This needs to be populates by making a webservice call.
I want to achive the same functionality on the data if i were to use a direct connection and if i were using datasets, namely like paging and applying filters to returned data. i know returning datasets is a bad idea and am looking to find a good solution.
I might look at ADO.NET Data Services, aka Astoria, in VS2008 SP1.
This allows you to expose data over a web-service (WCF exposing ATOM, IIRC) - but you don't need to know all these details: the tooling worries about that for you: you just get regular IQueryable<T> sources on a data-context (not quite the same as the LINQ-to-SQL data-context, but same concept).
The good thing here is that a LINQ query (such as filtering (Where), paging (Skip/Take) etc) can get composed all the way from the client, through the web-service, and down to the LINQ-enabled data store (LINQ-to-SQL or Entity Framework, etc). So only the right data comes over the wire: if you ask for the first 10 rows (of 20000) ordered by Name, then that is what you get: 10 rows out of the database; 10 rows over the wire, no messing.
Write a custom class (MyDataItem) that'll hold your data. Then you can pass a List<MyDataItem> or some collection of MyDataItem and bind to your grid.
Paging, filtering, etc. would need to be implemented by you.
There is no way to get the binding behavior you automatically get with a DataSet if you are going through a Web Services data layer. You would have to create your own proxy class that supports all the databinding functions and persists them through your web service calls. Depending on your application's environment, you may want to batch up modifications to avoid excess round trips to the web services.
fallen888 has it right - you will need to create a collection class of List or a DataTable, fill it with the output from the webservice data stream, and handle paging and filtering yourself yourself.