NoSql strategies for providing authorization to rows of data

NoSql strategies for providing authorization to rows of data - c#

Coming from a sql background. When I want to limit access to data based upon certain attributes of a user, I can create a view and use the view as a filter in limiting what data a user sees based upon the criteria in the view. This approach relies upon relationships and so far it has worked for me. Looking at NoSql and the shift in strategy and concept, I am confused about how to implement this consdering the nature of NoSql. What is the NoSql approach to a problem such as this? When users are only privy to certain rows based upon their user type? For example, say an administrator can see all of the records for a particular group and a generic user can only see their records and certain group level items, group photos, group messaging, etc. that are public within a group. I am really trying to wrap my head around not thinking in terms of the sql approach to this problem but I am new to NoSql so that has been a challenge.

NoSQL databases are conceptually different from relational databases in many aspects. Authorization and security in general is not their primary focus. But, most of them have evolved in that area, and they have fine-grained authorization. Basically, it depends on a particular database.
For example, Cassandra has column-level permissions in plan (https://issues.apache.org/jira/browse/CASSANDRA-12859), HBase has cell-level permissions (https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cdh_sg_hbase_authorization.html). On the other hand, MongoDB is generally schemaless and it has different (more complex) document-oriented data model, which makes it hard to implement low level access control. Additionally, MongoDB has views.
So, if DBMS you are using doesn't have built-in authorization on expected level, it has to be implemented inside an application that interacts with db (if there are more than one application, things can get tricky, and some usage rules have to be established). Using denormalized models is a common thing, so different roles/groups can interact with different tables/collections which contain data that only that role/group can see (basically, it's a simulation of RDBMS views). This denormalized approach usually takes more space and it requires keeping copies in sync. If DBMS supports projection, subset of columns/fields can be presented for different roles/groups (that way, at least some of processing is done on db side).
I hope this helps, though it's a late answer.

Related

How to Retrieve All Active Inventories from All tenants and companies to one custom screen selector in Acumatica?

How to Retrieve All Active Inventories from All tenants and companies to one custom screen selector in Acumatica??
In ISV Solution I am trying this. How to approach this scenario

Disclaimer: What you are asking appears to venture into a use case that is contrary to a foundational design element of Acumatica ERP - isolation of tenants within a single database. While I have a similar use case in mind in which I need to do the same, use of any of these suggestions should be weighed seriously against your business requirements and operating constraints.
After discussion with some Acumatica Developer MVP's and Acumatica staff, there is no SUPPORTED method for doing this directly with tables already containing a CompanyID field. However, here are a few possible ways to achieve your desired results.
Create alternate SQL Tables / Acumatica DAC's without CompanyID and CompanyMask so that the data can be shared. This will require keeping the supplemental table data in sync.
UNSUPPORTED (This will fail an ISV Certification) - Create a SQL view that excludes CompanyID and CompanyMask and an Acumatica DAC for that view. This will allow visibility to the data, regardless of tenant, without having to create or maintain duplicate data. Again, I was advised strongly against this approach in the same statement of it being possible.
Use an alternative tool to access the data in the SQL database if you have direct access to the database. This tool would have to access the database in a way completely independently of PXData so that all database queries do not automatically receive the "CompanyID = X" in the where clause.
It was mentioned that you might do something with impersonating other users, but I don't have any experience with that. I'm not sure if it would allow cross-tenant data access, but I don't believe it would allow viewing more than 1 tenant even if it does let you view another tenant at all.
My understanding is that the first option is the preferred method to stay within Acumatica in a supported way. The third option could be valid if you already use other analytics tools like Cognos.

Filtering with Web API

I have an application with several Web API controllers and I now I have a requirement which is to be able to filter GET results by the object properties. I've been looking at using OData but I'm not sure if it's a good fit for a couple reasons:
The Web API controller does not have direct access to the DataContext, instead it gets data from our database through our "domain" layer so it has no visibility into our Entity Framework models.
Tying into the first item, the Web API deals with lightweight DTO model objects which are produced in the domain layer. This is effectively what hides the EF models. The issue here is I want these queries to be executed in our database but by the time the Web API method gets a collection from the domain layer all of the objects in the collection have been mapped to these DTO objects, so I don't see how the OData filter could possibly do it's job when the objects are once-removed from EF in this way.
This item may be the most important one: We don't really want to allow arbitrary querying against our Web API/Database. We just sort of want to leverage this OData library to avoid writing our own filters, and filter parsers/builders for every type of object that could be returned by one of our Web API endpoints.
Am I on the wrong track based on #3? If not, would we be able to use this OData library without significant refactoring to how our Web API and our EF interact?

I haven't had experience with OData, but from what I can see it's designed to be fed a Context and manages the interaction and returning of those models. I am definitely not a fan of returning Entities in any form to a client.
It's an ugly situation to be in, but when faced with this, my first course of action is to push back to the clients to justify their searching needs. The default request is almost always "Well, it would be nice to be able to search against everything." My answer to that is that I don't want to know what you want, I want to know what you need because I don't want to give you a loaded gun to shoot your own foot off with and then have you blame me because the system came grinding to a halt. Searching is a huge performance killer if it's too open-ended. It's hard to test for accuracy/relevance, and efficiently index for 100% of possible search cases when users only need 25% of those scenarios. If the client cannot tell you what searching they will need, and just want everything because they might need it, then they don't need it yet.
Personally I stick to specific search DTOs and translate those into the linq expressions.
If I was faced with a hard requirement to implement something like that, I would:
Try to push for these searches/reports to be done off a reporting replica that is synchronized with the live database. (To minimize the bleeding when some idiot managers fire up some wacky non-indexed search criteria so that it doesn't tie up the production DB where people are trying to do work.)
Create a new bounded DbContext specific for searching with separate entity definitions that only expose the minimum # of properties to represent search criteria and IDs.
Hook this bounded context into the API and OData. It will return "search results". When a user selects a search result, use the ID(s) against the API to load the applicable domain, or initiate an action, etc.
no. 1. is optional, a nice to have provided they can live with searches not "seeing" updated criteria until replicated. (I.e. a few seconds to minutes depending on replication strategy/size) Normally these searches are used for reporting-type queries so I'd push to keep these separate from the normal day-to-day searching options that users use. (I.e. an advanced search option or the like.)

Should you have one-database-to-rule-them-all setup or separated database for each bounded context?

In DDD, as far as I understand it, it helps or guides you on how to structure complex application. Now in an application, you should identify your Bounded Context. Say you have more than 10 BCs.
I read somewhere (forgive me I cannot give any links), that its not ideal to have 1-big database for a complex application. That it should be separated for each BC. If that's the easier route to take. How should one structure an app if each BC have their own database.
I tried searching on github but cannot find one.

It depends if they only share the same database or also some tables - i.e. data.
Sharing a database but not tables can be perfectly fine. Except if you aim for scalability and intend to make your BC's independently deployable and runnable units like microservices, in which case they should probably have their own data store instance.
I see a few more drawbacks to database tables shared by 2 or more Bounded Contexts :
Tight coupling. The reason we have distinct BC's is that they represent different domain spaces that are likely to diverge their own way. Changing a concept in one of the BC's might impact the underlying table, forcing the other BC's that use this table to change as well. You get rigidity where there should be suppleness. You might also have inconsistencies or "holes" in the data due to the multiple possible sources of change.
Concurrency. In highly concurrent systems, some entities and the tables underneath are subject to strong contention. Bounded Contexts are one of the ways to lighten the load by separating different types of writes, but that only works if they don't lock the same data at the end of the day. Same is true for reads in non-CQRS systems where they query the same database where writes are done.
ORM friendliness. Most ORMs won't allow you to map to 2 or more classes from the same database table without a lot of convolutions and workarounds.
How should one structure an app if each BC have their own database.
To some extent (e.g. that may include the UI layer or not), just as if you had multiple separate applications. Please be more specific if you have precise questions in mind.

The idea of having this vertical slice per bounded-context is so the relationship of each BC to every other BC and the communication between them should be considered and designed based on the domain knowledge and not on the technical merits of a persistence technology.
If you have a Customer in 2 different BCs it causes a kind-of actor pattern situation. If the Support BC needs to know about the new Customer when it is created in the Sales BC, then the Sales BC needs to connect up to a known interface on the Support BC and pass it this new information. One domain talking to another. It models quite closely how things work in real life when people from different departments talk to each other.
If you share a big database (you're talking bespoke enterprise software here so there won't be many examples in the wild) then the temptation is to bypass all the domain expertise that is captured in the domain layers and meddle in another BC's database. Things become a big ball of mud very quickly.
Surprisingly I see this sort of thing too often in the real world and I consider it very bad practice.

It depends a littlebit on the reason why they are their own database. THe idea of a bounded context is that you have a set of entities that are related together and solve a problem together. if you look at the link Chaim Eliyah provided you can have a sales and a support context.http://martinfowler.com/bliki/BoundedContext.html
Now there is no reason a product for sales,and a product for support should look the same in a database. What is important is that if support wants to add a property (say "Low quality") that it can do so while sales might not want that property. Also downtime on your sales application should probably not affect your support application.
That said entities don't care where they are stored. If you already have a huge product database you can certainly build your entities for different bounded context based on the same database. The thing to remember is that database table is not the same as entity. Entities is what your business/application needs. Database is just what's needed to store things.
That said, separate if you can. If that's not feasable try to define ownerships. You make your life a lot easier if everyone agrees that product is the product as defined by sales and that support can have a "productfactsheetTable" augmenting the product. That way you avoid conflicting changes from each bounded context. (also a followup is that support can only read products but never write). Table prefixes might help here to make this clear.
And this problem already exists with 2 related bounded context. By 10 you'll have a nightmare if multiple context try to write to the same table.

SQL - Storing values that have dynamic properties and constraints

I'm not sure if what I'm attempting to do is simply incorrect/impossible or if there is an easier way and that I'm missing the point.
I'm using SQL Server 2012
What I would like to do is have a table that can store rows with values relating to stored properties in another table. Basically, key value pair. The thing is, I would like to determine which key values can be used by which entities.
For example,
I would like one table listing various companies, another to store 'files' created for each company - this is used to store historical information, another listing various production departments(stages in production), another listing production figures(KGs, Units, etc), and one listing the actual production capture against these figures for each month. There are also tables in place to show which production departments can use which production figures as well as which company has which production departments.
Some of the companies have the same stages in production as well as additional stages that the others don't.
These figures are captured on a monthly basis ONLY, so I have a table describing all the months of a year.
Each production department may have similar types of recordings to be captured, though they don't all have the same production readings.
Here's a link to a graphical representation of the table layouts:
http://tinypic.com/r/30a51mx/8
..
My end result is to auto-populate / update the table with newly added figures as the user enters this section of the program (by passing through the FileID), and to allow the user to edit this using a datagridview (or atleast select a value to be edited from the datagridview)
I will then need to write reports later on that will need to pivot on this information.
Any help or suggestions would be greatly appreciated.
Thanks

For an effective DB design it is very important to understand, two major requirements:
Should the DB design be done keeping in mind the ease of use from application point of view or from efficient storage point of view.
This point is by large decided keeping in mind following factors:
How much data are we going to store, so we need to have some idea about cost of storage factoring redundancy. Good normalised DB reduces redundancy.
Your DB is normalised very well, but is that really needed. Typically cost of storage is very less in today's time, and so if we can think of design which is slightly more redundant, it should be OK. Unless of course you plan to use Standard version of SQL server which has its own limitation in terms of DB size.
Is the data retrieval and update slow/fast? The more normalised DB is more number JOINS are expected. In your case, if you want to return values for multiple properties, say n, in a single result, then you'd need to make n joins on the ProductionProperty table, which will essentially reduce query performance, and hence slow user experience. So unless your UI is not very demanding, and your users can live with small lag, go ahead with a normalised DB design.
ORM mismatch- Since the relational database model and object model (assuming programming language follows OOPs concept) usually mismatch and they will heavily in a normalised scenario like this; you'll need to spend more hours coding through or troubleshooting scenarios which may require you to squirm in pain when making changes to either of these models. I suggest you use a good ORM framework to counter this and be more aware of the ORM mismatch scenarios.
Will you have separate Reporting DB or Reporting tables? Basically this translates into is your DB an OLTP database or Reporting Database? If this DB is going to worked on heavily by Data entry persons day in and day out, normalised form should be suitable considering that Point #1 is satisfied. If however reporting is a major need, then de-normalised form should be preferred (which means that you do not need so many separate tables).
PS: Master data should be kept in table of its own.I can say Months definitely is a master data and so is UoM unless you plan to do CRUD on the UoM measures too. Also note that it hardly matters keeping month in separate table especially when same business logic/constraints can be enforced on columns in SQL.

Creating a clear abstraction layer over a convoluted and large SQL database

Almost all of the applications I write at work get their data from a central MSSQL database. This database has about 70 tables, and on average I'd say 25 or so columns per table. The database has developed over 5-10 years (I'm not entirely sure) and is full of idiosyncrasies and quirks. Foreign keys are irregularly implemented when it comes to naming and so on, as well as case and language mixing in table and column names.
I am not able to restructure the database itself as it would break a ton of backwards compatibility for applications needed in the daily work of most people in the office.
I've almost exclusively been using LINQ2SQL for interacting with the database and it works fine, but always requires a lot of manual joining of tables, either in some db repository or 'inline' when coding. So I've finally decided that I have to do something to once and for all ease the pain of working with this leviathan. This would preferably include implementing a clear naming scheme, joining relevant tables with foreign keys properly once and for all etc.
The three routes I can see are:
Creating a number of views, stored procedures and functions in the SQL to ease up my interaction with the DB. This obviously has the bonus of being usable in many languages, as opposed to a solution implemented in e.g. C#. The biggest drawback I can see here is that it would probably take a lot of time to do this properly, as well as being a bit harder to service a year down the road when I haven't looked at the SQL queries for a while. I would also need to implement another DB abstraction step inside my applications as I wouldn't want to work with just straight up DB calls (abstraction upon abstraction seems bad in this case, but maybe I'm wrong?)
Continuing on my LINQ2SQL road, but creating a once-and-for-all repository class that hides all the underlying tables in abstracted calls only. This idea seems more feasible in terms of development time, maintenance and single-point-abstraction.
Pulling off some EF4 reverse-engineering magic, using the designer to hook up relevant foreign keys and renaming table classes to fit my taste.
Any input on how this should/could be done, as well as any recommended reading you might have, would be most appreciated.

We have a very similar situation with our database. We went the EF route, but we used Code First. I know it sounds weird to use Code First when your database already exists, but due to the size of the tables and the number of tables, trying to do it all in the designer was not feasible.
You can use the "Reverse Engineer Code First" option in Entity Framework Power Tools to generate everything you need from your database.

I think that well thought out abstraction layer is better suits the needs of application if it is not based on physical schema of DB. I mean - the main goal of DAL is to hide tables from users leaving to them only valid "activities" thru stored procedures. In most cases this will outperform the direct data access and gives to you one more degree of freedom - to play with TSQL code and to implement additional logic/schema changes without needing to change the application.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.