I've got situation where I need to expose some services to outer entity. All services do is:
1. take arguments from service caller
2. query database
3. delete queried data from db
4. return queried data back to caller
I'm trying to decide if I need to make server application which will access database, or just make database procedures. Functionally, both ways satisfies me. What I'm concerned of is security and I don't have much experience administrating postgresql database.
If I expose database procedures, how much administration I can do? Can I limit number of queries(procedure calls) user can issue? Can I limit time between 2 queries(procedure calls) and amount of memory user can use?
Services would return approximately 5MB of data and they would be called about few times per hour. Even tho service user is trusted and connection between user and server would be VPN'd, I need some kind of query rate limiting, just to be safe.
Related
I'm trying to come up with a design for an application (C#/Avalonia) which will allow creating views for data coming from multiple sources. The overall idea is to link the sources and present outcome using various visualization components.
There are multiple sources of data:
Database 1
Database 2/3/4
SOAP
Database 1 is going to be used to store everything related to the application itself (users, permissions and so on).
Databases 2-4+ are only data feeds.
SOAP - this is where I struggle, not quite sure how to handle. There could be 10-50 concurrent instances of the application running, and each one of them could request the same data update from SOAP (provider restrictions would make it impossible).
What I was thinking was to take the following approach:
Request initial data from SOAP
Cache the data in database 1 with a timestamp
Define a delay between the requests
Once a user requests a data update from SOAP, check if we should return cached or fresh data based on timestamp and delay value.
This approach leads to an issue where the user terminates the application in the middle of requesting new data.
User 1 requests new data, marks database to ensure no future requests are processed
User 2 requests new data - nothing happens at this stage, wait and query again
User 1 terminates - no new data for any users
Is the approach completely wrong and trying to handle it using Client only would be a suicide?
I have an MVC and WebAPI application that needs to log activities performed by the users back to my database. This is almost always a single insert into a table that have less than 5 columns (i.e. very little data is crossing the wire). The data interface that I am currently using is Entity Framework 6
Every once in a while, I'll get a large number of users needing to log that they performed a single activity. In this case, "Large Number" could be a couple hundred requests every second. This typically will only last for a few minutes at most. The rest of the time, I see very manageable traffic to the site.
When the traffic spikes, Some of my clients are getting timeout errors because the page doesn't finish loading until the server has inserted the data into the database. Now, the actual inserting of the data into the database isn't necessary for the user to continue on using the application, so I can cache these requests somewhere locally, and then batch insert them later.
Is there any good solutions for ASP.NET MVC to buffer incoming request data and then batch insert them into the database every few seconds?
As for my environment, I have several servers running Server 2012 R2 in a load balanced Web Farm. I would prefer to stay stateless if at all possible, because users might hit different servers per request.
When the traffic spikes, Some of my clients are getting timeout errors because the page doesn't finish loading until the server has inserted the data into the database.
I would suggest using a message queue. Have the website rendering code simply post an object to the queue representing the action, and have a separate process (e.g. Windows Service) read off the queue and write to the database using Entity Framework.
UPDATE
Alternatively you could log access to a file (fast), and have a separate process read the file and write the information into your database.
I prefer the message queue option, but it does add another piece of architecture.
I am going through the transactions exist in WCF service but seeking some more clarification on this. I am not sure about which transaction manager WCF will use for following scenarios:
If the WCF service is performing insert in table of one SQL server database and delete from table of another SQL server database (in same or different server)
If the same WCF service is performing insert in table of one SQL server database and delete from table oracle database.
If WCF service calling 2 different WCF service performing operation on same SQL server base database.
Kindly help me providing some understanding on this situations.
I think you're giving WCF more credit than it's due. WCF can do some amazing stuff, but there's nothing magical about it. It provides a set of interfaces for web services and allows you to provide an intermediary access layer for your data.
So let's tackle your scenarios:
If the WCF service is performing insert in table of one SQL server database and delete from table of another SQL server database (in same or different server)
We've got two RDBMS in use here, so you're going to have two transaction managers. The first transaction manager is in the RDBMS for the insert, and the second transaction manager is for the delete.
If the same WCF service is performing insert in table of one SQL server database and delete from table oracle database.
Again, we've got two RDBMS in use here, so you're going to have two transaction managers. The first transaction manager is in the RDBMS for the insert, and the second transaction manager is for the delete.
Note that we don't need to care about which type of RDBMS it is, we just track the number that are involved.
If WCF service calling 2 different WCF service performing operation on same SQL server base database.
This one is a little trickier because we don't know what the 2 WCF services are doing, and there is some unadvisable voodoo magic that could be done to coordinate transactions across the 2 services. I'm going to assume you're smarter than that and didn't mean that case.
So in this case, we have 1 RDBMS performing 2 separate transactions. We'll have 1 transaction manager from the 1 RDBMS, but the operations will complete under different transactions.
To wrap that up - to know how many transaction managers are involved, you need to look at the number of RDBMS that are being used. And to know how many transactions will be required, you need to look at the number of operations performed.
Notice that the use of WCF has no bearing on your concern about the managers. WCF just happens to be a tool that provides an additional way of accessing the data through a service. WCF is cool, but it's not magic.
Additional note
You asked in a comment:
my concern is that in all of this condition which transaction manager it will use a) The LTM b) The KTM c) The DTC?
And for the MS SQL Server transactions, it will either be the LTM or the DTC that handles the transaction. Per this MSDN Blog entry, it's not necessarily something you need to worry about until performance becomes a significant issue. And you should avoid premature optimization in favor of getting things working first.
And based upon this description of the KTM, it's very unclear how you think you'd be using the KTM in any of the cases you asked about.
The Kernel Transaction Manager (KTM) enables the development of applications that use transactions. The transaction engine itself is within the kernel, but transactions can be developed for kernel- or user-mode transactions, and within a single host or among distributed hosts.
Also note that Oracle DB has a separate transaction manager for its RDBMS that is different than the MS SQL Server transaction manager(s).
We are developing a multi-tenant application. With respect to architecture, we have designed shared middle tier for business logic and one database per tenant for data persistence. Saying that, business tier will establish set of connections (connection pool) with the database server per tenant. That means application maintain separate connection-pool for each tenant. If we expect around 5000 tenants, then this solution needs high resource utilization (connections between app server and database server per tenant), that leads to performance issue.
We have resolved that by keeping common connection pool. In order to maintain single connection pool across different databases, we have created a new database called ‘App-master’. Now, we always connect to the ‘App-master’ database first and then change the database to tenant specific database. That solved our connection-pool issue.
This solution works perfectly fine with on-premise database server. But it does not work with Azure Sql as it does not support change database.
Appreciate in advance to suggest how to maintain connection pool or better approach / best practice to deal with such multi-tenant scenario.
I have seen this problem before with multiple tenancy schemes with separate databases. There are two overlapping problems; the number of web servers per tenant, and the total number of tenants. The first is the bigger issue - if you are caching database connections via ADO.net connection pooling then the likelihood of any specific customer connection coming into a web server that has an open connection to their database is inversely proportional to the number of web servers you have. The more you scale out, the more any given customer will notice a per-call (not initial login) delay as the web server makes the initial connection to the database on their behalf. Each call made to a non-sticky, highly scaled, web server tier will be decreasingly likely to find an existing open database connection that can be reused.
The second problem is just one of having so many connections in your pool, and the likelihood of this creating memory pressure or poor performance.
You can "solve" the first problem by establishing a limited number of database application servers (simple WCF endpoints) which carry out database communications on behalf of your web server. Each WCF database application server serves a known pool of customer connections (Eastern Region go to Server A, Western Region go to Server B) which means a very high likelihood of a connection pool hit for any given request. This also allows you to scale access to the database separately to access to HTML rendering web servers (the database is your most critical performance bottleneck so this might not be a bad thing).
A second solution is to use content specific routing via a NLB router. These route traffic based on content and allow you to segment your web server tier by customer grouping (Western Region, Eastern Region etc) and each set of web servers therefore has a much smaller number of active connections with a corresponding increase in the likelihood of getting an open and unused connection.
Both these problems are issues with caching generally, the more you scale out as a completely "unsticky" architecture, the less likelihood that any call will hit cached data - whether that is a cached database connection, or read-cached data. Managing user connections to allow for maximum likelihood of a cache hit would be useful to maintain high performance.
Another method of restricting the number of connection pools per app server is to use Application Request Routing (ARR) to divide up your tenants and assign them to subsets of the web tier. This lends itself to a more scalable "pod" architecture where a "pod" is a small collection of web/app servers coupled to a subset of the databases. A good article on this approach is here:
http://azure.microsoft.com/blog/2013/10/31/application-request-routing-in-csf/
If you are building a multi-tenant DB application Azure you should also check-out the new Elastic Scale client libraries that simplify data-dependent routing and facilitate cross-shard queries and management operations. http://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/
We have an application with approximately 60,000 client machines accessing it. Previously we had a distributed model but we are moving to SaaS by creating a BO Layer and having calls come up into it over the WAN. We use LINQ to Entities to access the database from the BO layer. Our multi-tenant model is federated so that 'enterprises' comprising of multiple stores are on distinct sql servers (which usually has about 200 'enterprises' per server).
Each BO server is dual processor 8 core with HT (32 logicals). IIS is setup to have 32 max worker processes.
The BO layer is working pretty well as each call pulls the connection string associated with that enterprise which then talks to the correct database. The problem I am having though is that we have 1/4 of our clients on and about 15 BO servers, I have noticed that we have 3000+ open connections to each database server and its growing.
Any idea why it is growing like this? What am I supposed to set where to make it re-use connections (connection pooling appears to be on) that will keep it from flooding each db server like this? Any other suggestions?
It could be purely architecture thing.
How many database servers you have in total? And is the problem about workload is heavy on certain database servers but not others?
If that's the case, then probably considering how to partition different enterprise to different database servers will help. Or further partition data in heavy loaded database servers. Another technical is to vertical partition different tables for enterprises to different databases given no joins across vertical partitioned tables.