Retrieve all images at once or individually from SQL Server

Retrieve all images at once or individually from SQL Server - c#

I have a number of images that are stored as VARBINARY(MAX) (using FileStream) in a database. I'm looking to retrieve about 10 images or so at a time.
The prescribed, most common way using ASP.net is to use an HTTP handler and hit the database for each individual image. Seems fine, but is a bit slow at times.
Is it best to download all images for a given page at the same time in one big data chunk? Or should I try to grab each individually? Best practice?

Probably best to do them individually on a domain that doesn't have cookies set, or make sure your handler will work with multiple simultaneous requests. That way you can stream multiple results from the DB at the same time, and stream multiple images from your webserver as it gets them.

Well,
I think many people would have different opinions, and reasons about what the best practice is for them, but in reality, it all depends on hardware, software, data structure, and if the data is normalized.
In general, the SQL server likes SET operations better, meaning, the loops in general are slower.But, loops are safer for IOPs related issues, and they are better at causing less locks.
I am not sure which object mapper, or built in SQL library you are using( I have a feeling you may be using LINQ after you built a SQL class), but it also depends on the library you are using, and I would definitely recommend dapper.
I think reading them all at once would be faster, and here is why;
- If it is as you say, and you hit the database each time for the image, then that would add the delay of reconnecting to the database, so the latency will occur. But when there is one connection, the data retrieval is straight and your connection is open at that moment without requiring further session authentication.
I would recommend downloading them all at once, and informing the end user with a download screen during the process of that. Also for retrieving data, this link is very helpful I believe : https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx
Depending on the features of your server, and edition, you could definitely use different features.

Related

Handling limitations in multithreaded server

In my client-server architecture I have few API functions which usage need to be limited.
Server is written in .net C# and it is running on IIS.
Until now I didn't need to perform any synchronization. Code was written in a way that even if client would send same request multiple times (e.g. create sth request) one call will end with success and all others with error (because of server code + db structure).
What is the best way to perform such limitations? For example I want no more that 1 call of API method: foo() per user per minute.
I thought about some SynchronizationTable which would have just one column unique_text and before computing foo() call I'll write something like foo{userId}{date}{HH:mm} to this table. If call end with success I know that there wasn't foo call from that user in current minute.
I think there is much better way, probably in server code, without using db for that. Of course, there could be thousands of users calling foo.
To clarify what I need: I think it could be some light DictionaryMutex.
For example:
private static DictionaryMutex FooLock = new DictionaryMutex();
FooLock.lock(User.GUID);
try
{
...
}
finally
{
FooLock.unlock(User.GUID);
}
EDIT:
Solution in which one user cannot call foo twice at the same time is also sufficient for me. By "at the same time" I mean that server started to handle second call before returning result for first call.

Note, that keeping this state in memory in an IIS worker process opens the possibility to lose all this data at any instant in time. Worker processes can restart for any number of reasons.
Also, you probably want to have two web servers for high availability. Keeping the state inside of worker processes makes the application no longer clustering-ready. This is often a no-go.
Web apps really should be stateless. Many reasons for that. If you can help it, don't manage your own data structures like suggested in the question and comments.
Depending on how big the call volume is, I'd consider these options:
SQL Server. Your queries are extremely simple and easy to optimize for. Expect 1000s of such queries per seconds per CPU core. This can bear a lot of load. You can use a SQL Express for free.
A specialized store like Redis. Stack Overflow is using Redis as a persistent, clustering-enabled cache. A good idea.
A distributed cache, like Microsoft Velocity. Or others.
This storage problem is rather easy because it fits a key/value store model well. And the data is near worthless so you don't even need to backup.
I think you're overestimating how costly this rate limitation will be. Your web-service is probably doing a lot more costly things than a single UPDATE by primary key to a simple table.

Periodic updates to Access DB in C# over VPN

I know this may be considered a generic question, but I honestly don't have the first clue even where to start. I've tried searching, and have not found any results that fit the application.
I'm trying to develop a front-end for an Access 2010 database that will allow users to add/modify records. Several of the users use a VPN to connect to the DB, and the current model we are using of an Access 2010 Navigation Form is horrendously slow, regardless of connection speed. I have verified that we can reach the DB over VPN with no privilege issues or security concerns, but even through the OleDb engine there is significant latency on the data access.
What I would like to do is to be able to have updates sent/received in a background process, say every 5-10 minutes, so that the end user will be able to update it as they need to, and have the changes written without the user really being aware of the latency. Would simply using a background worker suffice to do this, or is there a better way to send "packets" of updates over the connection?
Again, I know this is not code-specific exactly, but I've never worked with C# and DB updates before, so I'm kind of learning as I go. Nearly all the results I've found have dealt with engines other than OleDb, such as SQL, but we are locked into using Access (an accdb file) as we don't have any other database engines available to us. I appreciate any and all help, in whatever form it comes in.
This is a new enough project that so far the only code that I've developed for this has consisted of initializing the connection to the DB to verify that it's even possible.

MS Access is not designed to be concurrent. When you open access db on remote machine you also downloading the entire db onto the client machine memory that's why the bigger the file the more it become slower.. If you wished to make it concurrent use MS SQL Express and linked it to your MS Access application. This way it is much faster and better.
Beside its free, and can be upscale if you needed.

Approach for caching data from data logger

Greetings,
I've been working on a C#.NET app that interacts with a data logger. The user can query and obtain logs for a specified time period, and view plots of the data. Typically a new data log is created every minute and stores a measurement for a few parameters. To get meaningful information out of the logger, a reasonable number of logs need to be acquired - data for at least a few days. The hardware interface is a UART to USB module on the device, which restricts transfers to a maximum of about 30 logs/second. This becomes quite slow when reading in the data acquired over a number of days/weeks.
What I would like to do is improve the perceived performance for the user. I realize that with the hardware speed limitation the user will have to wait for the full download cycle at least the first time they acquire a larger set of data. My goal is to cache all data seen by the app, so that it can be obtained faster if ever requested again. The approach I have been considering is to use a light database, like SqlServerCe, that can store the data logs as they are received. I am then hoping to first search the cache prior to querying a device for logs. The cache would be updated with any logs obtained by the request that were not already cached.
Finally my question - would you consider this to be a good approach? Are there any better alternatives you can think of? I've tried to search SO and Google for reinforcement of the idea, but I mostly run into discussions of web request/content caching.
Thanks for any feedback!

Seems like a very reasonable approach. Personally I'd go with SQL CE for storage, make sure you index the column holding the datetime of the record, then use TableDirect on the index for getting and inserting data so it's blazing fast. Since your data is already chronological there's no need to get any slow SQL query processor involved, just seek to the date (or the end) and roll forward with a SqlCeResultSet. You'll end up being speed limited only by I/O. I profiled doing really, really similar stuff on a project and found TableDirect with SQLCE was just as fast as a flat binary file.

I think you're on the right track wanting to store it locally in some queryable form.
I'd strongly recommend SQLite. There's a .NET class here.

SQlite/Firebird: Does any of them support multiple concurrent write access?

Question: I currently store ASP.net application data in XML files.
Now the problem is I have asynchronous operations, which means I ran into the problem of simultanous write access on a XML file...
Now, I'm considering moving to an embedded database to solve the issue.
I'm currently considering SQlite and embeddable Firebird.
I'm not sure however if SQlite or Firebird can handle multiple concurrent write access.
And I certainly don't want the same problem again.
Anybody knows ?
SQlite certainly is better known, but which one is better - SQlite or Firebird ? I tend to say Firebird, but I don't really know.
No MS-Access or MS-SQL-express recommodations please, I'm a sane person.

I wll choose Firebird for many reasons and for this too
Although it is transactional, SQLite
does not support concurrent
transactions, so if your embedded
application needs two or more
connections, they must be serialized.
An embedded Firebird database is
simple to upgrade to a fully shared
database - just change the shared
library.
May be you can also check this

SQLITE can be configured to gracefully handle simultaneous writes in most situations. What happens is that when one thread or process begins a write to the db, the file is locked. When the second write is attempted, and encounters the lock, it backs off for a short period before attempting the write again, until it succeeds or times out. The timeout is configurable, but otherwise all this happens without the application code having to do anything special except enabling the option, like this:
// set SQLite to wait and retry for up to 100ms if database locked
sqlite3_busy_timeout( db, 100 );
All this works very well and without any difficulty, except in two circumstances:
If an application does a great many writes, say a thousand inserts, all in one transaction, then the database will be locked up for a significant period and can cause problems for any other application attempting to write. The solution is to break up such large writes into seperate transactions, so other applications can get access to the database.
If the database is shared by different processes running on different machines, sharing a network mounted disk. Many operating systems have bugs in network mounted disks that making file locking unreliable. There is no answer to this. If you need to share a db on a network mounted disk, you need another database engine such as MySQL.
I do not have any experience with Firebird. I have used SQLITE in situations like this for many applications over several years.

Have you looked into Berkeley DB with the SQLite API for SQL support?

It sounds like SQLite will be a good fit. We use SQLite in a number of production apps, it supports, actually, it prefers transactions which go a long way to handling concurrency.
transactional sqlite? in C#

I would add #3 to the list from ravenspoint above: if you have a large call-center or order-processing center, say, where dozens of people might be hitting the SAVE button at the same time, even if each is updating or inserting just one record, you can run into problems using the busy timeout approach.
For scenario #3, a true SQL engine that can serialize is ideal; less ideal but serviceable is a dbms that can do byte-range record locking of a shared-file. But be aware that even a byte-range record lock will be inadequate for a large number of concurrent writes when new records are appended to the end of the file like a caboose on the end of a freight train, so that multiple processes are trying at the same time to set a lock on the same byte-range. On the other hand, a byte-range record locking scheme coupled with a hashed-key sparse file approach (e.g. the old Revelation/OpenInsight database for LANs) will be far superior to ISAM for this scenario.

Returning Large Results Via a Webservice

I'm working on a web service at the moment and there is the potential that the returned results could be quite large ( > 5mb).
It's perfectly valid for this set of data to be this large and the web service can be called either sync or async, but I'm wondering what people's thoughts are on the following:
If the connection is lost, the
entire resultset will have to be
regenerated and sent again. Is there
any way I can do any sort of
"resume" if the connection is lost
or reset?
Is sending a result set this large even appropriate? Would it be better to implement some sort of "paging" where the resultset is generated and stored on the server and the client can then download chunks of the resultset in smaller amounts and re-assemble the set at their end?

I have seen all three approaches, paged, store and retrieve, and massive push.
I think the solution to your problem depends to some extent on why your result set is so large and how it is generated. Do your results grow over time, are they calculated all at once and then pushed, do you want to stream them back as soon as you have them?
Paging Approach
In my experience, using a paging approach is appropriate when the client needs quick access to reasonably sized chunks of the result set similar to pages in search results. Considerations here are overall chattiness of your protocol, caching of the entire result set between client page requests, and/or the processing time it takes to generate a page of results.
Store and retrieve
Store and retrieve is useful when the results are not random access and the result set grows in size as the query is processed. Issues to consider here are complexity for clients and if you can provide the user with partial results or if you need to calculate all results before returning anything to the client (think sorting of results from distributed search engines).
Massive Push
The massive push approach is almost certainly flawed. Even if the client needs all of the information and it needs to be pushed in a monolithic result set, I would recommend taking the approach of WS-ReliableMessaging (either directly or through your own simplified version) and chunking your results. By doing this you
ensure that the pieces reach the client
can discard the chunk as soon as you get a receipt from the client
can reduce the possible issues with memory consumption from having to retain 5MB of XML, DOM, or whatever in memory (assuming that you aren't processing the results in a streaming manner) on the server and client sides.
Like others have said though, don't do anything until you know your result set size, how it is generated, and overall performance to be actual issues.

There's no hard law against 5 Mb as a result set size. Over 400 Mb can be hard to send.
You'll automatically get async handlers (since you're using .net)
implement some sort of "paging" where
the resultset is generated and stored
on the server and the client can then
download chunks of the resultset in
smaller amounts and re-assemble the
set at their end
That's already happening for you -- it's called tcp/ip ;-) Re-implementing that could be overkill.
Similarly --
entire resultset will have to be
regenerated and sent again
If it's MS-SQL, for example that is generating most of the resultset -- then re-generating it will take advantage of some implicit cacheing in SQL Server and the subsequent generations will be quicker.
To some extent you can get away with not worrying about these problems, until they surface as 'real' problems -- because the platform(s) you're using take care of a lot of the performance bottlenecks for you.

I somewhat disagree with secretGeek's comment:
That's already happening for you -- it's called tcp/ip ;-) Re-implementing that could be overkill.
There are times when you may want to do just this, but really only from a UI perspective. If you implement some way to either stream the data to the client (via something like a pushlets mechanism), or chunk it into pages as you suggest, you can then load some really small subset on the client and then slowly build up the UI with the full amount of data.
This makes for a slicker, speedier UI (from the user's perspective), but you have to evaluate if the extra effort will be worthwhile... because I don't think it will be an insignificant amount of work.

So it sounds like you'd be interested in a solution that adds 'starting record number' and 'final record number' parameter to your web method. (or 'page number' and 'results per page')
This shouldn't be too hard if the backing store is sql server (or even mysql) as they have built in support for row numbering.
Despite this you should be able to avoid doing any session management on the server, avoid any explicit caching of the result set, and just rely on the backing store's caching to keep your life simple.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.