Partial file uploads being automatically deleted - c#

I have some c# code that is doing some file uploads to my apache server via HttpWebRequests. While the upload is in progress, I am able to use ls -la to see the growing file size.
Now, if I for example pull my computers network cable, the partial file upload remains on the server.
However, if I simply close my c# app, the partial file is deleted!
I assume this is being caused by my streams being closed gracefully. How can I prevent this behavior? I want my partial file uploads to remain regardless of how the uploading app behaves.
I have attempted to use a destructor to abort my request stream, as well as call System.Environment.Exit(1), neither of which had any effect.

Pulling the network cable will never be equivalent to aborting the stream or closing the socket, as it is a failure in a lower OSI level.
Whenever the application is closed, the networking session is aborted and any pending operation cancelled. I don't think there's any workaround, unless you programmatically split the file transfer in
smaller chunks and save them as you go along (this way you'd have a manual incremental transfer, but it requires some code server-side).

Write a very simple HTTP proxy that keeps accepting connections but never closes a connection to your server
Even simpler, using netcat 1.10 (though this will accept just one connection)
nc -q $FOREVER -l -p 12345 -c 'nc $YOUR_SERVER 80'
Then connect your C# client to localhost:12345

This might be a silly suggestion but what if you call Process.GetCurrentProcess().Kill(); while the application is being closed?

Before looking at processing of partial uploads, start by testing whether turning keepalives on in Apache configuration solves your problem of receiving partial uploads.
This may have the effect of seeing fewer disconnects and thus less need to process their partial data. Such disconnects may be due to the client, the server, but often they are due to an intermediate node such as a firewall. The keepalives option has the effect of maintaining steady "dummy" traffic (0 byte long data payload), thus advertising to all parties that the connection is still alive.
For a large site with heavy concurrent load, keepalives are a bad thing which is why they are off by default. The option makes connection management for Apache much more complicated, preventing optimized connection reuse, and there is also a little bit of extra network traffic. But maybe you have a specialized use case where this is not a concern.
Keepalives will never help you at all if your clients simply tend to crash too soon (that is, if you see steady progress on the uploads at all times). They may help you considerably if the issue is network related.
They will help you tremendously if your clients generate the data gradually, with long delays in between uploaded chunks.

Have you checked, if your application steps into
void FinishUpload(IAsyncResult result) {…}
(line 240) when aborting/killing the app? If so, you may consider to not enter the callback. This is a bit dirty but may give you a location to start digging.

Does Apache support the SendChunked property of HTTPRequest?
If so it is worth trying out.
http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.sendchunked.aspx

Related

IIS7 stops working after 5 requests

Here is my problem:
I have just been brought onto a massive asp.net C# project and I've been charged with fixing some performance issues (not my area of expertise). More specifically after 5 - 7 redirects/ajax calls the web server stops responding and the whole page (and eventually the browser) freezes.
I don't think this is a coding issue as I've set up break points in a few pages (Page_Load method) and after the 5 requests it does not even reach the break points.
I don't believe this is related to this issue as I've increased the browser's maximum connections per server parameter and I got the same behavior. Furthermore after these 5 request in one browser IE, the application stops working in FF as well.
This is not a resource issue as the w3wp.exe process never exceeds 500MB memory.
One thing I've noticed when using Fiddler and other tools to monitor the requests is that the server takes a very long time when loading image files (png, jpg). I don't know if this is relevant.
I've enabled failed request tracing on the server and the only thing I've noticed is that some request fail with a 401 error even dough I've set Anonymous Authentication to enabled.
Here is the exact message
MODULE_SET_RESPONSE_ERROR_STATUS
ModuleName ManagedPipelineHandler
Notification 128
HttpStatus 401
HttpReason Unauthorized
HttpSubStatus 0
ErrorCode 0
ConfigExceptionInfo
Notification EXECUTE_REQUEST_HANDLER
ErrorCode The operation completed successfully. (0x0)
This message is sometimes thrown with ModuleName: ScriptModule
I have already wasted 2 days on this thing and I'm running out of ideas so any suggestions would be appreciated.
Like any large generic problem, your best bet in diagnosing the issue is to figure out how to break down the issue into smaller parts, how to hypothesize the issues, and how to validate or invalidate your hypotheses. My first inclination would be to hypothesize that the server-side processes in this particular are taking a long time, causing your client requests to block, making the whole thing seem frozen.
From there, I would attempt to replicate the long running server side processes by creating isolated client side tests - perhaps if the URLs are HTTP gets, I would test the same URLs individually. If they were HTTP posts, I'd create an isolated test form if feasible to see what happens with each request. If a long running server side process is found then you have a starting point.
If there are no long running server side processes then it may be JavaScript / client side coding issues that need to be looked into. But definitely when you're working a large, unfamiliar project, your best bet is to figure out how to break down the issue into smaller components that can then be tested
I solved the issue finally. Here is what I did:
Experimented with IIS settings and App_Pool recycling and noticed that there is nothing wrong with the way it handles requests that actually reach it.
I focused on the Http.sys module and noticed that in the log files there were a lot of Timer_ConnectionIdle and Client_Reset errors.
After some more experimentation and a lot of Google searches, I accidentally found this answer and it solved my issue. As the answer suggests the problem was caused by the AVG antivirus installed and incorrectly configured on the server.
Thanks for all the help and suggestions.
If it's ajax calls that are causing your browser to freeze, make sure they are not blocking ajax calls.
Just appending to Shan's answer, which is a good one.
First off, there is obviously a code issue as this is by no means 'normal' behavior for IIS.
That said, you must isolate it as Shan indicated. For example, given the server itself no longer accepts connections then we can pretty well eliminate javascript as the source of the problem and relegate it to being just a symptom.
Typically when a worker process spins into space like this it is due to either an infinite loop or an issue where multiple threads are trying to lock the same resource. I bet if you let it run long enough IIS itself will timeout, kill and restart the process.
With that in mind you want to look for any type of multithreaded garbage (which I highly recommend you don't do in a web server) or for anything that indicates a tight infinite loop. A loop is going to become apparent if you execute the requests individually. A multi-threaded issue will only show up if you happen to get a collision.
Run various performance counters on the web server. Also, once it locks up, let it sit that way for awhile. Once IIS performs it's own reset on the worker process go look for indicators in the event log.

FTP monitoring and downloading of new files

I have an FTP monitoring/downloading application using C# sockets. I got this error message:
421 Disconnecting you since you were inactive for 300 seconds.
Can someone have an explanation for this? I did a search on this one but still I can't seem to find a good explanation. Thanks.
It says it disconnected you because your connection was inactive for 300 seconds. This is a common practice on FTP servers since (as opposed to HTTP) FTP is not stateless, connections stay open and connections that do nothing can easily fill the connection limit of the server.
The obvious solution is making sure you don't stay inactive for 300 seconds. Create a timer that does something every minute or so, like getting a list of files in the current directory or something.
EDIT: As ChaosPandion mentionned in a comment, maybe you should just close the connection when you're done and reopen it when you need it again.
I think this pretty much explains itself. The server is disconnecting your connection, since it wasn't active for 5 minutes. The questions is: what counts as activity?
I'm afraid the answer won't be found in the FTP RFC, since this can be implementation-specific. Also, the timeout interval may vary (it may be configurable via the FTP administration utility).
If I'm correct, you'll simply have to design your application to work around this constraint, by reconnecting when disconnected, and performing any activities necessary to re-validate your application's inner state.

BizTalk server problem

we have a biztalk server (a virtual one (1!)...) at our company, and an sql server where the data is being kept.
Now we have a lot of data traffic. I'm talking about hundred of thousands. So I'm actually not even sure if one server is pretty safe, but our company is not that easy to convince.
Now recently we have a lot of problems.
Allow me to situate in detail, so I'm not missing anything:
Our server has 5 applications:
One with 3 orchestrations, 12 send ports, 16 receive locations.
One with 4 orchestrations, 32 send ports, 20 receive locations.
One with 4 orchestrations, 24 send ports, 20 receive locations.
One with 47 (yes 47) orchestrations, 37 send ports, 6 receive locations.
One with common application with a couple of resources.
Our problems have occured since we deployed the applications with the 47 orchestrations.
A lot of these orchestrations use assign shapes which use c# code to do the mapping. This is because we use HL7 extensions and this is kind of special, so by using c# code & xpath it was a lot easier to do the mapping because a lot of these schema's look alike. The c# reads in XmlNodes received through xpath, and returns XmlNode which are then assigned again to biztalk messages. I'm not sure if this could be the cause, but I thought I'd mention it.
The send and receive ports have a lot of different types: File, MQSeries, SQL, MLLP, FTP.
Each of these types have a different host instances, to balance out the load.
Our orchestrations use the BiztalkApplication host.
On this server also a couple of scripts are running, mostly ftp upload scripts & also a zipper script, which zips files every half an hour in a daily zip and deletes the zip files after a month. We use this zipscript on our backup files (we backup a lot, backups are also on our server), we did this because the server had problems with sending files to a location where there were a lot (A LOT) of files, so after the files were reduced to zips it went better.
Now the problems we are having recently are mainly two major problems:
Our most important problem is the following. We kept a receive location with a lot of messages on a queue for testing. After we start this receive location which uses the 47 orchestrations, the running service instances start to sky rock. Ok, this is pretty normal. Let's say about 10000, and then we stop the receive location to see how biztalk handles these 10000 instances. Normally they would go down pretty fast, and it does sometimes, but after a while it starts to "throttle", meaning they just stop being processed and the service instances stay at the same number, for example in 30 seconds it goes down from 10000 to 4000 and then it stays at 4000 and it lowers very very very slowly, like 30 in 5minutes or something. So this means, that all the other service instances of the other applications are also stuck in here, and they are also not processed.
We noticed that after restarting our host instances the instance number went down fast again. So we tried to selectively restart different host instances to locate the problem. We noticed that eventually restarting the file send/receive host instance would do the trick. So we thought file sends would be the problem. Concidering that we make a lot of backups. So we replaced the file type backups with mqseries backups. The same problem occured, and funny thing, restarting the file send/receive host still fixes the problem.
No errors can be found in the event viewer either.
A second problem we're having is. That sometimes at arround 6 am, all or a part of the host instances are being stopped.
In the event viewer we noticed the following errors (these are more than one):
The receive location "MdnBericht SQL" with URL "SQL://ZNACDBPEG/mdnd0001/" is shutting down. Details:"The error threshold has been exceeded. The receive location is shutting down.".
The Messaging Engine failed to add a receive location "M2m Othello Export Start Bestand" with URL "\m2mservices\Othello_import$\DataFilter Start*.xml" to the adapter "FILE". Reason: "The FILE adapter cannot access the folder \m2mservices\Othello_import$\DataFilter Start.
Verify this folder exists.
Error: Logon failure: unknown user name or bad password.
".
The FILE adapter cannot access the folder \m2mservices\Othello_import$\DataFilter Start.
Verify this folder exists.
Error: Logon failure: unknown user name or bad password.
An attempt to connect to "BizTalkMsgBoxDb" SQL Server database on server "ZNACDBBTS" failed.
Error: "Login failed for user ''. The user is not associated with a trusted SQL Server connection."
It woould seem that there's a login failure at this time and that because of it other services are also experiencing problems, and eventually they are shut down.
The thing is, our user is admin, and it's impossible that it's password is wrong "sometimes". We have concidering that the problem could be due to an infrastructure problem, but that's not really are department.
I know it's a long post, but we're not sure anymore what to do. Would adding another server and balancing the load solve our problems? Is there a way to meassure our balance and know where to start splitting? What are normal numbers of load etc?
I appreciate any answers because these issues are getting worse and we're also on a deadline.
Thanks a lot for replies!
Your immediate problem is BizTalk throttling feature. It's supposed to help BizTalk survive temporary overload conditions. One of its many problems is that you can see the throttling kick-in only in the performance monitor and not in the event log.
What you should do:
Separate the new application to a different host than the rest of the applications. Throttling is done in the host level. So the problematic application wont affect the rest of the applications.
Read about how to disable throttling in the link above.
What we have done is implementing an external throttling service. That feed the BizTalk receive location in small digestible packets. Its ugly, but the problem is ugly.
Update to comment: You have enough host instances. So Ignore that advice. You may reorder the applications between the instances. But there are no clear guidelines to do that. So its just shuffling and guessing.
About the safeness of disabling throttling. This feature doesn't make much sense in many scenarios. You have to study it. Check which of the throttling parameters you are hitting (this can be seen in the performance monitor) and decide how to change the thresholds.
How many host instances do you have?
From the line:
The send and receive ports have a lot
of different types: File, MQSeries,
SQL, MLLP, FTP. Each of these types
have a different host instances, to
balance out the load. Our
orchestrations use the
BiztalkApplication host
It sounds like you have a lot - I recently did an audit of a system where BizTalk was self throttling and the issue was in part due to too many host instances. Each host instance places its own load upon the BizTalk messagebox, as well as chewing up a minimum of 200mb memory.
Reading your comment, you have 20 - this is too many and would be a big part of your problems.
A good starting host setup would be:
A dedicated tracking host
One host that contains all receive handlers for adapters
One host that contains all orchestrations
One host that contains all send handlers for adapters
One host for adapters that need to be clustered (like FTP and MSMQ)
You can then also consider things like introducing "real time" hosts and batched hosts, so you can tune the real time hosts for low latency.
You can also have hosts for specific applications if there are known to be unstable, but in general this should not be done.
I run a BizTalk system that has similar problems and can empathize with what you are seeing. I don't know if it's the same, but I thought I'd share my experience in case.
In the same manner restarting the send/receive seems to fix the problem. In my case I found a direct correlation to memory usage by the host processes. I used performance counters to see when a given host was throttled for memory. By creating extra hosts, and moving orchestrations and ports between them I was able to narrow down which business sets were causing the problem. Basically in my case restarting the hosts was the equivalent to the ultimate "garbage collection" to free up memory. This was of course until enough instances came through to gobble it up again.
I'm afraid I have not solved the issue yet, but a few things I found to alleviate the issue:
Raise the memory to a given process so that throttling does not occur or occurs later
Each host instance, while informative, does have an overhead that is added. Try combining hosts that are not your problem children together to reduce the memory foot print.
Throw hardware at the problem, ram is cheap
I measure the following every few minutes in perfmon so I can diagnose where the problem is:
BizTalk:MessageAgent(*)\Process memory usage (MB)
BizTalk:MessageAgent(*)\Process memory usage threshold
Memory\Available MBytes
A few other things to take a look at. Make sure any custom pipelines use good BizTalk memory practices (i.e. no XML DOM manipulation hiding somewhere, etc). Also theoretically reducing the number of threads for a given host should lower the amount of memory it can seize at one time. I did not seem to have much luck with this one. Maybe the BizTalk throttling overrode it as others have mentioned, I don't know. Also, on a final note, if you dump the perfmon results to a csv, with Excel you can make some pretty memory usage graphs. These might be useful for talking to management about buying more hardware. That's assuming your issue fits this scenario as well.
We fixed the problem temporarily due to a combination of all ur answers.
We set the process memory usage throttling parameters of some hosts higher.
We divided the balance of the host instances better after I analyzed all the memory usage of all hosts, thanks to performance counters and also with the use of a tool called MsgBoxViewer.
And now we're trying to get more physical memory & hopefully also an extra server or a 64bit server.
Thanks for all replies!
We recently installed a 64-bit server in cluster with our older server. Thanks to this we can balance the memory even better which solved a lot of problems.
Although the 64-bit didn't give us much improvements (except for a bit more memory) since it can't use 64-bits on IBM MQ's, MLLP's, HL7 pipelines etc...
The other answers are helpful for run-time performance tuning, but i would recommend a design change as well.
You say that you do a lot of message manipulation in the orchestration in the message assignment shapes.
I would recommend moving that code to dedicated transforms. They are much more light weight, and can be executed faster. You can combine custom xslt and c# in these maps to do the hard work. Orchestrations cost more in development, design and testing, and a whole lot more in run-time performance.
You can then use transforms for message transformation, and leave the orchestrating (what is left of it after moving the message assignment code) to the orchestrations.
The added benefit of using transforms over orchestrations is that they are much more testable.

Socket.BeginReceive Performance on Mono

I'm developing a server in C#. This server will act as a data server for a backup service: a client will send data, a lot of data, continuously, specifically will send data chunk of files, up to five, in the same tcp channel. I'll send data to the server slowly, i don't want to kill customer bandwidth, so i didn't need to speed up at max data send and, for this reason, i can use a single tcp channel for everything.
Said this, actually the server uses BeginReceive method to acquire data from client and, on windows, this means IOCP. My questions is: how BeginReceive will perform on linux/freebsd trough mono? On windows, i've read a lot of stuff, will perform very well but this software, the server part, will run on linux or freebsd trough mono and i don't know how these methods are implemented on it!
More, to try to reduce continue allocations of an Async State object for the (Begin|End)Receive method i mantain one for the tcp connection and in the BeginReceive callback i copy out data before reuse it (naturally i don't clear data in because i know how much read trough EndReceive return value). Buffer is set on 8kb so i'll at max copy out 8kb of data, it shouldn't kill resoruces.
My target is to get up to 400/500 connections at max. It isn't so much, but the server (machine), in the meantime, will handle files trough an own filesystem (developed using fuse first in C# and later in C) on LVM+Linux Software Raid Mirror and antivirus check using clamav so the software must be light as can!
EDIT: I forgot to say that the machine will be (probably) a Intel Core 2 Duo 2.66+ GHz (3 MB L2 - FSB 1066 MHz) with 2 GB of ram and the SO using 64 bits.
Is mono using epoll (libevent) or kqueue (on freebsd)? And I should do something specific to try to maximize performances? Can I do something more to don't kill resources receiving data packets?
I know it's a little late, but I just found this question...
Mono is able to handle the number of connections that you need and much more. I regularly test xsp2 (the Mono ASP.NET standalone server) with over 1k simultaneous connections.. If this is going to be a high load situation, you should play a bit with setting MONO_THREADS_PER_CPU until you find the right number of threads for the ThreadPool.
On linux, Mono uses epoll when available (which is always these days).
I can't speak specifically about the performance of that one function on mono, but in general mono performs very well these days. 4-500 connections is as you say, not very many, so I doubt you'd have any issues.
In saying that, it shouldn't be very hard to set a test for this kind of thing up. I think that's probably the only way you'll get a conclusive answer for your situation.

Is it ok to never delete from a MSMQ?

I have inherited an application that pulls messages out of an MSMQ does some processing to them and then adds some data to database depending on what is in the message. The messages are getting pushed into the Queue by a third party application I do not control.
I do not know much about MSMQ, although I do have a basic understanding of how to use the APIs.
Anyway, I have noticed that the messages never get deleted, our client definately never explictly deletes them, and I can look in computer management and see the messages back to when the server was last rebooted.
Is this wrong? Will the messages start to automatically get deleted when the queue reaches some maximum size or will they just pile up there forever slowly taking up more memory?
Once a message has been processed, it is normal practice to remove it from a queue (transactionally or otherwise).
I'd suspect that while this isn't best practice, the queue is cleared on reboot, and as long as there's a sufficient amount of resources available, you'll never actually run into a problem.
That said, I'd opt for setting something up to periodically clean up the queue so you don't overwhelm the server. I'm not too familiar with MSMQ, but is there some way that you can tell if a message has been processed? Even if it's an additional service that runs, checks the messages in the queue and sees if they already appear in the database, and deletes them if they do? That way, you wouldn't need to modify the codebase you inherited, since it's working properly as-is.
Once you decide on a solution, please post an update here - I'm interested to know how you end up dealing with this problem. Thanks!
"Anyway, I have noticed that the messages never get deleted, our client definately never explictly deletes them, and I can look in computer management and see the messages back to when the server was last rebooted."
Sounds like the messages are Express if there are none around from before the last reboot. Express messages are only stored in RAM and not persisted to disk so restarting the MSMQ service will destroy them. This is probably why the volume of messages has never reached a critical level.
As MSMQ uses kernel memory and disk space for memory storage, eventually one of the two would give out and cause you server stablity issues so your plan to have a cleanup process is a good one.
Cheers,
John Breakwell (MSFT)

Categories