we have a biztalk server (a virtual one (1!)...) at our company, and an sql server where the data is being kept.
Now we have a lot of data traffic. I'm talking about hundred of thousands. So I'm actually not even sure if one server is pretty safe, but our company is not that easy to convince.
Now recently we have a lot of problems.
Allow me to situate in detail, so I'm not missing anything:
Our server has 5 applications:
One with 3 orchestrations, 12 send ports, 16 receive locations.
One with 4 orchestrations, 32 send ports, 20 receive locations.
One with 4 orchestrations, 24 send ports, 20 receive locations.
One with 47 (yes 47) orchestrations, 37 send ports, 6 receive locations.
One with common application with a couple of resources.
Our problems have occured since we deployed the applications with the 47 orchestrations.
A lot of these orchestrations use assign shapes which use c# code to do the mapping. This is because we use HL7 extensions and this is kind of special, so by using c# code & xpath it was a lot easier to do the mapping because a lot of these schema's look alike. The c# reads in XmlNodes received through xpath, and returns XmlNode which are then assigned again to biztalk messages. I'm not sure if this could be the cause, but I thought I'd mention it.
The send and receive ports have a lot of different types: File, MQSeries, SQL, MLLP, FTP.
Each of these types have a different host instances, to balance out the load.
Our orchestrations use the BiztalkApplication host.
On this server also a couple of scripts are running, mostly ftp upload scripts & also a zipper script, which zips files every half an hour in a daily zip and deletes the zip files after a month. We use this zipscript on our backup files (we backup a lot, backups are also on our server), we did this because the server had problems with sending files to a location where there were a lot (A LOT) of files, so after the files were reduced to zips it went better.
Now the problems we are having recently are mainly two major problems:
Our most important problem is the following. We kept a receive location with a lot of messages on a queue for testing. After we start this receive location which uses the 47 orchestrations, the running service instances start to sky rock. Ok, this is pretty normal. Let's say about 10000, and then we stop the receive location to see how biztalk handles these 10000 instances. Normally they would go down pretty fast, and it does sometimes, but after a while it starts to "throttle", meaning they just stop being processed and the service instances stay at the same number, for example in 30 seconds it goes down from 10000 to 4000 and then it stays at 4000 and it lowers very very very slowly, like 30 in 5minutes or something. So this means, that all the other service instances of the other applications are also stuck in here, and they are also not processed.
We noticed that after restarting our host instances the instance number went down fast again. So we tried to selectively restart different host instances to locate the problem. We noticed that eventually restarting the file send/receive host instance would do the trick. So we thought file sends would be the problem. Concidering that we make a lot of backups. So we replaced the file type backups with mqseries backups. The same problem occured, and funny thing, restarting the file send/receive host still fixes the problem.
No errors can be found in the event viewer either.
A second problem we're having is. That sometimes at arround 6 am, all or a part of the host instances are being stopped.
In the event viewer we noticed the following errors (these are more than one):
The receive location "MdnBericht SQL" with URL "SQL://ZNACDBPEG/mdnd0001/" is shutting down. Details:"The error threshold has been exceeded. The receive location is shutting down.".
The Messaging Engine failed to add a receive location "M2m Othello Export Start Bestand" with URL "\m2mservices\Othello_import$\DataFilter Start*.xml" to the adapter "FILE". Reason: "The FILE adapter cannot access the folder \m2mservices\Othello_import$\DataFilter Start.
Verify this folder exists.
Error: Logon failure: unknown user name or bad password.
".
The FILE adapter cannot access the folder \m2mservices\Othello_import$\DataFilter Start.
Verify this folder exists.
Error: Logon failure: unknown user name or bad password.
An attempt to connect to "BizTalkMsgBoxDb" SQL Server database on server "ZNACDBBTS" failed.
Error: "Login failed for user ''. The user is not associated with a trusted SQL Server connection."
It woould seem that there's a login failure at this time and that because of it other services are also experiencing problems, and eventually they are shut down.
The thing is, our user is admin, and it's impossible that it's password is wrong "sometimes". We have concidering that the problem could be due to an infrastructure problem, but that's not really are department.
I know it's a long post, but we're not sure anymore what to do. Would adding another server and balancing the load solve our problems? Is there a way to meassure our balance and know where to start splitting? What are normal numbers of load etc?
I appreciate any answers because these issues are getting worse and we're also on a deadline.
Thanks a lot for replies!
Your immediate problem is BizTalk throttling feature. It's supposed to help BizTalk survive temporary overload conditions. One of its many problems is that you can see the throttling kick-in only in the performance monitor and not in the event log.
What you should do:
Separate the new application to a different host than the rest of the applications. Throttling is done in the host level. So the problematic application wont affect the rest of the applications.
Read about how to disable throttling in the link above.
What we have done is implementing an external throttling service. That feed the BizTalk receive location in small digestible packets. Its ugly, but the problem is ugly.
Update to comment: You have enough host instances. So Ignore that advice. You may reorder the applications between the instances. But there are no clear guidelines to do that. So its just shuffling and guessing.
About the safeness of disabling throttling. This feature doesn't make much sense in many scenarios. You have to study it. Check which of the throttling parameters you are hitting (this can be seen in the performance monitor) and decide how to change the thresholds.
How many host instances do you have?
From the line:
The send and receive ports have a lot
of different types: File, MQSeries,
SQL, MLLP, FTP. Each of these types
have a different host instances, to
balance out the load. Our
orchestrations use the
BiztalkApplication host
It sounds like you have a lot - I recently did an audit of a system where BizTalk was self throttling and the issue was in part due to too many host instances. Each host instance places its own load upon the BizTalk messagebox, as well as chewing up a minimum of 200mb memory.
Reading your comment, you have 20 - this is too many and would be a big part of your problems.
A good starting host setup would be:
A dedicated tracking host
One host that contains all receive handlers for adapters
One host that contains all orchestrations
One host that contains all send handlers for adapters
One host for adapters that need to be clustered (like FTP and MSMQ)
You can then also consider things like introducing "real time" hosts and batched hosts, so you can tune the real time hosts for low latency.
You can also have hosts for specific applications if there are known to be unstable, but in general this should not be done.
I run a BizTalk system that has similar problems and can empathize with what you are seeing. I don't know if it's the same, but I thought I'd share my experience in case.
In the same manner restarting the send/receive seems to fix the problem. In my case I found a direct correlation to memory usage by the host processes. I used performance counters to see when a given host was throttled for memory. By creating extra hosts, and moving orchestrations and ports between them I was able to narrow down which business sets were causing the problem. Basically in my case restarting the hosts was the equivalent to the ultimate "garbage collection" to free up memory. This was of course until enough instances came through to gobble it up again.
I'm afraid I have not solved the issue yet, but a few things I found to alleviate the issue:
Raise the memory to a given process so that throttling does not occur or occurs later
Each host instance, while informative, does have an overhead that is added. Try combining hosts that are not your problem children together to reduce the memory foot print.
Throw hardware at the problem, ram is cheap
I measure the following every few minutes in perfmon so I can diagnose where the problem is:
BizTalk:MessageAgent(*)\Process memory usage (MB)
BizTalk:MessageAgent(*)\Process memory usage threshold
Memory\Available MBytes
A few other things to take a look at. Make sure any custom pipelines use good BizTalk memory practices (i.e. no XML DOM manipulation hiding somewhere, etc). Also theoretically reducing the number of threads for a given host should lower the amount of memory it can seize at one time. I did not seem to have much luck with this one. Maybe the BizTalk throttling overrode it as others have mentioned, I don't know. Also, on a final note, if you dump the perfmon results to a csv, with Excel you can make some pretty memory usage graphs. These might be useful for talking to management about buying more hardware. That's assuming your issue fits this scenario as well.
We fixed the problem temporarily due to a combination of all ur answers.
We set the process memory usage throttling parameters of some hosts higher.
We divided the balance of the host instances better after I analyzed all the memory usage of all hosts, thanks to performance counters and also with the use of a tool called MsgBoxViewer.
And now we're trying to get more physical memory & hopefully also an extra server or a 64bit server.
Thanks for all replies!
We recently installed a 64-bit server in cluster with our older server. Thanks to this we can balance the memory even better which solved a lot of problems.
Although the 64-bit didn't give us much improvements (except for a bit more memory) since it can't use 64-bits on IBM MQ's, MLLP's, HL7 pipelines etc...
The other answers are helpful for run-time performance tuning, but i would recommend a design change as well.
You say that you do a lot of message manipulation in the orchestration in the message assignment shapes.
I would recommend moving that code to dedicated transforms. They are much more light weight, and can be executed faster. You can combine custom xslt and c# in these maps to do the hard work. Orchestrations cost more in development, design and testing, and a whole lot more in run-time performance.
You can then use transforms for message transformation, and leave the orchestrating (what is left of it after moving the message assignment code) to the orchestrations.
The added benefit of using transforms over orchestrations is that they are much more testable.
Related
My C# ASP.net MVC site allows users to upload many photos at a time (using dropzone.js). After uploading about 80 files the user started receiving a 503 Server Unavailable error.
They waited a few minutes, retried the files that failed and the rest uploaded (about 20).
How can I prevent the server from doing this?
When it happened, the rest of the site remained operational.
TIA!
A 503 Server Unavailable error under load typically means you've crossed some resource limit on your server. There's two ways to fix this:
Change the client, so it doesn't reach this limit (or at least not quickly). For dropzone specifically I'd add throttling logic to it so that when a user drops a 100 photos all at once-- your logic will process these in batches (e.g. five at a time).
Change the server, so this limit is higher. Without more information about your infrastructure it's difficult to tell what's causing the current bottleneck. It could be as simple as scaling up the CPUs/RAM of your host.
503 Server unavailable means, your application is not scalable and breaking to handle further requests. Couple of suggestions to fix this issue.
Vertical scaling
Horizontal Scaling
Revisit the Architecture
Even before going through these points, I suggest to "load test" your current app and create bench-marking. Means start testing application for 100 files per minute, 200, 300 ... and identify at which number it is breaking/throwing 503 error. You can use simple JMeter kind of tools for this.
And also define your target requirement. like 10K files per hour or 100 files per minute...
Vertical Scaling:
Based on your bench marking results from load test, you can start fine tuning RAM/CPU/IO capacity of the existing machine. Example if you are using containers , you can increase the allocated hardware resources. And repeat the test until it satisfies your target requirement.
Horizontal Scaling
Based on your bench marking results from load test, start adding more nodes to your cluster. And repeat the test.
Revisit the Architecture
The above two options talks about infra fine tuning without modifying application architecture. But your requirement is "file uploading" which is IO task not CPU bound. So to achieve greater scalability with optimal resources you can consider re architecture the application either using "NodeJS" or "C# Reactive X" (https://github.com/dotnet/reactive). These are Reactive style programming which will provide you non-blocking async processing capabilities.
I'm writing a calculation intensive program in C# using the TPL. Some preliminary benchmarking shows good reduction in computation time through using processors with more cores/threads.
However, there is a limit to how many threads are available on a single CPU (I think even the best Xeons money can buy is currently have about 16).
I've been reading about how render farms with a 'grid' of multiple inexpensive CPUs in their own machines is a good way to increase the overall core count, but I have no idea how I go about implementing one of these. Is it implemented at the OS level with Microsoft server technology (and if so, how?), or do I also need to modify the C# code itself?
Any help or links to existing information would be greatly appreciated.
If you want to do this at scale (100s of nodes) then developing your own system is hard. You have to handle; nodes becoming unavailable, data replication to each node, tracking job progress.. It's a long list. You also need to consider the sort of communication you're going to require between your nodes. Remember that the cost of sending a message (data) from one thread to another is tiny compared to the cost of sending it to another machine across a network (even a fast one). You may have to completely rewrite your multithreaded application to run well on a distributed system, even to the point of using a completely different algorithm.
Hadoop
Microsoft had plans to commercialize Dryad as LINQ to HPC but this project was sidelined a while back (I worked on this project before I left Microsoft). I believe you can still get the final "public preview", but it's unsupported. The SQL team opted to work with the Hadoop/Hortonworks people on getting a Windows/Azure/.NET friendly Hadoop distribution off the ground. As far as I know the only thing they have delivered is HDInsight. A Hadoop service running in Azure.
There is now a Microsoft .NET SDK For Hadoop which will allow you to manage a cluster and submit jobs etc. It does not seem to allow you to write code that executes on the Hadoop nodes. You can however use the Hadoop streaming API. This is fairly low level but is language agnostic so you can pretty much use it to integrate map reduce code written in any language with Hadoop. More details on this can be found in this blog post.
Hadoop for .NET Developers
If you want to do this as a smaller scale (10s of nodes) then I would look for something like MPI .NET. it looks like this project has been abandoned but something similar is probably what you want.
You might look into some like Dryad - http://research.microsoft.com/en-us/projects/dryadlinq/default.aspx
It might on the other hand also be a big too much for your situation, but the ideas in Dryad could be simplified to your needs.
You might also look into making your own TaskScheduler, which could handle the distribution of threads to agents running on other boxes, but you would have to implement a simple socket client/server communication to get and push the data.
Another and a bit odd suggestion, which might be okay for investigating things, is to do the following.
Let the master of the calculation cut the problem into the number of available client computers.
Write the parameters to kick of the calculation for each client to a file shared by all on the network.
Let the clients look for files dedicated to them, and kick of the calculation for their piece, when file appears. The output is written back to a result file.
The server will sit an listen for all clients completing their jobs.
The files could be replaced with a database, low-level sockets, REST services, Web Services etc. depending on your needs.
This questions continues from what I learnt from my question yesterday titled using git to distribute nightly builds.
In the answers to the above questions it was clear that git would not suit my needs and was encouraged to re-examine using BitTorrent.
Short Version
Need to distribute nightly builds to 70+ people each morning, would like to use git BitTorrent to load balance the transfer.
Long Version
NB. You can skip the below paragraph if you have read my previous question.
Each morning we need to distribute our nightly build to the studio of 70+ people (artists, testers, programmers, production etc). Up until now we have copied the build to a server and have written a sync program that fetches it (using Robocopy underneath); even with setting up mirrors the transfer speed is unacceptably slow with it taking up-to an hour or longer to sync at peak times (off-peak times are roughly 15 minutes) which points to being hardware I/O bottleneck and possibly network bandwidth.
What I know so far
What I have found so far:
I have found the excellent entry on Wikipedia about the BitTorrent protocol which was an interesting read (I had only previously known the basics of how torrents worked). Also found this StackOverflow answer on the BITFIELD exchange that happens after the client-server handshake.
I have also found the MonoTorrent C# Library (GitHub Source) that I can use to write our own tracker and client. We cannot use off the shelf trackers or clients (e.g. uTorrent).
Questions
In my initial design, I have our build system creating a .torrent file and adding it to the tracker. I would super-seed the torrent using our existing mirrors of the build.
Using this design, would I need to create a new .torrent file for each new build? In other words, would it be possible to create a "rolling" .torrent where if the content of the build has only change 20% that is all that needs to be downloaded to get latest?
... Actually. In writing the above question, I think that I would need to create new file however I would be able download to the same location on the users machine and the hash will automatically determine what I already have. Is this correct?
In response to comments
For completely fresh sync the entire build (including: the game, source code, localized data, and disc images for PS3 and X360) ~37,000 files and coming in just under 50GB. This is going to increase as production continues. This sync took 29 minutes to complete at time when there is was only 2 other syncs happening, which low-peak if you consider that at 9am we would have 50+ people wanting to get latest.
We have investigated the disk I/O and network bandwidth with the IT dept; the conclusion was that the network storage was being saturated. We are also recording statistics to a database of syncs, these records show even with handful of users we are getting unacceptable transfer rates.
In regard not using off-the-shelf clients, it is a legal concern with having an application like uTorrent installed on users machines given that other items can be easily downloaded using that program. We also want to have a custom workflow for determining which build you want to get (e.g. only PS3 or X360 depending on what DEVKIT you have on your desk) and have notifications of new builds available etc. Creating a client using MonoTorrent is not the part that I'm concerned about.
To the question whether or not you need to create a new .torrent, the answer is: yes.
However, depending a bit on the layout of your data, you may be able to do some simple semi-delta-updates.
If the data you distribute is a large collection of individual files, with each build some files may have changed you can simply create a new .torrent file and have all clients download it to the same location as the old one (just like you suggest). The clients would first check the files that already existed on disk, update the ones that had changed and download new files. The main drawback is that removed files would not actually be deleted at the clients.
If you're writing your own client anyway, deleting files on the filesystem that aren't in the .torrent file is a fairly simple step that can be done separately.
This does not work if you distribute an image file, since the bits that stayed the same across the versions may have moved, and thus yielding different piece hashes.
I would not necessarily recommend using super-seeding. Depending on how strict the super seeding implementation you use is, it may actually harm transfer rates. Keep in mind that the purpose of super seeding is to minimize the number of bytes sent from the seed, not to maximize the transfer rate. If all your clients are behaving properly (i.e. using rarest first), the piece distribution shouldn't be a problem anyway.
Also, to create a torrent and to hash-check a 50 GiB torrent puts a lot of load on the drive, you may want to benchmark the bittorrent implementation you use for this, to make sure it's performant enough. At 50 GiB, the difference between different implementations may be significant.
Just wanted to add a few non-BitTorrent suggestions for your perusal:
If the delta between nightly builds is not significant, you may be able to use rsync to reduce your network traffic and decrease the time it takes to copy the build. At a previous company we used rsync to submit builds to our publisher, as we found our disc images didn't change much build-to-build.
Have you considered simply staggering the copy operations so that clients aren't slowing down the transfer for each other? We've been using a simple Python script internally when we do milestone branches: the script goes to sleep until a random time in a specified range, wakes up, downloads and checks-out the required repositories and runs a build. The user runs the script when leaving work for the day, when they return they have a fresh copy of everything ready to go.
You could use BitTorrent sync Which is somehow an alternative to dropbox but without a server in the cloud. It allows you to synchronize any number of folders and files of any size. with several people and it uses the same algorithms from the bit Torrent protocol. You can create a read-only folder and share the key with others. This method removes the need to create a new torrent file for each build.
Just to throw another option into the mix, have you considered BITS? Not used it myself but from reading the documentation it supports a distributed peer caching model which sounds like it will achieve what you want.
The downside is that it is a background service so it will give up network bandwidth in favour of user initiated activity - nice for your users but possibly not what you want if you need data on a machine in a hurry.
Still, it's another option.
I've got a C# service that currently runs single-instance on a PC. I'd like to split this component so that it runs on multiple PCs. Each PC should be assigned a certain part of the work. If one PC fails, its work should be moved to a backup machine.
Data synchronization can be done by the DB, so that should not be much of an issue. My current idea is to use some kind of load balancer that splits and sends the incoming requests to the array of PCs and makes sure the work is actually processed.
How would I implement such a functionality? I'm not sure if I'm asking the right question. If my understanding of how this goal should be achieved is wrong, please give me a hint.
Edit:
I wonder if the idea given above (load balancer splitswork packages to PCs and checks for result) is feasible at all. If there is some kind of already implemented solution so this seemingly common problem, I'd love to use that solution.
Availability is a critical requirement.
I'd recommend looking at a Pull model of load-sharing, rather than a Push model. When pushing work, the coordinating server(s)/load-balancer must be aware of all the servers that are currently running in your system so that it knows where to forward requests; this must either be set in config or dynamically set (such as in the Publisher-Subscriber model), then constantly checked to detect if any servers have gone offline. Whilst it's entirely feasible, it can complicate the scaling-out of your application.
With a Pull architecture, you have a central work queue (hosted in MSMQ, Sql Server Service Broker or similar) and each processing service pulls work off that queue. Expose a WCF service to accept external requests and place work onto the queue, safe in the knowledge that some server will do the work, even though you don't know exactly which one. This has the added benefits that each server monitors it's own workload and picks up work as-and-when it is ready, and you can easily add or remove servers to/from this model without any change in config.
This architecture is supported by NServiceBus and the communication between Windows Azure Web & Worker roles.
From what you said each PC will require a full copy of your service -
Each PC should be assigned a certain
part of the work. If one PC fails, its
work should be moved to a backup
machine
Otherwise you won't be able to move its work to another PC.
I would be tempted to have a central server which farms out work to individual PCs. This means that you would need some form of communication between each machine and and keep a record back on the central server of what work has been assigned where.
You'll also need each machine to measure it's cpu loading and reject work if it is too busy.
A multi-threaded approach to the service would make good use of those multiple processor cores that are ubiquitoius nowadays.
How about using a server and multi-threading your processing? Or even multi-threading on a PC as you can get many cores on a standard desktop now.
This obviously doesn't deal with the machine going down, but could give you much more performance for less investment.
you can check windows clustering, and you have to handle set of issues that depends on the behaviour of the service (you can put more details about the service itself so I can answer)
This depends on how you wanted to split your workload, this usually done by
Splitting the same workload by multiple services
Means same service being installed on
different servers and will do the
same job. Assume your service is reading huge data from the db servers and processing them to produce huge client specific datafiles and finally this datafile is been sent to the clients. In this approach all your services installed in diff servers will do the same work but they split the work to increaese the performance.
Splitting the part of the workload by multiple services
In this approach each service will be assigned to the indivitual jobs and works on different goals. in above example one serivce is responsible for reading data from db and generating huge data files and another service is configured only to read the data file and send it to clients.
I have implemented the 2nd approach in one of my work. Because this let me isolate and debug the errors in case of any failures.
The usual approach for load balancer is to split service requests evenly between all service instances.
For each work item (request) you can store relative information in database. Then each service should also have at least one background thread checking database for abandoned work items.
I would suggest that you publish your service through WCF (Windows Communication Foundation).
Then implement a "central" client application which can keep track of available providers of your service and dish out work. The central app will act as scheduler and load balancer of the tasks to be performed.
Check out Juwal Lövy's book on WCF ("Programming WCF Services") for a good introduction on this topic.
You can have a look at NGrid : http://ngrid.sourceforge.net/
or Alchemi : http://www.gridbus.org/~alchemi/index.html
both are grid computing framework with load balancers that will get you started in no time.
Cheers,
Florian
I am building a client-server based solution; client being a desktop application and the server being a web application.
Basically, I need to monitor the performance and resource utilization of the client, which is a .NET 2.0 based Windows Desktop application.
The most important thing I need to monitor is the network resources the client uses, i.e. what is the size of the data that flows out from the client to the server and what is the size of the data that the client downloads from the server.
Apart from this, general performance monitoring would help too.
Please guide.
Edit: A few people have suggested using perfmon, but aren't the values shown in perfmon system-wide? I need these network based stats for a single application only...bytes being sent and received by a single desktop application.
The standard tool for network monitoring is Wireshark.
It allows you to filter the network traffic very flexiblely.
This could be quite an overkill for your application though.
If you are using pure .NET, I would suggest that you add performance logging into your networking classes on the server side- if you are using .Net library classes, then inheritate from them your own classes which add statistics when sending and receiving data.
You need to split your monitoring in two parts:
How the system interacts with the server (number of calls performed)
Amount of network traffic (size of exchanged data for any call)
The first part is (in my experience) often negleted while it has a lot of importance, because acquiring a new connection is often much more expensive that data traffic in itself.
You do not tell us anything about the king of connection you're using (low level tcpip calls, web services, WCF or what else) but my suggestion is:
Find a way to determine how many time your application calls the server
Find how much any single call is costing in term of data exchanged
How to monitor these values depends a lot from the technology involved, for some is very simple (if, for example, you're using a web service, setting up Fiddler to monitor the calls and examining an monitoring results is very simple), for other you need to work using a low level traffic analyzer like Wireshark or MS Network Monitor and learn how to filter traffic according to IP address of the server, ports used and other parameters.
If you clarify your application architecture I can try to be more specific.
Regards
Massimo
You can also use Task Manager to do this. Go to the processes tab, then View->"select columns". Check "I/O read bytes" and "I/O write bytes". Then find your program in the processes list and you can observe the cumulative values.
Take a look at this article: http://www.codeproject.com/KB/IP/apptraffwatcher.aspx
You may be able to tear apart the source code, and grab what you need to meassure download/upload for your application's process ID.
It looks like he uses this library to get information about the amount of traffic: http://www.codeproject.com/KB/IP/trafficwatcher.aspx
I tried the perfmon and I was unable to watch our network traffic either. But I was able to in the Performance Explorer in Visual Studio 2005 Team suite.
If you have Team edition Visual Studio you can set up either Sampling/Instrumentation on your desktop application. Then go into options of the session. select Events -> Windows Kernel Trace -> Network. Run your application and let the Visual studio log the data. Then save the report. (I love Microsoft for this "feature") go to the command prompt, navigate to C:\Program Files\Microsoft Visual Studio 8\Team Tools\Performance Tools and run "vsperfreport /CALLTRACE (filename).vsp" This will produce a csv file containing all network packets sent/recieved/size/port etc by the desktop application.
I know this was a long winded solution but I just tried it on my .Net 2.0 application and it captured all of our communication with Oracle Identity Manager and Oracle Database.
It is not clear by your post if you are using HTTP requests. You indicated that the server is a web application, which implies (perhaps incorrectly) to me that you might be using the HTTP protocol to send/receive data from server to client.
If so, one tool that might be of use is Fiddler. This tool will monitor all HTTP traffic in and out of your workstation and it can (I believe) watch specific sessions and applications. The nice part is that you can see individual requests and see the statistics for these requests, including bytes in/out, round trip times, and other useful bits of information.
If you are not HTTP based, then this tool won't help.
I'm surprised nobody has suggested SysInternals (now Microsoft) Process Explorer (technet.microsoft.com/en-us/sysinternals/bb896653.aspx). If you right click on the executable in question and left click properties it will bring up a dialog box. Then you switch to the performance tab and you can monitor I/O of the executable. The Performance Graph tab will show CPU usage and I/O bytes history graphed over time. It's a cool and free tool.
You want to look at perfmon (otherwise called Performance Monitor in admin tools off the start menu).
Open it in its default graph view, add a counter, select network interface, then bytes per second (or a similar counter), click ok and you're done.
You can experiment with the other networking counters as there are many, one of them will do exactly what you want. You can also save the perfmon logs to a file and view them afterwards - you'll see the graph in its entirety and you can "zoom in" on sections. Alternatively, you can save log-style files with just raw numbers.
Here's a quick guide through perfmon as an admin tool, once you understand that, the rest comes easily.
In Vista you can't add individual counters any more, you add the entire set of counters grouped under an object - so for my example, you'd add the Network Interface object, then you'd see all the individual counters on the graph after you click ok.
If you want this built into your client codebase, and not using an external tool, you can use Performance Counters to get access to this and most other things reported by the Performance Monitor, Task Manager, etc.
You should check out ACE Analyst for this use case - think of it as a superintelligent layer on top of Wireshark packet captures. You need to look at the packets to understand the true nature of the application behavior as runs across the network.