I am working at a program that sends/receives files over the network using the TCP.
The program sends multiple files, so the stream is not close until the user quits the program.
The problem that i am facing is that, when i am sending a 700mb file, my server program private memory grows to 700,000 K and cripples my computer performance badly. And when trying to send another 700mb file the server throws an System.OutOfMemoryException.
Can someone tell me what i am doing wrong, or not doing ?
Server-side code:
using ( FileStream fs = new FileStream("dracula.avi", FileMode.Open, FileAccess.Read))
{
byte[] data = new byte[fs.Length];
int remaining = data.Length;
int offset = 0;
strWriter.WriteLine("Content-Length: " + data.Length);
strWriter.Flush();
Thread.Sleep(1000);
while (remaining > 0)
{
Thread.Sleep(10);
int read = fs.Read(data, offset, remaining);
remaining -= read;
offset += read;
}
fs.Flush();
fs.Close();
}
strm.Write(data, 0, data.Length);
strm.Flush();
GC.Collect();
You're currently reading the whole file into memory, even though you only want to copy it to another stream. Don't do that. Just iterate a chunk at a time: read a chunk, write a chunk, read a chunk, write a chunk, etc. If you're using .NET 4, you can use Stream.CopyTo for that purpose.
You're buffering your reads, but not your writes. The program is doing exactly what you're telling it to -- allocating a gigantic chunk or memory and filling it all before ever sending a single byte.
A much better approach is to read a small chunk from the file (for the sake of argument, 4096 bytes) and then write the chunk to the output stream. By doing this, you'll only use 4096 bytes per connection which is much more scalable.
An OOM condition generally occurs when you are either running out of system memory, or in a 32 bit process you are out of address space(2000MB).
You say it can successfully copy one but not two? Is that two concurrently or consecutively? What is your threading model? Also, the example is a snippet, you seem to have a StreamWriter and a Stream for writing, are these objects going away?
Be careful with GC.Collect. Microsoft doesn't recommend explicit calls because if you don't use it correctly it can cause objects to stay alive longer than needed. This is because when you do a GC.Collect, you are promoting objects to a higher generation. In my experience it is best to make sure you are releasing objects and let the framework decide what/when to GC.Collect.
I would get familiar with WinDBG+SOS, this allows you to look at the objects on the heap.
Try this:
Startup WinDBG and attach to your process
Type ".loadby sos clr" if using 4.0, otherwise type ".loadby sos mscorwks"
Press F5 to continue
Copy one file, wait for it to complete
Press CTRL+BREAK
Type "!dumpheap -stat", look at the results, look for objects that should be gone
For-Each object that should be gone, grab the MT value
Type "!dumpheap -mt {0}" replacing {0} with the value from step above
This is a list of instances, grab one of the objects addresses
Type "!gcroot {0}" replacing {0} with the objects address
This should tell you what is rooting the objects, you then need to find out how to unroot, e.g. null objects that aren't needed.
Better send the data chunks as soon as you read them. I didn't test the code, but it should be similar to something like;
var bufferLenght = 1024;
byte[] buffer;
while (remaining > 0)
{
buffer = new byte[1024];
int len = fs.Read(buffer, offset, bufferLenght);
remaining -= len;
offset += len;
strm.Write(buffer, 0, len);
}
Related
I've been trying to make a program to transfer a file with bandwidth throttling (after zipping it) to another computer on the same network.
I need to get its bandwidth throttled in order to avoid saturation (Kind of the way Robocopy does).
Recently, I found the ThrottledStream class, but It doesn't seem to be working, since I can send a 9MB with a limitation of 1 byte throttling and it still arrives almost instantly, so I need to know if there's some misapplication of the class.
Here's the code:
using (FileStream originStream = inFile.OpenRead())
using (MemoryStream compressedFile = new MemoryStream())
using (GZipStream zippingStream = new GZipStream(compressedFile, CompressionMode.Compress))
{
originStream.CopyTo(zippingStream);
using (FileStream finalDestination = File.Create(destination.FullName + "\\" + inFile.Name + ".gz"))
{
ThrottledStream destinationStream = new ThrottledStream(finalDestination, bpsLimit);
byte[] buffer = new byte[bufferSize];
int readCount = compressedFile.Read(buffer,0,bufferSize);
while(readCount > 0)
{
destinationStream.Write(buffer, 0, bufferSize);
readCount = compressedFile.Read(buffer, 0, bufferSize);
}
}
}
Any help would be appreciated.
The ThrottledStream class you linked to uses a delay calculation to determine how long to wait before perform the current write. This delay is based on the amount of data sent before the current write, and how much time has elapsed. Once the delay period has passed it writes the entire buffer in a single chunk.
The problem with this is that it doesn't do any checks on the size of the buffer being written in a particular write operation. If you ask it to limit throughput to 1 byte per second, then call the Write method with a 20MB buffer, it will write the entire 20MB immediately. If you then try to write another block of data that is 2 bytes long, it will wait for a very long time (20*2^20 seconds) before writing those two bytes.
In order to get the ThrottledStream class to work more smoothly, you have to call Write with very small blocks of data. Each block will still be written immediately, but the delays between the write operations will be smaller and the throughput will be much more even.
In your code you use a variable named bufferSize to determine the number of bytes to process per read/write in the internal loop. Try setting bufferSize to 256, which will result in many more reads and writes, but will give the ThrottledStream a chance to actually introduce some delays.
If you set bufferSize to be the same as bpsLimit you should see a single write operation complete every second. The smaller you set bufferSize the more write operations you'll get per second, the smoother the bandwidth throttling will work.
Normally we like to process as much of a buffer as possible in each operation to decrease the overheads, but in this case you're explicitly trying to add overheads to slow things down :)
I'm trying to write a simple c# application which downloads a large number of small files from an FTP server.
I've tried two approaches:
1 - generic socket programming
2 - using FtpWebRequest and FtpWebResponse objects
The download speed (for the same file) when using the first approach varies from 1.5s to 7s, the 2nd gives more less the same results - about 2.5s each time.
Considering that about 1.4s out of those 2.5s takes the process of initiating the FtpWebRequest object (only 1.1s for receiving data) the difference is quite significant.
The question is how to achieve for the 1st approach the same good stable download speed as for the 2nd one?
For the 1st approach the problem seems to lay in the loop below (as it takes about 90% of the download time):
Int32 intResponseLength = dataSocket.Receive(buffer, intBufferSize, SocketFlags.None);
while (intResponseLength != 0)
{
localFile.Write(buffer, 0, intResponseLength);
intResponseLength = dataSocket.Receive(buffer, intBufferSize, SocketFlags.None);
}
Equivalent part of code for the 2nd approach (always takes about 1.1s for particular file):
Int32 intResponseLength = ftpStream.Read(buffer, 0, intBufferSize);
while (intResponseLength != 0)
{
localFile.Write (buffer, 0, intResponseLength);
intResponseLength = ftpStream.Read(buffer, 0, intBufferSize);
}
I've tried buffers from 56b to 32kB - no significant difference.
Also creating a stream on the open data socket:
Stream str = new NetworkStream(dataSocket);
and reading it (instead of using dataSocket.Receive)
str.Read(buffer, 0, intBufferSize);
doesn't help... in fact it's even slower.
Thanks in advance for any suggestion!
You need to use Socket.Poll or Socket.Select methods to check availability of data. What you do not only slows down operation, but also causes extensive CPU load. Poll or Select will yield processor time until data is available or timeout elapses. You can keep the same loop but include a call to one of the above methods, and play with timeouts (try values from 10 ms to 500 ms to find timeout, optimal for your task).
I have create windows application that routine download file from load balance server, currently the speed is about 30MB/second. However I try to use FastCopy or TeraCopy it can copy at about 100MB/second. I want to know how to improve my copy speed to make it can copy file faster than currently.
One common mistake when using streams is to copy a byte at a time, or to use a small buffer. Most of the time it takes to write data to disk is spent seeking, so using a larger buffer will reduce your average seek time per byte.
Operating systems write files to disk in clusters. This means that when you write a single byte to disk Windows will actually write a block between 512 bytes and 64 kb in size. You can get much better disk performance by using a buffer that is an integer multiple of 64kb.
Additionally, you can get a boost from using a buffer that is a multiple of your CPUs underlying memory page size. For x86/x64 machines this can be set to either 4kb or 4mb.
So you want to use an integer multiple of 4mb.
Additionally if you use asynchronous IO you can fully take advantage of the large buffer size.
class Downloader
{
const int size = 4096 * 1024;
ManualResetEvent done = new ManualResetEvent(false);
Socket socket;
Stream stream;
void InternalWrite(IAsyncResult ar)
{
var read = socket.EndReceive(ar);
if (read == size)
InternalRead();
stream.Write((byte[])ar.AsyncState, 0, read);
if (read != size)
done.Set();
}
void InternalRead()
{
var buffer = new byte[size];
socket.BeginReceive(buffer, 0, size, System.Net.Sockets.SocketFlags.None, InternalWrite, buffer);
}
public bool Save(Socket socket, Stream stream)
{
this.socket = socket;
this.stream = stream;
InternalRead();
return done.WaitOne();
}
}
bool Save(System.Net.Sockets.Socket socket, string filename)
{
using (var stream = File.OpenWrite(filename))
{
var downloader = new Downloader();
return downloader.Save(socket, stream);
}
}
Possibly your application can do multi-threading to get the file using multiple threads, however the bandwidth is limited to the speed of the devices that transfer the content
Simplest way is to open the file in raw/binary mode (thats C speak not sure waht the C# equivalent is) and read and write very large blocks (several MB) at a time.
The trick TeraCopy uses is to make the reading and writing asynchronous. This means that a block of data can be written while another one is being read.
You have to fiddle around with the number of blocks and the size of those blocks to get the optimum for your situation. I used this method using C++ and for us the optimum was using four blocks of 256KB when copying from a network share to a local disk.
Regards,
Sebastiaan
If you run Process Monitor you can see the block sizes that Windows Explorer or TeraCopy are using.
In Vista the default block size for the local network is afair 2 MB, which makes copying files over a huge pipe a lot faster.
Why reinvent the wheel?
If your situation permits, you are probably better off shelling out to one of the existing "fast" copy utilities than trying to write one yourself. There are numerous non-obvious edge-cases which need to be handled, and getting consistently good perf requires lots of trial-end-error experimentation.
I'm developing a simple application to send files over TCP using the TCPListener and TCPClient classes. Here's the code that sends the file.
Stop is a volatile boolean which helps stopping the process at any time and WRITE_BUFFER_SIZE might be changed in runtime (another volatile)
while (remaining > 0 && !stop)
{
DateTime current = DateTime.Now;
int bufferSize = WRITTE_BUFFER_SIZE;
buffer = new byte[bufferSize];
int readed = fileStream.Read(buffer, 0, bufferSize);
stream.Write(buffer, 0, readed);
stream.Flush();
remaining -= readed;
// Wait in order to guarantee send speed
TimeSpan difference = DateTime.Now.Subtract(current);
double seconds = (bufferSize / Speed);
int wait = (int)Math.Floor(seconds * 1000);
wait -= difference.Milliseconds;
if (wait > 10)
Thread.Sleep(wait);
}
stream.Close();
and this is the code that handles the receiver side:
do
{
readed = stream.Read(buffer, 0, READ_BUFFER_SIZE);
// write to .part file and flush to disk
outputStream.Write(buffer, 0, readed);
outputStream.Flush();
offset += readed;
} while (!stop && readed > 0);
Now, when the speed is low (about 5KBps) everything works ok but, as I increase the speed the receiver size becomes more prone to raise a SocketException when reading from the stream. I'm guessing it has to do with the remote socket being closed before all data can be read, but What's the correct way to do this? When should I close the sending client?
I haven't found any good examples of file transmission on google, and the ones that I've found have a similar implementation of what I'm doing so I guess I'm missing something.
Edit: I get this error "Unable to read data from the transport connection". This is an IOException whose inner exception is a SocketException.
I've added this in the sender function, still I get the same error, the code never reaches the stream.close() and of course the tcpclient never really get closed... so I'm completely lost now.
buffer = new byte[1];
client.Client.Receive(buffer);
stream.Close();
Typically you want to set the LINGER option on the socket. Under C++ this would be SO_LINGER, but under windows this doesn't actually work as expected. You really want to do this:
Finish sending data.
Call shutdown() with the how parameter set to 1.
Loop on recv() until it returns 0.
Call closesocket().
Taken from: http://tangentsoft.net/wskfaq/newbie.html#howclose
C# sharp may have corrected this in its libraries, but I doubt it since they are built on top of the winsock API.
Edit:
Looking at your code in more detail. I see that you are sending no header across at all, so on the receiving side you have no idea of how many bytes you are actually supposed to read. Knowing the number of bytes to read of the socket makes this a much easier problem to debug. Keep in mind that shutting down the socket can still snip of the last bit of data if you don't close it properly.
Additionally having your buffer size be volatile is not thread safe and really doesn't buy you anything. Using stop as a volatile is safe, but don't expect it to be instant. In other words the loop could run several more times before it gets the updated value of stop. This is especially true on multiprocessor machines.
Edit_02:
For the TCPClientClass you want to do the following (as far as I can tell without having access to a C# at the moment).
// write all the bytes
// Then do the following
client.client.Shutdown(Shutdown.Send) // This assumes you have access to this protected member
while (stream.read(buffer, 0, READ_BUFFER_SIZE) != 0);
client.close()
I'd like to empty read buffer of the socket so I wrote follow code...
byte[] tempBuffer = new byte[1024];
int readCount = 0;
while ((readCount = tcpSocket.GetStream().Read(tempBuffer, 0, tempBuffer.Length)) != 0)
{
// do with tempBuffer
}
But Read() method is blocked so I added tcpSocket.ReceiveTimeout = 1;. And it works just like before.
As I know, this is usually used in C++. How can I solve this problem?
You can use the DataAvailable property to see if there is anything to be read before making a call into the Read method.
Use the NetworkStream.Read() function directly, instead of using GetStream():
If no data is available for reading,
the Read method returns 0. The Read
operation reads as much data as is
available, up to the number of bytes
specified by the size parameter. If
the remote host shuts down the
connection, and all available data has
been received, the Read method
completes immediately and return zero
bytes. NoteNote:
Why do you want to empty the read buffer?
If you don't want the contents of the socket close it.
If you don't want the current contents, but will want later data, how do you know when later starts. If the data is an non-encapsulated stream...
Sounds like your solving the problem in the wrong fashion.