How to split a large file into chunks in c#? - c#

I'm making a simple file transfer sender and receiver app through the wire. What I have so far is that the sender converts the file into a byte array and sends chunks of that array to the receiver.
This works with file of up to 256mb, but this line throws a "System out of memory" exception for anything above:
byte[] buffer = StreamFile(fileName); //This is where I convert the file
I'm looking for a way to read the file in chunks then write that chunk instead of loading the whole file into a byte. How can I do this with a FileStream?
EDIT:
Sorry, heres my crappy code so far:
private void btnSend(object sender, EventArgs e)
{
Socket clientSock = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
byte[] fileName = Encoding.UTF8.GetBytes(fName); //file name
byte[] fileData = null;
try
{
fileData = StreamFile(textBox1.Text); //file
}
catch (OutOfMemoryException ex)
{
MessageBox.Show("Out of memory");
return;
}
byte[] fileNameLen = BitConverter.GetBytes(fileName.Length); //length of file name
clientData = new byte[4 + fileName.Length + fileData.Length];
fileNameLen.CopyTo(clientData, 0);
fileName.CopyTo(clientData, 4);
fileData.CopyTo(clientData, 4 + fileName.Length);
clientSock.Connect("172.16.12.91", 9050);
clientSock.Send(clientData, 0, 4 + fileName.Length, SocketFlags.None);
for (int i = 4 + fileName.Length; i < clientData.Length; i++)
{
clientSock.Send(clientData, i, 1 , SocketFlags.None);
}
clientSock.Close();
}
And here's how I receive (the code was from a tutorial)
public void ReadCallback(IAsyncResult ar)
{
int fileNameLen = 1;
String content = String.Empty;
StateObject state = (StateObject)ar.AsyncState;
Socket handler = state.workSocket;
int bytesRead = handler.EndReceive(ar);
if (bytesRead > 0)
{
if (flag == 0)
{
Thread.Sleep(1000);
fileNameLen = BitConverter.ToInt32(state.buffer, 0);
string fileName = Encoding.UTF8.GetString(state.buffer, 4, fileNameLen);
receivedPath = fileName;
flag++;
}
if (flag >= 1)
{
BinaryWriter writer = new BinaryWriter(File.Open(receivedPath, FileMode.Append));
if (flag == 1)
{
writer.Write(state.buffer, 4 + fileNameLen, bytesRead - (4 + fileNameLen));
flag++;
}
else
writer.Write(state.buffer, 0, bytesRead);
writer.Close();
handler.BeginReceive(state.buffer, 0, StateObject.BufferSize, 0,
new AsyncCallback(ReadCallback), state);
}
}
else
{
Invoke(new MyDelegate(LabelWriter));
}
}
I just really want to know how I can read the file in chunks so that I dont need to convert it to a byte.
Thanks for the responses so far, I think I'm starting to get it :D

Just call Read repeatedly with a small buffer (I tend to use something like 16K). Note that the call to Read may end up reading a smaller amount than you request. If you're using a fixed chunk size and need the whole chunk in memory, you could just use an array of that size of course.
Without knowing how you're sending the file, it's hard to give much advice about how to structure your code, but it could be something like this:
byte[] chunk = new byte[MaxChunkSize];
while (true)
{
int index = 0;
// There are various different ways of structuring this bit of code.
// Fundamentally we're trying to keep reading in to our chunk until
// either we reach the end of the stream, or we've read everything we need.
while (index < chunk.Length)
{
int bytesRead = stream.Read(chunk, index, chunk.Length - index);
if (bytesRead == 0)
{
break;
}
index += bytesRead;
}
if (index != 0) // Our previous chunk may have been the last one
{
SendChunk(chunk, index); // index is the number of bytes in the chunk
}
if (index != chunk.Length) // We didn't read a full chunk: we're done
{
return;
}
}
If I was more awake I'd probably find a more readable way of writing this, but it'll do for now. One option is to extract another method from the middle section:
// Attempts to read an entire chunk into the given array; returns the size of
// chunk actually read.
int ReadChunk(Stream stream, byte[] chunk)
{
int index = 0;
while (index < chunk.Length)
{
int bytesRead = stream.Read(chunk, index, chunk.Length - index);
if (bytesRead == 0)
{
break;
}
index += bytesRead;
}
return index;
}

var b = new byte[1<<15]; // 32k
while((count = inStream.Read(b, 0, b.Length)) > 0)
{
outStream.Write(b, 0, count);
}

public static IEnumerable<byte[]> SplitStreamIntoChunks(Stream stream, int chunkSize)
{
var bytesRemaining = stream.Length;
while (bytesRemaining > 0)
{
var size = Math.Min((int) bytesRemaining, chunkSize);
var buffer = new byte[size];
var bytesRead = stream.Read(buffer, 0, size);
if (bytesRead <= 0)
break;
yield return buffer;
bytesRemaining -= bytesRead;
}
}

Related

C# thread exits before receiving data on socket

I am trying to send some text over the network using sockets and memory streams. The full data length in my example is 20480 bytes long. Buffer size is 8192.
Before I can receive the last 4096 bytes, the socket receives only 3088 bytes and the whole thread exits without throwing an exception just before receiving the last chunk of data.
// Send
while (sentBytes < ms.Length)
{
if (streamSize < Convert.ToInt64(buffer.Length))
{
ms.Read(buffer, 0, Convert.ToInt32(streamSize));
count = socket.Send(buffer, 0, Convert.ToInt32(streamSize), SocketFlags.None);
sentBytes += Convert.ToInt64(count);
streamSize -= Convert.ToInt64(count);
}
else
{
ms.Read(buffer, 0, buffer.Length);
count = socket.Send(buffer, 0, buffer.Length, SocketFlags.None);
sentBytes += Convert.ToInt64(count);
streamSize -= Convert.ToInt64(count);
}
}
// Receive
while (readBytes < size)
{
if (streamSize < Convert.ToInt64(buffer.Length))
// exits after this, before receiving the last 1008 bytes
{
count = socket.Receive(buffer, 0, Convert.ToInt32(streamSize), SocketFlags.None);
if (count > 0)
{
ms.Write(buffer, 0, count);
readBytes += Convert.ToInt64(count);
streamSize -= Convert.ToInt64(count);
}
}
else
{
count = socket.Receive(buffer, 0, buffer.Length, SocketFlags.None);
if (count > 0)
{
ms.Write(buffer, 0, count);
readBytes += Convert.ToInt64(count);
streamSize -= Convert.ToInt64(count);
}
}
}
I use the exact same algorithm to send/receive files having bigger sizes (over 1 GB) and the transfer works perfectly, no files are corrupted (I use file streams for that).
Interestingly, this code works in the debugger if I add a breakpoint on the sender side.
Also works with this modification:
if (streamSize < Convert.ToInt64(buffer.Length))
{
if (count > 0)
{
ms.Write(buffer, 0, Convert.ToInt32(streamSize));
readBytes += streamSize;
streamSize -= streamSize;
}
}
but this comes with no checking on how much data is received and also doesn't work to transfer files.
Could anybody point it out what is going on here?
Thread started like this:
public ClientConnection(Socket clientSocket, Server mainForm)
{
this.clientSocket = clientSocket;
clientThread = new Thread(ReceiveData);
clientConnected = true;
this.mainForm = mainForm;
clientThread.Start(clientSocket);
}
Added from comment by OP
// text is 10240 characters long
MemoryStream ms = new MemoryStream(UnicodeEncoding.Unicode.GetBytes(text));
// streamsize is 20480, which is sent prior to text in a header to the receiver
long streamSize = ms.Length;
Update:
Tested with more files, now the file transfer fails as well. The problem is with the last 1008 bytes in all cases.
I found it... When I expected to receive the header, I hadn't prepare the software to receive exactly header sized data.
//byte[] buffer = new byte[1024];
byte[] buffer = new byte[16];
readBytes = socket.Receive(buffer, 0, buffer.Length, SocketFlags.None);
This somehow caused a rogue 16 bytes of data written on the socket every time I was receiving the last chunk of the payload, the socket disconnected and the thread exited not throwing any exceptions whatsoever. I hope this answer will help one day someone else running into the same issue. All data transfer works properly now.
Please consider the simplified implementation of the functionality using NetworkStream class to perform the I/O-operations instead of the Socket class: the NetworkStream class allows to slightly increase the level of abstraction. An instance of the NetworkStream class can be created using an instance of the Socket class.
Sender
The implementation of the Sender is pretty straightforward using the Stream.CopyTo Method:
private static void CustomSend(Stream inputStream, Socket socket)
{
using (var networkStream = new NetworkStream(socket))
{
inputStream.CopyTo(networkStream, BufferSize);
}
}
Receiver
Let's introduce the following extension methods for the Stream class which copies the exact number of bytes from one instance of the Stream class to another using the specified buffer:
using System;
using System.IO;
public static class StreamExtensions
{
public static bool TryCopyToExact(this Stream inputStream, Stream outputStream, byte[] buffer, int bytesToCopy)
{
if (inputStream == null)
{
throw new ArgumentNullException("inputStream");
}
if (outputStream == null)
{
throw new ArgumentNullException("outputStream");
}
if (buffer.Length <= 0)
{
throw new ArgumentException("Invalid buffer specified", "buffer");
}
if (bytesToCopy <= 0)
{
throw new ArgumentException("Bytes to copy must be positive", "bytesToCopy");
}
int bytesRead;
while (bytesToCopy > 0 && (bytesRead = inputStream.Read(buffer, 0, Math.Min(buffer.Length, bytesToCopy))) > 0)
{
outputStream.Write(buffer, 0, bytesRead);
bytesToCopy -= bytesRead;
}
return bytesToCopy == 0;
}
public static void CopyToExact(this Stream inputStream, Stream outputStream, byte[] buffer, int bytesToCopy)
{
if (!TryCopyToExact(inputStream, outputStream, buffer, bytesToCopy))
{
throw new IOException("Failed to copy the specified number of bytes");
}
}
}
So, the Receiver can be implemented as follows:
private static void CustomReceive(Socket socket)
{
// It seems your receiver implementation "knows" the "size to receive".
const int SizeToReceive = 20480;
var buffer = new byte[BufferSize];
var outputStream = new MemoryStream(new byte[SizeToReceive], true);
using (var networkStream = new NetworkStream(socket))
{
networkStream.CopyToExact(outputStream, buffer, SizeToReceive);
}
// Use the outputStream instance...
}
Important note
Please do not forget to call the Dispose() method of the instances of the Socket class (for both Sender and Receiver). The absence of the method call can be a root cause of the problems.

While reading large amount of data in bytes the while loop is getting tripped for the last few bytes

I'm sending byte of 8254789 bytes. It is undergoing the loop but when it reaches the at 8246597 and has to read 8192 bytes. It is going out from while loop to nowhere. Can someone explain please, what is the problem?
public static byte[] ReadFully(Stream stream, int initialLength)
{
// If we've been passed an unhelpful initial length, justS
// use 32K.
if (initialLength < 1)
{
initialLength = 32768;
}
byte[] buffer = new byte[3824726];
int read = 0;
int chunk;
try
{
while ((chunk = stream.Read(buffer, read, 3824726 - read)) > 0)
{
Console.WriteLine("Length of chunk" + chunk);
read += chunk;
Console.WriteLine("Length of read" + read);
if (read == 0)
{
stream.Close();
return buffer;
}
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length)
{
Console.WriteLine("Length of Buffer" + buffer.Length);
int nextByte = stream.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1)
{
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Console.WriteLine("Length of newBuffer" + newBuffer.Length);
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
catch (Exception ex)
{ throw ex; }
}
Normally you don't write your stream loops like that. Instead try something like this:
byte[] buffer = new byte[BUFFER_SIZE];
int read = -1;
while((read = stream.Read(buffer, 0, buffer.Length)) > 0)
{
// ... use read bytes in buffer here
}
You're attempting to adjust your offset each time, but you don't need to because use cursors - so you're basically skipping ahead.

Asynchronous data call on TCp Client/Server for large data

I'm getting a image in bytes from a client device. A 3824726 bytes of data. This while loop will read till all the message is recieved. I'm facing a problem here. When it reads the byte till 3816534 and remaining 8192 bytes are left, it is just tripping off the while loop. No exception nothing. just going away. Please some one help reagding it. i'm not able to undersatnd what is the problem.
public static byte[] ReadFully(Stream stream, int initialLength)
{
// If we've been passed an unhelpful initial length, justS
// use 32K.
if (initialLength < 1)
{
initialLength = 32768;
}
byte[] buffer = new byte[3824726];
int read = 0;
int chunk;
try
{
while ((chunk = stream.Read(buffer, read, 3824726 - read)) > 0)
{
Console.WriteLine("Length of chunk" + chunk);
read += chunk;
Console.WriteLine("Length of read" + read);
if (read == 0)
{
stream.Close();
return buffer;
}
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length)
{
Console.WriteLine("Length of Buffer" + buffer.Length);
int nextByte = stream.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1)
{
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Console.WriteLine("Length of newBuffer" + newBuffer.Length);
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
catch (Exception ex)
{ throw ex; }
}

How to determine that packet receive is a part of first packet using tcp in C#

i code socket application using blocing socket and use multithreading..my problem is i want to receive two large packet..as we know tcp must be split large data into multiple packet..this the scenario:
i send 6 mb text file and tcp split packet into 2 separate packet
at the same time i send 26 mb video file and tcp also split packet into 2 separate packet
at my application i recive first part of text file let say it 3 mb of 6 mb
and then i receive first part of video let say it 13 mb of 26 mb..
the question is how i know that first packet of text file and first packet of video file is a different data and should handle in different way..(different buffer maybe??)
sorry for my bad english..
thanks in advance..
this some part of my code
ns = client.GetStream();
while (isListen == true && client.Connected)
{
while (!ns.DataAvailable)
{
try
{
Thread.Sleep(1);
}
catch (Exception ex)
{
}
}
data = new byte[client.ReceiveBufferSize];
//client.Client.Receive(data);
int indx = ns.Read(data, 0, data.Length);
string message = Encoding.ASCII.GetString(data, 0, indx);
if (message == GetEnumDescription(TypeData.Disconnect))
{
isListen = false;
server.ClientKeluar = objClient;
if (ClientDisconnected != null)
{
ClientDisconnected(objClient);
}
thisThread.Abort();
Server.kumpulanThread.Remove(thisThread);
Server._serverConnections.Remove(this);
client.Close();
}
else if (message.Contains(GetEnumDescription(TypeData.GetFile)))
{
//jalankan proses pengambilan data
}
else if (message.Contains(GetEnumDescription(TypeData.ByteLength)))
{
string length = message.Substring(6, message.Length - 6);
int len = int.Parse(length);
expectedLength = client.ReceiveBufferSize = len;
data = new byte[len];
}
else if (message.Contains(GetEnumDescription(TypeData.Image)))
{
typeData = "Image";
dat1 = new byte[client.ReceiveBufferSize];
index = 0;
}
else if (message.Contains(GetEnumDescription(TypeData.Video)))
{
typeData = "Video";
dat2 = new byte[client.ReceiveBufferSize];
index = 0;
}
else
{
if (typeData == "Image")
{
expectedLength = expectedLength - message.Length;
if (expectedLength == 0)
{
Array.Copy(data, 0, dat1, index, message.Length);
if (ImageDelivered != null)
{
ImageDelivered(dat1);
}
}
else
{
Array.Copy(data, 0, dat1, index, message.Length);
index = message.Length;
}
}
else if (typeData == "Video")
{
expectedLength = expectedLength - message.Length;
if (expectedLength == 0)
{
Array.Copy(data, 0, dat2, index, message.Length);
if (VideoDelivered != null)
{
VideoDelivered(dat2);
}
}
else
{
Array.Copy(data, 0, dat2, index, message.Length);
index = message.Length;
}
}
else
{
expectedLength = expectedLength - message.Length;
if (expectedLength == 0)
{
dataToWrite = dataToWrite + message;
string text = dataToWrite;
if (MessageDelivered != null)
{
MessageDelivered(text);
}
dataToWrite = "";
}
else
{
dataToWrite += message;
}
}
}
}
may anyone give sample code so i can get inspiration to solve this problem?
TCP protocol take cares of making segments of files and later joining them. You will get complete data in receive.

Problem in splitting a file

int bufferlength = 12488;
int pointer = 1;
int offset = 0;
int length = 0;
FileStream fstwrite = new FileStream("D:\\Movie.wmv", FileMode.Create);
while (pointer != 0)
{
byte[] buff = new byte[bufferlength];
FileStream fst = new FileStream("E:\\Movie.wmv", FileMode.Open);
pointer = fst.Read(buff, 0, bufferlength);
fst.Close();
fstwrite.Write(buff, offset , pointer);
offset += pointer;
}
I used the above code for splitting a file and place it in other drive.Im not able to set the correct offset and length for this routine can anyone help me to fix this
splitting in the sense ,i split it in "x" kbs and pass it somewhere make the same file in some other location
I find it atlast ,thanks to evry one who gave their valueble responses.
Currently you're always reading from the start of the file... and even if you weren't you'd just be copying the whole file.
Here's some code which will actually split a single file into multiple files:
public static void SplitFile(string inputFile,
string outputPrefix,
int chunkSize)
{
byte[] buffer = new byte[chunkSize];
using (Stream input = File.OpenRead(inputFile))
{
int index = 0;
while (input.Position < input.Length)
{
using (Stream output = File.Create(outputPrefix + index))
{
int chunkBytesRead = 0;
while (chunkBytesRead < chunkSize)
{
int bytesRead = input.Read(buffer,
chunkBytesRead,
chunkSize - chunkBytesRead);
// End of input
if (bytesRead == 0)
{
break;
}
chunkBytesRead += bytesRead;
}
output.Write(buffer, 0, chunkBytesRead);
}
index++;
}
}
}
Your reading bufferlength of bytes. Shouldn't you set the offset like this then?
offset += bufferlength;
Don't open your source file inside the loop, or you'll always read the first chunk.
Open it before the loop, then make sure your offset is applied to the read.

Categories