There is probably no other way to do this, but is there a way to append the contents of one text file into another text file, while clearing the first after the move?
The only way I know is to just use a reader and writer, which seems inefficient for large files...
Thanks!
No, I don't think there's anything which does this.
If the two files use the same encoding and you don't need to verify that they're valid, you can treat them as binary files, e.g.
using (Stream input = File.OpenRead("file1.txt"))
using (Stream output = new FileStream("file2.txt", FileMode.Append,
FileAccess.Write, FileShare.None))
{
input.CopyTo(output); // Using .NET 4
}
File.Delete("file1.txt");
Note that if file1.txt contains a byte order mark, you should skip past this first to avoid having it in the middle of file2.txt.
If you're not using .NET 4 you can write your own equivalent of Stream.CopyTo... even with an extension method to make the hand-over seamless:
public static class StreamExtensions
{
public static void CopyTo(this Stream input, Stream output)
{
if (input == null)
{
throw new ArgumentNullException("input");
}
if (output == null)
{
throw new ArgumentNullException("output");
}
byte[] buffer = new byte[8192];
int bytesRead;
while ((bytesRead = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, bytesRead);
}
}
}
Ignoring error handling, encodings, and efficiency for the moment, something like this would probably work (but I haven't tested it)
File.AppendAllText("path/to/destination/file", File.ReadAllText("path/to/source/file"));
Then you just have to delete or clear out the first file once this step is complete.
The cmd.exe version of this is
type fileone.txt >>filetwo.txt
del fileone.txt
You could create a system shell to do this. It should be pretty effecient.
I'm not sure what you mean by "inefficient". Jon's answer is probably enough for most cases.
However, if you are concerned about extremely large source files, Memory-Mapped Files could be your friend. See this link for more info.
Related
Hello
I've been working on terminal-like application to get better at programming in c#, just something to help me learn. I've decided to add a feature that will copy a file exactly as it is, to a new file... It seems to work almost perfect. When opened in Notepad++ the file are only a few lines apart in length, and very, very, close to the same as far as actual file size goes. However, the duplicated copy of the file never runs. It says the file is corrupt. I have a feeling it's within the methods for reading and rewriting binary to files that I created. The code is as follows, thank for the help. Sorry for the spaghetti code too, I get a bit sloppy when I'm messing around with new ideas.
Class that handles the file copying/writing
using System;
using System.IO;
//using System.Collections.Generic;
namespace ConsoleFileExplorer
{
class FileTransfer
{
private BinaryWriter writer;
private BinaryReader reader;
private FileStream fsc; // file to be duplicated
private FileStream fsn; // new location of file
int[] fileData;
private string _file;
public FileTransfer(String file)
{
_file = file;
fsc = new FileStream(file, FileMode.Open);
reader = new BinaryReader(fsc);
}
// Reads all the original files data to an array of bytes
public byte[] ReadAllDataToArray()
{
byte[] bytes = reader.ReadBytes((int)fsc.Length); // reading bytes from the original file
return bytes;
}
// writes the array of original byte data to a new file
public void WriteDataFromArray(byte[] fileData, string path) // got a feeling this is the problem :p
{
fsn = new FileStream(path, FileMode.Create);
writer = new BinaryWriter(fsn);
int i = 0;
while(i < fileData.Length)
{
writer.Write(fileData[i]);
i++;
}
}
}
}
Code that interacts with this class .
(Sleep(5000) is because I was expecting an error on first attempt...
case '3':
Console.Write("Enter source file: ");
string sourceFile = Console.ReadLine();
if (sourceFile == "")
{
Console.Clear();
Console.ForegroundColor = ConsoleColor.DarkRed;
Console.Error.WriteLine("Must input a proper file path.\n");
Console.ForegroundColor = ConsoleColor.White;
Menu();
} else {
Console.WriteLine("Copying Data"); System.Threading.Thread.Sleep(5000);
FileTransfer trans = new FileTransfer(sourceFile);
//copying the original files data
byte[] data = trans.ReadAllDataToArray();
Console.Write("Enter Location to store data: ");
string newPath = Console.ReadLine();
// Just for me to make sure it doesnt exit if i forget
if(newPath == "")
{
Console.Clear();
Console.ForegroundColor = ConsoleColor.DarkRed;
Console.Error.WriteLine("Cannot have empty path.");
Console.ForegroundColor = ConsoleColor.White;
Menu();
} else
{
Console.WriteLine("Writing data to file"); System.Threading.Thread.Sleep(5000);
trans.WriteDataFromArray(data, newPath);
Console.WriteLine("File stored.");
Console.ReadLine();
Console.Clear();
Menu();
}
}
break;
File compared to new file
right-click -> open in new tab is probably a good idea
Original File
New File
You're not properly disposing the file streams and the binary writer. Both tend to buffer data (which is a good thing, especially when you're writing one byte at a time). Use using, and your problem should disappear. Unless somebody is editing the file while you're reading it, of course.
BinaryReader and BinaryWriter do not just write "raw data". They also add metadata as needed - they're designed for serialization and deserialization, rather than reading and writing bytes. Now, in the particular case of using ReadBytes and Write(byte[]) in particular, those are really just raw bytes; but there's not much point to use these classes just for that. Reading and writing bytes is the thing every Stream gives you - and that includes FileStreams. There's no reason to use BinaryReader/BinaryWriter here whatsover - the file streams give you everything you need.
A better approach would be to simply use
using (var fsn = ...)
{
fsn.Write(fileData, 0, fileData.Length);
}
or even just
File.WriteAllBytes(fileName, fileData);
Maybe you're thinking that writing a byte at a time is closer to "the metal", but that simply isn't the case. At no point during this does the CPU pass a byte at a time to the hard drive. Instead, the hard drive copies data directly from RAM, with no intervention from the CPU. And most hard drives still can't write (or read) arbitrary amounts of data from the physical media - instead, you're reading and writing whole sectors. If the system really did write a byte at a time, you'd just keep rewriting the same sector over and over again, just to write one more byte.
An even better approach would be to use the fact that you've got file streams open, and stream the files from source to destination rather than first reading everything into memory, and then writing it back to disk.
There is an File.Copy() Method in C#, you can see it here https://msdn.microsoft.com/ru-ru/library/c6cfw35a(v=vs.110).aspx
If you want to realize it by yourself, try to place a breakpoint inside your methods and use a debug. It is like a story about fisher and god, who gived a rod to fisher - to got a fish, not the exactly fish.
Also, look at you int[] fileData and byte[] fileData inside last method, maybe this is problem.
I have a code that is SSIS script task to zip file written in C#.
I have problem when zipping 1gb (approxymately) file.
I try to implement this code and still get error 'System.OutOfMemoryException'
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at ST_4cb59661fb81431abcf503766697a1db.ScriptMain.AddFileToZipUsingStream(String sZipFile, String sFilePath, String sFileName, String sBackupFolder, String sPrefixFolder) in c:\Users\dtmp857\AppData\Local\Temp\vsta\84bef43d323b439ba25df47c365b5a29\ScriptMain.cs:line 333
at ST_4cb59661fb81431abcf503766697a1db.ScriptMain.Main() in c:\Users\dtmp857\AppData\Local\Temp\vsta\84bef43d323b439ba25df47c365b5a29\ScriptMain.cs:line 131
This is the snippet of code when zipping file:
protected bool AddFileToZipUsingStream(string sZipFile, string sFilePath, string sFileName, string sBackupFolder, string sPrefixFolder)
{
bool bIsSuccess = false;
try
{
if (File.Exists(sZipFile))
{
using (ZipArchive addFile = ZipFile.Open(sZipFile, ZipArchiveMode.Update))
{
addFile.CreateEntryFromFile(sFilePath, sFileName);
//Move File after zipping it
BackupFile(sFilePath, sBackupFolder, sPrefixFolder);
}
}
else
{
//from https://stackoverflow.com/questions/28360775/adding-large-files-to-io-compression-ziparchiveentry-throws-outofmemoryexception
using (var zipFile = ZipFile.Open(sZipFile, ZipArchiveMode.Update))
{
var zipEntry = zipFile.CreateEntry(sFileName);
using (var writer = new BinaryWriter(zipEntry.Open()))
using (FileStream fs = File.Open(sFilePath, FileMode.Open))
{
var buffer = new byte[16 * 1024];
using (var data = new BinaryReader(fs))
{
int read;
while ((read = data.Read(buffer, 0, buffer.Length)) > 0)
writer.Write(buffer, 0, read);
}
}
}
//Move File after zipping it
BackupFile(sFilePath, sBackupFolder, sPrefixFolder);
}
bIsSuccess = true;
}
catch (Exception ex)
{
throw ex;
}
return bIsSuccess;
}
What I am missing, please give me suggestion maybe tutorial or best practice handling this problem.
I know this is an old post but what can I say, it helped me sort out some stuff and still comes up as a top hit on Google.
So there is definitely something wrong with the System.IO.Compression library!
First and Foremost...
You must make sure to turn off 32-Preferred. Having this set (in my case with a build for "AnyCPU") causes so many inconsistent issues.
Now with that said, I took some demo files (several less than 500MB, one at 500MB, and one at 1GB), and created a sample program with 3 buttons that made use of the 3 methods.
Button 1 - ZipArchive.CreateFromDirectory(AbsolutePath, TargetFile);
Button 2 - ZipArchive.CreateEntryFromFile(AbsolutePath, RelativePath);
Button 3 - Using the [16 * 1024] Byte Buffer method from above
Now here is where it gets interesting. (Assuming that the program is built as "AnyCPU" and with NO 32 Preferred check)... all 3 Methods worked on a Windows 64-Bit OS, regardless of how much memory it had.
However, as soon as I ran the same test on a 32-Bit OS, regardless of how much memory it had, ONLY method 1 worked!
Method 2 and 3 blew up with the outofmemory error AND to add salt to it, method 3 (the preferred method of chunking) actually corrupted more files than method #2!
By messed up, I mean that of my files, the 500MB and the 1GB file ended up in the zipped archive but at a size less than the original (it was basically truncated).
So I dunno... since there are not many 32-bit OS around anymore, I guess maybe it is a moot point.
But seems like there are some bugs in the System.IO.Compression Framework!
I am trying to send a file over a NetworkStream and rebuild it on the client side. I can get the data over correctly (i think) but when I use either a BinaryWriter or a FileStream object to recreate the file, the file is cut off in the beginning at the same point no matter what methodology I use.
private void ReadandSaveFileFromServer(ref TcpClient clientATF,ref NetworkStream currentStream, string locationToSave)
{
int fileSize = 0;
string fileName = "";
fileName = ReadStringFromServer(ref clientATF,ref currentStream);
fileSize = ReadIntFromServer(ref clientATF,ref currentStream);
byte[] fileSent = new byte[fileSize];
if (currentStream.CanRead && clientATF.Connected)
{
currentStream.Read(fileSent, 0, fileSent.Length);
WriteToConsole("Log Recieved");
}
else
{
WriteToConsole("Log Transfer Failed");
}
FileStream fileToCreate = new FileStream(locationToSave + "\\" + fileName, FileMode.Create);
fileToCreate.Seek(0, SeekOrigin.Begin);
fileToCreate.Write(fileSent, 0, fileSent.Length);
fileToCreate.Close();
//binWriter = new BinaryWriter(File.Open(locationToSave + "\\" + fileName, FileMode.Create));
//binWriter.Write(fileSent);
//binWriter.Close();
}
When I step through and check fileName and fileSize, they are correct. The byte[] is also fully populated. Any clue as to what I can do next?
Thanks in advance...
Sean
EDIT!!!:
So I figured out what is happening. When I read a string and then the Int from stream, the byte array is 256 indices long. So my read for string is taking in the int, which then will clobber the other areas. Need to figure this out...
For one thing, you can use the convenience method File.WriteAllBytes to write the data more simply. But I doubt that that's the problem.
You're assuming you can read all the data in a single call to Read. You're ignoring the return value. Don't do that - instead, read multiple times until either you've read everything you expect to, or you've reached the end of the stream. See this article for more details. If you're using .NET 4, there's a new CopyTo method you may find useful.
(As an aside, your use of ref suggests that you don't understand what it really means. It's well worth making sure you understand how arguments are passed in C#.)
To add to Jon Skeet's answer, your reading code should be:
int bytesRead;
int readPos = 0;
do
{
bytesRead = currentStream.Read(fileSent, readPos, fileSent.Length);
readPos += bytesRead;
} while (bytesRead > 0);
If you are looking for a general solution for sending and receiving files over a network have you considered using a C# network library? It has probably solved most of the issues you will come across when trying to do this.
Disclaimer: I'm one of the developers of this library.
I came across this piece of code today:
public static byte[] ReadContentFromFile(String filePath)
{
FileInfo fi = new FileInfo(filePath);
long numBytes = fi.Length;
byte[] buffer = null;
if (numBytes > 0)
{
try
{
FileStream fs = new FileStream(filePath, FileMode.Open);
BinaryReader br = new BinaryReader(fs);
buffer = br.ReadBytes((int)numBytes);
br.Close();
fs.Close();
}
catch (Exception e)
{
System.Console.WriteLine(e.StackTrace);
}
}
return buffer;
}
My first thought is to refactor it down to this:
public static byte[] ReadContentFromFile(String filePath)
{
return File.ReadAllBytes(filePath);
}
System.IO.File.ReadAllBytes is documented as:
Opens a binary file, reads the
contents of the file into a byte
array, and then closes the file.
... but am I missing some key difference?
The original code returns a null reference if the file is empty, and won't throw an exception if it can't be read. Personally I think it's better to return an empty array, and to not swallow exceptions, but that's the difference between refactoring and redesigning I guess.
Oh, also, if the file length is changed between finding out the length and reading it, then the original code will read the original length. Again, I think the File.ReadAllBytes behaviour is better.
What do you want to happen if the file doesn't exist?
That's basically the same method if you add the try {...} catch{...} block. The method name, ReadContentFromFile, further proves the point.
What a minute... isn't that something a unit test should tell?
In this case, no you are not missing anything at all. from a file operation standpoint. Now you know that your lack of exception handling will change the behavior of the system.
It is a streamlined way of reading the bytes of a file.
NOTE: that if you need to set any custom options on the read, then you would need the long form.
Is there a way to know how many bytes of a stream have been used by StreamReader?
I have a project where we need to read a file that has a text header followed by the start of the binary data. My initial attempt to read this file was something like this:
private int _dataOffset;
void ReadHeader(string path)
{
using (FileStream stream = File.OpenRead(path))
{
StreamReader textReader = new StreamReader(stream);
do
{
string line = textReader.ReadLine();
handleHeaderLine(line);
} while(line != "DATA") // Yes, they used "DATA" to mark the end of the header
_dataOffset = stream.Position;
}
}
private byte[] ReadDataFrame(string path, int frameNum)
{
using (FileStream stream = File.OpenRead(path))
{
stream.Seek(_dataOffset + frameNum * cbFrame, SeekOrigin.Begin);
byte[] data = new byte[cbFrame];
stream.Read(data, 0, cbFrame);
return data;
}
return null;
}
The problem is that when I set _dataOffset to stream.Position, I get the position that the StreamReader has read to, not the end of the header. As soon as I thought about it this made sense, but I still need to be able to know where the end of the header is and I'm not sure if there's a way to do it and still take advantage of StreamReader.
You can find out how many bytes the StreamReader has actually returned (as opposed to read from the stream) in a number of ways, none of them too straightforward I'm afraid.
Get the result of textReader.CurrentEncoding.GetByteCount(totalLengthOfAllTextRead) and then seek to this position in the stream.
Use some reflection hackery to retrieve the value of the private variable of the StreamReader object that corresponds to the current byte position within the internal buffer (different from that with the stream - usually behind, but no more than equal to of course). Judging by .NET Reflector, the this variable seems to be named bytePos.
Don't bother using a StreamReader at all but instead implement your custom ReadLine function built on top of the Stream or BinaryReader even (BinaryReader is guaranteed never to read further ahead than what you request). This custom function must read from the stream char by char, so you'd actually have to use the low-level Decoder object (unless the encoding is ASCII/ANSI, in which case things are a bit simpler due to single-byte encoding).
Option 1 is going to be the least efficient I would imagine (since you're effectively re-encoding text you just decoded), and option 3 the hardest to implement, though perhaps the most elegant. I'd probably recommend against using the ugly reflection hack (option 2), even though it's looks tempting, being the most direct solution and only taking a couple of lines. (To be quite honest, the StreamReader class really ought to expose this variable via a public property, but alas it does not.) So in the end, it's up to you, but either method 1 or 3 should do the job nicely enough...
Hope that helps.
So the data is utf8 (the default encoding for StreamReader). This is a multibyte encoding, so IndexOf would be inadvisable. You could:
Encoding.UTF8.GetByteCount(string)
on your data so far, adding 1 or 2 bytes for the missing line ending.
If you're needing to count bytes, I'd go with the BinaryReader. You can take the results and cast them about as needed, but I find its idea of its current position to be more reliable (in that since it reads in binary, its immune to character-set problems).
So your last line contains 'DATA' + an unknown amount of data bytes. You could extract the position by using IndexOf() with your last read line. Then readjust the stream.Position.
But I am not sure if you should use ReadLine() at all in this case. Maybe it would be better to read byte by byte until you reach the 'DATA' mark.
The line breaks are easily identifiable without needing to decode the stream first (except for some encodings rarely used for text files like EBCDIC, UTF-16, UTF-32), so you can just read each line as bytes and then decode the entire line:
using (FileStream stream = File.OpenRead(path)) {
List<byte> buffer = new List<byte>();
bool hasCr = false;
bool done = false;
while (!done) {
int b = stream.ReadByte();
if (b == -1) throw new IOException("End of file reached in header.");
if (b == 13) {
hasCr = true;
} else if (b == 10 && hasCr) {
string line = Encoding.UTF8.GetString(buffer.ToArray(), 0, buffer.Count);
if (line == "DATA") {
done = true;
} else {
HandleHeaderLine(line);
}
buffer.Clear();
hasCr = false;
} else {
if (hasCr) buffer.Add(13);
hasCr = false;
buffer.Add((byte)b);
}
}
_dataOffset = stream.Position;
}
Instead of closing the stream and open it again, you could of course just keep on reading the data.