I wrote the below method to archive files into one file using binary mode:
// Compile archive
public void CompileArchive(string FilePath, ListView FilesList, Label Status, ProgressBar Progress)
{
FileTemplate TempFile = new FileTemplate();
if (FilesList.Items.Count > 0)
{
BinaryWriter Writer = new BinaryWriter(File.Open(FilePath, FileMode.Create), System.Text.Encoding.ASCII);
Progress.Maximum = FilesList.Items.Count - 1;
Writer.Write((long)FilesList.Items.Count);
for (int i = 0; i <= FilesList.Items.Count - 1; i++)
{
TempFile.Name = FilesList.Items[i].SubItems[1].Text;
TempFile.Path = "%ARCHIVE%";
TempFile.Data = this.ReadFileData(FilesList.Items[i].SubItems[2].Text + "\\" + TempFile.Name);
Writer.Write(TempFile.Name);
Writer.Write(TempFile.Path);
Writer.Write(TempFile.Data);
Status.Text = "Status: Writing '" + TempFile.Name + "'";
Progress.Value = i;
}
Writer.Close();
Status.Text = "Status: None";
Progress.Value = 0;
}
}
I read files data using ReadFileData which is in the above method method which return a string of data. (StreamReader) Next up I extract my archive. Everything is done great but the data which will being stored in the extraction method doesn't give me a right data so the extracted files have not right data to show their original functionality.
Extract method:
// Extract archive
public void ExtractArchive(string ArchivePath, string ExtractPath, ListView FilesList, Label Status, ProgressBar Progress)
{
FileTemplate TempFile = new FileTemplate();
BinaryReader Reader = new BinaryReader(File.Open(ArchivePath, FileMode.Open), System.Text.Encoding.ASCII);
long Count = Reader.ReadInt64();
if (Count > 0)
{
Progress.Maximum = (int)Count - 1;
FilesList.Items.Clear();
for (int i = 0; i <= Count - 1; i++)
{
TempFile.Name = Reader.ReadString();
TempFile.Path = Reader.ReadString();
TempFile.Data = Reader.ReadString();
Status.Text = "Status: Reading '" + TempFile.Name + "'";
Progress.Value = i;
if (!Directory.Exists(ExtractPath))
{
Directory.CreateDirectory(ExtractPath);
}
BinaryWriter Writer = new BinaryWriter(File.Open(ExtractPath + "\\" + TempFile.Name, FileMode.Create), System.Text.Encoding.ASCII);
Writer.Write(TempFile.Data);
Writer.Close();
string[] ItemArr = new string[] { i.ToString(), TempFile.Name, TempFile.Path };
ListViewItem ListItem = new ListViewItem(ItemArr);
FilesList.Items.Add(ListItem);
}
Reader.Close();
Status.Text = "Status: None";
Progress.Value = 0;
}
}
The structure:
struct FileTemplate
{
public string Name, Path, Data;
}
Thanks.
Consider using byte arrays for write and safe the data.
Byte array( write )
Byte[] bytes = File.ReadAllBytes(..);
// Write it into your stream
myStream.Write(bytes.Count);
myStream.Write(bytes, 0, bytes.Count);
Byte array ( read )
Int32 byteCount = myStream.ReadInt32();
Byte[] bytes = new Byte[byteCount];
myStream.Read(bytes, 0, byteCount);
The example of an icon makes it clear; you are using string-based APIs to handle data that isn't strings (icons are not string-based). More, you are usig ASCII, so only characters in the 0-127 range would ever be correct. Basically, you can't do that. You need to handle binary data using binary methods (perhaps using the Stream API).
Other options:
use serialization to store instances of objects with the data properties and a BLOB (byte[]) for the content
use something like zip (maybe SharpZipLib) which does somethig very similar, essentially
If your Data can be binary data, then you shouldn't have them in a string. They should be a byte[].
When you write a string using the ASCII encoding like you do, and try to write binary data, many of the bytes (treated as Unicode characters) can't be encoded and so you end up with damaged data.
Morale of the story: never treat binary data as text.
Related
I want to read from a "start" to a "stop" from a raw image file that I've created with FKT Imager.
I have a code that works, but I dont know if it's the best way of doing it?
// Read file, byte at the time (example 00, 5A)
int start = 512;
int stop = 3345332;
FileStream fs = new FileStream("file.001", FileMode.Open, FileAccess.Read);
int hexIn;
String hex;
String data = "";
fs.Position = start;
for (int i = 0; i < stop; i++) { // i = offset in bytes
hexIn = fs.ReadByte();
hex = string.Format("{0:X2}", hexIn);
data = data + hex;
} //for
fs.Close();
Console.Writeline("data=" + data);
You want to read a range of bytes from within a file. Why not reading all bytes in one go into an array and then do the transformation?
private string ReadFile(string filename, int offset, int length)
{
byte[] data = new byte[length];
using (FileStream fs = new FileStream(filename, FileMode.Open))
{
fs.Position = offset;
fs.Read(data, 0, length);
}
return string.Join("", data.Select(x => x.ToString("X2")));
}
I have a file which has records for employees data and images. Each record for one employee and his data, his image, and his wife image. I can't change the file structure
There are separators between text data and images.
Here a sample of one record:
record number D01= employee name !=IMG1= employee image ~\IMG2= wife image ^! \r\n
(D01= & !=IMG1= & ~\IMG2= & ^!) are the separators
This is the code how the file was written:
FileStream fs = new FileStream(filePath, FileMode.Create);
StreamWriter sw = new StreamWriter(fs, Encoding.UTF8);
BinaryWriter bw = new BinaryWriter(fs);
sw.Write(employeeDataString);
sw.Write("!=IMG1=");
sw.Flush();
bw.Write(employeeImg, 0, employeeImg.Length);
bw.Flush();
sw.Write(#"~\IMG2=");
sw.Flush();
bw.Write(wifeImg, 0, wifeImg.Length);
bw.Flush();
sw.Write("^!");
sw.Flush();
sw.Write(#"\r\n");
sw.Flush();
So how to read that file?
There many kinds of files; the three most common ways to store records are
Fixed size records, ideally with fixed size fields. Very simple to implement random access.
Tagged files with tags and data interwoven. A bit complicated, but highly flexible and still rather efficiently readable, since the tags hold the positions and lengths of the data.
And then there are Separated files. Always a pain.
Two issues:
You must be sure that the separators are never in the data. Not 100% possible when you have binary data like images..
There is no efficient way to access individual records..
Ignoring the 1st issue, here is a piece of code that will read all records into a list of class ARecord.
FileStream fs;
BinaryReader br;
List<ARecord> theRecords;
class ARecord
{
public string name { get; set; }
public Image img1 { get; set; }
public Image img2 { get; set; }
}
int readFile(string filePath)
{
fs = new FileStream(filePath, FileMode.Open);
br = new BinaryReader(fs, Encoding.UTF8);
theRecords = new List<ARecord>();
ARecord record = getNextRecord();
while (record != null)
{
theRecords.Add(record);
record = getNextRecord();
}
return theRecords.Count;
}
ARecord getNextRecord()
{
ARecord record = new ARecord ();
MemoryStream ms;
System.Text.UTF8Encoding enc = new System.Text.UTF8Encoding();
byte[] sepImg1 = enc.GetBytes(#"!=IMG1=");
byte[] sepImg2 = enc.GetBytes(#"~\IMG2=");
byte[] sepRec = enc.GetBytes(#"^!\r\n");
record.name = enc.GetString(readToSep(sepImg1));
ms = new MemoryStream(readToSep(sepImg2));
if (ms.Length <= 0) return null; // check for EOF
record.img1 = Image.FromStream(ms);
ms = new MemoryStream(readToSep(sepRec));
record.img2 = Image.FromStream(ms);
return record;
}
byte[] readToSep(byte[] sep)
{
List<byte> data = new List<byte>();
bool eor = false;
int sLen = sep.Length;
int sPos = 0;
while (br.BaseStream.Position < br.BaseStream.Length && !eor )
{
byte b = br.ReadByte();
data.Add(b);
if (b != sep[sPos]) { sPos = 0; }
else if (sPos < sLen - 1) sPos++; else eor = true;
}
if (data.Count > sLen ) data.RemoveRange(data.Count - sLen , sLen );
return data.ToArray();
}
Notes:
There is no error checking whatsoever.
Watch those separators! is the # really right??
Expanding the code to create the record number is left to you
I'm creating simple self-extracting archive using magic number to mark the beginning of the content.
For now it is a textfile:
MAGICNUMBER .... content of the text file
Next, textfile copied to the end of the executable:
copy programm.exe/b+textfile.txt/b sfx.exe
I'm trying to find the second occurrence of the magic number (the first one would be a hardcoded constant obviously) using the following code:
string my_filename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
StreamReader file = new StreamReader(my_filename);
const int block_size = 1024;
const string magic = "MAGICNUMBER";
char[] buffer = new Char[block_size];
Int64 count = 0;
Int64 glob_pos = 0;
bool flag = false;
while (file.ReadBlock(buffer, 0, block_size) > 0)
{
var rel_pos = buffer.ToString().IndexOf(magic);
if ((rel_pos > -1) & (!flag))
{
flag = true;
continue;
}
if ((rel_pos > -1) & (flag == true))
{
glob_pos = block_size * count + rel_pos;
break;
}
count++;
}
using (FileStream fs = new FileStream(my_filename, FileMode.Open, FileAccess.Read))
{
byte[] b = new byte[fs.Length - glob_pos];
fs.Seek(glob_pos, SeekOrigin.Begin);
fs.Read(b, 0, (int)(fs.Length - glob_pos));
File.WriteAllBytes("c:/output.txt", b);
but for some reason I'm copying almost entire file, not the last few kilobytes. Is it because of the compiler optimization, inlining magic constant in while loop of something similar?
How should I do self-extraction archive properly?
Guessed I should read file backwards to avoid problems of compiler inlining magic constant multiply times.
So I've modified my code in the following way:
string my_filename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
StreamReader file = new StreamReader(my_filename);
const int block_size = 1024;
const string magic = "MAGIC";
char[] buffer = new Char[block_size];
Int64 count = 0;
Int64 glob_pos = 0;
while (file.ReadBlock(buffer, 0, block_size) > 0)
{
var rel_pos = buffer.ToString().IndexOf(magic);
if (rel_pos > -1)
{
glob_pos = block_size * count + rel_pos;
}
count++;
}
using (FileStream fs = new FileStream(my_filename, FileMode.Open, FileAccess.Read))
{
byte[] b = new byte[fs.Length - glob_pos];
fs.Seek(glob_pos, SeekOrigin.Begin);
fs.Read(b, 0, (int)(fs.Length - glob_pos));
File.WriteAllBytes("c:/output.txt", b);
}
So I've scanned the all file once, found that I though would be the last occurrence of the magic number and copied from here to the end of it. While the file created by this procedure seems smaller than in previous attempt it in no way the same file I've attached to my "self-extracting" archive. Why?
My guess is that position calculation of the beginning of the attached file is wrong due to used conversion from binary to string. If so how should I modify my position calculation to make it correct?
Also how should I choose magic number then working with real files, pdfs for example? I wont be able to modify pdfs easily to include predefined magic number in it.
Try this out. Some C# Stream IO 101:
public static void Main()
{
String path = #"c:\here is your path";
// Method A: Read all information into a Byte Stream
Byte[] data = System.IO.File.ReadAllBytes(path);
String[] lines = System.IO.File.ReadAllLines(path);
// Method B: Use a stream to do essentially the same thing. (More powerful)
// Using block essentially means 'close when we're done'. See 'using block' or 'IDisposable'.
using (FileStream stream = File.OpenRead(path))
using (StreamReader reader = new StreamReader(stream))
{
// This will read all the data as a single string
String allData = reader.ReadToEnd();
}
String outputPath = #"C:\where I'm writing to";
// Copy from one file-stream to another
using (FileStream inputStream = File.OpenRead(path))
using (FileStream outputStream = File.Create(outputPath))
{
inputStream.CopyTo(outputStream);
// Again, this will close both streams when done.
}
// Copy to an in-memory stream
using (FileStream inputStream = File.OpenRead(path))
using (MemoryStream outputStream = new MemoryStream())
{
inputStream.CopyTo(outputStream);
// Again, this will close both streams when done.
// If you want to hold the data in memory, just don't wrap your
// memory stream in a using block.
}
// Use serialization to store data.
var serializer = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
// We'll serialize a person to the memory stream.
MemoryStream memoryStream = new MemoryStream();
serializer.Serialize(memoryStream, new Person() { Name = "Sam", Age = 20 });
// Now the person is stored in the memory stream (just as easy to write to disk using a
// file stream as well.
// Now lets reset the stream to the beginning:
memoryStream.Seek(0, SeekOrigin.Begin);
// And deserialize the person
Person deserializedPerson = (Person)serializer.Deserialize(memoryStream);
Console.WriteLine(deserializedPerson.Name); // Should print Sam
}
// Mark Serializable stuff as serializable.
// This means that C# will automatically format this to be put in a stream
[Serializable]
class Person
{
public String Name { get; set; }
public Int32 Age { get; set; }
}
The easiest solution is to replace
const string magic = "MAGICNUMBER";
with
static string magic = "magicnumber".ToUpper();
But there are more problems with the whole magic string approach. What is the file contains the magic string? I think that the best solution is to put the file size after the file. The extraction is much easier that way: Read the length from the last bytes and read the required amount of bytes from the end of the file.
Update: This should work unless your files are very big. (You'd need to use a revolving pair of buffers in that case (to read the file in small blocks)):
string inputFilename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
string outputFilename = inputFilename + ".secret";
string magic = "magic".ToUpper();
byte[] data = File.ReadAllBytes(inputFilename);
byte[] magicData = Encoding.ASCII.GetBytes(magic);
for (int idx = magicData.Length - 1; idx < data.Length; idx++) {
bool found = true;
for (int magicIdx = 0; magicIdx < magicData.Length; magicIdx++) {
if (data[idx - magicData.Length + 1 + magicIdx] != magicData[magicIdx]) {
found = false;
break;
}
}
if (found) {
using (FileStream output = new FileStream(outputFilename, FileMode.Create)) {
output.Write(data, idx + 1, data.Length - idx - 1);
}
}
}
Update2: This should be much faster, use little memory and work on files of all size, but the program your must be proper executable (with size being a multiple of 512 bytes):
string inputFilename = System.Diagnostics.Process.GetCurrentProcess().MainModule.FileName;
string outputFilename = inputFilename + ".secret";
string marker = "magic".ToUpper();
byte[] data = File.ReadAllBytes(inputFilename);
byte[] markerData = Encoding.ASCII.GetBytes(marker);
int markerLength = markerData.Length;
const int blockSize = 512; //important!
using(FileStream input = File.OpenRead(inputFilename)) {
long lastPosition = 0;
byte[] buffer = new byte[blockSize];
while (input.Read(buffer, 0, blockSize) >= markerLength) {
bool found = true;
for (int idx = 0; idx < markerLength; idx++) {
if (buffer[idx] != markerData[idx]) {
found = false;
break;
}
}
if (found) {
input.Position = lastPosition + markerLength;
using (FileStream output = File.OpenWrite(outputFilename)) {
input.CopyTo(output);
}
}
lastPosition = input.Position;
}
}
Read about some approaches here: http://www.strchr.com/creating_self-extracting_executables
You can add the compressed file as resource to the project itself:
Project > Properties
Set the property of this resource to Binary.
You can then retrieve the resource with
byte[] resource = Properties.Resources.NameOfYourResource;
Search backwards rather than forwards (assuming your file won't contain said magic number).
Or append your (text) file and then lastly its length (or the length of the original exe), so you only need read the last DWORD / few bytes to see how long the file is - then no magic number is required.
More robustly, store the file as an additional data section within the executable file. This is more fiddly without external tools as it requires knowledge of the PE file format used for NT executables, q.v. http://msdn.microsoft.com/en-us/library/ms809762.aspx
I need to copy the content of one.xaml file into a byte clob. This is my code. It looks like I am not accessing the content of this file. Can anyone tell me why. I am new to C# APIs but I am a programmer. the choice of 4000 is because of the maximum string size restriction, just in case someone wonders. I might have bugs about zies etc.. but the main thing is that I want to se the content of thje xaml file into the clob . Thanks.
string LoadedFileName = #"C:\temp2\one.xaml";//Fd.FileName;
byte[] clobByteTotal ;
FileStream stream = new FileStream(LoadedFileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
if (stream.Length % 2 >= 1)
{
clobByteTotal = new byte[stream.Length + 1];
}
else clobByteTotal = new byte[stream.Length];
for (int i = 0; i <= stream.Length/4000; i++)
{
int x = (stream.Length / 4000 == 0) ? (int)stream.Length : 4000;
stream.Read(stringSizeClob, i*4000, x);
String tempString1 = stringSizeClob.ToString();
byte[] clobByteSection = Encoding.Unicode.GetBytes(stringSizeClob.ToString());
Buffer.BlockCopy(clobByteSection, 0, clobByteTotal, i * clobByteSection.Length, clobByteSection.Length);
}
If you just need read a content of a text file into a byte array, just can do this
string xamlText = File.ReadAlltext(LoadedFileName );
byte[] xamlBytes = Encoding.Unicode.GetBytes(xamlText); //if this is a Unicode and not UTF8
//write byte data somewhere
This much shorter option, which is suitable, naturally for not too big files.
Any reason not to use File.ReadAllBytes?
byte[] xamlBytes = File.ReadAllBytes(path);
I'm using the following code to convert a hex string written in a txt file to a
byte file. The problem is that it doesn't handles large txt files and I get the
"out of memory exception". I know that it should be done in "chunks" but I just can't
get it right.
Please help! The code:
protected void Button1_Click(object sender, EventArgs e)
{
{
string tempFileName = (Server.MapPath("~\\Tempfolder\\" + FileUpload2.FileName));
using (FileStream fs = new FileStream(tempFileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (StreamReader sr = new StreamReader(fs))
{
string s = (sr.ReadToEnd());
if (s.Length % 2 == 1) { lblispis.Text = "String must have an even length"; }
else
{
string hexString = s;
File.WriteAllBytes(tempFileName + ".bin", StringToByteArray(hexString));
lblispis.Text = "Done.";
}
}
}
}
public static byte[] StringToByteArray(String hex)
{
int NumberChars = hex.Length;
byte[] bytes = new byte[NumberChars / 2];
for (int i = 0; i < NumberChars; i += 2)
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
return bytes;
}
You could replace the ReadToEnd call with ReadLine and wrap it in a loop, if the file format allows that.
If that's not the case, there's always the option to read an even number of characters (Read(char[], int, int)) until you hit the end of the file. Of course that way you detect an uneven number of characters very late after having done quite some work already.
To add to #Wormbo's answer, note that a hex string only contains twice as much characters as the byte array. In .NET, object size limit is 2GB (2GB is actually the process size limit on a 32-bit machine), but you can easily have problems allocating even ~800MB contiguous blocks due to heap fragmentation.
In other words, you will want to write directly to disk, immediately after converting it:
using (StreamReader reader = new StreamReader(hex))
using (BinaryWriter writer = new BinaryWriter(File.Open(bin, FileMode.Create)))
{
string line;
while ((line = reader.ReadLine()) != null)
writer.Write(StringToByteArray(line));
}
[Edit]
I've fixed it, parentheses had to be added around the assignment (check the while statement above).
Note that this is only a shorthand for something like:
string line = reader.ReadLine();
while (line != null)
{
writer.Write(...);
line = reader.ReadLine();
}