I have a file, which contains data, I want to read it as byte[] and divide into 3 blocks. First line might be read as string, then 2nd block, might be 1-3 length of lines and all left bytes as block 3.
I was wondering, how can I get that block 1 and block 2 would be a string made of byte[], block 3 would be kept byte[].
File:
00256000 12 // block 1 single line
a2#b2#c2#d2#e2# //
1# // block 2 readline doesn't fit, unknown length of lines
1# //
—q3л // block 3 left bytes
I was trying to do FileStream.Read(bytes, 0, file.length), but it only reads all bytes.
StreamReader.ReadLine() is suitable only for 1st line, but it reads plain string, not bytes, it skips '\n' , '\r' etc.
I don't know which way is better to read files and it would be perfect to read allbytes and somehow divide them to these 3 blocks, to have exact block size.
You can read all bytes and iterate through buffer searching for line endings. When you find line endings convert textparts with
string text = Encoding.UTF8.GetString(buffer, start_len, end_len);
p.s. be sure to use exact encoding... UTF8 is an example...
Create a class with the data members you want, mark it as serializable, then serialize the data (i.e. save it to a file) and deserialize it whenever you want the data.
[Serializable()]
public class Data1
{
public Data1()
{
}
public String[] Block { get; set; }
}
To load that data after you have saved it, use some technique like this:
public Data1 Load(string filename)
{
if (System.IO.File.Exists(filename))
{
using (var stream = System.IO.File.OpenRead(filename))
{
var deserializer = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
return deserializer.Deserialize(stream) as Data1;
}
}
return null;
}
I won't do it all for you, though! You need to look into how to Serialize an instance of Data1.
There is very convenient method for reading small files, it returns array of lines.
string[] lines = File.ReadAllLines("filename");
Related
I want to know wether my byte array ends on carriage return and if not I want to add it.
Thats what I have tried
byte[] fileContent = File.ReadAllBytes(openFileDialog.FileName);
byte[] endCharacter = fileContent.Skip(fileContent.Length - 2).Take(2).ToArray();
if (!(endCharacter.Equals(Encoding.ASCII.GetBytes(Environment.NewLine))))
{
fileContent = fileContent.Concat(Encoding.ASCII.GetBytes(Environment.NewLine)).ToArray();
}
But I don't get it... Is this the right approach? If so, what's wrong with equals? Even if my byte array ends with {10,13}, the If statement never detects it.
In this case, Equals checks for reference equality; while endCharacter and Encoding.ASCII.GetBytes(Environment.NewLine) may have the same contents, they are not the same array, so Equals returns false.
You're interested in value equality, so you should instead individually compare the values at each position in the arrays:
newLine = Encoding.ASCII.GetBytes(Environment.NewLine);
if (endCharacter[0] != newLine[0] && endCharacter[1] != newLine[1])
{
// ...
}
In general, if you want to compare arrays for value equality, you could use something like this method, provided by Marc Gravell.
However, a much more efficient solution to your problem would be to convert the last two bytes of your file into ASCII and do a string comparison (since System.String already overloads == to check for value equality):
string endCharacter = Encoding.ASCII.GetString(fileContent, fileContent.Length - 2, 2);
if (endCharacter == Environment.NewLine)
{
// ...
}
You may also need to be careful about reading the entire file into memory if it's likely to be large. If you don't need the full contents of the file, you could do this more efficiently by just reading in the final two bytes, inspecting them, and appending directly to the file as necessary. This can be achieved by opening a System.IO.FileStream for the file (through System.IO.File.Open).
I found the solution, I must take SequenceEqual (http://www.dotnetperls.com/sequenceequal) in place of Equals. Thanks to everyone!
byte[] fileContent = File.ReadAllBytes(openFileDialog.FileName);
byte[] endCharacter = fileContent.Skip(fileContent.Length - 2).Take(2).ToArray();
if (!(endCharacter.SequenceEqual(Encoding.ASCII.GetBytes(Environment.NewLine))))
{
fileContent = fileContent.Concat(Encoding.ASCII.GetBytes(Environment.NewLine)).ToArray();
File.AppendAllText(openFileDialog.FileName, Environment.NewLine);
}
I would like to encode data into a binary format in a buffer which I will later either write to a file or transfer over a socket. What C# class or classes would be best to use to create a List<byte> containing the binary data.
I will be storing integers, single byte character strings (i.e., ASCII), floating point numbers and other data in this buffer in a custom encoded format (for the strings) and regular binary numeric layout for the ints and floating point types.
BinaryWriter looks like it has the methods I need, but it has to manage a growing buffer for me that I want to produce a List<byte> result from when I am done encoding.
Thanks
BinaryWriter, writing to a MemoryStream. If you need more than the available memory, you can easily switch to a temporary file stream.
using (var myStream = new MemoryStream()) {
using (var myWriter = new BinaryWriter(myStream)) {
// write here
}
using (var myReader = new BinaryReader(myStream)) {
// read here
}
// put the bytes into an array...
var myBuffer = myStream.ToArray();
// if you *really* want a List<Byte> (you probably don't- see my comment)
var myBytesList = myStream.ToArray().ToList();
}
BinaryWriter writes to a stream. Give it a MemoryStream, and when you want your List<byte>, use new List<byte>(stream.GetBuffer()).
For certain reasons, I have to create a 1024 kb .txt file.
Below is my current code:
int size = 1024000 //1024 kb..
byte[] bytearray = new byte[size];
foreach (byte bit in bytearray)
{
bit = 0;
}
string tobewritten = string.Empty;
foreach (byte bit in bytearray)
{
tobewritten += bit.ToString();
}
//newPath is local directory, where I store the created file
using (System.IO.StreamWriter sw = File.CreateText(newPath))
{
sw.WriteLine(tobewritten);
}
I have to wait at least 30 minutes to execute this piece of code, which I consider too long.
Now, I would like to ask for advice on how to actually achieve my mentioned objective effectively. Are there any alternatives to do this task? Am I writing bad code? Any help is appreciated.
There are several misunderstandings in the code you provided:
byte[] bytearray = new byte[size];
foreach (byte bit in bytearray)
{
bit = 0;
}
You seem to think that your are initializing each byte in your array bytearray with zero. Instead you just set the loop variable bit (unfortunate naming) to zero size times. Actually this code wouldn't even compile since you cannot assign to the foreach iteration variable.
Also you didn't need initialization here in the first place: byte array elements are automatically initialized to 0.
string tobewritten = string.Empty;
foreach (byte bit in bytearray)
{
tobewritten += bit.ToString();
}
You want to combine the string representation of each byte in your array to the string variable tobewritten. Since strings are immutable you create a new string for each element that has to be garbage collected along with the string you created for bit, this is relatively expensive, especially when you create 2048000 one of them - use a Stringbuilder instead.
Lastly all of that is not needed at all anyway - it seems you just want to write a bunch of "0" characters to a text file - if you are not worried about creating a single large string of zeros (it depends on the value of size whether this makes sense) you can just create the string directly to do this one go - or alternatively write a smaller string directly to the stream a bunch of times.
using (var file = File.CreateText(newpath))
{
file.WriteLine(new string('0', size));
}
Replace the string with a pre-sized StringBuilder to avoid unnecessary allocations.
Or, better yet, write each piece directly to the StreamWriter instead of pointlessly building a 100MB in-memory string first.
I've got a code which populates a stream. After population, the stream's length 1000 (for instance), while the returned string length from Stream.ReadString is 997 and the returned value from StreamReader.ReadToEnd() is an empty stream.
Here's a code showing what I mean (obviously, this isn't exactly my working code, but the issue is the same):
MemoryStream stream = MethodCreatingPopulatedStream(stream);
StreamReader reader = new StreamReader(stream);
if (stream.Length != reader.ReadToEnd().Length)
{
PostQuestionInStackOverFlow();
}
else if (!string.Equals(reader.ReadToEnd(), stream.ReadString()))
{
PostQuestionInStackOverFlow();
GetAnnoyedAtDotNet();
}
else
{
Smile();
}
What am I missing here?
P.S, Adding Stream.Flush anywhere made no change
The length of the string (characters) is not necessarily the same length as the stream (bytes). It depends entirely on the encoding and any other overhead associated with storing a string (such as storing its length).
As for your second test, stream.ReadString() doesn't even exist and would have to assume a certain encoding if it did.
I’m writing text to a binary file in C# and see a difference in quantity written between writing a string and a character array. I’m using System.IO.BinaryWriter and watching BinaryWriter.BaseStream.Length as the writes occur. These are my results:
using(BinaryWriter bw = new BinaryWriter(File.Open(“data.dat”), Encoding.ASCII))
{
string value = “Foo”;
// Writes 4 bytes
bw.Write(value);
// Writes 3 bytes
bw.Write(value.ToCharArray());
}
I don’t understand why the string overload writes 4 bytes when I’m writing only 3 ASCII characters. Can anyone explain this?
The documentation for BinaryWriter.Write(string) states that it writes a length-prefixed string to this stream. The overload for Write(char[]) has no such prefixing.
It would seem to me that the extra data is the length.
EDIT:
Just to be a bit more explicit, use Reflector. You will see that it has this piece of code in there as part of the Write(string) method:
this.Write7BitEncodedInt(byteCount);
It is a way to encode an integer using the least possible number of bytes. For short strings (that we would use day to day that are less than 128 characters), it can be represented using one byte. For longer strings, it starts to use more bytes.
Here is the code for that function just in case you are interested:
protected void Write7BitEncodedInt(int value)
{
uint num = (uint) value;
while (num >= 0x80)
{
this.Write((byte) (num | 0x80));
num = num >> 7;
}
this.Write((byte) num);
}
After prefixing the the length using this encoding, it writes the bytes for the characters in the desired encoding.
From the BinaryWriter.Write(string) docs:
Writes a length-prefixed string to this stream in the current encoding of the BinaryWriter, and advances the current position of the stream in accordance with the encoding used and the specific characters being written to the stream.
This behavior is probably so that when reading the file back in using a BinaryReader the string can be identified. (e.g. 3Foo3Bar6Foobar can be parsed into the string "Foo", "Bar" and "Foobar" but FooBarFoobar could not be.) In fact, BinaryReader.ReadString uses exactly this information to read a string from a binary file.
From the BinaryWriter.Write(char[]) docs:
Writes a character array to the current stream and advances the current position of the stream in accordance with the Encoding used and the specific characters being written to the stream.
It is hard to overstate how comprehensive and useful the docs on MSDN are. Always check them first.
As already stated, BinaryWriter.Write(String) writes the length of the string to the stream, before writing the string itself.
This allows the BinaryReader.ReadString() to know how long the string is.
using (BinaryReader br = new BinaryReader(File.OpenRead("data.dat")))
{
string foo1 = br.ReadString();
char[] foo2 = br.ReadChars(3);
}
Did you look at what was actually written? I'd guess a null terminator.