MemoryMappedFile doesn't work with 2 processes? - c#

I've made a simple test with a MemoryMappedFile as msdn says :
2 processes, 1 memory mapped file :
the first process adds the string "1"
the first process waits
the second process adds the string "2" and terminates
the first process now reads the whole memory mapped file
process A:
using (MemoryMappedFile mmf = MemoryMappedFile.CreateNew("testmap", 10000))
{
bool mutexCreated;
Mutex mutex = new Mutex(true, "testmapmutex", out mutexCreated);
using (MemoryMappedViewStream stream = mmf.CreateViewStream())
{
BinaryWriter writer = new BinaryWriter(stream, Encoding.UTF8);
writer.Write("1");
}
mutex.ReleaseMutex();
Console.WriteLine("Start Process B and press ENTER to continue.");
Console.ReadLine();
mutex.WaitOne();
using (MemoryMappedViewStream stream = mmf.CreateViewStream())
{
BinaryReader reader = new BinaryReader(stream, Encoding.UTF8);
Console.WriteLine("Process A says: {0}", reader.ReadString());
Console.WriteLine("Process B says: {0}", reader.ReadString());
}
mutex.ReleaseMutex();
}
process B:
using (MemoryMappedFile mmf = MemoryMappedFile.OpenExisting("testmap"))
{
Mutex mutex = Mutex.OpenExisting("testmapmutex");
mutex.WaitOne();
using (MemoryMappedViewStream stream = mmf.CreateViewStream(1, 0))
{
BinaryWriter writer = new BinaryWriter(stream, Encoding.UTF8);
writer.Write("2");
}
mutex.ReleaseMutex();
}
The result is :
Hu ?
Where is "1", "2" ?
However, if I run ONLY the first process ( without activating process B) I get :
What am I missing ?
I expect to see :
Process A says: 1
Process B says: 2

You are battling an implementation detail of BinaryWriter.Write(string). It writes the length of the string first, required so that BinaryReader knows how many characters it needs to read when reading the string back. For short strings, like "1", it writes a single byte to store the length.
So the offset you pass to CreateViewStream() is wrong, passing 1 will make it overwrite part of the string written by process A. The smiley character you see is the glyph for (char)1. The length byte of the string written by process B.
Memory mapped files are troublesome in managed code. You normally read and write to them by declaring a struct to set the layout and using pointers to access the view but that requires unsafe code. Streams are a pretty poor abstraction for a chunk of memory but a necessary evil. Also the reason it took so long for MMFs to become available in .NET.

EDIT
I noticed one apparently strange thing in the code of ProcessB. This code
using (MemoryMappedViewStream stream = mmf.CreateViewStream(1, 0))
creates a view from the first byte, but the strings in .NET are 2 bytes. I think it should be enough to you to make that 1->2 become 2. So the offset of the ProcessB view from the start of the mapped file will be after already inserted "1" string from ProcessA.
In your case seems that you overlap them.
Hope this helps.

Related

File Copy Program Doesn't Properly Copy File

Hello
I've been working on terminal-like application to get better at programming in c#, just something to help me learn. I've decided to add a feature that will copy a file exactly as it is, to a new file... It seems to work almost perfect. When opened in Notepad++ the file are only a few lines apart in length, and very, very, close to the same as far as actual file size goes. However, the duplicated copy of the file never runs. It says the file is corrupt. I have a feeling it's within the methods for reading and rewriting binary to files that I created. The code is as follows, thank for the help. Sorry for the spaghetti code too, I get a bit sloppy when I'm messing around with new ideas.
Class that handles the file copying/writing
using System;
using System.IO;
//using System.Collections.Generic;
namespace ConsoleFileExplorer
{
class FileTransfer
{
private BinaryWriter writer;
private BinaryReader reader;
private FileStream fsc; // file to be duplicated
private FileStream fsn; // new location of file
int[] fileData;
private string _file;
public FileTransfer(String file)
{
_file = file;
fsc = new FileStream(file, FileMode.Open);
reader = new BinaryReader(fsc);
}
// Reads all the original files data to an array of bytes
public byte[] ReadAllDataToArray()
{
byte[] bytes = reader.ReadBytes((int)fsc.Length); // reading bytes from the original file
return bytes;
}
// writes the array of original byte data to a new file
public void WriteDataFromArray(byte[] fileData, string path) // got a feeling this is the problem :p
{
fsn = new FileStream(path, FileMode.Create);
writer = new BinaryWriter(fsn);
int i = 0;
while(i < fileData.Length)
{
writer.Write(fileData[i]);
i++;
}
}
}
}
Code that interacts with this class .
(Sleep(5000) is because I was expecting an error on first attempt...
case '3':
Console.Write("Enter source file: ");
string sourceFile = Console.ReadLine();
if (sourceFile == "")
{
Console.Clear();
Console.ForegroundColor = ConsoleColor.DarkRed;
Console.Error.WriteLine("Must input a proper file path.\n");
Console.ForegroundColor = ConsoleColor.White;
Menu();
} else {
Console.WriteLine("Copying Data"); System.Threading.Thread.Sleep(5000);
FileTransfer trans = new FileTransfer(sourceFile);
//copying the original files data
byte[] data = trans.ReadAllDataToArray();
Console.Write("Enter Location to store data: ");
string newPath = Console.ReadLine();
// Just for me to make sure it doesnt exit if i forget
if(newPath == "")
{
Console.Clear();
Console.ForegroundColor = ConsoleColor.DarkRed;
Console.Error.WriteLine("Cannot have empty path.");
Console.ForegroundColor = ConsoleColor.White;
Menu();
} else
{
Console.WriteLine("Writing data to file"); System.Threading.Thread.Sleep(5000);
trans.WriteDataFromArray(data, newPath);
Console.WriteLine("File stored.");
Console.ReadLine();
Console.Clear();
Menu();
}
}
break;
File compared to new file
right-click -> open in new tab is probably a good idea
Original File
New File
You're not properly disposing the file streams and the binary writer. Both tend to buffer data (which is a good thing, especially when you're writing one byte at a time). Use using, and your problem should disappear. Unless somebody is editing the file while you're reading it, of course.
BinaryReader and BinaryWriter do not just write "raw data". They also add metadata as needed - they're designed for serialization and deserialization, rather than reading and writing bytes. Now, in the particular case of using ReadBytes and Write(byte[]) in particular, those are really just raw bytes; but there's not much point to use these classes just for that. Reading and writing bytes is the thing every Stream gives you - and that includes FileStreams. There's no reason to use BinaryReader/BinaryWriter here whatsover - the file streams give you everything you need.
A better approach would be to simply use
using (var fsn = ...)
{
fsn.Write(fileData, 0, fileData.Length);
}
or even just
File.WriteAllBytes(fileName, fileData);
Maybe you're thinking that writing a byte at a time is closer to "the metal", but that simply isn't the case. At no point during this does the CPU pass a byte at a time to the hard drive. Instead, the hard drive copies data directly from RAM, with no intervention from the CPU. And most hard drives still can't write (or read) arbitrary amounts of data from the physical media - instead, you're reading and writing whole sectors. If the system really did write a byte at a time, you'd just keep rewriting the same sector over and over again, just to write one more byte.
An even better approach would be to use the fact that you've got file streams open, and stream the files from source to destination rather than first reading everything into memory, and then writing it back to disk.
There is an File.Copy() Method in C#, you can see it here https://msdn.microsoft.com/ru-ru/library/c6cfw35a(v=vs.110).aspx
If you want to realize it by yourself, try to place a breakpoint inside your methods and use a debug. It is like a story about fisher and god, who gived a rod to fisher - to got a fish, not the exactly fish.
Also, look at you int[] fileData and byte[] fileData inside last method, maybe this is problem.

Writing to memory in .NET and pick up data of process in labview

I am using the Kinect Face-Tracking Basics WPF example to collect data from an object.
I can access the data in a text file etc but I want to write the data to managed memory in a C# process and have labview's .NET program pick up the data from same location.
So far, this is what I've got:
this.facePoints3D = frame.Get3DShape();
// using (MemoryStream stream = new MemoryStream())
// {
//var sw = new StreamWriter(stream);
int n = 121;
foreach (Vector3DF[] vector in facePoints3D.GetSlices(n))
{
//convert from float to byte array before we pass on to memory
var bytearray = new byte[vector.Length * this.facePoints3D.Count];
Buffer.BlockCopy(vector, 0, bytearray, 0, bytearray.Length);
//Initialize unmanaged memory to hold array.
int size = Marshal.SizeOf(bytearray[0]) * bytearray.Length;
IntPtr pnt = Marshal.AllocHGlobal(size);
try
{
//copy the array to unmanaged memory.
Marshal.Copy(bytearray, 0, pnt, bytearray.Length);
// Copy the unmanaged array back to another managed array.
byte[] bytearray2 = new byte[bytearray.Length];
Marshal.Copy(pnt, bytearray2, 0, bytearray.Length);
//Console.WriteLine("The array was coppied to unmanaged memory and back.");
}
finally
{
// Free the unmanaged memory.
Marshal.FreeHGlobal(pnt);
}
}
Thus far, I have the labview program configured correctly as
Process A (c#): MemoryStream Buffer -> Marshal AllocHGlobal -> Marshal Copy - > Marshal Dispose -> IntPtr ToInt64
Process B (labview): IntPtr value -> Marshall AllocHGlobal -> Marshal Copy -> Destination
Now the labview end runs well but it doesn't appear to be picking up the values from the Memory location.
Advice please?
Are those 2 separate processes? (2 separate exes). If so, you are not going to be able to share memory via direct allocation due to Process Isolation (1 process cannot see another process's memory).
Assuming that your answer is "yes" (2 separate processes), consider using Named Pipes to communicate cross process (or wrap it up with WCF)
Interprocess communication with Named Pipes: http://msdn.microsoft.com/en-us/library/bb546085(v=vs.110).aspx
WCF Tutorial: basic interprocess communication: http://tech.pro/tutorial/855/wcf-tutorial-basic-interprocess-communication
You could also use a memory-mapped file: http://www.abhisheksur.com/2012/02/inter-process-communication-using.html
EDIT
Added an example of using the binary serializer instead of doing manual memory copies of the structure. You can write the resulting byte arrays to your memory mapped file. This solution does require that you apply the [Serializable] attribute to your Vector3DF structure, and it assumes that the code reading the memory has the same type definition for Vector3DF.
(note: in your code in the comments, it looked as though you are working with an array of arrays of Vector3DF structs, so that's how I modeled the serialization code. Adjust as necessary
public byte[] SerializeVectors(Vector3DF[][] vectors)
{
var formatter = new BinaryFormatter();
// note: if you are using a stream to write to the memory mapped file,
// you could pass it in instead of using this memory stream as an intermediary
using (var stream = new MemoryStream())
{
formatter.Serialize(stream, vectors);
return stream.ToArray();
}
}
public Vector3DF[][] DeserializeVectors(byte[] vectorBuffer)
{
var formatter = new BinaryFormatter();
using (var stream = new MemoryStream(vectorBuffer, false))
{
return (Vector3DF[][])formatter.Deserialize(stream);
}
}
Here is a link to a Gist that contains working code and a unit test so that you can play around with it: https://gist.github.com/jsmarsch/d0dcade8c656b94f5c1c

Clearing contents of memory mapped file in C#

I am using MemoryMappedFile for communication between 2 programs. Program "A" creates the mmf and reads it's contents on a timer. Program "B" writes xml data to the mmf on a timer. I have the memory map working but I run into an issue where the previous iteration of the XML data is longer than the current and old data gets carried over to the next round.
so for simplicity lets say program B writes
aaaa
Program A will read correctly,
Then the next write from program B is:
b
Program A reads
baaa
It seems like there should be some simple way to flush the contents of the memory mapped file but I can't seem to figure it out. It's very possible that I'm totally wrong in the way I'm going about this.
Here's what I'm currently doing.
Program A:
using (MemoryMappedFile mmf = MemoryMappedFile.OpenExisting("testmap",MemoryMappedFileRights.ReadWrite))
{
Mutex mutex = Mutex.OpenExisting("testmapmutex");
mutex.WaitOne();
string outputtext;
using (MemoryMappedViewStream stream = mmf.CreateViewStream(0,0))
{
XmlSerializer deserializer = new XmlSerializer(typeof(MyObject));
TextReader textReader = new StreamReader(stream);
outputtext = textReader.ReadToEnd();
textReader.Close();
}
mutex.ReleaseMutex();
return outputtext; //ends up in a textbox for debugging
}
Program B
using (MemoryMappedFile mmf = MemoryMappedFile.OpenExisting("testmap", MemoryMappedFileRights.ReadWrite))
{
Mutex mutex = Mutex.OpenExisting("testmapmutex");
mutex.WaitOne();
using (MemoryMappedViewStream stream = mmf.CreateViewStream(0, 0))
{
XmlSerializer serializer = new XmlSerializer(typeof(MyObject));
TextWriter textWriter = new StreamWriter(stream);
serializer.Serialize(textWriter, myObjectToExport);
textWriter.Flush();
}
mutex.ReleaseMutex();
}
Assuming length is reasonably small, you could really clear it out
textWriter.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);
textWriter.BaseStream.Write(new byte[length], 0, length);
textWriter.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);
EDIT: I think I misunderstood the OP's question. The problem he was having was not with clearing the contents of the MMF, but with stream manipulation. This should fix the problem:
textWriter.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);
textWriter.Write("");
textWriter.Flush();
That being said, you might want to do both.
I haven't really worked with MemoryMappedStreams much but this question seemed interesting so I took a crack at it. I wrote a really basic windows example with two buttons (read/write) and a single text box. I didn't pass in "0, 0" to the CreateViewStream calls and I created the file with a fixed length using a call to "CreateOrOpen" and everything worked well! The following are the key pieces of code that I wrote:
WRITE The File
// create the file if it doesn't exist
if (sharedFile == null) sharedFile = MemoryMappedFile.CreateOrOpen("testmap", 1000, MemoryMappedFileAccess.ReadWrite);
// process safe handling
Mutex mutex = new Mutex(false, "testmapmutex");
if (mutex.WaitOne()) {
try {
using (MemoryMappedViewStream stream = sharedFile.CreateViewStream()) {
var writer = new StreamWriter(stream);
writer.WriteLine(txtResult.Text);
writer.Flush();
}
}
finally { mutex.ReleaseMutex(); }
}
READ The File
// create the file if it doesn't exist
if (sharedFile == null) sharedFile = MemoryMappedFile.CreateOrOpen("testmap", 1000, MemoryMappedFileAccess.ReadWrite);
// process safe handling
Mutex mutex = new Mutex(false, "testmapmutex");
if (mutex.WaitOne()) {
try {
using (MemoryMappedViewStream stream = sharedFile.CreateViewStream()) {
var textReader = new StreamReader(stream);
txtResult.Text = textReader.ReadToEnd();
textReader.Close();
}
}
finally { mutex.ReleaseMutex(); }
}
Dispose the file (after finished)
if (sharedFile != null) sharedFile.Dispose();
For the full example, see here: https://github.com/goopyjava/memory-map-test. Hope that helps!
EDIT/NOTE - If you look at the example provided you can write to the file as many times as you want and any time you read you will read exactly/only what was written last. I believe this was the original goal of the question.

Capturing binary output from Process.StandardOutput

In C# (.NET 4.0 running under Mono 2.8 on SuSE) I would like to run an external batch command and capture its ouput in binary form. The external tool I use is called 'samtools' (samtools.sourceforge.net) and among other things it can return records from an indexed binary file format called BAM.
I use Process.Start to run the external command, and I know that I can capture its output by redirecting Process.StandardOutput. The problem is, that's a text stream with an encoding, so it doesn't give me access to the raw bytes of the output. The almost-working solution I found is to access the underlying stream.
Here's my code:
Process cmdProcess = new Process();
ProcessStartInfo cmdStartInfo = new ProcessStartInfo();
cmdStartInfo.FileName = "samtools";
cmdStartInfo.RedirectStandardError = true;
cmdStartInfo.RedirectStandardOutput = true;
cmdStartInfo.RedirectStandardInput = false;
cmdStartInfo.UseShellExecute = false;
cmdStartInfo.CreateNoWindow = true;
cmdStartInfo.Arguments = "view -u " + BamFileName + " " + chromosome + ":" + start + "-" + end;
cmdProcess.EnableRaisingEvents = true;
cmdProcess.StartInfo = cmdStartInfo;
cmdProcess.Start();
// Prepare to read each alignment (binary)
var br = new BinaryReader(cmdProcess.StandardOutput.BaseStream);
while (!cmdProcess.StandardOutput.EndOfStream)
{
// Consume the initial, undocumented BAM data
br.ReadBytes(23);
// ... more parsing follows
But when I run this, the first 23bytes that I read are not the first 23 bytes in the ouput, but rather somewhere several hundred or thousand bytes downstream. I assume that StreamReader does some buffering and so the underlying stream is already advanced say 4K into the output. The underlying stream does not support seeking back to the start.
And I'm stuck here. Does anyone have a working solution for running an external command and capturing its stdout in binary form? The ouput may be very large so I would like to stream it.
Any help appreciated.
By the way, my current workaround is to have samtools return the records in text format, then parse those, but this is pretty slow and I'm hoping to speed things up by using the binary format directly.
Using StandardOutput.BaseStream is the correct approach, but you must not use any other property or method of cmdProcess.StandardOutput. For example, accessing cmdProcess.StandardOutput.EndOfStream will cause the StreamReader for StandardOutput to read part of the stream, removing the data you want to access.
Instead, simply read and parse the data from br (assuming you know how to parse the data, and won't read past the end of stream, or are willing to catch an EndOfStreamException). Alternatively, if you don't know how big the data is, use Stream.CopyTo to copy the entire standard output stream to a new file or memory stream.
Since you explicitly specified running on Suse linux and mono, you can work around the problem by using native unix calls to create the redirection and read from the stream. Such as:
using System;
using System.Diagnostics;
using System.IO;
using Mono.Unix;
class Test
{
public static void Main()
{
int reading, writing;
Mono.Unix.Native.Syscall.pipe(out reading, out writing);
int stdout = Mono.Unix.Native.Syscall.dup(1);
Mono.Unix.Native.Syscall.dup2(writing, 1);
Mono.Unix.Native.Syscall.close(writing);
Process cmdProcess = new Process();
ProcessStartInfo cmdStartInfo = new ProcessStartInfo();
cmdStartInfo.FileName = "cat";
cmdStartInfo.CreateNoWindow = true;
cmdStartInfo.Arguments = "test.exe";
cmdProcess.StartInfo = cmdStartInfo;
cmdProcess.Start();
Mono.Unix.Native.Syscall.dup2(stdout, 1);
Mono.Unix.Native.Syscall.close(stdout);
Stream s = new UnixStream(reading);
byte[] buf = new byte[1024];
int bytes = 0;
int current;
while((current = s.Read(buf, 0, buf.Length)) > 0)
{
bytes += current;
}
Mono.Unix.Native.Syscall.close(reading);
Console.WriteLine("{0} bytes read", bytes);
}
}
Under unix, file descriptors are inherited by child processes unless marked otherwise (close on exec). So, to redirect stdout of a child, all you need to do is change the file descriptor #1 in the parent process before calling exec. Unix also provides a handy thing called a pipe which is a unidirectional communication channel, with two file descriptors representing the two endpoints. For duplicating file descriptors, you can use dup or dup2 both of which create an equivalent copy of a descriptor, but dup returns a new descriptor allocated by the system and dup2 places the copy in a specific target (closing it if necessary). What the above code does, then:
Creates a pipe with endpoints reading and writing
Saves a copy of the current stdout descriptor
Assigns the pipe's write endpoint to stdout and closes the original
Starts the child process so it inherits stdout connected to the write endpoint of the pipe
Restores the saved stdout
Reads from the reading endpoint of the pipe by wrapping it in a UnixStream
Note, in native code, a process is usually started by a fork+exec pair, so the file descriptors can be modified in the child process itself, but before the new program is loaded. This managed version is not thread-safe as it has to temporarily modify the stdout of the parent process.
Since the code starts the child process without managed redirection, the .NET runtime does not change any descriptors or create any streams. So, the only reader of the child's output will be the user code, which uses a UnixStream to work around the StreamReader's encoding issue,
I checked out what's happening with reflector. It seems to me that StreamReader doesn't read until you call read on it. But it's created with a buffer size of 0x1000, so maybe it does. But luckily, until you actually read from it, you can safely get the buffered data out of it: it has a private field byte[] byteBuffer, and two integer fields, byteLen and bytePos, the first means how many bytes are in the buffer, the second means how many have you consumed, should be zero. So first read this buffer with reflection, then create the BinaryReader.
Maybe you can try like this:
public class ThirdExe
{
private static TongueSvr _instance = null;
private Diagnostics.Process _process = null;
private Stream _messageStream;
private byte[] _recvBuff = new byte[65536];
private int _recvBuffLen;
private Queue<TonguePb.Msg> _msgQueue = new Queue<TonguePb.Msg>();
void StartProcess()
{
try
{
_process = new Diagnostics.Process();
_process.EnableRaisingEvents = false;
_process.StartInfo.FileName = "d:/code/boot/tongueerl_d.exe"; // Your exe
_process.StartInfo.UseShellExecute = false;
_process.StartInfo.CreateNoWindow = true;
_process.StartInfo.RedirectStandardOutput = true;
_process.StartInfo.RedirectStandardInput = true;
_process.StartInfo.RedirectStandardError = true;
_process.ErrorDataReceived += new Diagnostics.DataReceivedEventHandler(ErrorReceived);
_process.Exited += new EventHandler(OnProcessExit);
_process.Start();
_messageStream = _process.StandardInput.BaseStream;
_process.BeginErrorReadLine();
AsyncRead();
}
catch (Exception e)
{
Debug.LogError("Unable to launch app: " + e.Message);
}
private void AsyncRead()
{
_process.StandardOutput.BaseStream.BeginRead(_recvBuff, 0, _recvBuff.Length
, new AsyncCallback(DataReceived), null);
}
void DataReceived(IAsyncResult asyncResult)
{
int nread = _process.StandardOutput.BaseStream.EndRead(asyncResult);
if (nread == 0)
{
Debug.Log("process read finished"); // process exit
return;
}
_recvBuffLen += nread;
Debug.LogFormat("recv data size.{0} remain.{1}", nread, _recvBuffLen);
ParseMsg();
AsyncRead();
}
void ParseMsg()
{
if (_recvBuffLen < 4)
{
return;
}
int len = IPAddress.NetworkToHostOrder(BitConverter.ToInt32(_recvBuff, 0));
if (len > _recvBuffLen - 4)
{
Debug.LogFormat("current call can't parse the NetMsg for data incomplete");
return;
}
TonguePb.Msg msg = TonguePb.Msg.Parser.ParseFrom(_recvBuff, 4, len);
Debug.LogFormat("recv msg count.{1}:\n {0} ", msg.ToString(), _msgQueue.Count + 1);
_recvBuffLen -= len + 4;
_msgQueue.Enqueue(msg);
}
The key is _process.StandardOutput.BaseStream.BeginRead(_recvBuff, 0, _recvBuff.Length, new AsyncCallback(DataReceived), null); and the very very important is that convert to asynchronous reads event like Process.OutputDataReceived.

Bytes consumed by StreamReader

Is there a way to know how many bytes of a stream have been used by StreamReader?
I have a project where we need to read a file that has a text header followed by the start of the binary data. My initial attempt to read this file was something like this:
private int _dataOffset;
void ReadHeader(string path)
{
using (FileStream stream = File.OpenRead(path))
{
StreamReader textReader = new StreamReader(stream);
do
{
string line = textReader.ReadLine();
handleHeaderLine(line);
} while(line != "DATA") // Yes, they used "DATA" to mark the end of the header
_dataOffset = stream.Position;
}
}
private byte[] ReadDataFrame(string path, int frameNum)
{
using (FileStream stream = File.OpenRead(path))
{
stream.Seek(_dataOffset + frameNum * cbFrame, SeekOrigin.Begin);
byte[] data = new byte[cbFrame];
stream.Read(data, 0, cbFrame);
return data;
}
return null;
}
The problem is that when I set _dataOffset to stream.Position, I get the position that the StreamReader has read to, not the end of the header. As soon as I thought about it this made sense, but I still need to be able to know where the end of the header is and I'm not sure if there's a way to do it and still take advantage of StreamReader.
You can find out how many bytes the StreamReader has actually returned (as opposed to read from the stream) in a number of ways, none of them too straightforward I'm afraid.
Get the result of textReader.CurrentEncoding.GetByteCount(totalLengthOfAllTextRead) and then seek to this position in the stream.
Use some reflection hackery to retrieve the value of the private variable of the StreamReader object that corresponds to the current byte position within the internal buffer (different from that with the stream - usually behind, but no more than equal to of course). Judging by .NET Reflector, the this variable seems to be named bytePos.
Don't bother using a StreamReader at all but instead implement your custom ReadLine function built on top of the Stream or BinaryReader even (BinaryReader is guaranteed never to read further ahead than what you request). This custom function must read from the stream char by char, so you'd actually have to use the low-level Decoder object (unless the encoding is ASCII/ANSI, in which case things are a bit simpler due to single-byte encoding).
Option 1 is going to be the least efficient I would imagine (since you're effectively re-encoding text you just decoded), and option 3 the hardest to implement, though perhaps the most elegant. I'd probably recommend against using the ugly reflection hack (option 2), even though it's looks tempting, being the most direct solution and only taking a couple of lines. (To be quite honest, the StreamReader class really ought to expose this variable via a public property, but alas it does not.) So in the end, it's up to you, but either method 1 or 3 should do the job nicely enough...
Hope that helps.
So the data is utf8 (the default encoding for StreamReader). This is a multibyte encoding, so IndexOf would be inadvisable. You could:
Encoding.UTF8.GetByteCount(string)
on your data so far, adding 1 or 2 bytes for the missing line ending.
If you're needing to count bytes, I'd go with the BinaryReader. You can take the results and cast them about as needed, but I find its idea of its current position to be more reliable (in that since it reads in binary, its immune to character-set problems).
So your last line contains 'DATA' + an unknown amount of data bytes. You could extract the position by using IndexOf() with your last read line. Then readjust the stream.Position.
But I am not sure if you should use ReadLine() at all in this case. Maybe it would be better to read byte by byte until you reach the 'DATA' mark.
The line breaks are easily identifiable without needing to decode the stream first (except for some encodings rarely used for text files like EBCDIC, UTF-16, UTF-32), so you can just read each line as bytes and then decode the entire line:
using (FileStream stream = File.OpenRead(path)) {
List<byte> buffer = new List<byte>();
bool hasCr = false;
bool done = false;
while (!done) {
int b = stream.ReadByte();
if (b == -1) throw new IOException("End of file reached in header.");
if (b == 13) {
hasCr = true;
} else if (b == 10 && hasCr) {
string line = Encoding.UTF8.GetString(buffer.ToArray(), 0, buffer.Count);
if (line == "DATA") {
done = true;
} else {
HandleHeaderLine(line);
}
buffer.Clear();
hasCr = false;
} else {
if (hasCr) buffer.Add(13);
hasCr = false;
buffer.Add((byte)b);
}
}
_dataOffset = stream.Position;
}
Instead of closing the stream and open it again, you could of course just keep on reading the data.

Categories