Accessing the internal buffer of a BinaryReader

Accessing the internal buffer of a BinaryReader - c#

I'm inheriting the BinaryReader class.
I have to override some essential methods like ReadUInt16.
The internal implementation of this method is:
public virtual ushort ReadUInt16(){
FillBuffer(2);
return (ushort)(m_buffer[0] | m_buffer[1] << 8);
}
The binary files I'm reading from are organized as high byte first (big endian), and I've inherited from the BinaryReader also because I had to add some more functionality.
Anyway I want to implement the swapping in the subclass itself.
Is there another way to access m_buffer or alternative without using reflection or other consuming resources?
May be I should override the FillBuffer and back up peeked bytes? Or maybe just ignoring it? Will it have side effects? Has anyone faced this before? Can anyone explain why is FillBuffer not internal? Is it required to always fill the buffer or it can be skipped? And now that it's not internal why wasn't a protected getter to the m_buffer field implemented along?
Here's the implementation of FillBuffer.
protected virtual void FillBuffer(int numBytes) {
if (m_buffer != null && (numBytes < 0 || numBytes > m_buffer.Length)) {
throw new ArgumentOutOfRangeException("numBytes",
Environment
.GetResourceString("ArgumentOutOfRange_BinaryReaderFillBuffer"));
}
int bytesRead=0;
int n = 0;
if (m_stream==null) __Error.FileNotOpen();
// Need to find a good threshold for calling ReadByte() repeatedly
// vs. calling Read(byte[], int, int) for both buffered & unbuffered
// streams.
if (numBytes==1) {
n = m_stream.ReadByte();
if (n==-1)
__Error.EndOfFile();
m_buffer[0] = (byte)n;
return;
}
do {
n = m_stream.Read(m_buffer, bytesRead, numBytes-bytesRead);
if (n==0) {
__Error.EndOfFile();
}
bytesRead+=n;
} while (bytesRead<numBytes);
}

Not a good idea to try accessing the internal buffer. Why not do something like:
var val = base.ReadUInt16();
return (ushort)((val << 8) | ((val >> 8) & 0xFF));
Slightly slower than reading from the buffer directly, but I seriously doubt that this is going to make a material impact on the overall speed of your application.
FillBuffer is apparently an implementation detail that for some reason the Framework team decided they needed to make protected, possibly because some other Framework class takes advantage of the internal workings of BinaryReader. Since you know that all it does is fill the internal buffer, and your derived class doesn't have access to the internal buffer, so if you do decide to rewrite the reading implementation yourself, I'd suggest that you ignore that method. Calling it can't do you any good, and could do you great harm.
You might be interested in a series of articles I wrote some years ago, in which I implemented a BinaryReaderWriter class, which is essentially a BinaryReader and BinaryWriter joined together, and allows you random read/write access to the underlying stream.

Related

BinaryWriter/Reader - Getting unexpected readbacks

Put simply, I'm having trouble writing a binary file, then reading back that same file. This is a settings file, which contains a few strings and an int. I have no problem with writing the strings, but when I get to the int I just can't get a value back when the settings are read.
A little back story: I'm more of a c/c++ programmer, c# isn't totally foreign, but it's not my forte. For instance, I had no idea if I wrote a string, it takes care of the length and reads for you.. just. x.ReadString(). Neat.
Here's some rough code to give you an idea of what I'm doing. Note this is stripped of error checking, and most of the superfluous reads/writes. Just kept it to bare minimal:
public static int LDAPport;
public static string Username;
public static char Ver;
public static int LoadSettings()
{
using (BinaryReader r = new BinaryReader(File.Open("data\\Settings.cfg", FileMode.Open)))
{
char cin = r.ReadChar();
if (cin != Ver)
{
// Corrupted or older style settings
return 1;
}
Username = r.ReadString();
LDAPport = r.ReadInt32();
r.Close();
}
return 0;
}
public static bool SaveSettings()
using (BinaryWriter w = new BinaryWriter(File.Open("data\\Settings.cfg", FileMode.Create)))
{
w.Write(Ver);
w.Write(Username);
w.Write(LDAPport);
w.Close();
}
}
return true;
}
So when it writes 389 (I've verified it is getting the proper value when it goes to write) When read back, LDAPport gets -65145. Which tells me it's either not reading all 4 bytes for an int (it's 4 isn't it?), or it's reading a number that's unsigned (since I'm expecting signed). Casting seemed to do nothing for me.
I've also tried other read methods. .Read() does the same thing, I've tried changing the data type for the number, including UInt32 to no avail. Everything I've read just says simply .Write(389) then .Read()! that simple!.. well no. There doesn't seem to be a method for just int.. I assume it's got those smarts in .Read() yes?
Can someone with a c# brain greater than mine shed some light as to what's going on here? Or has my brain officially turned to jelly at this point? :)

How to instance a C# class in UNmanaged memory? (Possible?)

UPDATE: There is now an accepted answer that "works". You should never, ever, ever, ever use it. Ever.
First let me preface my question by stating that I'm a game developer. There's a legitimate - if highly unusual - performance-related reason for wanting to do this.
Say I have a C# class like this:
class Foo
{
public int a, b, c;
public void MyMethod(int d) { a = d; b = d; c = a + b; }
}
Nothing fancy. Note that it is a reference type that contains only value types.
In managed code I'd like to have something like this:
Foo foo;
foo = Voodoo.NewInUnmanagedMemory<Foo>(); // <- ???
foo.MyMethod(1);
What would the function NewInUnmanagedMemory look like? If it can't be done in C#, could it be done in IL? (Or maybe C++/CLI?)
Basically: Is there a way - no matter how hacky - to turn some totally arbitrary pointer into an object reference. And - short of making the CLR explode - damn the consequences.
(Another way to put my question is: "I want to implement a custom allocator for C#")
This leads to the follow-up question: What does the garbage collector do (implementation-specific, if need be) when faced with a reference that points outside of managed memory?
And, related to that, what would happen if Foo had a reference as a member field? What if it pointed at managed memory? What if it only ever pointed at other objects allocated in unmanaged memory?
Finally, if this is impossible: Why?
Update: Here are the "missing pieces" so far:
#1: How to convert an IntPtr to an object reference? It might be possible though unverifiable IL (see comments). So far I've had no luck with this. The framework seems to be extremely careful to prevent this from happening.
(It would also be nice to be able to get the size and layout information for non-blittable managed types at runtime. Again, the framework tries to make this impossible.)
#2: Assuming problem one can be solved - what does the GC do when it encounters an object reference that points outside of the GC heap? Does it crash? Anton Tykhyy, in his answer, guesses that it will. Given how careful the framework is to prevent #1, it does seem likely. Something that confirms this would be nice.
(Alternatively the object reference could point to pinned memory inside the GC heap. Would that make a difference?)
Based on this, I'm inclined to think that this idea for a hack is impossible - or at least not worth the effort. But I'd be interested to get an answer that goes into the technical details of #1 or #2 or both.

I have been experimenting creating classes in unmanaged memory. It is possible but has a problem I am currently unable to solve - you can't assign objects to reference-type fields -see edit at the bottom-, so you can have only structure fields in your custom class.
This is evil:
using System;
using System.Reflection.Emit;
using System.Runtime.InteropServices;
public class Voodoo<T> where T : class
{
static readonly IntPtr tptr;
static readonly int tsize;
static readonly byte[] zero;
public static T NewInUnmanagedMemory()
{
IntPtr handle = Marshal.AllocHGlobal(tsize);
Marshal.Copy(zero, 0, handle, tsize);
IntPtr ptr = handle+4;
Marshal.WriteIntPtr(ptr, tptr);
return GetO(ptr);
}
public static void FreeUnmanagedInstance(T obj)
{
IntPtr ptr = GetPtr(obj);
IntPtr handle = ptr-4;
Marshal.FreeHGlobal(handle);
}
delegate T GetO_d(IntPtr ptr);
static readonly GetO_d GetO;
delegate IntPtr GetPtr_d(T obj);
static readonly GetPtr_d GetPtr;
static Voodoo()
{
Type t = typeof(T);
tptr = t.TypeHandle.Value;
tsize = Marshal.ReadInt32(tptr, 4);
zero = new byte[tsize];
DynamicMethod m = new DynamicMethod("GetO", typeof(T), new[]{typeof(IntPtr)}, typeof(Voodoo<T>), true);
var il = m.GetILGenerator();
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ret);
GetO = m.CreateDelegate(typeof(GetO_d)) as GetO_d;
m = new DynamicMethod("GetPtr", typeof(IntPtr), new[]{typeof(T)}, typeof(Voodoo<T>), true);
il = m.GetILGenerator();
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ret);
GetPtr = m.CreateDelegate(typeof(GetPtr_d)) as GetPtr_d;
}
}
If you care about memory leak, you should always call FreeUnmanagedInstance when you are done with your class.
If you want more complex solution, you can try this:
using System;
using System.Reflection.Emit;
using System.Runtime.InteropServices;
public class ObjectHandle<T> : IDisposable where T : class
{
bool freed;
readonly IntPtr handle;
readonly T value;
readonly IntPtr tptr;
public ObjectHandle() : this(typeof(T))
{
}
public ObjectHandle(Type t)
{
tptr = t.TypeHandle.Value;
int size = Marshal.ReadInt32(tptr, 4);//base instance size
handle = Marshal.AllocHGlobal(size);
byte[] zero = new byte[size];
Marshal.Copy(zero, 0, handle, size);//zero memory
IntPtr ptr = handle+4;
Marshal.WriteIntPtr(ptr, tptr);//write type ptr
value = GetO(ptr);//convert to reference
}
public T Value{
get{
return value;
}
}
public bool Valid{
get{
return Marshal.ReadIntPtr(handle, 4) == tptr;
}
}
public void Dispose()
{
if(!freed)
{
Marshal.FreeHGlobal(handle);
freed = true;
GC.SuppressFinalize(this);
}
}
~ObjectHandle()
{
Dispose();
}
delegate T GetO_d(IntPtr ptr);
static readonly GetO_d GetO;
static ObjectHandle()
{
DynamicMethod m = new DynamicMethod("GetO", typeof(T), new[]{typeof(IntPtr)}, typeof(ObjectHandle<T>), true);
var il = m.GetILGenerator();
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ret);
GetO = m.CreateDelegate(typeof(GetO_d)) as GetO_d;
}
}
/*Usage*/
using(var handle = new ObjectHandle<MyClass>())
{
//do some work
}
I hope it will help you on your path.
Edit: Found a solution to reference-type fields:
class MyClass
{
private IntPtr a_ptr;
public object a{
get{
return Voodoo<object>.GetO(a_ptr);
}
set{
a_ptr = Voodoo<object>.GetPtr(value);
}
}
public int b;
public int c;
}
Edit: Even better solution. Just use ObjectContainer<object> instead of object and so on.
public struct ObjectContainer<T> where T : class
{
private readonly T val;
public ObjectContainer(T obj)
{
val = obj;
}
public T Value{
get{
return val;
}
}
public static implicit operator T(ObjectContainer<T> #ref)
{
return #ref.val;
}
public static implicit operator ObjectContainer<T>(T obj)
{
return new ObjectContainer<T>(obj);
}
public override string ToString()
{
return val.ToString();
}
public override int GetHashCode()
{
return val.GetHashCode();
}
public override bool Equals(object obj)
{
return val.Equals(obj);
}
}

"I want to implement a custom allocator for C#"
GC is at the core of the CLR. Only Microsoft (or the Mono team in case of Mono) can replace it, at a great cost in development effort. GC being at the core of the CLR, messing around with the GC or the managed heap will crash the CLR — quickly if you're very-very lucky.
What does the garbage collector do (implementation-specific, if need be) when faced with a reference that points outside of managed memory?
It crashes in an implementation-specific way ;)

Purely C# Approach
So, there are a few options. The easiest is to use new/delete in an unsafe context for structs. The second is to use built-in Marshalling services to deal with unmanaged memory (code for this is visible below). However, both of these deal with structs (though I think the latter method is very close to what you want). My code has a limitation in that you must stick to structures throughout and use IntPtrs for references (using ChunkAllocator.ConvertPointerToStructure to get the data and ChunkAllocator.StoreStructure to store the changed data). This is obviously cumbersome, so you'd better really want the performance if you use my approach. However, if you are dealing with only value-types, this approach is sufficient.
Detour: Classes in the CLR
Classes have a 8 byte "prefix" in their allocated memory. Four bytes are for the sync index for multithreading, and four bytes are for identifying their type (basically, virtual method table and run-time reflection). This makes it hard to deal with unamanaged memory since these are CLR specific and since the sync index can change during run-time. See here for details on run-time object creation and here for an overview of memory layout for a reference type. Also check out CLR via C# for a more in-depth explanation.
A Caveat
As usual, things are rarely so simple as yes/no. The real complexity of reference types has to do with how the garbage collector compacts allocated memory during a garbage collection. If you can somehow ensure that a garbage collection doesn't happen or that it won't affect the data in question (see the fixed keyword) then you can turn an arbitrary pointer into an object reference (just offset the pointer by 8 bytes, then interpret that data as a struct with the same fields and memory layout; perhaps use StructLayoutAttribute to be sure). I would experiment with non-virtual methods to see if they work; they should (especially if you put them on the struct) but virtual methods are no-go due to the virtual method table that you'd have to discard.
One Does Not Simply Walk Into Mordor
Simply put, this means that managed reference types (classes) cannot be allocated in unmanaged memory. You could use managed reference types in C++, but those would be subject to garbage collection... and the process and code is more painful than the struct-based approach. Where does that leave us? Back where we started, of course.
There is a Secret Way
We could brave Shelob's Lair memory allocation ourselves. Unfortunately, this is where our paths must part, because I am not that knowledgeable about it. I will provide you with a link or two - perhaps three or four in actuality. This is rather complicated and begs the question: Are there other optimizations you could try? Cache coherency and superior algorithms is one approach, as is judicious application of P/Invoke for performance-critical code. You could also apply the aforementioned structures-only memory allocation for key methods/classes.
Good luck, and let us know if you find a superior alternative.
Appendix: Source Code
ChunkAllocator.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
namespace MemAllocLib
{
public sealed class ChunkAllocator : IDisposable
{
IntPtr m_chunkStart;
int m_offset;//offset from already allocated memory
readonly int m_size;
public ChunkAllocator(int memorySize = 1024)
{
if (memorySize < 1)
throw new ArgumentOutOfRangeException("memorySize must be positive");
m_size = memorySize;
m_chunkStart = Marshal.AllocHGlobal(memorySize);
}
~ChunkAllocator()
{
Dispose();
}
public IntPtr Allocate<T>() where T : struct
{
int reqBytes = Marshal.SizeOf(typeof(T));//not highly performant
return Allocate<T>(reqBytes);
}
public IntPtr Allocate<T>(int reqBytes) where T : struct
{
if (m_chunkStart == IntPtr.Zero)
throw new ObjectDisposedException("ChunkAllocator");
if (m_offset + reqBytes > m_size)
throw new OutOfMemoryException("Too many bytes allocated: " + reqBytes + " needed, but only " + (m_size - m_offset) + " bytes available");
T created = default(T);
Marshal.StructureToPtr(created, m_chunkStart + m_offset, false);
m_offset += reqBytes;
return m_chunkStart + (m_offset - reqBytes);
}
public void Dispose()
{
if (m_chunkStart != IntPtr.Zero)
{
Marshal.FreeHGlobal(m_chunkStart);
m_offset = 0;
m_chunkStart = IntPtr.Zero;
}
}
public void ReleaseAllMemory()
{
m_offset = 0;
}
public int AllocatedMemory
{
get { return m_offset; }
}
public int AvailableMemory
{
get { return m_size - m_offset; }
}
public int TotalMemory
{
get { return m_size; }
}
public static T ConvertPointerToStruct<T>(IntPtr ptr) where T : struct
{
return (T)Marshal.PtrToStructure(ptr, typeof(T));
}
public static void StoreStructure<T>(IntPtr ptr, T data) where T : struct
{
Marshal.StructureToPtr(data, ptr, false);
}
}
}
Program.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace MemoryAllocation
{
class Program
{
static void Main(string[] args)
{
using (MemAllocLib.ChunkAllocator chunk = new MemAllocLib.ChunkAllocator())
{
Console.WriteLine(">> Simple data test");
SimpleDataTest(chunk);
Console.WriteLine();
Console.WriteLine(">> Complex data test");
ComplexDataTest(chunk);
}
Console.ReadLine();
}
private static void SimpleDataTest(MemAllocLib.ChunkAllocator chunk)
{
IntPtr ptr = chunk.Allocate<System.Int32>();
Console.WriteLine(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Int32>(ptr));
System.Diagnostics.Debug.Assert(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Int32>(ptr) == 0, "Data not initialized properly");
System.Diagnostics.Debug.Assert(chunk.AllocatedMemory == sizeof(Int32), "Data not allocated properly");
int data = MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Int32>(ptr);
data = 10;
MemAllocLib.ChunkAllocator.StoreStructure(ptr, data);
Console.WriteLine(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Int32>(ptr));
System.Diagnostics.Debug.Assert(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Int32>(ptr) == 10, "Data not set properly");
Console.WriteLine("All tests passed");
}
private static void ComplexDataTest(MemAllocLib.ChunkAllocator chunk)
{
IntPtr ptr = chunk.Allocate<Person>();
Console.WriteLine(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Person>(ptr));
System.Diagnostics.Debug.Assert(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Person>(ptr).Age == 0, "Data age not initialized properly");
System.Diagnostics.Debug.Assert(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Person>(ptr).Name == null, "Data name not initialized properly");
System.Diagnostics.Debug.Assert(chunk.AllocatedMemory == System.Runtime.InteropServices.Marshal.SizeOf(typeof(Person)) + sizeof(Int32), "Data not allocated properly");
Person data = MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Person>(ptr);
data.Name = "Bob";
data.Age = 20;
MemAllocLib.ChunkAllocator.StoreStructure(ptr, data);
Console.WriteLine(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Person>(ptr));
System.Diagnostics.Debug.Assert(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Person>(ptr).Age == 20, "Data age not set properly");
System.Diagnostics.Debug.Assert(MemAllocLib.ChunkAllocator.ConvertPointerToStruct<Person>(ptr).Name == "Bob", "Data name not set properly");
Console.WriteLine("All tests passed");
}
struct Person
{
public string Name;
public int Age;
public Person(string name, int age)
{
Name = name;
Age = age;
}
public override string ToString()
{
if (string.IsNullOrWhiteSpace(Name))
return "Age is " + Age;
return Name + " is " + Age + " years old";
}
}
}
}

You can write code in C++ and call it from .NET using P/Invoke or you can you can write code in managed C++ that gives you full access to the native API from inside a .NET language. However, on the managed side you can only work with managed types so you will have to encapsulate your unmanaged objects.
To give a simple example: Marshal.AllocHGlobal allows you to allocate memory on the Windows heap. The handle returned is not of much use in .NET but can be required when calling a native Windows API requiring a buffer.

This is not possible.
However you can use a managed struct and create a pointer of this struct type. This pointer can point anywhere (including to unmanaged memory).
The question is, why would you want to have a class in unmanaged memory? You wouldn't get GC features anyway. You can just use a pointer-to-struct.

Nothing like that is possible. You can access managed memory in unsafe context, but said memory is still managed and subject to GC.
Why?
Simplicity and security.
But now that I think about it, I think you can mix managed and unmanaged with C++/CLI. But I'm not sure about that, because I never worked with C++/CLI.

I don't know a way to hold a C# class instance in the unmanaged heap, not even in C++/CLI.

It's possible to design a value-type allocator entirely within .net, without using any unmanaged code, which can allocate and free an arbitrary number of value-type instances without any significant GC pressure. The trick is to create a relatively small number of arrays (possibly one for each type) to hold the instances, and then pass around "instance reference" structs which hold the array indices of the index in question.
Suppose, for example, that I want to have a "creature" class which holds XYZ positions (float), XYZ velocity (also float), roll/pitch/yaw (ditto), damage (float), and kind (enumeration). An interface "ICreatureReference" would define getters and setters for all those properties. A typical implementation would be a struct CreatureReference with a single private field int _index, and property accessors like:
float Position {
get {return Creatures[_index].Position;}
set {Creatures[_index].Position = value;}
};
The system would keep a list of which array slots are used and vacant (it could, if desired, use one of the fields within Creatures to form a linked list of vacant slots). The CreatureReference.Create method would allocate an item from the vacant-items list; the Dispose method of a CreatureReference instance would add its array slot to the vacant-items list.
This approach ends up requiring an annoying amount of boilerplate code, but it can be reasonably efficient and avoid GC pressure. The biggest problems with are probably that (1) it makes structs behave more like reference types than structs, and (2) it requires absolute discipline with calling IDispose, since non-disposed array slots will never get reclaimed. Another irksome quirk is that one will be unable to use property setters for read-only values of type CreatureReference, even though the property setters would not try to mutate any fields of the CreatureReference instance to which they are applied. Using an interface ICreatureReference may avoid this difficulty, but one must be careful to only declare storage locations of generic types constrained to ICreatureReference, rather than declaring storage locations of ICreatureReference.

Can as 2D byte array be made one huge continuous byte array?

I have an extremely large 2D bytearray in memory,
byte MyBA = new byte[int.MaxValue][10];
Is there any way (probably unsafe) that I can fool C# into thinking this is one huge continuous byte array? I want to do this such that I can pass it to a MemoryStream and then a BinaryReader.
MyReader = new BinaryReader(MemoryStream(*MyBA)) //Syntax obviously made-up here

I do not believe .NET provides this, but it should be fairly easy to implement your own implementation of System.IO.Stream, that seamlessly switches backing array. Here are the (untested) basics:
public class MultiArrayMemoryStream: System.IO.Stream
{
byte[][] _arrays;
long _position;
int _arrayNumber;
int _posInArray;
public MultiArrayMemoryStream(byte[][] arrays){
_arrays = arrays;
_position = 0;
_arrayNumber = 0;
_posInArray = 0;
}
public override int Read(byte[] buffer, int offset, int count){
int read = 0;
while(read<count){
if(_arrayNumber>=_arrays.Length){
return read;
}
if(count-read <= _arrays[_arrayNumber].Length - _posInArray){
Buffer.BlockCopy(_arrays[_arrayNumber], _posInArray, buffer, offset+read, count-read);
_posInArray+=count-read;
_position+=count-read;
read=count;
}else{
Buffer.BlockCopy(_arrays[_arrayNumber], _posInArray, buffer, offset+read, _arrays[_arrayNumber].Length - _posInArray);
read+=_arrays[_arrayNumber].Length - _posInArray;
_position+=_arrays[_arrayNumber].Length - _posInArray;
_arrayNumber++;
_posInArray=0;
}
}
return count;
}
public override long Length{
get {
long res = 0;
for(int i=0;i<_arrays.Length;i++){
res+=_arrays[i].Length;
}
return res;
}
}
public override long Position{
get { return _position; }
set { throw new NotSupportedException(); }
}
public override bool CanRead{
get { return true; }
}
public override bool CanSeek{
get { return false; }
}
public override bool CanWrite{
get { return false; }
}
public override void Flush(){
}
public override void Seek(long offset, SeekOrigin origin){
throw new NotSupportedException();
}
public override void SetLength(long value){
throw new NotSupportedException();
}
public override void Write(byte[] buffer, int offset, int count){
throw new NotSupportedException();
}
}
Another way to workaround the size-limitation of 2^31 bytes is UnmanagedMemoryStream which implements System.IO.Stream on top of an unmanaged memory buffer (which might be as large as the OS supports). Something like this might work (untested):
var fileStream = new FileStream("data",
FileMode.Open,
FileAccess.Read,
FileShare.Read,
16 * 1024,
FileOptions.SequentialScan);
long length = fileStream.Length;
IntPtr buffer = Marshal.AllocHGlobal(new IntPtr(length));
var memoryStream = new UnmanagedMemoryStream((byte*) buffer.ToPointer(), length, length, FileAccess.ReadWrite);
fileStream.CopyTo(memoryStream);
memoryStream.Seek(0, SeekOrigin.Begin);
// work with the UnmanagedMemoryStream
Marshal.FreeHGlobal(buffer);

Agree. Anyway you have limit of array size itself.
If you really need to operate huge arrays in a stream, write your custom memory stream class.

I think you can use a linear structure instead of a 2D structure using the following approach.
Instead of having byte[int.MaxValue][10] you can have byte[int.MaxValue*10]. You would address the item at [4,5] as int.MaxValue*(4-1)+(5-1). (a general formula would be (i-1)*number of columns+(j-1).
Of course you could use the other convention.

If I understand your question correctly, you've got a massive file that you want to read into memory and then process. But you can't do this because the amount of data in the file exceeds that of any single-dimensional array.
You mentioned that speed is important, and that you have multiple threads running in parallel to process the data as quickly as possible. If you're going to have to partition the data for each thread anyway, why not base the number of threads on the number of byte[int.MaxValue] buffers required to cover everything?

You can create a memoryStream and then pass the array in line by line using the method Write
EDIT:
The limit of a MemoryStream is certainly the amount of memory present for your application. Maybe there is a limit beneath that but if you need more memory, then you should consider to modify your overall architecture. E.g. you could process your data in chunks, or you could do a swapping mechanism to a file.

If you are using Framework 4.0, you have the option of working with a MemoryMappedFile. Memory mapped files can be backed by a physical file, or by the Windows swap file. Memory mapped files act like an in-memory stream, transparently swapping data to/from the backing storage if and when required.
If you are not using Framework 4.0, you can still use this option, but you will need to either write your own or find an exsiting wrapper. I expect there are plenty on The Code Project.

VST plugin : using FFT on audio input buffer with arbitrary size, how?

I'm getting interested in programming a VST plugin, and I have a basic knowledge of audio dsp's and FFT's.
I'd like to use VST.Net, and I'm wondering how to implement an FFT-based effect.
The process-code looks like
public override void Process(VstAudioBuffer[] inChannels, VstAudioBuffer[] outChannels)
If I'm correct, normally the FFT would be applied on the input, some processing would be done on the FFT'd data, and then an inverse-FFT would create the processed soundbuffer.
But since the FFT works on a specified buffersize that will most probably be different then the (arbitrary) amount of input/output-samples, how would you handle this ?

FFT requires that your buffer size is a power of two, but to get around this you should just implement an internal buffer and work with that instead. So for instance:
// MyNiftyPlugin.h
#define MY_NUM_CHANNELS 2
#define MY_FFT_BUFFER_SIZE 1024
class MyNiftyPlugin : public AudioEffectX {
// ... stuff ...
private:
float internalBuffer[MY_NUM_CHANNELS][MY_FFT_BUFFER_SIZE];
long internalBufferIndex;
};
And then in your process loop:
// MyNiftyPlugin.cpp
void process(float **input, float **output, long sampleFrames) {
for(int frame = 0; frame < sampleFrames; ++frame) {
for(int channel = 0; channel < MY_NUM_CHANNELS; ++channel) {
internalBuffer[channel][internalBufferIndex] = inputs[channel][frame];
}
if(++internalBufferIndex > MY_FFT_BUFFER_SIZE) {
doFftStuff(...);
internalBufferIndex = 0;
}
}
}
This will impose a bit of latency in your plugin, but the performance boost you can achieve by knowing the buffer size for FFT during compile time makes it worthwhile.
Also, this is a good workaround for hosts like FL Studio (aka "Fruity Loops") which are known to call process() with different blocksizes every time.

How can one simplify network byte-order conversion from a BinaryReader?

System.IO.BinaryReader reads values in a little-endian format.
I have a C# application connecting to a proprietary networking library on the server side. The server-side sends everything down in network byte order, as one would expect, but I find that dealing with this on the client side is awkward, particularly for unsigned values.
UInt32 length = (UInt32)IPAddress.NetworkToHostOrder(reader.ReadInt32());
is the only way I've come up with to get a correct unsigned value out of the stream, but this seems both awkward and ugly, and I have yet to test if that's just going to clip off high-order values so that I have to do fun BitConverter stuff.
Is there some way I'm missing short of writing a wrapper around the whole thing to avoid these ugly conversions on every read? It seems like there should be an endian-ness option on the reader to make things like this simpler, but I haven't come across anything.

There is no built-in converter. Here's my wrapper (as you can see, I only implemented the functionality I needed but the structure is pretty easy to change to your liking):
/// <summary>
/// Utilities for reading big-endian files
/// </summary>
public class BigEndianReader
{
public BigEndianReader(BinaryReader baseReader)
{
mBaseReader = baseReader;
}
public short ReadInt16()
{
return BitConverter.ToInt16(ReadBigEndianBytes(2), 0);
}
public ushort ReadUInt16()
{
return BitConverter.ToUInt16(ReadBigEndianBytes(2), 0);
}
public uint ReadUInt32()
{
return BitConverter.ToUInt32(ReadBigEndianBytes(4), 0);
}
public byte[] ReadBigEndianBytes(int count)
{
byte[] bytes = new byte[count];
for (int i = count - 1; i >= 0; i--)
bytes[i] = mBaseReader.ReadByte();
return bytes;
}
public byte[] ReadBytes(int count)
{
return mBaseReader.ReadBytes(count);
}
public void Close()
{
mBaseReader.Close();
}
public Stream BaseStream
{
get { return mBaseReader.BaseStream; }
}
private BinaryReader mBaseReader;
}
Basically, ReadBigEndianBytes does the grunt work, and this is passed to a BitConverter. There will be a definite problem if you read a large number of bytes since this will cause a large memory allocation.

I built a custom BinaryReader to handle all of this. It's available as part of my Nextem library. It also has a very easy way of defining binary structs, which I think will help you here -- check out the Examples.
Note: It's only in SVN right now, but very stable. If you have any questions, email me at cody_dot_brocious_at_gmail_dot_com.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Accessing the internal buffer of a BinaryReader - c#

Related

BinaryWriter/Reader - Getting unexpected readbacks

How to instance a C# class in UNmanaged memory? (Possible?)

Can as 2D byte array be made one huge continuous byte array?

VST plugin : using FFT on audio input buffer with arbitrary size, how?

How can one simplify network byte-order conversion from a BinaryReader?

Categories

Resources