Compare byte[] to T - c#

I want to make a list of pointers to locations that contains a certain value in the process memory of another process. The value can be a short, int, long, string, bool or something else.
My idea is to use Generics for this. I have one problem with making it, how can I tell the compiler to what type he needs to convert the byte array?
This is what I made:
public List<IntPtr> ScanProccessFor<T>(T ItemToScanFor)
{
List<IntPtr> Output = new List<IntPtr>();
IntPtr StartOffset = SelectedProcess.MainModule.BaseAddress;
int ScanSize = SelectedProcess.MainModule.ModuleMemorySize;
for (int i = 0; i < ScanSize; i++)
if (ReadMemory(SelectedProcess, StartOffset + i, (UInt16)Marshal.SizeOf(ItemToScanFor)) == ItemToScanFor)
Output.Insert(Output.Count,StartOffset + i);
return Output;
}
How can I tell the compiler that he needs to convert the byte[] to type T?

Your question is a little bit confusing, but I'll try to answer what I can
Instead of taking a generic type, I would probably write a method that takes an instance of an interface like IConvertableToByteArray or something.
public IConvertableToByteArray
{
public byte[] ToByteArray();
}
Then If you needed to allow a specific type to be compatible with that method, you could make an encapsulating class
public IntConvertableToByteArray : IConvertableToByteArray
{
public int Value{get; set;}
public byte[] ToByteArray()
{
insert logic here
}
}

You could use Marshal.StructureToPtr to get an unmanaged representation of the structure (which has to be a 'simple' structure). You might need to special case strings though.
You should also think about the alignment constraints on what you are searching for -- advancing through memory 1 byte at a time will be very slow and wasteful if the item must be 4 or 8 byte aligned.

Related

Exclude extra private field in struct with LayoutKind.Explicit from being part of the structure layout

Let's say we have one structure :
[StructLayout(LayoutKind.Explicit, Size=8)] // using System.Runtime.InteropServices;
public struct AirportHeader {
[FieldOffset(0)]
[MarshalAs(UnmanagedType.I4)]
public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
[FieldOffset(4)]
[MarshalAs(UnmanagedType.I4)]
public int Offset;
}
What I want to have : Both direct access to type string and int values, for the field Ident in this structure, without breaking the 8 bytes size of the structure, nor having to compute a string value each time from the int value.
The field Ident in that structure as int is interesting because I can fast compare with other idents if they match, other idents may come from datas that are unrelated to this structure, but are in the same int format.
Question : Is there a way to define a field that is not part of the struture layout ? Like :
[StructLayout(LayoutKind.Explicit, Size=8)]
public struct AirportHeader {
[FieldOffset(0)]
[MarshalAs(UnmanagedType.I4)]
public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
[FieldOffset(4)]
[MarshalAs(UnmanagedType.I4)]
public int Offset;
[NoOffset()] // <- is there something I can do the like of this
string _identStr;
public string IdentStr {
get { // EDIT ! missed the getter on this property
if (string.IsNullOrEmpty(_identStr)) _identStr =
System.Text.Encoding.ASCII.GetString(Ident.GetBytes());
// do the above only once. May use an extra private bool field to go faster.
return _identStr;
}
}
}
PS : I use pointers ('*' and '&', unsafe) because I need to deal with endianness (Local system, binary files/file format, network) and fast type conversions, fast arrays filling. I also use many flavours of Marshal methods (fixing structures on byte arrays), and a little of PInvoke and COM interop. Too bad some assemblies I'm dealing with doesn't have their dotNet counterpart yet.
TL;DR; For details only
The question is all it is about, I just don't know the answer. The following should answer most questions like "other approaches", or "why not do this instead", but could be ignored as the answer would be straightforward. Anyway, I preemptively put everything so it's clear from the start what am I trying to do. :)
Options/Workaround I'm currently using (or thinking of using) :
Create a getter (not a field) that computes the string value each time :
public string IdentStr {
get { return System.Text.Encoding.ASCII.GetString(Ident.GetBytes()); }
// where GetBytes() is an extension method that converts an int to byte[]
}
This approach, while doing the job, performs poorly : The GUI displays aircraft from a database of default flights, and injects other flights from the network with a refresh rate of one second (I should increase that to 5 seconds). I have around 1200 flights within a area, relating to 2400 airports (departure and arrival), meaning I have 2400 calls to the above code each second to display the ident in a DataGrid.
Create another struct (or class), which only purpose is to manage
data on GUI side, when not reading/writing to a stream or file. That means, read
the data with the explicit layout struct. Create another struct with
the string version of the field. Work with GUI. That will perform
better on an overall point of view, but, in the process of defining
structures for the game binaries, I'm already at 143 structures of
the kind (just with older versions of the game datas; there are a bunch I didn't write yet, and I plan to add structures for the newest datas types). ATM, more than half of them require one or more extra
fields to be of meaningful use. It's okay if I were the only one to use the assembly, but
other users will probably get lost with AirportHeader,
AirportHeaderEx, AirportEntry, AirportEntryEx,
AirportCoords, AirportCoordsEx.... I would avoid doing that.
Optimize option 1 to make computations perform faster (thanks to SO,
there are a bunch of ideas to look for - currently working on the idea). For the Ident field, I
guess I could use pointers (and I will). Already doing it for fields I must display in little endian and read/write in big
endian. There are other values, like 4x4 grid informations that are
packed in a single Int64 (ulong), that needs bit shifting to
expose the actual values. Same for GUIDs or objects pitch/bank/yaw.
Try to take advantage of overlapping fields (on study). That would work for GUIDs. Perhaps it may work for the Ident example, if MarshalAs can constrain the
value to an ASCII string. Then I just need to specify the same
FieldOffset, '0' in this case. But I'm unsure setting the field
value (entry.FieldStr = "FMEP";) actually uses the Marshal constrain on the managed code side. My undestanding is it will store the string in Unicode on managed side (?).
Furthermore, that wouldn't work for packed bits (bytes that contains
several values, or consecutive bytes hosting values that have to be
bit shifted). I believe it is impossible to specify value position, length and format
at bit level.
Why bother ? context :
I'm defining a bunch of structures to parse binary datas from array of bytes (IO.File.ReadAllBytes) or streams, and write them back, datas related to a game. Application logic should use the structures to quickly access and manipulate the datas on demand. Assembly expected capabilities is read, validate, edit, create and write, outside the scope of the game (addon building, control) and inside the scope of the game (API, live modding or monitoring). Other purpose is to understand the content of binaries (hex) and make use of that understanding to build what's missing in the game.
The purpose of the assembly is to provide a ready to use basis components for a c# addon contributor (I don't plan to make the code portable). Creating applications for the game or processing addon from source to compilation into game binaries. It's nice to have a class that loads the entire content of a file in memory, but some context require you to not do that, and only retrieve from the file what is necessary, hence the choice of the struct pattern.
I need to figure out the trust and legal issues (copyrighted data) but that's outside the scope of the main concern. If that matter, Microsoft did provide over the years public freely accessible SDKs exposing binaries structures on previous versions of the game, for the purpose of what I'm doing (I'm not the first and probably not the last to do so). Though, I wouldn't dare to expose undocumented binaries (for the latest game datas for instance), nor facilitate a copyright breach on copyrighted materials/binaries.
I'm just asking confirmation if there is a way or not to have private fields not being part of the structure layout. Naive belief ATM is "that's impossible, but there are workarounds". It's just that my c# experience is pretty sparce, so maybe I'm wrong, why I ask. Thanks !
As suggested, there are several ways to get the job done. Here are the getters/setters I came up with within the structure. I'll measure how each code performs on various scenarios later. The dict approach is very seducing as on many scenarios, I would need a directly accessible global database of (59000) airports with runways and parking spots (not just the Ident), but a fast check between struct fields is also interesting.
public string IdentStr_Marshal {
get {
var output = "";
GCHandle pinnedHandle; // CS0165 for me (-> c# v5)
try { // Fast if no exception, (very) slow if exception thrown
pinnedHandle = GCHandle.Alloc(this, GCHandleType.Pinned);
IntPtr structPtr = pinnedHandle.AddrOfPinnedObject();
output = Marshal.PtrToStringAnsi(structPtr, 4);
// Cannot use UTF8 because the assembly should work in Framework v4.5
} finally { if (pinnedHandle.IsAllocated) pinnedHandle.Free(); }
return output;
}
set {
value.PadRight(4); // Must fill the blanks - initial while loop replaced (Charlieface's)
IntPtr intValuePtr = IntPtr.Zero;
// Cannot use UTF8 because some users are on Win7 with FlightSim 2004
try { // Put a try as a matter of habit, but not convinced it's gonna throw.
intValuePtr = Marshal.StringToHGlobalAnsi(value);
Ident = Marshal.ReadInt32(intValuePtr, 0).BinaryConvertToUInt32(); // Extension method to convert type.
} finally { Marshal.FreeHGlobal(intValuePtr); // freeing the right pointer }
}
}
public unsafe string IdentStr_Pointer {
get {
string output = "";
fixed (UInt32* ident = &Ident) { // Fixing the field
sbyte* bytes = (sbyte*)ident;
output = new string(bytes, 0, 4, System.Text.Encoding.ASCII); // Encoding added (#Charlieface)
}
return output;
}
set {
// value must not exceed a length of 4 and must be in Ansi [A-Z,0-9,whitespace 0x20].
// value validation at this point occurs outside the structure.
fixed (UInt32* ident = &Ident) { // Fixing the field
byte* bytes = (byte*)ident;
byte[] asciiArr = System.Text.Encoding.ASCII.GetBytes(value);
if (asciiArr.Length >= 4) // (asciiArr.Length == 4) would also work
for (Int32 i = 0; i < 4; i++) bytes[i] = asciiArr[i];
else {
for (Int32 i = 0; i < asciiArr.Length; i++) bytes[i] = asciiArr[i];
for (Int32 i = asciiArr.Length; i < 4; i++) bytes[i] = 0x20;
}
}
}
}
static Dictionary<UInt32, string> ps_dict = new Dictionary<UInt32, string>();
public string IdentStr_StaticDict {
get {
string output; // logic update with TryGetValue (#Charlieface)
if (ps_dict.TryGetValue(Ident, out output)) return output;
output = System.Text.Encoding.ASCII.GetString(Ident.ToBytes(EndiannessType.LittleEndian));
ps_dict.Add(Ident, output);
return output;
}
set { // input can be "FMEE", "DME" or "DK". length of 2 characters is the minimum.
var bytes = new byte[4]; // Need to convert value to a 4 byte array
byte[] asciiArr = System.Text.Encoding.ASCII.GetBytes(value); // should be 4 bytes or less
// Put the valid ASCII codes in the array.
if (asciiArr.Length >= 4) // (asciiArr.Length == 4) would also work
for (Int32 i = 0; i < 4; i++) bytes[i] = asciiArr[i];
else {
for (Int32 i = 0; i < asciiArr.Length; i++) bytes[i] = asciiArr[i];
for (Int32 i = asciiArr.Length; i < 4; i++) bytes[i] = 0x20;
}
Ident = BitConverter.ToUInt32(bytes, 0); // Set structure int value
if (!ps_dict.ContainsKey(Ident)) // Add if missing
ps_dict.Add(Ident, System.Text.Encoding.ASCII.GetString(bytes));
}
}
As mentioned by others, it is not possible to exclude a field from a struct for marshalling.
You also cannot use a pointer as a string in most places.
If the number of different possible strings is relatively small (and it probably will be, given it's only 4 characters), then you could use a static Dictionary<int, string> as a kind of string-interning mechanism.
Then you write a property to add/retrieve the real string.
Note that dictionary access is O(1), and hashing an int just returns itself, so it will be very, very fast, but will take up some memory.
[StructLayout(LayoutKind.Explicit, Size=8)]
public struct AirportHeader
{
[FieldOffset(0)]
[MarshalAs(UnmanagedType.I4)]
public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
[FieldOffset(4)]
[MarshalAs(UnmanagedType.I4)]
public int Offset;
static Dictionary<int, string> _identStrings = new Dictionary<int, string>();
public string IdentStr =>
_identStrings.TryGetValue(Ident, out var ret) ? ret :
(_identStrings[Ident] = Encoding.ASCII.GetString(Ident.GetBytes());
}
This is not possible because a structure must contain all of its values ​​in a specific order. Usually this order is controlled by the CLR itself. If you want to change the order of the data order, you can use the StructLayout. However, you cannot exclude a field or that data would simply not exist in memory.
Instead of a string (which is a reference type) you can use a pointer to point directly to that string and use that in your structure in combination with the StructLayout. To get this string value, you can use a get-only property that reads directly from unmanaged memory.

Why does it appear that this string is stored inline by value in an explicit layout class or struct?

I have been doing some extremely unsafe and slightly useless messing with the System.Runtime.CompilerServices.Unsafe MSIL package that allows you to do a lot of things with pointers you can't in C#. I created an extension method that returns a ref byte, with that byte being the start of the Method Table pointer at the start of the object, which allows you to use any object in a fixed statement, taking a byte pointer to the start of the object:
public static unsafe ref byte GetPinnableReference(this object obj)
{
return ref *(byte*)*(void**)Unsafe.AsPointer(ref obj);
}
I then decided to test it, using this code:
[StructLayout(LayoutKind.Explicit, Pack = 0)]
public class Foo
{
[FieldOffset(0)]
public string Name = "THIS IS A STRING";
}
[StructLayout(LayoutKind.Explicit, Pack = 0)]
public struct Bar
{
[FieldOffset(0)]
public string Name;
}
And then in the method
var foo = new Foo();
//var foo = new Bar { Name = "THIS IS A STRING" };
fixed (byte* objPtr = foo)
{
char* stringPtr = (char*)(objPtr + (foo is Foo ? : 12));
for (var i = 0; i < foo.Name.Length; i++)
{
Console.Write(*(stringPtr + i /* Char offset */));
}
Console.WriteLine();
}
Console.ReadKey();
The really weird thing about this is that this successfully prints "THIS IS A STRING"? The code works like this:
Get a byte pointer, objPtr, to the very start of the object
Add 16 to get to the actual data
Add another 16 to get past the string header to the string's actual data
Add 4 to skip the first 4 bytes of the string, which are the int _stringLength (exposed to us as Length property)
Interpret the result as a char pointer
EDIT: Important point - when switching foo to type Bar, I only add 12 rather than 36 bytes on (36 = 16 + 16 + 4). Why does it only have 8 bytes of header in the struct rather than 32 in the class? It would make sense that the struct has a smaller header (no syncblk i believe), but then why doesn't the string still have a 16 byte head? I would expect the offset to be 8 + 16 + 4 (28) rather than just 8 + 4 (12)
However, this assumption makes a big flaw. It assumes the string is stored inline inside the class/struct. However, strings are reference types and only a reference to them is stored inside the object from my knowledge. Particularly, I thought reference types can only be put on the heap - and as this struct is a local variable I thought it was on the stack. If it wasn't, the code would surely look something more like this to get the stringPtr
byte** stringRefptr = objPtr + 16;
char* stringPtr = (char*)(*stringRefPtr + 20);
where you take the string reference as a byte** and then use it to get to the chars. And this still wouldn't make sense if the string internally was a char[] (I'm not sure if it is)
So why does this work, and print the string, even though it mistakenly assumes string is stored inline, when string is a reference type?
NOTE: Requires .NET Core 2.0+ with System.Runtime.CompilerServices.Unsafe nuGet package, and C# 7.3+.
Because strings are indeed stored inline. The problem with your assumption is that strings are not normal objects but handled as a special case by the CLR (probably for performance reasons).
And as for the objects, since the string is the only member this would naturally be the most efficient way to allocate the memory. Try adding more members after your string member and your code would break.
Here’s a few references in how strings are stored in the CLR
https://mattwarren.org/2016/05/31/Strings-and-the-CLR-a-Special-Relationship/
https://codeblog.jonskeet.uk/2011/04/05/of-memory-and-strings/
Edit: I didn’t check, but I believe your reasoning behind the offsets is off. 36 = 24 (size of object) + 8 (string header?) + 4 (size of int) while for the struct the24 bytes becomes 0 as it has no header.

Implementing `.CopyTo(Array,int)` for the array itself as a target array(to be copied into)

In a class definition, I implemented IList<T> to make it look like an array.
// Foo has C++ arrays inside for a
// fast communication with some hardware
public abstract class Foo<T> : IList<T(or uint for a derived class)>
{
public virtual void CopyTo(uint[] array, int arrayIndex)
{
int dL = Length;
if (dL == array.Length)
{
/* needs pinning the target before this?*/
Marshal.Copy(handleForFooUnmanagedArray,
(int[])(object) array,
arrayIndex,
dL - arrayIndex);
return;
}
throw new NotImplementedException();
}
}
so it can do this now:
uint [] bar = new uint[L];
foo.CopyTo(bar,0);
but now I want to make it work like an array with this:
uint [] bar = new uint[L];
bar.CopyTo(foo,0);
so I looked what interfaces an array implements in run-time(here) to find something like a private .CopyFrom that I thought should be called implicity in `.CopyTo',
IList
ICloneable
ICollection
IEnumerable
IStructuralComparable
IStructuralEquatable
non of these have any .CopyFrom.
Maybe there is some IntPtr property as a handle for Marshal copying in .CopyTo but I couldn't see it in intellisense.
Question:
How can I find that which method does the .CopyTo use to get necessary info about target array and what that necessary info would that be? Another method like a .CopyFrom or a handle pointing to start of target array, or some interpreter intermediate codes stored in somewhere? Is the target array pinned in the process?
Side question:
Do I need to implement some extra methods in IList<T> on top of important(unknown) ones?
I already implemented toArray, Count and [] but I havent done anything for others yet. Then Foo also has Length(with a custom interface) but it doesn't belong Array so an uint[] may not use it in its CopyTo.
I'm not experienced with IL so I may not understand if thats the solution but I can look back in time.
Also I tried to implement Array which refuses to be implemented because of being a special class.
Thank you very much for your time.
CopyTo is implemented in unmanaged code by runtime itself, and signature of method looks like this:
[MethodImpl(MethodImplOptions.InternalCall)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);
As you see it still expects Array and not some pointer, so it's hard to do what you want.
But if you can have a managed array inside your Foo then it's easy to achieve the goal - just use implicit conversion to Array like this:
class MyFakeArray {
uint[] _realArray = new uint[10];
public MyFakeArray() {
}
public static implicit operator uint[](MyFakeArray a) {
return a._realArray;
}
}
Then CopyTo will work as expected:
var a = new uint[10];
var fa = new MyFakeArray();
a.CopyTo(fa, 0);

Allocate array in the same object as class

I have an array inside a class:
class MatchNode
{
public short X;
public short Y;
public NodeVal[] ControlPoints;
private MatchNode()
{
ControlPoints = new NodeVal[4];
}
}
The NodeVal is:
struct NodeVal
{
public readonly short X;
public readonly short Y;
public NodeVal(short x, short y)
{
X = x;
Y = y;
}
}
Now what if we wanted to take performance to next level and avoid having a separate object for the array. Actually it doesn't have to have an array. The only restriction is that the client code should be able to access NodeVal by index like:
matchNode.ControlPoints[i]
OR
matchNode[i]
and of course the solution should be faster or as fast as array access since it's supposed to be an optimization.
EDIT: As Ryan suggested it seems I should explain more about the motivation:
The MatchNode class is used heavily in the project. Millions of them are used in the project and each are accessed hundreds of times so having them as compact and concise as possible can lead to less cache misses and overall performance.
Let's consider a 64bit machine. In the current implementation the class the array takes 8 bytes for the ControlPoints reference and the size of the array object would be at least 16 bytes of object overhead (for method table and sync block) and 16 byte for the actual byte. So we have at least 24 overhead bytes beside 16 bytes of actual data.
These objects are used in bottlenecks of the project so it matters if we could optimize them more.
Of course we could just have a super big array of NodeVal and just save an index in MatchNode that would locate the actual data but again it will change every client codes that uses the MatchNodes, let alone be a dirty non-object oriented solution.
It is okay to have a messy MatchNode that uses every kind of nasty trick like unsafe or static cache code. It is not okay to leak these optimizations out to the client code.
You´re looking for indexers:
class MatchNode
{
public short X;
public short Y;
private NodeVal[] myField;
public NodeVal this[int i] { get { return myField[i]; } set { myField[i] = value; } }
public MatchNode(int size) { this.myField = new NodeVal[size]; }
}
Now you can simply use this:
var m = new MatchNode(10);
m[0] = new NodeVal();
However I doubt this will affect performance (at least in means of speed) in any way and you should consider the actual problems using a profiling tool (dotTrace for instance). Furthermore this approach will also create a private backing-field which will produce the same memory-footprint.

C# equivalent to VB6 'Type'

I am trying to port a rather large source from VB6 to C#. This is no easy task - especially for me being fairly new to C#.net. This source uses numerous Windows APIs as well as numerous Types. I know that there is no equivalent to the VB6 Type in C# but I'm sure there is a way to reach the same outcome. I will post some code below to further explain my request.
VB6:
Private Type ICONDIRENTRY
bWidth As Byte
bHeight As Byte
bColorCount As Byte
bReserved As Byte
wPlanes As Integer
wBitCount As Integer
dwBytesInRes As Long
dwImageOffset As Long
End Type
Dim tICONDIRENTRY() As ICONDIRENTRY
ReDim tICONDIRENTRY(tICONDIR.idCount - 1)
For i = 0 To tICONDIR.idCount - 1
Call ReadFile(lFile, tICONDIRENTRY(i), Len(tICONDIRENTRY(i)), lRet, ByVal 0&)
Next i
I have tried using structs and classes - but no luck so far.
I would like to see a conversion of this Type structure, but if someone had any clue as to how to convert the entire thing it would be unbelievably helpful. I have spent countless hours on this small project already.
If it makes any difference, this is strictly for educational purposes only.
Thank you for any help in advance,
Evan
struct is the equivalent. You'd express it like this:
struct IconDirEntry {
public byte Width;
public byte Height;
public byte ColorCount;
public byte Reserved;
public int Planes;
public int BitCount;
public long BytesInRes;
public long ImageOffset;
}
You declare a variable like this:
IconDirEntry entry;
Generally, in C#, type prefixes are not used, nor are all caps, except possibly for constants. structs are value types in C#, so that means that they are always passed by value. It looks like you're passing them in to a method that's populating them. If you want that usage, you'll have to use classes.
I'm not exactly sure what your issue is but this is a small ex of how to use a struct.
struct aStrt
{
public int A;
public int B;
}
static void Main(string[] args)
{
aStrt saStrt;
saStrt.A = 5;
}
Your question is not clear ..
What issues are you facing when you are using either struct or class and define those field members? Are you not able to access those members using an instance created for that class ??
Else, declare the class as static and make all the members inside the class also as static , so that u can access them without any instance being created!!
Maybe you trying to get something like this?
struct IconDirEntry
{
public byte Width;
// etc...
}
IconDirEntry[] tICONDIRENTRY = new IconDireEntry[tICONDIR.idCount - 1];

Categories