I'm trying to write a CPU emulator in C#. The machine's object looks like this:
class Machine
{
short a,b,c,d; //these are registers.
short[] ram=new short[0x10000]; //RAM organised as 65536 16-bit words
public void tick() { ... } //one instruction is processed
}
When I execute an instruction, I have a switch statement which decides what the result of the instruction will be stored in (either a register or a word of RAM)
I want to be able to do this:
short* resultContainer;
if (destination == register)
{
switch (resultSymbol) //this is really an opcode, made a char for clarity
{
case 'a': resultContainer=&a;
case 'b': resultContainer=&b;
//etc
}
}
else
{
//must be a place in RAM
resultContainer = &RAM[location];
}
then, when I've performed the instruction, I can simply store the result like:
*resultContainer = result;
I've been trying to figure out how to do this without upsetting C#.
How do I accomplish this using unsafe{} and fixed(){ } and perhaps other things I'm not aware of?
This is how I would implement it:
class Machine
{
short a, b, c, d;
short[] ram = new short[0x10000];
enum AddressType
{
Register,
DirectMemory,
}
// Gives the address for an operand or for the result.
// `addressType`and `addrCode` are extracted from instruction opcode
// `regPointers` and `ramPointer` are fixed pointers to registers and RAM.
private unsafe short* GetAddress(AddressType addressType, short addrCode, short*[] regPointers, short* ramPointer)
{
switch (addressType)
{
case AddressType.Register:
return regPointers[addrCode]; //just an example implementation
case AddressType.DirectMemory:
return ramPointer + addrCode; //just an example implementation
default:
return null;
}
}
public unsafe void tick()
{
fixed (short* ap = &a, bp = &b, cp = &c, dp = &d, ramPointer = ram)
{
short*[] regPointers = new short*[] { ap, bp, cp, dp };
short* pOperand1, pOperand2, pResult;
AddressType operand1AddrType, operand2AddrType, resultAddrType;
short operand1AddrCode, operand2AddrCode, resultAddrCode;
// ... decipher the instruction and extract the address types and codes
pOperand1 = GetAddress(operand1AddrType, operand1AddrCode, regPointers, ramPointer);
pOperand2 = GetAddress(operand2AddrType, operand2AddrCode, regPointers, ramPointer);
pResult = GetAddress(resultAddrType, resultAddrCode, regPointers, ramPointer);
// execute the instruction, using `*pOperand1` and `*pOperand2`, storing the result in `*pResult`.
}
}
}
To get the address of the registers and RAM array, you have to use the fixed statement. Also you can only use the acquired pointers in the fixed block. So you have to pass around the pointers.
What if we look at the problem from another view? Execute the instruction and store the result in a variable (result), and then decide on where you should put the result. Doesn't work?
A good option, one I had planned for DevKit but not yet implemented, is to generate your emulation code. It's a time-honoured, tried and tested solution. For every opcode/register combination, or some efficient subset, generate the code for ADD A,B and SUB X,Y separately. You can use a bitmask to grab the specific case you're looking for and simply execute the correct code. Compiler should be able to use a jump table under the hood, and there's no lookups, no conditionals, should be pretty efficient.
Related
Let's say we have one structure :
[StructLayout(LayoutKind.Explicit, Size=8)] // using System.Runtime.InteropServices;
public struct AirportHeader {
[FieldOffset(0)]
[MarshalAs(UnmanagedType.I4)]
public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
[FieldOffset(4)]
[MarshalAs(UnmanagedType.I4)]
public int Offset;
}
What I want to have : Both direct access to type string and int values, for the field Ident in this structure, without breaking the 8 bytes size of the structure, nor having to compute a string value each time from the int value.
The field Ident in that structure as int is interesting because I can fast compare with other idents if they match, other idents may come from datas that are unrelated to this structure, but are in the same int format.
Question : Is there a way to define a field that is not part of the struture layout ? Like :
[StructLayout(LayoutKind.Explicit, Size=8)]
public struct AirportHeader {
[FieldOffset(0)]
[MarshalAs(UnmanagedType.I4)]
public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
[FieldOffset(4)]
[MarshalAs(UnmanagedType.I4)]
public int Offset;
[NoOffset()] // <- is there something I can do the like of this
string _identStr;
public string IdentStr {
get { // EDIT ! missed the getter on this property
if (string.IsNullOrEmpty(_identStr)) _identStr =
System.Text.Encoding.ASCII.GetString(Ident.GetBytes());
// do the above only once. May use an extra private bool field to go faster.
return _identStr;
}
}
}
PS : I use pointers ('*' and '&', unsafe) because I need to deal with endianness (Local system, binary files/file format, network) and fast type conversions, fast arrays filling. I also use many flavours of Marshal methods (fixing structures on byte arrays), and a little of PInvoke and COM interop. Too bad some assemblies I'm dealing with doesn't have their dotNet counterpart yet.
TL;DR; For details only
The question is all it is about, I just don't know the answer. The following should answer most questions like "other approaches", or "why not do this instead", but could be ignored as the answer would be straightforward. Anyway, I preemptively put everything so it's clear from the start what am I trying to do. :)
Options/Workaround I'm currently using (or thinking of using) :
Create a getter (not a field) that computes the string value each time :
public string IdentStr {
get { return System.Text.Encoding.ASCII.GetString(Ident.GetBytes()); }
// where GetBytes() is an extension method that converts an int to byte[]
}
This approach, while doing the job, performs poorly : The GUI displays aircraft from a database of default flights, and injects other flights from the network with a refresh rate of one second (I should increase that to 5 seconds). I have around 1200 flights within a area, relating to 2400 airports (departure and arrival), meaning I have 2400 calls to the above code each second to display the ident in a DataGrid.
Create another struct (or class), which only purpose is to manage
data on GUI side, when not reading/writing to a stream or file. That means, read
the data with the explicit layout struct. Create another struct with
the string version of the field. Work with GUI. That will perform
better on an overall point of view, but, in the process of defining
structures for the game binaries, I'm already at 143 structures of
the kind (just with older versions of the game datas; there are a bunch I didn't write yet, and I plan to add structures for the newest datas types). ATM, more than half of them require one or more extra
fields to be of meaningful use. It's okay if I were the only one to use the assembly, but
other users will probably get lost with AirportHeader,
AirportHeaderEx, AirportEntry, AirportEntryEx,
AirportCoords, AirportCoordsEx.... I would avoid doing that.
Optimize option 1 to make computations perform faster (thanks to SO,
there are a bunch of ideas to look for - currently working on the idea). For the Ident field, I
guess I could use pointers (and I will). Already doing it for fields I must display in little endian and read/write in big
endian. There are other values, like 4x4 grid informations that are
packed in a single Int64 (ulong), that needs bit shifting to
expose the actual values. Same for GUIDs or objects pitch/bank/yaw.
Try to take advantage of overlapping fields (on study). That would work for GUIDs. Perhaps it may work for the Ident example, if MarshalAs can constrain the
value to an ASCII string. Then I just need to specify the same
FieldOffset, '0' in this case. But I'm unsure setting the field
value (entry.FieldStr = "FMEP";) actually uses the Marshal constrain on the managed code side. My undestanding is it will store the string in Unicode on managed side (?).
Furthermore, that wouldn't work for packed bits (bytes that contains
several values, or consecutive bytes hosting values that have to be
bit shifted). I believe it is impossible to specify value position, length and format
at bit level.
Why bother ? context :
I'm defining a bunch of structures to parse binary datas from array of bytes (IO.File.ReadAllBytes) or streams, and write them back, datas related to a game. Application logic should use the structures to quickly access and manipulate the datas on demand. Assembly expected capabilities is read, validate, edit, create and write, outside the scope of the game (addon building, control) and inside the scope of the game (API, live modding or monitoring). Other purpose is to understand the content of binaries (hex) and make use of that understanding to build what's missing in the game.
The purpose of the assembly is to provide a ready to use basis components for a c# addon contributor (I don't plan to make the code portable). Creating applications for the game or processing addon from source to compilation into game binaries. It's nice to have a class that loads the entire content of a file in memory, but some context require you to not do that, and only retrieve from the file what is necessary, hence the choice of the struct pattern.
I need to figure out the trust and legal issues (copyrighted data) but that's outside the scope of the main concern. If that matter, Microsoft did provide over the years public freely accessible SDKs exposing binaries structures on previous versions of the game, for the purpose of what I'm doing (I'm not the first and probably not the last to do so). Though, I wouldn't dare to expose undocumented binaries (for the latest game datas for instance), nor facilitate a copyright breach on copyrighted materials/binaries.
I'm just asking confirmation if there is a way or not to have private fields not being part of the structure layout. Naive belief ATM is "that's impossible, but there are workarounds". It's just that my c# experience is pretty sparce, so maybe I'm wrong, why I ask. Thanks !
As suggested, there are several ways to get the job done. Here are the getters/setters I came up with within the structure. I'll measure how each code performs on various scenarios later. The dict approach is very seducing as on many scenarios, I would need a directly accessible global database of (59000) airports with runways and parking spots (not just the Ident), but a fast check between struct fields is also interesting.
public string IdentStr_Marshal {
get {
var output = "";
GCHandle pinnedHandle; // CS0165 for me (-> c# v5)
try { // Fast if no exception, (very) slow if exception thrown
pinnedHandle = GCHandle.Alloc(this, GCHandleType.Pinned);
IntPtr structPtr = pinnedHandle.AddrOfPinnedObject();
output = Marshal.PtrToStringAnsi(structPtr, 4);
// Cannot use UTF8 because the assembly should work in Framework v4.5
} finally { if (pinnedHandle.IsAllocated) pinnedHandle.Free(); }
return output;
}
set {
value.PadRight(4); // Must fill the blanks - initial while loop replaced (Charlieface's)
IntPtr intValuePtr = IntPtr.Zero;
// Cannot use UTF8 because some users are on Win7 with FlightSim 2004
try { // Put a try as a matter of habit, but not convinced it's gonna throw.
intValuePtr = Marshal.StringToHGlobalAnsi(value);
Ident = Marshal.ReadInt32(intValuePtr, 0).BinaryConvertToUInt32(); // Extension method to convert type.
} finally { Marshal.FreeHGlobal(intValuePtr); // freeing the right pointer }
}
}
public unsafe string IdentStr_Pointer {
get {
string output = "";
fixed (UInt32* ident = &Ident) { // Fixing the field
sbyte* bytes = (sbyte*)ident;
output = new string(bytes, 0, 4, System.Text.Encoding.ASCII); // Encoding added (#Charlieface)
}
return output;
}
set {
// value must not exceed a length of 4 and must be in Ansi [A-Z,0-9,whitespace 0x20].
// value validation at this point occurs outside the structure.
fixed (UInt32* ident = &Ident) { // Fixing the field
byte* bytes = (byte*)ident;
byte[] asciiArr = System.Text.Encoding.ASCII.GetBytes(value);
if (asciiArr.Length >= 4) // (asciiArr.Length == 4) would also work
for (Int32 i = 0; i < 4; i++) bytes[i] = asciiArr[i];
else {
for (Int32 i = 0; i < asciiArr.Length; i++) bytes[i] = asciiArr[i];
for (Int32 i = asciiArr.Length; i < 4; i++) bytes[i] = 0x20;
}
}
}
}
static Dictionary<UInt32, string> ps_dict = new Dictionary<UInt32, string>();
public string IdentStr_StaticDict {
get {
string output; // logic update with TryGetValue (#Charlieface)
if (ps_dict.TryGetValue(Ident, out output)) return output;
output = System.Text.Encoding.ASCII.GetString(Ident.ToBytes(EndiannessType.LittleEndian));
ps_dict.Add(Ident, output);
return output;
}
set { // input can be "FMEE", "DME" or "DK". length of 2 characters is the minimum.
var bytes = new byte[4]; // Need to convert value to a 4 byte array
byte[] asciiArr = System.Text.Encoding.ASCII.GetBytes(value); // should be 4 bytes or less
// Put the valid ASCII codes in the array.
if (asciiArr.Length >= 4) // (asciiArr.Length == 4) would also work
for (Int32 i = 0; i < 4; i++) bytes[i] = asciiArr[i];
else {
for (Int32 i = 0; i < asciiArr.Length; i++) bytes[i] = asciiArr[i];
for (Int32 i = asciiArr.Length; i < 4; i++) bytes[i] = 0x20;
}
Ident = BitConverter.ToUInt32(bytes, 0); // Set structure int value
if (!ps_dict.ContainsKey(Ident)) // Add if missing
ps_dict.Add(Ident, System.Text.Encoding.ASCII.GetString(bytes));
}
}
As mentioned by others, it is not possible to exclude a field from a struct for marshalling.
You also cannot use a pointer as a string in most places.
If the number of different possible strings is relatively small (and it probably will be, given it's only 4 characters), then you could use a static Dictionary<int, string> as a kind of string-interning mechanism.
Then you write a property to add/retrieve the real string.
Note that dictionary access is O(1), and hashing an int just returns itself, so it will be very, very fast, but will take up some memory.
[StructLayout(LayoutKind.Explicit, Size=8)]
public struct AirportHeader
{
[FieldOffset(0)]
[MarshalAs(UnmanagedType.I4)]
public int Ident; // a 4 bytes ASCII : "FIMP" { 0x46, 0x49, 0x4D, 0x50 }
[FieldOffset(4)]
[MarshalAs(UnmanagedType.I4)]
public int Offset;
static Dictionary<int, string> _identStrings = new Dictionary<int, string>();
public string IdentStr =>
_identStrings.TryGetValue(Ident, out var ret) ? ret :
(_identStrings[Ident] = Encoding.ASCII.GetString(Ident.GetBytes());
}
This is not possible because a structure must contain all of its values in a specific order. Usually this order is controlled by the CLR itself. If you want to change the order of the data order, you can use the StructLayout. However, you cannot exclude a field or that data would simply not exist in memory.
Instead of a string (which is a reference type) you can use a pointer to point directly to that string and use that in your structure in combination with the StructLayout. To get this string value, you can use a get-only property that reads directly from unmanaged memory.
So I am new to C#, but one thing I already like coming from other higher level languages is the ability to do bitwise operations in (close to) C. I have a bunch of functions where some or all parameters are optional, and I like switches, so I built a function that converts boolean arrays to unsigned Shorts which allows me to basically Mux a boolean array to a single value for the switch:
namespace firstAsp.Helpers{
public class argMux{
public static ushort ba2ushort (bool[] parms){
//initialize position and output
ushort result = 0;
int i = parms.Length-1;
foreach (bool b in parms){
if (b)//put a one in byte at position of b
//bitwise or with position
result |= (ushort)(1<<i);
i--;
}
return result;
}
}
}
Here is an example use case:
public IActionResult Cheese(string fname,string lname)
{
bool[] tf = {fname!=null,lname!=null};
switch(argMux.ba2ushort(tf)){
case 3:
#ViewData["Data"]=$"Hello, {fname} {lname}";
break;
case 2:
#ViewData["Data"]=$"Hello, {fname}";
break;
case 1:
#ViewData["Data"]=$"Hello, Dr. {lname}";
break;
case 0:
#ViewData["Data"]="Hello, Dr. CheeseBurger";
break;
}
return View();
}
My question is, Is this an efficient way to do this, or is there a way that is superior? I am aiming for simplicity of use which this definitely delivers for me, but I would also like it to be efficient code that is fast at runtime. Any pointers? Is this a stupid way to do this? Any and all feedback is welcome, you can even call me an idiot if you believe it, I'm not too sensitive. Thanks!
This is all bad.
The correct way to write your method is to use none of this:
public IActionResult Cheese(string firstName, string lastName)
{
#ViewData["Data"]=$"Hello, {firstName ?? "Dr."} {lastName ?? "Cheeseburger"}";
return View();
}
one thing I already like coming from other higher level languages is the ability to do bitwise operations in (close to) C.
If you are twiddling bits to solve a high level business problem, you are probably doing something wrong. Solve high-level business problems with high-level business code.
Also, if you are using unsigned types in C#, odds are good you are doing something wrong. Unsigned types are there for interoperability with unmanaged code. It is very rare to use a ushort or a uint or a ulong for anything else in C#. Quantities which are logically unsigned, like the length of an array, are always represented as signed quantities.
C# has many features which are designed to ensure that people who have COM libraries can continue to use their libraries, and so that people who need the raw, unchecked performance of pointer arithmetic can do so when it is reasonably safe. Don't mistake the existence of those low-level programming features as evidence that C# is typically used as a low-level programming language. Write your code so that it reads clearly as an implementation of a business workflow.
The business of your code is rendering a name as a string, so it should clearly read as rendering a name as a string. If I asked you to write down a name on a piece of paper, the first thing you'd do would not be to make a bit array to help you, so it shouldn't be here either.
Now, there might be some cases where this sort of thing is sensible, and in those cases you should use an enum rather than treating a short as a bit field:
[Flags]
enum Permissions
{
None = 0x00,
Read = 0x01,
Write = 0x02,
ReadWrite = 0x03,
Delete = 0x04,
ReadDelete = 0x05,
WriteDelete = 0x06,
ReadWriteDelete = 0x07
}
...
static Permissions GetPermission(bool read, bool write, bool delete) {
var p1 = read ? Permissions.Read : Permissions.None;
var p2 = write ? Permissions.Write : Permissions.None;
var p3 = delete ? Permissions.Delete : Permissions.None;
return p1 | p2 | p3;
}
And now you have a convenient
switch(p)
{
case Permissions.None: ...
case Permissions.Read: ...
case Permissions.Write: ...
case Permissions.ReadWrite: ...
But notice here that we keep everything in the business domain. What are we doing? Checking permissions. So what does the code look like? Like it is checking permissions. Not twiddling a bunch of bits and then switching on an integer.
I know that strings are immutable and any changes to a string simply creates a new string in memory (and marks the old one as free). However, I'm wondering if my logic below is sound in that you actually can, in a round-a-bout fashion, modify the contents of a string.
const string baseString = "The quick brown fox jumps over the lazy dog!";
//initialize a new string
string candidateString = new string('\0', baseString.Length);
//Pin the string
GCHandle gcHandle = GCHandle.Alloc(candidateString, GCHandleType.Pinned);
//Copy the contents of the base string to the candidate string
unsafe
{
char* cCandidateString = (char*) gcHandle.AddrOfPinnedObject();
for (int i = 0; i < baseString.Length; i++)
{
cCandidateString[i] = baseString[i];
}
}
Does this approach indeed change the contents candidateString (without creating a new candidateString in memory) or does the runtime see through my tricks and treat it as a normal string?
Your example works just fine, thanks to several elements:
candidateString lives in the managed heap, so it's safe to modify. Compare this with baseString, which is interned. If you try to modify the interned string, unexpected things may happen. There's no guarantee that string won't live in write-protected memory at some point, although it seems to work today. That would be pretty similar to assigning a constant string to a char* variable in C and then modifying it. In C, that's undefined behavior.
You preallocate enough space in candidateString - so you're not overflowing the buffer.
Character data is not stored at offset 0 of the String class. It's stored at an offset equal to RuntimeHelpers.OffsetToStringData.
public static int OffsetToStringData
{
// This offset is baked in by string indexer intrinsic, so there is no harm
// in getting it baked in here as well.
[System.Runtime.Versioning.NonVersionable]
get {
// Number of bytes from the address pointed to by a reference to
// a String to the first 16-bit character in the String. Skip
// over the MethodTable pointer, & String
// length. Of course, the String reference points to the memory
// after the sync block, so don't count that.
// This property allows C#'s fixed statement to work on Strings.
// On 64 bit platforms, this should be 12 (8+4) and on 32 bit 8 (4+4).
#if WIN32
return 8;
#else
return 12;
#endif // WIN32
}
}
Except...
GCHandle.AddrOfPinnedObject is special cased for two types: string and array types. Instead of returning the address of the object itself, it lies and returns the offset to the data. See the source code in CoreCLR.
// Get the address of a pinned object referenced by the supplied pinned
// handle. This routine assumes the handle is pinned and does not check.
FCIMPL1(LPVOID, MarshalNative::GCHandleInternalAddrOfPinnedObject, OBJECTHANDLE handle)
{
FCALL_CONTRACT;
LPVOID p;
OBJECTREF objRef = ObjectFromHandle(handle);
if (objRef == NULL)
{
p = NULL;
}
else
{
// Get the interior pointer for the supported pinned types.
if (objRef->GetMethodTable() == g_pStringClass)
p = ((*(StringObject **)&objRef))->GetBuffer();
else if (objRef->GetMethodTable()->IsArray())
p = (*((ArrayBase**)&objRef))->GetDataPtr();
else
p = objRef->GetData();
}
return p;
}
FCIMPLEND
In summary, the runtime lets you play with its data and doesn't complain. You're using unsafe code after all. I've seen worse runtime messing than that, including creating reference types on the stack ;-)
Just remember to add one additional \0 after all the characters (at offset Length) if your final string is shorter than what's allocated. This won't overflow, each string has an implicit null character at the end to ease interop scenarios.
Now take a look at how StringBuilder creates a string, here's StringBuilder.ToString:
[System.Security.SecuritySafeCritical] // auto-generated
public override String ToString() {
Contract.Ensures(Contract.Result<String>() != null);
VerifyClassInvariant();
if (Length == 0)
return String.Empty;
string ret = string.FastAllocateString(Length);
StringBuilder chunk = this;
unsafe {
fixed (char* destinationPtr = ret)
{
do
{
if (chunk.m_ChunkLength > 0)
{
// Copy these into local variables so that they are stable even in the presence of race conditions
char[] sourceArray = chunk.m_ChunkChars;
int chunkOffset = chunk.m_ChunkOffset;
int chunkLength = chunk.m_ChunkLength;
// Check that we will not overrun our boundaries.
if ((uint)(chunkLength + chunkOffset) <= ret.Length && (uint)chunkLength <= (uint)sourceArray.Length)
{
fixed (char* sourcePtr = sourceArray)
string.wstrcpy(destinationPtr + chunkOffset, sourcePtr, chunkLength);
}
else
{
throw new ArgumentOutOfRangeException("chunkLength", Environment.GetResourceString("ArgumentOutOfRange_Index"));
}
}
chunk = chunk.m_ChunkPrevious;
} while (chunk != null);
}
}
return ret;
}
Yes, it uses unsafe code, and yes, you can optimize yours by using fixed, as this type of pinning is much more lightweight than allocating a GC handle:
const string baseString = "The quick brown fox jumps over the lazy dog!";
//initialize a new string
string candidateString = new string('\0', baseString.Length);
//Copy the contents of the base string to the candidate string
unsafe
{
fixed (char* cCandidateString = candidateString)
{
for (int i = 0; i < baseString.Length; i++)
cCandidateString[i] = baseString[i];
}
}
When you use fixed, the GC only discovers an object needs to be pinned when it stumbles upon it during a collection. If there's no collection going on, the GC isn't even involved. When you use GCHandle, a handle is registered in the GC each time.
As others have pointed out, mutating the String objects is useful in some rare cases. I give an example with a useful code snippet below.
Use-case/background
Although everyone should be a huge fan of the really excellent character Encoding support that .NET has always offered, sometimes it might be preferable to cut down that overhead, especially if doing a lot of roundtripping between 8-bit (legacy) characters and managed strings (i.e. typically interop scenarios).
As I hinted, .NET is particularly emphatic that you must explicitly specify a text Encoding for any/all conversions of non-Unicode character data to/from managed String objects. This rigorous control at the periphery is really commendable, since it ensures that once you have the string inside the managed runtime you never have to worry; everything is just Unicode (technically, UCS-2).
For contrast, consider a certain other popular scripting language that famously botched this whole area, resulting in an ongoing saga of parallel 2.x and 3.x versions, all due to adding support for Unicode in the latter.
By enforcing Unicode once you're inside, .NET pushes all that mess to the interop boundary where it is done "once-and-for-all", but this philosophy entails that that encoding/decoding work is eager and exhaustive, as opposed to "lazy" and more under your program's control. Because of this, the .NET Encoding/Encoder classes can be a performance bottleneck. If you're moving lots of text from wide (Unicode) to simple fixed 7- or 8-bit narrow ANSI, ASCII, etc. (note I'm not talking about MBCS or UTF-8, where you'll want to use the Encoders!), the .NET encoding paradigm might seem like overkill.
Furthermore, it could be the case that you don't know, or don't care to, specify an Encoding. Maybe all you care about is fast and accurate round-tripping for that low-byte of a 16-bit Char. If you look at the .NET source code, even the System.Text.ASCIIEncoding might be too bulky in some situations.
The code snippet...
Thin String: 8-bit characters directly stored in a managed
String, one 'thin char' per wide Unicode character, without
bothering with character encoding/decoding during round-tripping.
All of these methods just ignore/strip the upper byte of each 16-bit Unicode character, transmitting only each low byte exactly as-is. Obviously, successful recovery of the Unicode text after a round-trip will only be possible if those upper bits aren't relevant.
/// <summary> Convert byte array to "thin string" </summary>
public static unsafe String ToThinString(this byte[] src)
{
int c;
var ret = String.Empty;
if ((c = src.Length) > 0)
fixed (char* dst = (ret = new String('\0', c)))
do
dst[--c] = (char)src[c]; // fill new String by in-situ mutation
while (c > 0);
return ret;
}
In the direction just shown, which is typically bringing native data in to managed, you often don't have the managed byte array, so rather than allocate a temporary one just for the purpose of calling this function, you can process the raw native bytes directly into a managed string. As before, this bypasses all character encoding.
The (obvious) range checks that would be needed in this unsafe function are elided for clarity:
public static unsafe String ToThinString(byte* pSrc, int c)
{
var ret = String.Empty;
if (c > 0)
fixed (char* dst = (ret = new String('\0', c)))
do
dst[--c] = (char)pSrc[c]; // fill new String by in-situ mutation
while (c > 0);
return ret;
}
The advantage of String mutation here is that you avoid temporary allocations by writing directly to the final allocation. Even if you were to avoid the extra allocation by using stackalloc, there would be an unnecessary re-copying of the whole thing when you eventually call the String(Char*, int, int) constructor: clearly there's no way to associate data you just laboriously prepared with a String object that didn't exist until you were finished!
For completeness...
Here's the mirror-code which reverses operation to get back a byte array (even though this direction doesn't happen to illustrate the string-mutation technique). This is the direction you'd typically use to send Unicode text out of the managed .NET runtime, for use by a legacy app.
/// <summary> Convert "thin string" to byte array </summary>
public static unsafe byte[] ToByteArr(this String src)
{
int c;
byte[] ret = null;
if ((c = src.Length) > 0)
fixed (byte* dst = (ret = new byte[c]))
do
dst[--c] = (byte)src[c];
while (c > 0);
return ret ?? new byte[0];
}
I just tested to change a string literal and the results are very scary
var f1 = "paul";
var f2 = "paul";
fixed (char* bla = f1)
{
bla[0] = 'f';
}
This changes both f1 and f2 to "faul".
So i would never recommend to mess with the immutability of strings unless you
just created a new string instance and know exactly what you're doing.
I've done a few years of c# now, and I'm trying to learn some new stuff. So I decided to have a look at c++, to get to know programming in a different way.
I've been doing loads of reading, but I just started writing some code today.
On my Windows 7/64 bit machine, running VS2010, I created two projects:
1) A c# project that lets me write things the way I'm used to.
2) A c++ "makefile" project that let's me play around, trying to implement the same thing. From what I understand, this ISN'T a .NET project.
I got to trying to populate a dictionary with 10K values. For some reason, the c++ is orders of magnitude slower.
Here's the c# below. Note I put in a function after the time measurement to ensure it wasn't "optimized" away by the compiler:
var freq = System.Diagnostics.Stopwatch.Frequency;
int i;
Dictionary<int, int> dict = new Dictionary<int, int>();
var clock = System.Diagnostics.Stopwatch.StartNew();
for (i = 0; i < 10000; i++)
dict[i] = i;
clock.Stop();
Console.WriteLine(clock.ElapsedTicks / (decimal)freq * 1000M);
Console.WriteLine(dict.Average(x=>x.Value));
Console.ReadKey(); //Don't want results to vanish off screen
Here's the c++, not much thought has gone into it (trying to learn, right?)
int input;
LARGE_INTEGER frequency; // ticks per second
LARGE_INTEGER t1, t2; // ticks
double elapsedTime;
// get ticks per second
QueryPerformanceFrequency(&frequency);
int i;
boost::unordered_map<int, int> dict;
// start timer
QueryPerformanceCounter(&t1);
for (i=0;i<10000;i++)
dict[i]=i;
// stop timer
QueryPerformanceCounter(&t2);
// compute and print the elapsed time in millisec
elapsedTime = (t2.QuadPart - t1.QuadPart) * 1000.0 / frequency.QuadPart;
cout << elapsedTime << " ms insert time\n";
int input;
cin >> input; //don't want console to disappear
Now, some caveats. I managed to find this related SO question. One of the guys wrote a long answer mentioning WOW64 skewing the results. I've set the project to release and gone through the "properties" tab of the c++ project, enabling everything that sounded like it would make it fast. Changed the platform to x64, though I'm not sure whether that addresses his wow64 issue. I'm not that experienced with the compiler options, perhaps you guys have more of a clue?
Oh, and the results: c#:0.32ms c++: 8.26ms. This is a bit strange. Have I misinterpreted something about what .Quad means? I copied the c++ timer code from someplace on the web, going through all the boost installation and include/libfile rigmarole. Or perhaps I am actually using different instruments unwittingly? Or there's some critical compile option that I haven't used? Or maybe the c# code is optimized because the average is a constant?
Here's the c++ command line, from the Property page->C/C++->Command Line:
/I"C:\Users\Carlos\Desktop\boost_1_47_0" /Zi /nologo /W3 /WX- /MP /Ox /Oi /Ot /GL /D "_MBCS" /Gm- /EHsc /GS- /Gy- /arch:SSE2 /fp:fast /Zc:wchar_t /Zc:forScope /Fp"x64\Release\MakeTest.pch" /Fa"x64\Release\" /Fo"x64\Release\" /Fd"x64\Release\vc100.pdb" /Gd /errorReport:queue
Any help would be appreciated, thanks.
A simple allocator change will cut that time down a lot.
boost::unordered_map<int, int, boost::hash<int>, std::equal_to<int>, boost::fast_pool_allocator<std::pair<const int, int>>> dict;
0.9ms on my system (from 10ms before). This suggests to me that actually, the vast, vast majority of your time is not spent in the hash table at all, but in the allocator. The reason that this is an unfair comparison is because your GC will never collect in such a trivial program, giving it an undue performance advantage, and native allocators do significant caching of free memory- but that'll never come into play in such a trivial example, because you've never allocated or deallocated anything and so there's nothing to cache.
Finally, the Boost pool implementation is thread-safe, whereas you never play with threads so the GC can just fall back to a single-threaded implementation, which will be much faster.
I resorted to a hand-rolled, non-freeing non-thread-safe pool allocator and got down to 0.525ms for C++ to 0.45ms for C# (on my machine). Conclusion: Your original results were very skewed because of the different memory allocation schemes of the two languages, and once that was resolved, then the difference becomes relatively minimal.
A custom hasher (as described in Alexandre's answer) dropped my C++ time to 0.34ms, which is now faster than C#.
static const int MaxMemorySize = 800000;
static int FreedMemory = 0;
static int AllocatorCalls = 0;
static int DeallocatorCalls = 0;
template <typename T>
class LocalAllocator
{
public:
std::vector<char>* memory;
int* CurrentUsed;
typedef T value_type;
typedef value_type * pointer;
typedef const value_type * const_pointer;
typedef value_type & reference;
typedef const value_type & const_reference;
typedef std::size_t size_type;
typedef std::size_t difference_type;
template <typename U> struct rebind { typedef LocalAllocator<U> other; };
template <typename U>
LocalAllocator(const LocalAllocator<U>& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
LocalAllocator(std::vector<char>* ptr, int* used) {
CurrentUsed = used;
memory = ptr;
}
template<typename U> LocalAllocator(LocalAllocator<U>&& other) {
CurrentUsed = other.CurrentUsed;
memory = other.memory;
}
pointer address(reference r) { return &r; }
const_pointer address(const_reference s) { return &r; }
size_type max_size() const { return MaxMemorySize; }
void construct(pointer ptr, value_type&& t) { new (ptr) T(std::move(t)); }
void construct(pointer ptr, const value_type & t) { new (ptr) T(t); }
void destroy(pointer ptr) { static_cast<T*>(ptr)->~T(); }
bool operator==(const LocalAllocator& other) const { return Memory == other.Memory; }
bool operator!=(const LocalAllocator&) const { return false; }
pointer allocate(size_type count) {
AllocatorCalls++;
if (*CurrentUsed + (count * sizeof(T)) > MaxMemorySize)
throw std::bad_alloc();
if (*CurrentUsed % std::alignment_of<T>::value) {
*CurrentUsed += (std::alignment_of<T>::value - *CurrentUsed % std::alignment_of<T>::value);
}
auto val = &((*memory)[*CurrentUsed]);
*CurrentUsed += (count * sizeof(T));
return reinterpret_cast<pointer>(val);
}
void deallocate(pointer ptr, size_type n) {
DeallocatorCalls++;
FreedMemory += (n * sizeof(T));
}
pointer allocate() {
return allocate(sizeof(T));
}
void deallocate(pointer ptr) {
return deallocate(ptr, 1);
}
};
int main() {
LARGE_INTEGER frequency; // ticks per second
LARGE_INTEGER t1, t2; // ticks
double elapsedTime;
// get ticks per second
QueryPerformanceFrequency(&frequency);
std::vector<char> memory;
int CurrentUsed = 0;
memory.resize(MaxMemorySize);
struct custom_hash {
size_t operator()(int x) const { return x; }
};
boost::unordered_map<int, int, custom_hash, std::equal_to<int>, LocalAllocator<std::pair<const int, int>>> dict(
std::unordered_map<int, int>().bucket_count(),
custom_hash(),
std::equal_to<int>(),
LocalAllocator<std::pair<const int, int>>(&memory, &CurrentUsed)
);
// start timer
std::string str;
QueryPerformanceCounter(&t1);
for (int i=0;i<10000;i++)
dict[i]=i;
// stop timer
QueryPerformanceCounter(&t2);
// compute and print the elapsed time in millisec
elapsedTime = ((t2.QuadPart - t1.QuadPart) * 1000.0) / frequency.QuadPart;
std::cout << elapsedTime << " ms insert time\n";
int input;
std::cin >> input; //don't want console to disappear
}
Storing a consecutive sequence of numeric integral keys added in ascending order is definitely NOT what hash tables are optimized for.
Use an array, or else generate random values.
And do some retrievals. Hash tables are highly optimized for retrieval.
You can try dict.rehash(n) with different (large) values of n before inserting elements, and see how this impacts performance. Memory allocations (they take place when the container fills buckets) are generally more expensive in C++ than in C#, and rehashing is also heavy. For std::vector and std::deque, the analog member function is reserve.
Different rehash policies and load factor threshold (have a look at the max_load_factor member function) will also greatly impact unordered_map's performance.
Next, since you're using VS2010, I suggest you use std::unordered_map from the <unordered_map> header. Don't use boost when you can use the standard library.
The actual hash function used may greatly impact performance. You may try with the following:
struct custom_hash { size_t operator()(int x) const { return x; } };
and use std::unordered_map<int, int, custom_hash>.
Finally, I agree that this is a poor usage of hash tables. Use random values for insertion, you'll get a more precise picture of what is going on. Testing insertion speeds of hash tables isn't stupid at all, but hash tables are not meant to store consecutive integers. Use a vector for this.
Visual Studio TR1 unordered_map is the same as stdext::hash_map:
Another thread asking why it performs slow, see my answer with links to others that have discovered the same issue. The conclusion is to use another hash_map implementation when in C+++:
Alternative to stdext::hash_map for performance reasons
Btw. remember when in C++ then there is big difference between optimized Release-build and non-optimized Debug-build compared to C#.
I have a problem using a class of made of structures.
Here's the basic definition:
using System;
struct Real
{
public double real;
public Real(double real)
{
this.real = real;
}
}
class Record
{
public Real r;
public Record(double r)
{
this.r = new Real(r);
}
public void Test(double origval, double newval)
{
if (this.r.real == newval)
Console.WriteLine("r = newval-test passed\n");
else if (this.r.real == origval)
Console.WriteLine("r = origval-test failed\n");
else
Console.WriteLine("r = neither-test failed\n");
}
}
When I create a non-dynamic (static?) Record, setting the Real works.
When I create a dynamic Record, setting the real doesn't work.
When I create a dynamic Record, replacing the real works.
And here's the test program
class Program
{
static void Main(string[] args)
{
double origval = 8.0;
double newval = 5.0;
// THIS WORKS - create fixed type Record, print, change value, print
Record record1 = new Record(origval);
record1.r.real = newval; // change value ***
record1.Test(origval, newval);
// THIS DOESN'T WORK. change value is not making any change!
dynamic dynrecord2 = new Record(origval);
dynrecord2.r.real = newval; // change value
dynrecord2.Test(origval, newval);
// THIS WORKS - create dynamic type Record, print, change value, print
dynamic dynrecord3 = new Record(origval);
dynamic r = dynrecord3.r; // copy out value
r.real = newval; // change copy
dynrecord3.r = r; // copy in modified value
dynrecord3.Test(origval, newval);
}
}
And here's the output:
r = newval-test passed
r = origval-test failed
r = newval-test passed
When I change the struct Real to class Real, all three cases work.
So what's going on?
Thanks,
Max
dynamic is really a fancy word for object as far as the core CLI is concerned, so you are mutating a boxed copy. This is prone to craziness. Mutating a struct in the first place is really, really prone to error. I would simply make the struct immutable - otherwise you are going to get this over and over.
I dug a little deeper into this problem. Here's an answer from Mads Torgersen of Microsoft.
From Mads:
This is a little unfortunate but by design. In
dynrecord2.r.real = newval; // change value
The value of dynrecord2.r gets boxed, which means copied into its own heap object. That copy is the one getting modified, not the original that you subsequently test.
This is a consequence of the very “local” way in which C# dynamic works. Think about a statement like the above – there are two fundamental ways that we could attack that:
1) Realize at compile time that something dynamic is going on, and essentially move the whole statement to be bound at runtime
2) Bind individual operations at runtime when their constituents are dynamic, returning something dynamic that may in turn cause things to be bound at runtime
In C# we went with the latter, which is nicely compositional, and makes it easy to describe dynamic in terms of the type system, but has some drawbacks – such as boxing of resulting value types for instance.
So what you are seeing is a result of this design choice.
I took another look at the MSIL. It essentially takes
dynrecord2.r.real = newval;
and turns it into:
Real temp = dynrecord2.r;
temp.real = newval;
If dynrecord2.r is a class, it just copies the handle so the change affects the internal field. If dynrecord2.r is a struct, a copy is made, and the change doesn't affect the original.
I'll leave it up to the reader to decide if this is a bug or a feature.
Max
Make your struct immutable and you won't have problems.
struct Real
{
private double real;
public double Real{get{return real;}}
public Real(double real)
{
this.real = real;
}
}
Mutable structs can be useful in native interop or some high performance scenarios, but then you better know what you're doing.