Passing an array pointer for placing in a hashmap in C++

Passing an array pointer for placing in a hashmap in C++ - c#

I'm kinda new to C++ (coming from C#).
I'd like to pass an array of data to a function (as a pointer).
void someFunc(byte *data)
{
// add this data to a hashmap.
Hashtable.put(key, data)
}
This data will be added into a hashmap (some key-value based object).
In C#, i could just add the passed reference to a dictionary and be done with it.
Can the same be done in C++ ? or must i create a COPY of the data, and only add that to the data structure for storing it ?
I have seen this pattern in some code examples, but i am not 100% sure why it is needed, or whether it can be avoided at certain times.

Not sure where your key is coming from... but the std::map and the std::unordered_map are probably what you are looking for.
Now the underlying data structure of the std::map is a binary tree, while the std::unordered_map is a hash.
Furthermore, the std::unordered_map is an addition in the C++ 11 standard.

It all depends how you pass the data and how it is created. If the data is created on the heap(by using new) you can just put the pointer or reference you have to it in your table. On the other hand if the function takes the argument by value you will need to make a copy at some point, because if you store the address of a temp bad things will happen :).
As for what data structure to use, and how they work, I've found one of the best references is cppreference
Heap allocation should be reserved for special cases. stack allocation is faster and easier to manage, you should read up on RAII(very important). As for other reading try finding out stuff on dynamic vs. automatic memory allocation.
Just found this read specifically saying C# to C++ figured it'd be perfect for you, good luck c++ can be one of the more difficult languages to learn so don't assume anything will work the same as it does in C#. MSDN has a nice C# vs. C++ thing yet

Related

Marshal C# struct into a byte[]

Is there a way to serialize a C# structure, annotated with [StructLayout], into a managed byte array, i. e. a byte[], either premade or with allocation?
I can see marshaling to unmanaged memory then copying, but that's ugly.

Checkout MemoryMarshal.Cast<TFrom, TTo>(). It will easily allow you to change from byte/short/int/long arrays to Structures and back.
https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.memorymarshal?view=netcore-3.1

From my experiences mixing managed and unmanaged data is all about clearly defining the transition from one space to another.
When i have had the requirement of going from native to managed or the other way the first step has always been to copy data to the 'target' space and then forward that.
I assume that you are already familiar with the interop services since you mentioned copying and [StructLayout].
https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal.ptrtostructure?view=netcore-3.1#System_Runtime_InteropServices_Marshal_PtrToStructure_System_IntPtr_System_Object_
If you find a better way please do tell

How to share a big byte array between C++ and C#

I need to share a huge (many megabytes) byte array between a C++ program residing in a DLL and a C# program.
I need realtime performance, so it is very important I can share it between the two in an efficient way, so making a new copy for it in C# every time after the data is manipulated in C++ is not an option, yet the examples I have found so far seems to depend on this.
Is it possible to share the array in an efficient way? And if so, how?

In current versions of .NET, any multi-megabyte array will end up on the large object heap and never move. However, to be safe, you should pin the array as fejesjoco said. Then the C++ code can save a pointer into the .NET array and update it in-place.

Use memory mapped file. System.IO.MemoryMappedFiles in .NET and CreateFileMapping in C++.

Does the .NET marshaler pass a copy, not a reference? If so, then call GCHandle.Alloc(array, GCHandleType.Pinned), then you can get the address of this pinned object and pass that to the DLL as a pointer.

In vb.net, I've accomplished this by passing a reference to the first byte. There's no type-checking to ensure that what's passed is really an array of the proper size, but as long as the routine is running the array will be pinned.

Definition of C# data structures and algorithms

This may be a silly question (with MSDN and all), but maybe some of you will be able to help me sift through amazing amounts of information.
I need to know the specifics of the implementations of common data structures and algorithms in C#. That is, for example, I need to know, say, how Linked Lists are handled and represented, how they and their methods are defined.
Is there a good centralized source of documentation for this (with code), or should I just reconstruct it? Have you ever had to know the specifics of these things to decide what to use?
Regards, and thanks.

Scott Mitchell has a great 6-part article that covers many .NET data structures:
An Extensive Examination of Data Structures
For an algorithmic overview of data structures, I suggest reading the algorithm textbook: "Introduction to Algorithms" by Cormen, et al..
For details on each .NET data structure the MSDN page on that specific class is good.
When all of them fail to address issues, Reflector is always there. You can use it to dig through the actual source and see things for yourself.

If you really want to learn it, try making your own.
Googling for linked lists will give you a lot of hits and sample code to go off of. Wikipedia will also be a good resource.

Depends on the language. Most languages have the very basics now pre-built with them, but that doesn't mean their implementations are the same. The same named object--LinkedList in C# is completely different than the LinkedList in Java or C++. Even the String library is different. C# for instance is known to create a new String object every time you assign a string a new value...this becomes something you learn quickly when it brings your program to a crashing halt when you're working with substrings in C# for the first time.
So the answer to your question is massively complicated because I don't know quite what you're after. If you're just going to be teaching a class what a generic version of these algorithms and data structures are, you can present them without getting into the problems I mentioned above. You'll just need to select, lookup, read about a particular type of implementation of them. Like for LinkedList you need to be able to instantiate the list, destroy the list, copy the list, add to the list somewhere (usually front/back), remove from the list, etc. You could get fancy and add as many methods as you want.

Pointers in C# and how frequently it is used in the application?

For me , the Pointer was one of the hardest concept in programming languages in C++. When I was learning C++, I spent tremendous amount of time learning it. However, Now I primarily work in projects that are entirely written in languages like C#, and VB.NET etc. As a matter fact, I have NOT touched C++ for almost 4 years. Even though, C# has pointer, but I have not encouter the situation where I must use pointer in C#. So my question is , what kinds of productivity can we obtain in C# by using pointer ? what are the situation where the uses of the pointer is must?

You're already using lots of pointers in C#, except that they don't look like pointers. Every time you do something with an instance of a class, that's a pointer. You're getting almost all the potential benefit already, without the hassle.
It is possible to use pointers more explicitly in C#, which is what most people mean by C# pointers, but I would think the occasions would be very rare. They may be useful to link to C libraries and the like, but other than that I don't see much use for them.

Personally, I've never had a need for using pointers in .NET, but if you're dealing with absolute performance critical code, you'd use pointers. If you look at the System.String class, you'll see that a lot of the methods that handle the string manipulation, use pointers. Also, when dealing with image processing, very often it's useful to use pointers. Now, one can definitely argue whether those sort of applications should be written in .NET in the first place (I think they should), but at least if you need to squeeze out that extra bit of speed, you can.

I use pointers in C# only in rare circumstances that mostly have to do with sending/receiving data, where you have to convert a byte array to a struct and vice-versa. Though even then, you don't have to deal with pointers directly typically.
In some cases, you can use pointers to improve performance, because with the Marshaller, sometimes you have to copy memory to access data, while with pointers, you can access it directly (think Bitmap.Lock()).

Personally, I've never needed to use a pointer in C#. If I need that kind of functionality, I write that code in C++/CLI, and call it from C#. If I need to pass pointers from C# to C++/CLI or vice-versa, I pass them around as an IntPtr and cast it to the type I need in C++/CLI.
In my opinion - if you're using pointers in C#, in 99% of cases, you're using the language wrong.
Edit:
The nice thing about C++/CLI is that you can mark individual classes for native-only compilation. I do a lot of image processing work which needs to happen very quickly; it uses a lot of pointer-based code. I generally have a managed C++/CLI object forward calls to a native C++ object where my processing takes place. I turn on optimizations for that native code and viola, I get a nice performance gain.
Granted, this only matters if the performance gain you get by executing native, optimized code can offset the overhead of managed to unmanaged transitions. In my case, it always does.

Judy array for managed languages

Judy array is fast data structure that may represent a sparse array or a set of values. Is there its implementation for managed languages such as C#? Thanks

It's worth noting that these are often called Judy Trees or Judy Tries if you are googling for them.
I also looked for a .Net implementation but found nothing.
Also worth noting that:
The implementation is heavily designed around efficient cache usage, as such implementation specifics may be highly dependent on the size of certain constructs used within the sub structures. A .Net managed implementation may be somewhat different in this regard.
There are some significant hurdles to it that I can see (and there are probably more that my brief scan missed)
The API has some fairly anti OO aspects (for example a null pointer is viewed as an empty tree) so simplistic, move the state pointer to the LHS and make functions instance methods conversion to C++ wouldn't work.
The implementation of the sub structures I looked at made heavy use of pointers. I cannot see these efficiently being translated to references in managed languages.
The implementation is a distillation of a lot of very complex ideas that belies the simplicity of the public api.
The code base is about 20K lines (most of it complex), this doesn't strike me as an easy port.
You could take the library and wrap the C code in C++/CLI (probably simply holding internally a pointer that is the c api trie and having all the c calls point to this one). This would provide a simplistic implementation but the linked libraries for the native implementation may be problematic (as might memory allocation).
You would also probably need to deal with converting .Net strings to plain old byte* on the transition as well (or just work with bytes directly)

Judy really doesn't fit well with managed languages. I don't think you'll be able to use something like SWIG and get the first layer done automatically.
I wrote PyJudy and I ended up having to make some non-trivial API changes to fit well in Python. For example, I wrote in the documentation:
JudyL arrays map machine words to
machine words. In practice the words
store unsigned integers or pointers.
PyJudy supports all four mappings as
distinct classes.
pyjudy.JudyLIntInt - map unsigned
integer keys to unsigned integer
values
pyjudy.JudyLIntObj - map unsigned
integer keys to Python object values
pyjudy.JudyLObjInt - map Python
object keys to unsigned integer
values
pyjudy.JudyLObjObj - map Python
object keys to Python object values
I haven't looked at the code for a few years so my memories about it are pretty hazy. It was my first Python extension library, and I remember I hacked together a sort of template system for code generation. Nowadays I would use something like genshi.
I can't point to alternatives to Judy - that's one reason why I'm searching Stackoverflow.
Edit: I've been told that my timing numbers in the documentation are off from what Judy's documentation suggests because Judy is developed for 64-bit cache lines and my PowerBook was only 32 bits.
Some other links:
Patricia tries (http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Tree/PATRICIA/ )
Double-Array tries (http://linux.thai.net/~thep/datrie/datrie.html)
HAT-trie (http://members.optusnet.com.au/~askitisn/index.html)
The last has comparison numbers for different high-performance trie implementations.

This is proving trickier than I thought. PyJudy might be worth a look, as would be Tie::Judy. There's something on Softpedia, and something Ruby-ish. Trouble is, none of these are .NET specifically.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.