Returning string from delphi dll to c# - c#

I am trying to separate an encryption function from our legacy code to a dll which I can call from C#, but I am having issues getting it to work and I keep getting access violations when calling the dll.
I am not sure where the AV happens because delphi has a hard time hitting my breakpoints when the dll is attached to another process.
I got it to work yesterday using David Heffernan's answer here: Returning a string from delphi dll to C# caller in 64 bit
But my success was short-lived as I changed the string parameters to regular string's (delphi) saw it didn't work and changed them back to to AnsiString (our encryption routine expects Ansi). Since I changed these param types. I have not been able to get it to work again.
Here is my Delphi Code:
procedure Encrypt(const Source: AnsiString; const Key: AnsiString; var OutPut:PAnsiChar; const OutputLength: Integer);
var
EncryptedString, EncodedString: AnsiString;
begin
EncryptedString := Crypt(Source, Key);
EncodedString := Encode(EncryptedString);
if Length(EncodedString) <= OutputLength then
System.AnsiStrings.StrPCopy(Output, EncodedString);
end;
exports
Encrypt;
My C# caller:
[DllImport("AsmEncrypt.dll", CharSet = CharSet.Ansi)]
public static extern void Encrypt(string password, string key, StringBuilder output, int outputlength);
// using like this:
Encrypt(credentials.Password, myKey, str, str.Capacity);
My best bet right now is that I've goofed some of the arguments to the dll since it seems to crash before it reaches an OutputDebugStr() I had put on first line of Encrypt()
All help will be greatly appreciated

Change the Delphi function to
procedure Encrypt(Source, Key, OutPut: PAnsiChar; OutputLength: Integer); stdcall;
in order to make this code work.
You should probably also make the length argument IN/OUT so that the caller can resize the string builder object once the call returns. That would also allow the callee to signal any errors to the caller, another flaw in your current design.
I must also say that using AnsiString as a byte array is a recipe for failure. It's high time you started doing encryption right. If you have text, then encode it as a byte array with a specific encoding, usually this means UTF-8. Then encrypt that byte array to another byte array.

From this docs page:
The AnsiString structure contains a 32-bit length indicator, a 32-bit reference count, a 16-bit data length indicating the number of bytes per character, and a 16-bit code page.
So an AnsiString isn't simply a pointer to an array of characters -- it's a pointer to a special structure which encodes a bunch of information.
However, .NET's P/Invoke machinery is going to pass a pointer to an array of characters. Delphi is going to try and interpret that as a pointer to its special AnsiString structure, and things aren't going to go well.
I think you're going to have a hard time using AnsiString in interop. You're better off choosing a string type which both .NET and Delphi know about. If you then need to convert that to AnsiString, do that in Delphi.

Related

_bstr_r vs _T("")

I have a .net library which is registered as a COM object, when importing the .tlb file in a C++ project I'm getting such a method declaration
virtual HRESULT __stdcall GetBid (
/*[in]*/ BSTR symbol,
/*[out,retval]*/ double * pRetVal ) = 0;
for the .NET equivalent
double GetBid(string symbol);
now I'm trying to call it like this
double bid;
ptr->GetBid(_T("AAPL"), &bid);
which doesn't work as expected, because on the .NET side the string parameter is actually an empty string.
If I change to such a call
double bid;
ptr->GetBid(_bstr_t("AAPL"), &bid);
everything works as expected.
Why both calls are compiled fine, but the result is different? Shouldn't the first call be converted into a correct marshaling of string?
Thx for any under the hood information about BSTR magic :)
BSTR has a 32-bit length preceeding the string. Thus, BSTR can contain embedded nulls.
_T("AAPL") creates a wchar_t * with a terminating null, but with no length prefix.
Under the hood though, both are wchar_t * so the call compiles and there is no conversion necessary. You were somewhat lucky because worse things could happen than just getting no string on the other side. The marshaler might look at the _T("AAPL") count back 32-bits, and happen to get a reeeeaally long length value by luck, which would be bad. :-)
Youl would get an automatic conversion if the parameter was defined as _bstr_t, since that would invoke the _bstr_t(wchar_t *) constructor.
Because BSTR is a pointer to a wide character string, but it's not mean that you can just assign simple const wchar_t* string. For working with BSTR you need to use a few system functions, line SysAllocString() for creating BSTR. _bstr_t class incapsulate all this stuff

Optimizing several million char* to string conversions

I have an application that needs to take in several million char*'s as an input parameter (typically strings less than 512 characters (in unicode)), and convert and store them as .net strings.
It turning out to be a real bottleneck in the performance of my application. I'm wondering if there's some design pattern or ideas to make it more effecient.
There is a key part that makes me feel like it can be improved: There are a LOT of duplicates. Say 1 million objects are coming in, there might only be like 50 unique char* patterns.
For the record, here is the algorithm i'm using to convert char* to string (this algorithm is in C++, but the rest of the project is in C#)
String ^StringTools::MbCharToStr ( const char *Source )
{
String ^str;
if( (Source == NULL) || (Source[0] == '\0') )
{
str = gcnew String("");
}
else
{
// Find the number of UTF-16 characters needed to hold the
// converted UTF-8 string, and allocate a buffer for them.
const size_t max_strsize = 2048;
int wstr_size = MultiByteToWideChar (CP_UTF8, 0L, Source, -1, NULL, 0);
if (wstr_size < max_strsize)
{
// Save the malloc/free overhead if it's a reasonable size.
// Plus, KJN was having fits with exceptions within exception logging due
// to a corrupted heap.
wchar_t wstr[max_strsize];
(void) MultiByteToWideChar (CP_UTF8, 0L, Source, -1, wstr, (int) wstr_size);
str = gcnew String (wstr);
}
else
{
wchar_t *wstr = (wchar_t *)calloc (wstr_size, sizeof(wchar_t));
if (wstr == NULL)
throw gcnew PCSException (__FILE__, __LINE__, PCS_INSUF_MEMORY, MSG_SEVERE);
// Convert the UTF-8 string into the UTF-16 buffer, construct the
// result String from the UTF-16 buffer, and then free the buffer.
(void) MultiByteToWideChar (CP_UTF8, 0L, Source, -1, wstr, (int) wstr_size);
str = gcnew String ( wstr );
free (wstr);
}
}
return str;
}
You could use each character from the input string to feed a trie structure. At the leaves, have a single .NET string object. Then, when a char* comes in that you've seen previously, you can quickly find the existing .NET version without allocating any memory.
Pseudo-code:
start with an empty trie,
process a char* by searching the trie until you can go no further
add nodes until your entire char* has been encoded as nodes
at the leaf, attach an actual .NET string
The answer to this other SO question should get you started: How to create a trie in c#
There is a key part that makes me feel like it can be improved: There are a LOT of duplicates. Say 1 million objects are coming in, there might only be like 50 unique char* patterns.
If this is the case, you may want to consider storing the "found" patterns within a map (such as using a std::map<const char*, gcroot<String^>> [though you'll need a comparer for the const char*), and use that to return the previously converted value.
There is an overhead to storing the map, doing the comparison, etc. However, this may be mitigated by the dramatically reduced memory usage (you can reuse the managed string instances), as well as saving the memory allocations (calloc/free). Also, using malloc instead of calloc would likely be a (very small) improvement, as you don't need to zero out the memory prior to calling MultiByteToWideChar.
I think the first optimization you could make here would be to make your first try calling MultiByteToWideChar start with a buffer instead of a null pointer. Because you specified CP_UTF8, MultiByteToWideChar must walk over the whole string to determine the expected length. If there is some length which is longer than the vast majority of your strings, you might consider optimistically allocating a buffer of that size on the stack; and if that fails, then going to dynamic allocation. That is, move the first branch if your if/else block outside of the if/else.
You might also save some time by calculating the length of the source string once and passing it in explicitly -- that way MultiByteToWideChar doesn't have to do a strlen every time you call it.
That said, it sounds like if the rest of your project is C#, you should use the .NET BCL class libraries designed to do this rather than having a side by side assembly in C++/CLI for the sole purpose of converting strings. That's what System.Text.Encoding is for.
I doubt any kind of caching data structure you could use here is going to make any significant difference.
Oh, and don't ignore the result of MultiByteToWideChar -- not only should you never cast anything to void, you've got undefined behavior in the event MultiByteToWideChar fails.
I would probably use a cache based on a ternary tree structure, or similar, and look up the input string to see if it's already converted before even converting a single character to .NET representation.

Communication between c++ and c# application through network

I have a simple server written in c# listening on some port. I have an application in c++ and I need the application to send some information to the server. This information is a struct containing 5 integers. I was thinking that I can send it also as a string: something like: "ID=3, anotherInt=5...". Is it a good idea? If not, how should I do that?
How to make it work? What is your advice?
Thank you.
I think you have a mistake in your code.
char *ln = "String to send";
connect(client_socket, (struct sockaddr *)&clientService, sizeof(clientService));
send(client_socket,(const char*)&ln, sizeof(ln), 0);
The prototype for send function is:
ssize_t send(int socket, const void *message, size_t length, int flags);
ln is already a pointer to your char buffer. You are passing in &ln, which is the address
of the pointer. Shouldn't it be just "ln"?
You should fix the send() method in client code. sizeof() is wrong way to find the length of string, casts applied on "ln" aren't quite right for what you need there. Check <<this link>> for an example and see how it works for you. BTW, C# code in the server needs some serious re-writing if it were to work predictably. You are using 4096 byte buffer and calls to Read() aren't guaranteed to fetch the entire transmission in one go. You will need a loop just for Read to make sure you are reading everything you need - ofcourse, this needs a clear definition of communication semantics. Happy networking!
First of all, (const char*)&ln is not correct. ln is a char *, so when you take the address of it using & you are getting a char **, which you are then casting to a char *. This is undefined behavior. You'll want to get rid of the & operator. Also you'll probably want to read up on what pointers are and how to use them.
As a rule of thumb, don't cast willy-nilly to make the compiler shut up. The errors are trying to tell you something. However, the sockaddr and sockaddr_in thing is correct; the api is designed that way. If you turn on -Wall in your compiler options, it should give you a warning in the correct place.
ALSO: you want strlen(ln) and not sizeof.
Quick n Dirty Rundown on Pointers
When a type includes * before the variable name, the variable holds a pointer to a value of that type. A pointer is much like a reference in C#, it is a value holding the location of some data. These are commonly passed into functions when a function wants to look at a piece of data that the caller owns, sometimes because it wants to modify it. A string is represented as a char *, which is a pointer to the first character in the string. The two operators that are related to pointers are & and *. & takes an lvalue and returns a pointer to that value. * takes a pointer and returns the value it points to. In this case, you have a string, as a char *, and the function you're calling wants a char *, so you can pass it in directly without casting or using * or &. However, for the other function you have a struct sockaddr_in and it wants a struct sockaddr *, so you (correctly) use & to get a struct sockaddr_in *, then cast it to a struct sockaddr *. This is called "type punning," and is an unpleasant reality of the API. Google will give you a better description of type punning than I can.
connect(client_socket, (struct sockaddr *)&clientService, sizeof(clientService));
this is ok, but this line should read:
send(client_socket,(const char*)ln, strlen(ln), 0);
where the conversion (const char*) can be omitted.
In your code the value of pointer ln is sent (correctly) but you most likely want to send the string it's pointing to in it's entire length.
Concerning the messages to be send: Converting integers to ascii is not a bad idea. You also may have a look at JSON or Googles protobuf format. Formatters or Parsers can easily be written from scratch.

ASP.NET web app calling Delphi DLL on IIS webserver, locks up when returning PChar string

Works fine if I don't return anything, or I return an integer. But if I try to return a PChar, ie..
result := PChar('') or result:= PChar('Hello')
The web app just freezes up and I watch its memory count gradually get higher and higher in task manager.
The odd thing is that the DLL works fine on the VStudio debug server, or through a C# app. The only thing I can think of that would make a difference is that the IIS server is running in 64bit Windows.
It doesn't appear to be a compatability issue though because I can successfully write to text files and do other things from the DLL... I just can NOT return a PChar string.
Tried using PWideChar, tried returning 'something\0', tried everything I could think of. No luck unfortunately.
[DllImport("TheLib.dll", CallingConvention = CallingConvention.StdCall, CharSet = CharSet.Ansi)]
private static extern string SomeFunction();
string result = SomeFunction();
delphi:
library TheLib;
function SomeFunction() : PChar export; stdcall;
begin
return PChar('');
end;
exports
SomeFunction
Dampsquid's analysis is correct so I will not repeat that. However, I prefer a different solution that I feel to be more elegant. My preferred solution for such a problem is to use Delphi Widestring which is a BSTR.
On the Delphi side you write it like this:
function SomeFunction: Widestring; stdcall;
begin
Result := 'Hello';
end;
And on the C# side you do it like this:
[DllImport(#"TheLib.dll")]
[return: MarshalAs(UnmanagedType.BStr)]
private static extern string SomeFunction();
And that's it. Because both parties use the same COM allocator for the memory allocation, it all just works.
Update 1
#NoPyGod interestingly points out that this code fails with a runtime error. Having looked into this I feel it to be a problem at the Delphi end. For example, if we leave the C# code as it is and use the following, then the errors are resolved:
function SomeFunction: PChar; stdcall;
begin
Result := SysAllocString(WideString('Hello'));
end;
It would seem that Delphi return values of type WideString are not handled as they should be. Out parameters and var parameters are handled as would be expected. I don't know why return values fail in this way.
Update 2
It turns out that the Delphi ABI for WideString return values is not compatible with Microsoft tools. You should not use WideString as a return type, instead return it via an out parameter. For more details see Why can a WideString not be used as a function return value for interop?
You cannot return a string like that, the string is local to the function and will be freed as soon as he function returns leaving the returned PChar pointing to an invalid location.
you need to pass in a pointer to be filled within the DLL, dynamically create the string and free it back in the c# code or create a static buffer in yout DLL and return that.
By far the safest way is to pass a pointer into the function ie
function SomeFunction( Buffer: PChar; MaxLength: PInteger ): wordbool; stdcall;
{
// fill in the buffer and set MaxLength to length of data
}
you should set MaxLength to the sixe of the buffer before calling your dll so that the dll can check there is enough space for the data to be returned.
try to enable 32-bit applications in application pool advanced settings :

Passing a C# byte array to LuaInterface

I have a byte array in my C# code that I need to pass into a LuaInterface instance. I can use pack() in Lua, pass the resulting string to C# and convert it with System.Text.Encoding.UTF8.GetBytes(), but going the other way doesn't seem to work.
Is there a simple solution? I'm hoping I can avoid assigning the byte array to a global value.
Edit:
I tried a few new things this morning. I tried using LuaInterface.GetFunction(), and everything works until it hits lua_pushstring() in LuaDLL.cpp. At this point the C# string is converted to a char* via Marshal::StringToHGlobalAnsi().ToPointer(). It looks like this function expects a null terminated string, and my string's first byte is 0 so I get an empty string in my lua code.
Finally traced it down to a the call to ::lua_pushstring() in lapi.c. It called strlen() on the char* passed in. Since my first byte of data was 0, it returned 0. There is an alternate call, lua_pushlstring, that accepts the size of the string as an argument. Changing to call this function fixed the issue.
Try encoding your byte array with System.Text.ASCIIEncoding.ASCII.GetString to get a string that can be passed to Lua.

Categories