What is the fastest way to convert int to char?
I need a faster way because,
convertingchar = Convert.ToChar(intvalue);
is my slowest part in the program.
I have multiple different int values, that have to be converted to char.
My project is very big and I can't post my function. That's why I post this test function.
My Code so far.
char convertingchar = ' ';
...some code...
public void convert(int intvalue){
convertingchar = Convert.ToChar(intvalue);
}
Running a quick performance test between your Convert.ToChar approach and the mentioned casting one, I find that 65535 tests (char.MaxValue):
Convert: 00:00:00.0005447 total
Cast: 00:00:00.0003663 total
At its best, cast was running for me in about half the time of convert.
The implementation of Convert.ToChar as found in Reference Source reveals the time consumer:
public static char ToChar(int value) {
if (value < 0 || value > Char.MaxValue) throw new OverflowException(Environment.GetResourceString("Overflow_Char"));
Contract.EndContractBlock();
return (char)value;
}
While these checks do certainly serve a purpose, your particular use-cases, especially in such a performance-critical situation, may not require them. That's up to you.
A nice alternative to enforce these checks would be to use a ushort rather than an int. Obviously that may or may not be attainable, but with the same maximum value, this means you'll get compile-time checking for what you were previously depending on ToChar to perform.
Convert.ToChar eventually performs an explicit conversion as (char)value, where value is your int value. Before doing so, it checks to ensure value is in the range 0 to 0xffff, and throws an OverflowException if it is not. The extra method call, value/boundary checks, and OverflowException may be useful, but if not, the performance will be better if you just use (char)value.
This will make sure everything is ok while converting but take some time while making sure of that,
convertingchar = Convert.ToChar(intvalue);
This will convert it without making sure everything is ok so less time,
convertingchar = (char)intvalue;
For example.
Console.WriteLine("(char)122 is {0}", (char)122);
yields:
(char)122 is z
NOTE
Not related to question directly but if you feel that Conversion is slow then you might be doing something wrong. The question is why do you need to convert the lot of the int to char. What you are trying to achieve. There might be better way.
The fastest way, contrary to what the others have noted is to not run any code at all. In all the other cases, there is memory allocated for the int and memory allocated for the char. Thus the best that can be achieved is simply copy the int to the char address.
However this code is 100% faster, since no code is run at all.
[StructLayout(LayoutKind.Explicit)]
public struct Foo
{
[FieldOffset(0)]
public int Integer;
[FieldOffset(0)]
public char Char;
}
https://msdn.microsoft.com/en-us/library/aa288471%28v=vs.71%29.aspx
Related
Edit 3 describes the narrowed-down problem after debugging
Edit 4 contains the solution - it's all about type difference between C and C#
Today I came across a curious problem. In C I have the following struct:
typedef struct s_z88i2
{
long Node;
long DOF;
long TypeFlag;
double CValue;
}s_z88i2;
Furthermore I have a function (this is a simplified version):
DLLIMPORT int pass_i2(s_z88i2 inp, int total)
{
long nkn=0,ifg=0,iflag1=0;
double wert=0;
int val;
// Testplace 1
nkn=inp.Node;
ifg=inp.DOF;
iflag1=inp.TypeFlag;
wert=inp.CValue;
// Testplace 2
return 0;
}
The assigned values are used nowhere - I'm aware of that.
When I reach // Testplace 1 the following statement is executed:
char tmpstr[256];
sprintf(tmpstr,"RB, Node#: %li, DOF#: %li, Type#: %li, Value: %f", inp.Node, inp.DOF, inp.TypeFlag, inp.CValue);
tmpstr then is passed to a messagebox. It shows - as one would expect - the values given in my struct I passed to the function in an nice and orderly way. Moving on through the function the values inside the struct get assigned to some variables. On reaching Testplace 2 the following is executed:
sprintf(tmpstr,"RB, Node#: %li, DOF#: %li, Type#: %li, Value: %f",nkn, ifg, iflag1, wert);
Again, tmpstr is passed to a messagebox. However, this doesn't show what one would expect. The values for Node and Typeare still correct. For DOFand Value the displayed values are 0 which leads me to the conclusion that something is going terribly wrong during assigning the values. I somehow sometimes managed to get a way to long number for value whis was as incorrect as 0. But I have not been able to reproduce that mistake during my last tests.
Possible values for inp are e.g. {2,1,1,-451.387}, so the first 1 and -451.387are forgotten.
Does anyone know what I'm doing wrong or how to fix this?
Many thanks in advance!
Edit:
Changed %ito %li but the result did not change. Thank to unwind!
I'm developing this dll with Dev-Cpp using MinGW (unfortunately) because I wasn't able to convince Visual Studio 2012 Pro to compile this properly. Although the documentation of the original source says it is plain ANSI-C. This bugs me a bit because I cannot debug this dll properly with Dev-Cpp. Hence the messageboxes.
Edit 2:
As Neil Townsend suggested, I switched to passing a reference. But this also did not cure the problem. When I access the values in my struct directly everything is fine. When I assign them to variables some get lost.
A short notice on how I'm calling the function. The dll is to be accessed from C#, so I'm meddeling with P/Invoke (as I get it).
[DllImport("z88rDLL", CallingConvention = CallingConvention.Cdecl)]
public static extern int pass_i2(ref s_z88i2 inp, int total);
is my definition in C#. I have lots of other functions imported and they all work fine. It is this function I encounter these Problems for the first time. I call the function via:
s_z88i2 tmpi2 = FilesZ88.z88i2F.ConstraintsList[i];
int res = SimulationsCom.pass_i2(ref tmpi2, FilesZ88.z88i2F.ConstraintsList.Count);
First I set the struct, then I call the function.
Why Oh Why has VS to be picky when it comes to compiling ANSI-C? It certainly would make things easier.
Edit 3:
I can narrow the problem down to sprintf, I think. Having convinced VS to build my dll I was able to step through it. It appears that the values are assigned very nicely indeed to the variables they belong in. If, however I want to print these variables via sprintf they turn out rather empty (0). Curiously, that the value is always 0and not something else. I'm still interested in why sprintfbehaves that way, but I consider my initial problem solved/panic defeated. So thanks everyone!
Edit 4:
As supercat points out below, I had a rethink about type-compatibility between C and C#. I was aware that an int in C# evaluates as a long in C. But after double-checking I found that in C my variables are really FR_INT4 (which I kept out of the original question for reasons of clarity => bad idea). Internally FR_INT4 is defined as: #define FR_INT4 long long, so as a super-long-long. A quick test showed that passing a long from C# gives the best compatibility. So the sprintf-issue can maybe be simplified to the question: "What is the format-identifier of a long long?".
It is %lli which would is quite simple, actually. So I can announce drumroll that my problem really is solved!
sprintf(tmpstr,"RB, Node#: %lli, DOF#: %lli, Typ#: %lli, Wert: %f\n", inp.Node, inp.DOF, inp.TypeFlag, inp.CValue);
returns every value I want. Thank you very much everyone!
Formatting a value of type long with the format specifier %i is not valid. You should use %li.
In C, it is a better approach to pass a reference or pointer to the struct rather than the struct. So:
DLLIMPORT int pass_i2(s_z88i2 *inp, int total) {
long nkn=0,ifg=0,iflag1=0;
double wert=0;
int val;
// Testplace 1
nkn=inp->Node;
ifg=inp->DOF;
iflag1=inp->TypeFlag;
wert=inp->CValue;
// Testplace 2
return 0;
}
You will need to correct the sprintf liness accordingly, inp.X becomes inp->X. To use this function either:
// Option A - create it in a declaration, fill it, and send a pointer to that
struct s_z88i2 thing;
// fill out thing
// eg. thing.Node = 2;
pass_i2(&thing, TOTAL);
or:
// Option B - create a pointer; create the memory for the struct, fill it, and send the pointer
struct s_z88i2 *thing;
thing = malloc(sizeof(struct s_z88i2));
// fill out thing
// eg thing->Node = 2;
pass_i2(thing, TOTAL);
This way pass_i2 will operate on the struct you send it, and any changes it makes will be there on return from pass_i2.
To clarify this as answered:
My struct actually is:
typedef struct s_z88i2
{
long long Node;
long long DOF;
long long TypeFlag;
double CValue;
}s_z88i2;
which requires long to be passed from C# (and not int as I previously thought). Through debugging I found out that the assignment of values behaves as it should, the problem was within sprintf. If I use %lli as format-identifier even this problem is solved.
sprintf(tmpstr,"RB, Node#: %lli, DOF#: %lli, Typ#: %lli, Wert: %f\n", inp.Node, inp.DOF, inp.TypeFlag, inp.CValue);
Is the statement I need to use. So thanks again everyone who contributed!
I have an application that needs to take in several million char*'s as an input parameter (typically strings less than 512 characters (in unicode)), and convert and store them as .net strings.
It turning out to be a real bottleneck in the performance of my application. I'm wondering if there's some design pattern or ideas to make it more effecient.
There is a key part that makes me feel like it can be improved: There are a LOT of duplicates. Say 1 million objects are coming in, there might only be like 50 unique char* patterns.
For the record, here is the algorithm i'm using to convert char* to string (this algorithm is in C++, but the rest of the project is in C#)
String ^StringTools::MbCharToStr ( const char *Source )
{
String ^str;
if( (Source == NULL) || (Source[0] == '\0') )
{
str = gcnew String("");
}
else
{
// Find the number of UTF-16 characters needed to hold the
// converted UTF-8 string, and allocate a buffer for them.
const size_t max_strsize = 2048;
int wstr_size = MultiByteToWideChar (CP_UTF8, 0L, Source, -1, NULL, 0);
if (wstr_size < max_strsize)
{
// Save the malloc/free overhead if it's a reasonable size.
// Plus, KJN was having fits with exceptions within exception logging due
// to a corrupted heap.
wchar_t wstr[max_strsize];
(void) MultiByteToWideChar (CP_UTF8, 0L, Source, -1, wstr, (int) wstr_size);
str = gcnew String (wstr);
}
else
{
wchar_t *wstr = (wchar_t *)calloc (wstr_size, sizeof(wchar_t));
if (wstr == NULL)
throw gcnew PCSException (__FILE__, __LINE__, PCS_INSUF_MEMORY, MSG_SEVERE);
// Convert the UTF-8 string into the UTF-16 buffer, construct the
// result String from the UTF-16 buffer, and then free the buffer.
(void) MultiByteToWideChar (CP_UTF8, 0L, Source, -1, wstr, (int) wstr_size);
str = gcnew String ( wstr );
free (wstr);
}
}
return str;
}
You could use each character from the input string to feed a trie structure. At the leaves, have a single .NET string object. Then, when a char* comes in that you've seen previously, you can quickly find the existing .NET version without allocating any memory.
Pseudo-code:
start with an empty trie,
process a char* by searching the trie until you can go no further
add nodes until your entire char* has been encoded as nodes
at the leaf, attach an actual .NET string
The answer to this other SO question should get you started: How to create a trie in c#
There is a key part that makes me feel like it can be improved: There are a LOT of duplicates. Say 1 million objects are coming in, there might only be like 50 unique char* patterns.
If this is the case, you may want to consider storing the "found" patterns within a map (such as using a std::map<const char*, gcroot<String^>> [though you'll need a comparer for the const char*), and use that to return the previously converted value.
There is an overhead to storing the map, doing the comparison, etc. However, this may be mitigated by the dramatically reduced memory usage (you can reuse the managed string instances), as well as saving the memory allocations (calloc/free). Also, using malloc instead of calloc would likely be a (very small) improvement, as you don't need to zero out the memory prior to calling MultiByteToWideChar.
I think the first optimization you could make here would be to make your first try calling MultiByteToWideChar start with a buffer instead of a null pointer. Because you specified CP_UTF8, MultiByteToWideChar must walk over the whole string to determine the expected length. If there is some length which is longer than the vast majority of your strings, you might consider optimistically allocating a buffer of that size on the stack; and if that fails, then going to dynamic allocation. That is, move the first branch if your if/else block outside of the if/else.
You might also save some time by calculating the length of the source string once and passing it in explicitly -- that way MultiByteToWideChar doesn't have to do a strlen every time you call it.
That said, it sounds like if the rest of your project is C#, you should use the .NET BCL class libraries designed to do this rather than having a side by side assembly in C++/CLI for the sole purpose of converting strings. That's what System.Text.Encoding is for.
I doubt any kind of caching data structure you could use here is going to make any significant difference.
Oh, and don't ignore the result of MultiByteToWideChar -- not only should you never cast anything to void, you've got undefined behavior in the event MultiByteToWideChar fails.
I would probably use a cache based on a ternary tree structure, or similar, and look up the input string to see if it's already converted before even converting a single character to .NET representation.
As I am trying to code a C# application from an existing application but developed in Delphi,
Very tough but managed some how till, but now I have come across a problem...
Delphi code contains following code:
type
TFruit = record
name : string[20];
case isRound : Boolean of // Choose how to map the next section
True :
(diameter : Single); // Maps to same storage as length
False :
(length : Single; // Maps to same storage as diameter
width : Single);
end;
i.e. a variant record (with case statement inside) and accordingly record is constructed and its size too.
On the other hand I am trying to do the same in C# struct, and haven't succeeded yet, I hope somemone can help me here.
So just let me know if there's any way I can implement this in C#.
Thanks in advance....
You could use an explicit struct layout to replicate this Delphi variant record. However, I would not bother since it seems pretty unlikely that you really want assignment to diameter to assign also to length, and vice versa. That Delphi record declaration looks like it dates from mid-1990s style of Delphi coding. Modern Delphi code would seldom be written that way.
I would just do it like this:
struct Fruit
{
string name;
bool isRound;
float diameter; // only valid when isRound is true
float length; // only valid when isRound is false
float width; // only valid when isRound is false
}
A more elegant option would be a class with properties for each struct field. And you would arrange that the property getters and setters for the 3 floats raised exceptions if they were accessed for an invalid value of isRound.
Perhaps this will do the trick?
This is NOT a copy-and-paste solution, note that the offsets and data sizes may need to be changed depending on how the Delphi structure is declared and/or aligned.
[StructLayout(LayoutKind.Explicit)]
unsafe struct Fruit
{
[FieldOffset(0)] public fixed char name[20];
[FieldOffset(20)] public bool IsRound;
[FieldOffset(21)] public float Diameter;
[FieldOffset(21)] public float Length;
[FieldOffset(25)] public float Width;
}
It depends on what you are trying to do.
If you're simply trying to make a corresponding structure then look at David Heffernan's answer. These days there is little justification for mapping two fields on top of each other unless they truly represent the same thing. (Say, individual items or the same items in an array.)
If you're actually trying to share files you need to look along the lines of ananthonline's answer but there's a problem with it that's big enough I couldn't put it in a comment:
Not only is there the Unicode issue but a Delphi shortstring has no corresponding structure in C# and thus it's impossible to simply map a field on top of it.
That string[20] actually comprises 21 bytes, a one-byte length code and then 20 characters worth of data. You have to honor the length code as there is no guarantee of what lies beyond the specified length--you're likely to find garbage there. (Hint: If the record is going to be written to disk always zap the field before putting new data in it. It makes it much easier to examine the file on disk when debugging.)
Thus you need to declare two fields and write code to process it on both ends.
Since you have to do that anyway I would go further and write code to handle the rest of it so as to eliminate the need for unsafe code at all.
I have a fast bit-level routine that calculates a value, and returns a UInt32. I need to store this value in SQL Server in a 32 bit int field. I don't want to increase the size of the field, and just want to store the "bytes" from this function the int field.
Hundreds of these records are requested at a time, so I need the fastest way to convert a UInt32 to int in a loop. If the left-most bit is set in the UInt32, it should set the sign bit of the int (or do anything "repeatable", really, but the sign bit would probably be easiest).
In other words, I just want the 4 bytes of a UInt32 to become the 4 bytes of an 32 bit int. I could use the BitConverter class, but I'm not sure that's the fastest way. Would it be faster to do this with an unchecked area like this:
UInt32 fld = 4292515959;
unchecked {
return (int)fld;
// -2451337
}
I see the reverse question has been asked here, and was just wondering if the answer would be the same going the other way:
Fastest way to cast int to UInt32 bitwise?
I'd say the unchecked version (like unchecked((int)x)) is the fastest way, since there's no method call. I don't believe there's a faster way.
By the way, UInt32 is just another name for uint... going one way is the same as going another way in terms of performance, so this is really the same as the link you posted.
Edit: I remember observing first-hand an instance of a benchmark where checked was faster than unchecked (and no, it wasn't a debug build, it was a release build with optimizations). I don't know why that happened, but in any case, don't think that you'll gain anything measurable by turning of overflow checking.
unchecked((int)x) is required only casting consts and checked and unchecked produces the same results (if the code can compile).
For example this code
uint data = 4292515959;
int uncheckedData;
int checkedData;
unchecked {
uncheckedData = (int)data;
}
checkedData = (int)data;
Console.WriteLine(data);
Console.WriteLine(uncheckedData);
Console.WriteLine(checkedData);
produces this output
4292515959
-2451337
-2451337
To be more concise, this code can be compiled (same result as unchecked((int)data))
uint data = 4292515959;
checkedData = (int)data;
This code (note the const) can't be compiled (requires unchecked)
const uint data = 4292515959;
checkedData = (int)data;
This code can't be compiled as well (requires unchecked)
checkedData = (int)4292515959;
I'm working on a networking application in C#, sending a lot of plain numbers across the network. I discovered the IPAddress.HostToNetworkOrder and IPAddress.NetworkToHostOrder methods, which are very useful, but they left me with a few questions:
I know I need to encode and decode integers, what about unsigned ones? I think yes, so at the moment I'm doing it by casting a pointer to the unsigned int into a pointer to an int, and then doing a network conversion for the int (since there is no method overload that takes unsigned ints)
public static UInt64 HostToNetworkOrder(UInt64 i)
{
Int64 a = *((Int64*)&i);
a = IPAddress.HostToNetworkOrder(a);
return *((UInt64*)&a);
}
public static UInt64 NetworkToHostOrder(UInt64 a)
{
Int64 i = *((Int64*)&a);
i = IPAddress.HostToNetworkOrder(i);
return *((UInt64*)&i);
}
2. What about floating point numbers (single and double). I think no, however If I do need to should I do a similar method to the unsigned ints and cast a single pointer into a int pointer and convert like so?
EDIT:: Jons answer doesn't answer the second half of the question (it doesn't really answer the first either!), I would appreciate someone answering part 2
I suspect you'd find it easier to use my EndianBinaryReader and EndianBinaryWriter in MiscUtil - then you can decide the endianness yourself. Alternatively, for individual values, you can use EndianBitConverter.
You'd better read several RFC documents to see how different TCP/IP protocols (application level, for example, HTTP/FTP/SNMP and so on).
This is generally speaking, a protocol specific question (both your questions), as your packet must encapsulate the integers or floating point number in a protocol defined format.
For SNMP, this is a conversion that changing an integer/float number to a few bytes and changing it back. ASN.1 is used.
http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One