I have a few questions about endian-ness that are related enough that I warrant putting them in as one question:
1) Is endian-ness decided by .Net or by the hardware?
2) If it's decided by the hardware, how can I figure out what endian the hardware is in C#?
3) Does endian-ness affect binary interactions such as ORs, ANDs, XORs, or shifts? I.E. Will shifting once to the right always shift off the least significant bit?
4) I doubt it, but is there a difference in endian-ness from different versions of the .Net framework? I assume they're all the same, but I've learned to stop assuming about some of the lower level details such as this.
If need be, I can ask these as different questions, but I figure anybody who knows the answer to one of these probably knows the answer to all of them (or can point me in a good direction).
1) The hardware.
2) BitConverter.IsLittleEndian
3) Endianness does not affect bitwise operations. Shifting to the right is shifting in the least-significant-bit direction. UPDATE from Oops' comment: However, endianness does affect binary input and output. When reading or writing values larger than a byte (e.g., reading an int from a BinaryReader or using BitConverter), you do have to account for endianness. Once the values are read in correctly, then all bitwise operations act as normal.
4) Most versions of .NET are little endian. Notable exceptions include the XBox and some platforms supported by Mono or the Compact Framework.
1) neither nor... or you can say either
it was decide by the hardware developers. And you must decide about it if you write software that read/writes certain file formats without using external libraries.
There is no problem with endiannes If you read from a file
* a byte
but you have to decide how to interprete it in little or big endian format
if you read all other primitive data types like
* integers,
* strings,
* floats.
The hardware does not help here. The BitConverter does not help here. Only the documentation of the file format could help and testing your code of course...
edit: found ad good explanation here:
http://betterexplained.com/articles/understanding-big-and-little-endian-byte-order/
Related
I am developing an application in C# with spectrogram drawing functionality.
For my fist try, I used MathNet.Numerics, and now I am continuing to develop with alglib. When I changed from one to the other, I noticed that the output differs for them. Mathnet uses some kind of correction by default, which alglib seems to omit. I am not really into signal processing, also a newbie to programming, and I could not figure out what the difference exactly comes from.
MathNet default output (raw magnitude) values are ranging from ~0.1 to ~274 in my case.
And with alglib I get values ranging from ~0.2 to ~6220.
I found that MathNet Fourier.Forward uses a default scaling option. Here is says, the FourierOptions.Default is "Universal; Symmetric scaling and common exponent (used in Maple)."
https://numerics.mathdotnet.com/api/MathNet.Numerics.IntegralTransforms/FourierOptions.htm
If I use FourierOptions.NoScaling, the output is the same as what alglib produces.
In MathNet, I used Fourier.Forward function: https://numerics.mathdotnet.com/api/MathNet.Numerics.IntegralTransforms/Fourier.htm#Forward
In case of alglib, I used fftr1d function: https://www.alglib.net/translator/man/manual.csharp.html#sub_fftr1d
What is that difference in their calculation?
What is the function that I could maybe use to convert alglib output magnitude to that of MathNet, or vice versa?
In what cases should I use these different "scalings"? What are they for exactly?
Please share your knowledge. Thanks in advance!
I worked it out by myself, after reading a bunch of posts mentioning different methods of FFT output scaling. I still find this aspect of FFT processing heavily unsdocumented everywhere. I have not yet found any reliable source that explains what is the use of these scalings, which fields of sciences or what processing methods use them.
I have yet found out three different kinds of scalings, regarding the raw FFT output (complexes' magnitudes). This means multiplying them by: 1. 1/numSamples 2. 2/numSamples 3. 1/sqrt(numSamples) 4. (no scaling)
MathNet.IntegralTransforms.Fourier.Forward function (and according to various posts on the net, also possibly Matlab and Maple) by default, uses the third one. Which results in the better distinguishable graphical output when using logarithmic colouring, in my opinion.
I would still be grateful if you know something and share your ideas, or if you can reference a good paper explaining on these.
I have prototyped a library with some image-processing algorithms in Python/Numpy/Scipy, and now I want to port the code to C# and WPF.
I have realized that, although the input files are images (photographs), conceptually what matters to my domain problem is that they are bidimensional arrays of floats, and the operations I perform (grayscale conversion, blur, blob detection, skeletonization), and even persistence, are best performed in floating-point "space", rather than in integer space (which means bytes - uint8 -, usually).
So, I took a look at .NET namespaces, and there are a lot of "Drawing" this, "Imaging" that, "Media" something, and I am utterly confused.
So, the question is: Which .NET class is the most obvious and commonly used "image data container" for floating point image processing.
I know about AForge, but since I am learning C# and my image-processing needs are not so heavy at this point, I'd like to give native .NET a chance (but that could be a bad idea anyway, so please let me know if it is).
Based on what you already have, why not looking for the same libraries you used in Python but for C#/.NET? for example, for numeric calculations look at:
Project:
http://numerics.mathdotnet.com/
Examples: https://github.com/mathnet/mathnet-numerics/tree/master/src/Examples
And for examples of image processing, maybe looking at the source code of Paint.NET (its latest open sourced version - openpdn Fork of Paint.NET 3.36.7) may give you an idea of what libraries to use for images:
http://code.google.com/p/openpdn/source/browse/#hg%2Fsrc
Both libraries are in C#.
Since C# supports Int8, Int16, Int32 and Int64, why did the designers of the language choose to define int as an alias for Int32 instead of allowing it to vary depending on what the native architecture considers to be a word?
I have not had any specific need for int to behave differently than the way it does, I am only asking out of pure encyclopedic interest.
I would think that a 64-bit RISC architecture could conceivably exist which would most efficiently support only 64-bit quantities, and in which manipulations of 32-bit quantities would require extra operations. Such an architecture would be at a disadvantage in a world in which programs insist on using 32-bit integers, which is another way of saying that C#, becoming the language of the future and all, essentially prevents hardware designers from ever coming up with such an architecture in the future.
StackOverflow does not encourage speculating answers, so please answer only if your information comes from a dependable source. I have noticed that some members of SO are Microsoft insiders, so I was hoping that they might be able to enlighten us on this subject.
Note 1: I did in fact read all answers and all comments of SO: Is it safe to assume an int will always be 32 bits in C#? but did not find any hint as to the why that I am asking in this question.
Note 2: the viability of this question on SO is (inconclusively) discussed here: Meta: Can I ask a “why did they do it this way” type of question?
I believe that their main reason was portability of programs targeting CLR. If they were to allow a type as basic as int to be platform-dependent, making portable programs for CLR would become a lot more difficult. Proliferation of typedef-ed integral types in platform-neutral C/C++ code to cover the use of built-in int is an indirect hint as to why the designers of CLR decided on making built-in types platform-independent. Discrepancies like that are a big inhibitor to the "write once, run anywhere" goal of execution systems based on VMs.
Edit More often than not, the size of an int plays into your code implicitly through bit operations, rather than through arithmetics (after all, what could possibly go wrong with the i++, right?) But the errors are usually more subtle. Consider an example below:
const int MaxItem = 20;
var item = new MyItem[MaxItem];
for (int mask = 1 ; mask != (1<<MaxItem) ; mask++) {
var combination = new HashSet<MyItem>();
for (int i = 0 ; i != MaxItem ; i++) {
if ((mask & (1<<i)) != 0) {
combination.Add(item[i]);
}
}
ProcessCombination(combination);
}
This code computes and processes all combinations of 20 items. As you can tell, the code fails miserably on a system with 16-bit int, but works fine with ints of 32 or 64 bits.
Unsafe code would provide another source of headache: when the int is fixed at some size (say, 32) code that allocates 4 times the number of bytes as the number of ints that it needs to marshal would work, even though it is technically incorrect to use 4 in place of sizeof(int). Moreover, this technically incorrect code would remain portable!
Ultimately, small things like that play heavily into the perception of platform as "good" or "bad". Users of .NET programs do not care that a program crashes because its programmer made a non-portable mistake, or the CLR is buggy. This is similar to the way the early Windows were widely perceived as non-stable due to poor quality of drivers. To most users, a crash is just another .NET program crash, not a programmers' issue. Therefore is is good for perception of the ".NET ecosystem" to make the standard as forgiving as possible.
Many programmers have the tendency to write code for the platform they use. This includes assumptions about the size of a type. There are many C programs around which will fail if the size of an int would be changed to 16 or 64 bit because they were written under the assumption that an int is 32 bit. The choice for C# avoids that problem by simply defining it as that. If you define int as variable depending on the platform you by back into that same problem. Although you could argue that it's the programmers fault of making wrong assumptions it makes the language a bit more robust (IMO). And for desktop platforms a 32 bit int is probably the most common occurence. Besides it makes porting native C code to C# a bit easier.
Edit: I think you write code which makes (implicit) assumptions about the sizer of a type more often than you think. Basically anything which involves serialization (like .NET remoting, WCF, serializing data to disk, etc.) will get you in trouble if you allow variables sizes for int unless the programmer takes care of it by using the specific sized type like int32. And then you end up with "we'll use int32 always anyway just in case" and you have gained nothing.
with .NET things are fairly simple - it is all (including ARM ASFAIK) running little endian .
The question that I have is: what is happing on Mono and (potentially) big endian systems? Do the bits reverse (when compared to x86) in Int32 / Int64 structure or does the framework force little endian rule-set?
Thanks
Your assertion that all MS .NET are little endian is not correct. It depends on the architecture that you are running on - the CLR spec says so:
From the CLI Annotated Standard (p.161) — Partition I, section 12.6.3: "Byte Ordering":
For data types larger than 1 byte, the byte ordering is dependent on the target CPU. Code that depends on byte ordering may not run on all platforms. [...]
(taken from this SO answer)
See this answer for more information on the internals of BitConverter and how it handles endianness.
A list of behavioral changes I can think of at the moment (unchecked and incomplete):
IPAddress.HostToNetworkOrder and IPAddress.NetworkToHostOrder
Nearly everything in BitConverter
BinaryReader and BinaryWriter (EDIT: From documentation: "BinaryReader reads this data type in little-endian format.")
Binary serialization
Everything that reads and writes Unicode in default encoding from/to streams (UnicodeEncoding) (EDIT: Default is defined as little endian)
and of course every (runtime library) function using these.
Usually Microsoft doesn't mention endianness in their docs - with some strange exceptions. For instance, BinaryReader.ReadUInt16 is defined to read little endian. Nothing mentioned for the other methods. One may assume that binary serialization is always little-endian, even on big-endian machines.
Note that XNA on XBox360 is big-endian, so this not just a theoretical problem with Mono.
c#/.Net does not make any claims on endian. int32/64 are atomic not structures.
As far as I know such conversion would happen outside the scope of your code and hidden to you. It's called "managed code" for some reasons, including such potential issues.
To know if bytes are "reversed", just check BitConverter.IsLittleEndian:
if (BitConverter.IsLittleEndian)
{
// reverse bytes
}
Considering how similar .Net and Mono are by design, I'd say they probably handle endianness the same.
You can always test it by creating a managed int with a known value, then using reflection or marshalling to access the memory and take a look.
I'm trying to write a simple reader for AutoCAD's DWG files in .NET. I don't actually need to access all data in the file so the complexity that would otherwise be involved in writing a reader/writer for the whole file format is not an issue.
I've managed to read in the basics, such as the version, all the header data, the section locator records, but am having problems with reading the actual sections.
The problem seems to stem from the fact that the format uses a custom method of storing some data types. I'm going by the specs here:
http://www.opendesign.com/files/guestdownloads/OpenDesign_Specification_for_.dwg_files.pdf
Specifically, the types that depend on reading in of individual bits are the types I'm struggling to read. A large part of the problem seems to be that C#'s BinaryReader only lets you read in whole bytes at a time, when in fact I believe I need the ability to read in individual bits and not simply 8 bits or a multiple of at a time.
It could be that I'm misunderstanding the spec and how to interpret it, but if anyone could clarify how I might go about reading in individual bits from a stream, or even how to read in some of the variables types in the above spec that require more complex manipulation of bits than simply reading in full bytes then that'd be excellent.
I do realise there are commercial libraries out there for this, but the price is simply too high on all of them to be justifiable for the task at hand.
Any help much appreciated.
You can always use BitArray class to do bit wise manipulation. So you read bytes from file and load them into BitArray and then access individual bits.
For the price of any of those libraries you definitely cannot develop something stable yourself. How much time did you spend so far?