BinaryWriter/Reader - Getting unexpected readbacks - c#

Put simply, I'm having trouble writing a binary file, then reading back that same file. This is a settings file, which contains a few strings and an int. I have no problem with writing the strings, but when I get to the int I just can't get a value back when the settings are read.
A little back story: I'm more of a c/c++ programmer, c# isn't totally foreign, but it's not my forte. For instance, I had no idea if I wrote a string, it takes care of the length and reads for you.. just. x.ReadString(). Neat.
Here's some rough code to give you an idea of what I'm doing. Note this is stripped of error checking, and most of the superfluous reads/writes. Just kept it to bare minimal:
public static int LDAPport;
public static string Username;
public static char Ver;
public static int LoadSettings()
{
using (BinaryReader r = new BinaryReader(File.Open("data\\Settings.cfg", FileMode.Open)))
{
char cin = r.ReadChar();
if (cin != Ver)
{
// Corrupted or older style settings
return 1;
}
Username = r.ReadString();
LDAPport = r.ReadInt32();
r.Close();
}
return 0;
}
public static bool SaveSettings()
using (BinaryWriter w = new BinaryWriter(File.Open("data\\Settings.cfg", FileMode.Create)))
{
w.Write(Ver);
w.Write(Username);
w.Write(LDAPport);
w.Close();
}
}
return true;
}
So when it writes 389 (I've verified it is getting the proper value when it goes to write) When read back, LDAPport gets -65145. Which tells me it's either not reading all 4 bytes for an int (it's 4 isn't it?), or it's reading a number that's unsigned (since I'm expecting signed). Casting seemed to do nothing for me.
I've also tried other read methods. .Read() does the same thing, I've tried changing the data type for the number, including UInt32 to no avail. Everything I've read just says simply .Write(389) then .Read()! that simple!.. well no. There doesn't seem to be a method for just int.. I assume it's got those smarts in .Read() yes?
Can someone with a c# brain greater than mine shed some light as to what's going on here? Or has my brain officially turned to jelly at this point? :)

Related

Efficiency of static constant list initialization in C# vs C++ static arrays

I apologize in advance. My domain is mostly C (and C++). I'm trying to write something similar in C#. Let me explain with code.
In C++, I can use large static arrays that are processed during compile-time and stored in a read-only section of the PE file. For instance:
typedef struct _MY_ASSOC{
const char* name;
unsigned int value;
}MY_ASSOC, *LPMY_ASSOC;
bool GetValueForName(const char* pName, unsigned int* pnOutValue = nullptr)
{
bool bResult = false;
unsigned int nValue = 0;
static const MY_ASSOC all_assoc[] = {
{"name1", 123},
{"name2", 213},
{"name3", 1433},
//... more to follow
{"nameN", 12837},
};
for(size_t i = 0; i < _countof(all_assoc); i++)
{
if(strcmp(all_assoc[i].name, pName) == 0)
{
nValue = all_assoc[i].value;
bResult = true;
break;
}
}
if(pnOutValue)
*pnOutValue = nValue;
return bResult;
}
In the example above, the initialization of static const MY_ASSOC all_assoc is never called at run-time. It is entirely processed during the compile-time.
Now if I write something similar in C#:
public struct NameValue
{
public string name;
public uint value;
}
private static readonly NameValue[] g_arrNV_Assoc = new NameValue[] {
new NameValue() { name = "name1", value = 123 },
new NameValue() { name = "name2", value = 213 },
new NameValue() { name = "name3", value = 1433 },
// ... more to follow
new NameValue() { name = "nameN", value = 12837 },
};
public static bool GetValueForName(string name, out uint nOutValue)
{
foreach (NameValue nv in g_arrNV_Assoc)
{
if (name == nv.name)
{
nOutValue = nv.value;
return true;
}
}
nOutValue = 0;
return false;
}
The line private static readonly NameValue[] g_arrNV_Assoc has to be called once during the host class initialization, and it is done for every single element in that array!
So my question -- can I somehow optimize it so that the data stored in g_arrNV_Assoc array is stored in the PE section and not initialized at run-time?
PS. I hope I'm clear for the .NET folks with my terminology.
Indeed the terminology is sufficient enough, large static array is fine.
There is nothing you can really do to make it more efficient out of the box.
It will load initially once (at different times depending on which version of .net and if you have a static constructor). However, it will load before you call it.
Even if you created it empty with just the predetermined size, the CLR is still going to initialize each element to default, then you would have to buffer copy over your data somehow which in turn will have to be loaded from file.
The question are though
How much overhead does loading the default static array of struct actually have compared to what you are doing in C
Does it matter when in the lifecycle of the application when its loaded
And if this is way too much over-head (which i have already assumed you have determined), what other options are possibly available outside the box?
You could possibly pre-allocate a chunk of unmanaged memory, then read and copy the bytes in from somewhere, then inturn access using pointers.
You could also create this in a standard Dll, Pinvoke just like an other un-managed DLL. However i'm not really sure you will get much of a free-lunch here anyway, as there is overhead to marshal these sorts of calls to load your dll.
If your question is only academic, these are really your only options. However if this is actually a performance problem you have, you will need to try and benchmark this for micro-optimization and try to figure out what is suitable to you.
Anyway, i don't profess to know everything, maybe someone else has a better idea or more information. Good luck

Web API translating input into random int

not sure whether the subject is the best description for this problem but I am getting an unusual problem where I have a single Web API operation and a single field on a request and for some odd reason the value gets manipulated. Depending on the input this gets converted / translated by Web API or even by something else such as JSON.NET?
Should mention that this a brand new project with no additional references apart from what gets added by default when creating a new Web API project in Visual Studio.
public class TestController : ApiController
{
public void Post(Foo request)
{
}
}
public class Foo
{
public string Blah { get; set; }
}
Using a rest client I hit the operation using the following request:
{
"Blah": 43443333222211111117
}
When debugging the value of Blah gets converted to "6549845074792007885". I don't understand how and why its doing this? Any other value it respects. For example:
{
"Blah": 53443333222211111117
}
This is absolutely fine but is a bigger number.
Thanks, DS.
Update
This bug has been fixed and is scheduled to be included in the next release.
Original Answer
This is a bug in JSON.NET as hinted at, but it's not as simple as it first seems.
Versions prior to 5.0.4 work for both of these test cases. Anything after seems to fail but only for the first test case which is odd. I've gone through some of the JSON.NET code to try and see where this confusion occurs but at present cannot work out the why this is the case, I need to do more digging.
2147483647 Int Max
4444333322221111 Valid Credit Card Number Format
9223372036854775807 Int 64 Max
43443333222211111117 Dodgy Number greater than Int 64 hence overflow
53443333222211111117 Larger than above and Int 64, but works oddly.
1.7976931348623157E+308. Decimal max
Why 53443333222211111117 works is very odd. JSON.NET seems to have a 1025 character buffer set aside that contains a load of jibberish for my test cases, eventually the number is read incorrectly. I'll check this out further and raise an issue with JSON.NET.
If you use a decimal for the property this will work in all cases where the leading number is not zero, but this isn't a solution. For the short term, use version 5.0.3.
Take this example program to demonstrate the problem.
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Sending in: \n43443333222211111117");
var largeBrokenNumber = JsonConvert.DeserializeObject<Foo>("{\"Blah\": 43443333222211111117 }");
Console.WriteLine(largeBrokenNumber.Blah);
Console.WriteLine();
Console.WriteLine("Sending in: \n53443333222211111117");
var largeOddWorkingNumber = JsonConvert.DeserializeObject<Foo>("{\"Blah\": 53443333222211111117 }");
Console.WriteLine(largeOddWorkingNumber.Blah);
}
}
public class Foo
{
public string Blah { get; set; }
}

How to specify a CodeSet for WChar string from a CORBA client

This question is related to another question with which I have been struggling:
How to access CORBA interface without IDL or late-bound invoke remoting methods
I'm really stumped on how to get past this error about the CodeSet not being specified. I have been tracing into the IIOP code trying to figure out how the CodeSet can be specified, and it looks like it could be specified with a tagged component associated with the profile. Being unfamiliar with CORBA, I don't know what a tagged component is or what a profile is or how to control them, but I suspect that it may be influenced by creating a portable object interceptor, at which point I could add a tagged CodeSet component to the profile, if that means anything. I'm just going by what I can learn from the IIOP.NET code and Google.
Could someone please help me understand and hopefully control this? If the server is a black box and I need to write a client to call a method that outputs a string, how do I tell IIOP.NET what WChar CodeSet to use so it doesn't give me an error about it being unspecified. I tried OverrideDefaultCharSets from the client, but that didn't seem to have any effect. The IIOP sample code for that function shows it being used on the server side.
This was a real pain to work out, but I got it:
class MyOrbInitializer : omg.org.PortableInterceptor.ORBInitializer
{
public void post_init(omg.org.PortableInterceptor.ORBInitInfo info)
{
// Nothing to do
}
public void pre_init(omg.org.PortableInterceptor.ORBInitInfo info)
{
omg.org.IOP.Codec codec = info.codec_factory.create_codec(
new omg.org.IOP.Encoding(omg.org.IOP.ENCODING_CDR_ENCAPS.ConstVal, 1, 2));
Program.m_codec = codec;
}
}
class Program
{
public static omg.org.IOP.Codec m_codec;
static void Main(string[] args)
{
IOrbServices orb = OrbServices.GetSingleton();
orb.OverrideDefaultCharSets(CharSet.UTF8, WCharSet.UTF16);
orb.RegisterPortableInterceptorInitalizer(new MyOrbInitializer());
orb.CompleteInterceptorRegistration();
...
MarshalByRefObject objRef = context.resolve(names);
string origObjData = orb.object_to_string(objRef);
Ch.Elca.Iiop.CorbaObjRef.Ior iorObj = new Ch.Elca.Iiop.CorbaObjRef.Ior(origObjData);
CodeSetComponentData cscd = new CodeSetComponentData(
(int)Ch.Elca.Iiop.Services.CharSet.UTF8,
new int[] { (int)Ch.Elca.Iiop.Services.CharSet.UTF8 },
(int)Ch.Elca.Iiop.Services.WCharSet.UTF16,
new int[] { (int)Ch.Elca.Iiop.Services.WCharSet.UTF16 });
omg.org.IOP.TaggedComponent codesetcomp = new omg.org.IOP.TaggedComponent(
omg.org.IOP.TAG_CODE_SETS.ConstVal, m_codec.encode_value(cscd));
iorObj.Profiles[0].TaggedComponents.AddComponent(codesetcomp);
string newObjData = iorObj.ToString();
MarshalByRefObject newObj = (MarshalByRefObject)orb.string_to_object(newObjData);
ILicenseInfo li = (ILicenseInfo)newObj;
...
}
Unfortunately in my case the problem remained that the byte ordering was backwards too, so I had to go with an entirely different solution based on just getting bytes back and manually converting them to a string instead of getting string directly.

How do I modify PDF without a library using C# and stream it back to client in ASP.NET?

I'm having an issue where I'm corrupted a PDF and not sure of a proper solution. I've seen several posts on people trying to just do a basic stream or trying to modify the file with a third party library. This is how my situation differs...
I have all the web pieces in place to get me the PDF streamed back and it works fine until I try to modify it with C#.
I've modified the PDF in a text editor manually to remove the <> entries and tested that the PDF functions properly after that.
I've then programmatically streamed the PDF in as byte[] from the database, convert it to a string, using a RegEx to find and remove the same stuff I tried removing manually.
THE PROBLEM! When I try to convert the modified PDF string contents back into a byte[] to stream back, the PDF encoding no longer seems to be correct. What is the correct encoding?
Does anyone know the best way to do something like this? I'm just trying to keep my solution as light as possible because our site is geared towards PDF document access so heavy APIs or complex are not preferable unless no other options are available. Also, because this situation is really only when our users view the file in an iframe for "preview", I can't permanently modify the PDF.
Thanks for your help in advance!
Try to use the following BinaryEncoding class as encoding. It basically casts all bytes to chars (and back), so that only ASCII data can correctly be processed as string, but the rest of the data is kept unchanged and nothing is lost as long as you don't use any UNICODE characters > 0x00FF. So for your roundtrip it should work just fine.
public class BinaryEncoding: Encoding {
private static readonly BinaryEncoding #default = new BinaryEncoding();
public static new BinaryEncoding Default {
get {
return #default;
}
}
public override int GetByteCount(char[] chars, int index, int count) {
if (chars == null) {
throw new ArgumentNullException("chars");
}
return count;
}
public override int GetBytes(char[] chars, int charIndex, int charCount, byte[] bytes, int byteIndex) {
if (chars == null) {
throw new ArgumentNullException("chars");
}
if (bytes == null) {
throw new ArgumentNullException("bytes");
}
if (charCount < 0) {
throw new ArgumentOutOfRangeException("charCount");
}
unchecked {
for (int i = 0; i < charCount; i++) {
bytes[byteIndex+i] = (byte)chars[charIndex+i];
}
}
return charCount;
}
public override int GetCharCount(byte[] bytes, int index, int count) {
if (bytes == null) {
throw new ArgumentNullException("bytes");
}
return count;
}
public override int GetChars(byte[] bytes, int byteIndex, int byteCount, char[] chars, int charIndex) {
if (bytes == null) {
throw new ArgumentNullException("bytes");
}
if (chars == null) {
throw new ArgumentNullException("chars");
}
if (byteCount < 0) {
throw new ArgumentOutOfRangeException("byteCount");
}
unchecked {
for (int i = 0; i < byteCount; i++) {
chars[charIndex+i] = (char)bytes[byteIndex+i];
}
}
return byteCount;
}
public override int GetMaxByteCount(int charCount) {
return charCount;
}
public override int GetMaxCharCount(int byteCount) {
return byteCount;
}
}
You seem to be discovering that...
the PDF format is not trivial!
Whereby it may be OK (yet kludgey) to patch a few "text" bytes, in-situ (i.e. keeping size and structure unchanged), "messing" much more that that with the PDF files typically ends up breaking them. Regular expression for sure seem to be a blunt tool for the job.
The PDF file needs to be parsed and seen as a hierarchical collection objects (and then some..), and that's why we need the libraries which encapsulate the knowledge about the format.
If you need convincing, you may peruse the, now ISO standard, specification for the PDF Format (version 1.7) available for free on Adobe web site. BTW, these 750 pages cover the latest version, while there's much overlay, previous versions introduce yet another layer of details to contend with...
Edit:
This said, in re-reading the question, and Lucero's remark, the changes indicated do seem small/safe enough that a "snip and tuck" approach may work.
Beware that this type of approach may lead to issues, over time (when the format encountered is of a different, older or newer!, version, or when the file content, somehow causes different structures to be exposed, or...) or also with some specific uses (for example it may prevent users to use some features of the PDF documents such as forms or security). Maybe a compromise is to learn enough about the format(s) at hand and confirm that the changes are indeed casual.
Also... while the PDF format is a relatively complicated affair, the libraries that deal with it are not necessarily heavy, and they are typically easy to use.
In short, you'll need to weight the benefits and drawbacks of both approaches and pick accordingly ;-) (how was that for a "non-answer").
Look into IText. There is a reason why things like the apache commons library exist.

How can one simplify network byte-order conversion from a BinaryReader?

System.IO.BinaryReader reads values in a little-endian format.
I have a C# application connecting to a proprietary networking library on the server side. The server-side sends everything down in network byte order, as one would expect, but I find that dealing with this on the client side is awkward, particularly for unsigned values.
UInt32 length = (UInt32)IPAddress.NetworkToHostOrder(reader.ReadInt32());
is the only way I've come up with to get a correct unsigned value out of the stream, but this seems both awkward and ugly, and I have yet to test if that's just going to clip off high-order values so that I have to do fun BitConverter stuff.
Is there some way I'm missing short of writing a wrapper around the whole thing to avoid these ugly conversions on every read? It seems like there should be an endian-ness option on the reader to make things like this simpler, but I haven't come across anything.
There is no built-in converter. Here's my wrapper (as you can see, I only implemented the functionality I needed but the structure is pretty easy to change to your liking):
/// <summary>
/// Utilities for reading big-endian files
/// </summary>
public class BigEndianReader
{
public BigEndianReader(BinaryReader baseReader)
{
mBaseReader = baseReader;
}
public short ReadInt16()
{
return BitConverter.ToInt16(ReadBigEndianBytes(2), 0);
}
public ushort ReadUInt16()
{
return BitConverter.ToUInt16(ReadBigEndianBytes(2), 0);
}
public uint ReadUInt32()
{
return BitConverter.ToUInt32(ReadBigEndianBytes(4), 0);
}
public byte[] ReadBigEndianBytes(int count)
{
byte[] bytes = new byte[count];
for (int i = count - 1; i >= 0; i--)
bytes[i] = mBaseReader.ReadByte();
return bytes;
}
public byte[] ReadBytes(int count)
{
return mBaseReader.ReadBytes(count);
}
public void Close()
{
mBaseReader.Close();
}
public Stream BaseStream
{
get { return mBaseReader.BaseStream; }
}
private BinaryReader mBaseReader;
}
Basically, ReadBigEndianBytes does the grunt work, and this is passed to a BitConverter. There will be a definite problem if you read a large number of bytes since this will cause a large memory allocation.
I built a custom BinaryReader to handle all of this. It's available as part of my Nextem library. It also has a very easy way of defining binary structs, which I think will help you here -- check out the Examples.
Note: It's only in SVN right now, but very stable. If you have any questions, email me at cody_dot_brocious_at_gmail_dot_com.

Categories