Passing pickle strings between C# and IronPython - c#

I wrote an IronPython package to do some data-crunching, and now I am wrapping it in a C# application. Part of the application's functionality is to save the state of a project, and then later to restore that saved state.
I am using the pickle module in IronPython to save an object of a class from my custom package. Before I did the wrapping in C#, that was no problem: I used the pickle.dump() function to serialize the object to a file. Now I want to use the pickle.dumps() function to serialize the object to a string, then pass that string to a C# object and serialize THAT object with an XmlSerializer.
Serialization seems to work, but deserialization breaks down: C# gets the deserialized string and passes it to IronPython, which should be able to reconstitute the original object with the pickle.loads() function, but instead raises this error:
System.Collections.Generic.KeyNotFoundException {"ô"}
Can you help me solve this problem? I have two theories:
Perhaps there is a difference in the string encodings between IronPython and C#, or between C#, IronPython, and what is expected by the pickle module?
Not the entire string is being serialized in the first place, so I am just passing nonsense to pickle.loads()
My evidence that leads me to these theories:
The missing key in the error message (ô) looks like a bit of unicode-parsed-as-ASCII-text.
If I break execution (in debug mode of Visual Studio 2010) and look at the string before it is passed to IronPython for unpicking, what I see is not long enough to represent the entire object. But it might just reach the limit of what the Visual Studio debugger will display.
Thanks in advance!

Short answer: pickling with protocol=-1 works for me where the default, human-readable protocol does not.
Long answer: It looks like theory #1 is substantially correct. I use a dictionary of large numpy arrays to store my data, and even using the default "human-readable" pickling protocol, they are pickled to a binary-like form. Notice the fourth line from the bottom (this is pulled from a pickled file that can be successfully unpickled, but it gives an idea of what I see in the files that can't):
(g10
(I0
tp25
g12
tp26
Rp27
(I1
(I2
tp28
g19
I00
Vq=\u000a×£°(#R¸\u2026ëQ.#
p29
tp30
bssb.
There is the potential for this to cause an error somewhere in the chain of
pickling to a string in IronPython
passing that string to a C# object
serializing that object with an XmlSerializer
unserializing the xml to get back a C# object
passing the string that represents the pickled object back to IronPython
unpicking the string in IronPython
Exactly where the error occurs, I don't yet know. I have worked around the problem by pickling the IronPython object with protocol=-1, which turns the object into a string of binary gibberish that can survive the process.

Related

How to use a 'hard-coded' dictionary/enum

I am wanting to create a 'dictionary' of strings, however I have only ever learned how to use strings to reference what I want in a dictionary. I want something with more auto-correct (as typos can happen in a large table of strings), which is why I want to know how to hard-code. (The value of the strings will be retrieved from a text file, like JSON).
I notice that Microsoft uses some type of hard-coding in their String Resource File.
So instead of doing:
string result = strings["Hello"];
I wish to do this:
string result = strings.Hello;
The only thing I can think of is to use some external tool that creates an enum/struct script with the values from the text file. Is there a better option, perhaps one built into .NET?
Edit: I think 'strongly-typed' would be a better description over 'hard-coded'.
Edit 2: Thanks for all the comments and answers. By the looks of it, some code-gen is required to fufil this result. I wonder if there's already any tools out there that do this for you (I tried looking but my terminology may be lacking). It doesn't seem too difficult to create this tool.
There are compiletime constants and runtime constants.
Your wish for Autocrrection/Intellisense support requires a compile time constants. Those are the only ones Intellisence, Syntax Highlighting and the Compiler double check for you.
But your requriement of having the values generated from a 3rd party textfile, indicates either a runtime constant or some automatic code generation. Runtime constants would take away the Editor support. While Code generation would run into issue with the Editor only having a old copy of the file. And a high risk of breaking tons of code if a string in that one file changes.
So your two requirements are inherently at odds. You need to have your cake and eat it too.
Perhaps my primitve solution to the Enum/ToString() problem might help you?
Enumeration are for most parts groups of constants, and integer ones by default. With added type checks on assignments. That makes them a good way around Primitive Obsession. You reference a value from the group like you would any constant, readonly static field or readonly property. (There is other advantages like Flags, but I doubt they mater here).
While Enums have a string you could use for display and input parsing - the one you use in sourcecode - that one is absolutely not suited for display. By default they are all-caps and you would need to support Localisation down the line. My primitive Solution was a translation layer. I add a Dictionary<someEnum, String> SomeEnumStringRepresentation. This dictionary can be generated and even changed at runtime:
I need to display any specific value, it is SomeEnumLocalisation[someEnum]. I could add a default behavior to just ToString() the compiler representation of the Enum.
I need to parse a user input? Itterate over the values until you find a match, if not throw a ParseException.
I get to use compile time checks. Without having to deal with the very inmutable compile side strings anywhere else. Or with my code side strings changing all the time.
i am not quit understand what out put you want , bu I am just throwing an idea to here - how about to extend the class string and add your own methods to it ? so when you use strings.Hello it will return what you wanted?
example :
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/extension-methods

Seeking guidance reading .yaml files with C#

Two months later:
The YAML (Eve Online blueprint.yaml) file I tried to parse changed a huge deal which also made it much easier to parse using de deserializer. If someone (for whatever reason) would like to see the code, it's updated on https://github.com/hkraal/ParseYaml
Based on the comment of Steve Wellens I've adjusted the code to do less things at once. It didn't matter in the error itself. I've created another project (Example1) in my solution to test the actual example found on aaubry.net I referenced to earlier.
It gave me the same error when using an "dynamic" key which lead to my current conclusion:
There is a difference between:
items:
- part_no: A4786
and
items:
part_no: A4786
The first is being used in the example which I (wrongly) assumed I could apply to my .yaml file which is using the second syntax.
Now it remains to find out how I can get the 'child' elements of my key with the syntax used in my yaml file...
As C# is used at work I started thinking about a nice project to learn about various aspects of the language while having a direct goal to work towards. However I'm hitting my first wall quite early in my project parsing a Yaml file. My goal is to create an List of YamlBlueprint objects as defined in YamlBlueprint.cs but I don't even get to the end of the Yaml file.
I've setup a testcase on github which demonstrates the problem:
https://github.com/hkraal/ParseYaml
The example on http://www.aaubry.net/page/YamlDotNet-Documentation-Loading-a-YAML-stream works up untill I want to loop trough the items. Based on what I see I should be able to give myKey as parameter to the YamlScalarNode() to access the items below it.
var items = (YamlSequenceNode)mapping.Children[new YamlScalarNode(myKey)];
I'm gettting the following error if I do:
An unhandled exception of type 'System.InvalidCastException' occurred in yamldotnet.exe
Additional information: Unable to cast object of type 'YamlDotNet.RepresentationModel.YamlMappingNode' to type 'YamlDotNet.RepresentationModel.YamlSequenceNode'.
When passing "items" as parameter to YamlScalarNode() it just complains about the item not being there which is to be expected. As I'm not sure where my toughttrain is going wrong I would love a bit assistance on how to troubleshoot this further.
Your question has already been correctly answered, but I would like to point out that your approach is probably not the best one for parsing files. The YamlDotNet.RepresentationModel.* types offer an object model that directly represents the YAML stream and its various parts. This is useful if you are creating an application that processes or generates YAML streams.
When you want to read a YAML document into an object graph, the best approach is to use the Deserializer class. With it you can write your code as follows:
using(var reader = File.OpenText("blueprints.yaml")
{
var deserializer = new Deserializer();
var blueprintsById = deserializer.Deserialize<Dictionary<int, YamlBlueprint>>(reader);
// Use the blueprintsById variable
}
The only difference is that the Id property of the YamlBlueprint instances won't be set, but that's just a matter of adding this:
foreach(var entry in blueprintsById)
{
entry.Value.Id = entry.Key;
}
You have too much stuff going on in one line of code. Create a new YamlScalarNode object in one line, access the array in another line, cast the resultant object in another line. That way, you'll narrow down the problem area to a single step.
The message is telling you that you are retrieving a YamlMappingNode from the array but you are casting it to a YamlSequenceNode. Which is not allowed since the two types are obviously not related.
Well that was kinda stupid... it's kind of hard to create an mapping of something which only contains one element. I've edited the repo linked in the OP with an working example in case somebody runs into the same problem.

Use ASP ScriptingContext Request collections in C#?

I am trying to use request elements from an ASP application in a .NET class library that is used in the application. I came across a head scratcher that I can't wrap my head around:
//Context is an ASPTypeLibrary.ScriptingContext
dynamic req = System.EnterpriseServices.ContextUtil.GetNamedProperty("Request");
Context.Response.Write(req.Form("mykey")); //this writes the value I expected
Context.Response.Write(String.Format("{0}", req.Form("mykey"))); //this writes 'System.__ComObject'
Am I going about this all wrong? I was using info I gleaned from this question.
You should note that Request.Form("someKey") is not a string. The source of your confusion, however, originates not in Request.Form("someKey") but on the other side, in Response.Write(...).
There are some automatic conversion shenanigans going on.
Request.Write(...) doesn't take a string. It takes a Variant. The method will do its darnest to output whatever you pass to it.
If the Variant holds a BSTR (a COM string), it will output that unchanged. It will also try calling VarChangeTypeEx(...) (kind-of; see note below) to try to see if it can get COM to convert it to a BSTR (that's what happens when you pass it a number). If the Variant contains an object with a default method on it ([propvalue]), and it has no better way to output it, it will call the default method and start over with the result of that. I think it has a few other tricks up its sleeve, which are not entirely clearly documented.
At a high level, it should now be clear what's happening. On the first line, req.Form("myKey") returns a COM object, which then gets passed down to Response.Write(...), which then converts that object to a BSTR string and outputs it. On the other hand, when you try to pass req.Form("myKey") to a C# method, the conversion doesn't occur and you get a generic COM object instead, with predictable consequences.
So what is the return value of Request.Form("someKey") then? It's an IRequestDictionary object. And why a dictionary? Because you can submit an http request that has multiple form elements with the same name. This can be the case, for example, when the input elements are checkboxes intended to be overlapping options.
What happens when the form has multiple entries? The conversion process returns a joined string analog to String.Join(", ", someArray) in C#.
It's not clear to me whether Response.Write has intimate knowledge of IRequestDictionary (unlikely), or whether it knows about COM Enumerator pattern (more likely) and it enumerates them to compose the string.
More interesting to me is who is responsible for the conversion process, because VBScript's CStr() will do the same conversion. I had always assumed that CStr() was a thin wrapper around VarChangeTypeEx(...), but I'm pretty sure that VarChangeTypeEx(...) does not concatenate enumerators like that. Obviously CStr() is a lot fancier than I had assumed. I believe that Response.Write simply calls internally whatever API fully implements CStr() and relies on that for the conversion.
For further exploration of the Classic ASP objects and interface, try http://msdn.microsoft.com/en-us/library/ms524856(v=vs.90).aspx instead of the usual VBScript-based descriptions.

How to properly work with non-primitive ClrInstanceField values using ClrMD?

I've got some really large memory dumps of a managed process that I'm trying to get a lot of statistics from--as well as be able to present an interactive view of--fairly deep object graphs on the heap. Think something comparable to !do <address> with prefer_dml 1 set in WinDbg with SOS, where you can continually click on the properties and see their values, only in a much friendlier UI for comparing many objects.
I've found Microsoft.Diagnostics.Runtime (ClrMD) to be particularly well suited for this task, but I'm having a hard time working with array fields and I'm a little confused about object fields, which I have working a little better.
Array:
If I target an array with an address directly off the heap and use ClrType.GetArrayLength and ClrType.GetArrayElementValue things work fine, but once I'm digging through the fields on another object, I'm not sure what value I'm getting from ClrInstanceField.GetValue when the ClrInstanceField.ElementType is ClrElementType.SZArray (I haven't encountered Array digging around in my object graph yet, but I should like to handle it as well).
Edit: I just decided to use the ClrType for System.UInt64 to dereference the array field (using parent address + offset of the array field to calculate the address where the array pointer is stored), then I can work with it the same as if I got it from EnumerateObjects. I am now having some difficulty with some arrays not supporting the ArrayComponentType property. I have yet to test with arrays of Structs so I am also wondering if that will be a C-style allocation of inline structs, as it is with int[] or if it will be an array of pointers to structs on the heap. Guid[] is one of the types I'm having an issue getting the ArrayComponentType from.
Object: Fixed (logic error)
With a ClrInstanceField that has a Type of ClrElementType.Object I get much better results, but still need a little more. Firstly, after calling GetFieldValue I get back a ulong address(?) which I can use ClrInstanceField.Type.Fields against just fine, so I can see the field names and values of the nested object. That said, I have to account for polymorphism, so I tried using ClrHeap.GetObjectType on the same address and it either returns NULL or something completely incorrect. It seems odd that the address would work in my first use case, but not the second.
String: Fixed (found workaround)
Because my real project already uses DbgEng w/ SOS, I have a different way to easily get the value of strings by address, but it seemed very odd that trying to use ClrInstanceField.GetFieldValue succeeded in returning a string, but with completely inaccurate results (a bunch of strange characters). Maybe I'm doing this wrong?
Edit: I have extracted an abstraction that now runs in LINQPad from my original code. It's a bit long to post here, but it's all here in a gist. It's still a little messy from all the copy/paste/refactor and I'll be cleaning it up further an likely posting the final source on either CodePlex or GitHub after I've got these issues fixed.
The code base is fairly large and specific to a project, but if it's absolutely necessary I may be able to extract out a sample set. That said, all access to the ClrMD objects is fairly simple. I get the initial addresses from SOS commands like !dumpheap -stat (which works fine for the root objects) and then I use ClrHeap.GetTypeByName or ClrHeap.GetObjectType. After that it relies exclusively on ClrType.Fields and ClrInstanceField members Type, ElementType, and GetFieldValue
As an added bonus, I did find a browser friendly version of the XML Docs provided with the NuGet package, though it's the same documentation IntelliSense provides.
It's going to be hard to answer very precisely without seeing what your code looks like, but basically, it goes like this:
The first thing you need to know in order to be able to call GetFieldAddress/GetFieldValue is if the object address you have is a regular pointer or an interior pointer. That is, if it directly points to an object on the heap, or to an interior structure within an actual object (think String vs. Struct field within an actual object).
If you're getting the wrong values out of GetFieldAddress/GetFieldValue, it usually means you're not specifying that you have an interior pointer (or you thought you had one when you didn't).
The second part is understanding what the values mean.
If field.IsPrimitive() is true: GetFieldValue() will get you the actual primitive value (i.e. an Int32, Byte, or whatever)
If field.IsValueClass() is true, then GetFieldAddress() will get you an interior pointer to the structure. Thus, any calls on GetFieldAddress/Value() that you use on that address you need to tell it that it is an interior pointer!
If field.ElementType is a ClrElementType.String, then I seem to remember you need to call GetFieldValue will get you the actual string contents (need to check, but this should be it).
Otherwise, you have an object reference, in which case GetFieldValue() will get you a regular pointer to the new reference object.
Does this make sense?

Understanding C# strings reconstructed by Reflector

I have been asked to help with a C# project where the source code is no longer available. Fortunately a non-obfuscated debug build of the project is available, so I ran it through Reflector and the reconstructed source code looks largely fine.
There is one oddity that I have a question about. Some objects that pretty clearly should be a string are coming out like this:
string str7 = new string();
str7.Value = strArray3[k];
Now, string does not have a parameterless constructor nor does it have a Value property. I think I can just remove the instantiation and remove the .Value property and things will probably work as expected, but I would like to understand if there might be something more going on than a Reflector bug.
One other interesting piece is that almost all of the variables were reconstructed with original-sounding names, but this one (and a few others) seem to have been assigned random names.
Any insight is very welcome.
Can you post both the IL and decompiled C# for the same method where this happens?
There isn't by chance a "class string { ... }" in that assembly, is there?

Categories