Security implications of deserializing arbitray objects from YAML with yamldotnet

Security implications of deserializing arbitray objects from YAML with yamldotnet - c#

I'm currently evaluating this library from a security perspective. Since YAML can be used to serialize objects, I was wondering if the defaults provided by the parser are set such that deserialization of arbitrary objects is prevented.
Putting it differently:
What are the steps needed to allow for the creation of any type of object give a YAML string containing the definition of a specific type?
From what I could tell going through the documentation, this behaviour is not easily reproducible, since the deserializer expects a type of an object to populate with the given data.
As a second, related question: Are validation checks that are present during the creation of objects (e.g. "Age" property can not be larger than 130 and smaller than 0) or when using a getter/setter pattern being used when creating the object, or is it possible to create objects that have unexpected data inside them that way?

Related

How to validate a dynamic (Expando) object?

I'm looking for a way to validate dynamic objects. The dynamic objects are used in an internal API. I know that I can
convert to XML and use XSD for validation
convert to JSON and use Json.NET .IsValid(schema) or the Json.NET Schema package
validate each property in code like HasProperty as shown here
The XML and JSON based approaches requires serialization and deserialization and this will impact performance when used frequently. The code based validation is difficult to maintain (in particular for hierarchical objects) and can't be used easily by API consumers.
I was looking to find something similar to XSD validation but natively for dynamic objects but had no luck. I'm fine if it is limited to ExpandoObjects.
My minimum requirements:
identify missing required properties
identify properties which are not specified
allow optional properties
allow nested objects

How to deserialize JSON objects (of a schema given at runtime) to serializable C# objects?

Given a JSON schema (given at runtime), I need a function in C# that takes a string as an argument, validates if the given string is a valid JSON object of the given schema and returns a C# serializable object.
With C# serializable object I mean an object that can be added to a ValueSet.
Problem 1:
I know, with Json.NET I can deserialize JSON objects, but this works only with classes that are defined at compilation time, whereas my JSON schema is given at runtime, so the classes are not already defined at compilation time!
Problem 2:
Json.NET is not returning serializable objects, i.e. the objects that are returned by Json.NET can not be added to a ValueSet!

While it doesn't exactly solve your problems, there may be a workaround: composition.
Create a wrapper class that implements ISerializable and has a dynamic property. Deserialize your object and store it in this property. You'll have to supply the de/serialization code for your object, but that should be straightforward: just have the JSON serializer do the work.
The trouble you're experiencing is that you're asking a library that's written as generically as possible to do something very specific. You'd have a hard time finding a library that does this. (Though I think it's a good idea, and I may add it to my type generation feature.)
(Shameless plug) Also, have you looked at Manatee.Json?

Why does C# add `xsi:type` to serialized objects? And where is the corresponding schema?

When I serialize a C# object (using XmlSerializer) that contains properties pointing to other objects that are statically typed with an abstract class type, the resulting XML contains elements like these:
<AbstractBaseClass xsi:type="ConcreteClass" />
So, when I don't specify [XmlArrayItem(typeof(ConcreteClass))] or the likes, this is intended behavior.
But since the resulting element refers to a schema type via the xsi:type attribute, there should be a means to actually produce such a schema type automatically (XSD file), right? Otherwise, any schema validator will complain that the elements refer to a non-existent type, like so (Altova XMLSpy 2016 integrated validator):
So, either of the following should be possible:
Is there a way to instruct the XmlSerializer to use a different attribute to store the concrete type?
Is there a way to automatically create the XSD file that actually contains all the type information?
I am interested in both solutions, if they exist.
UPDATE
For the first question, there seems to exist a possibilty by using the xsd.exe tool from Microsoft. So that narrows the question down to the first point: Use a different attribute altogether in order to avoid the validation completely (preferred way for me).

JSON Serialization Output Contains Count Property

I'm trying to serialize an array of objects into JSON in C#. By array I mean something like Object[] (not Array<Object>), I'm using a JsonMediaTypeFormatter as part of MVC (the serialization is happening automatically as part of the framework but I can override it). The output contains {"count":2,"value":[{...},{...},...]}" where the ... is the json representation of the object. I've looked around and haven't found much information about suppressing this behavior. I want the output to just be the [{...},{...},...] rather than the object with count and values properties. Does anyone know how to achieve this without manually writing the code to do the serialization?

You could consider an alternative framework like the JSON.NET framework. I don't know how much you can customize if you are using an in-built .NET object since there are public properties that are not being ignored. Not using the JSONMediaTypeFormatter much, if it allows you to ignore properties, consider overriding List or ArrayList to hide certain attributes.
I would recommend not returning an array directly as there is a security flaw that could be compromised in a client browser (if that is the consumer). See this reference to find out more.

Deserialize an object graph with private members in C#

I want to deserialize an object graph in C#, the objects in the graph will have object and collection properties, some of the properties may be private, but I do not need to worry about cyclic object references. My intent is to use the deserialized object graph as test data as an application is being built, for this reason the objects need to be able to be deserialized from the XML prior to any serialization. I would like it to be as easy as possible to freely edit the XML to vary the objects that are constructed. I want the deserialization process not to require nested loops or nested Linq to SQL statements for each tier in the object graph.
I found the DataContractSerializer lacking. It can indeed deserialize to private fields and properties with a private setter but it appears to be incredibly brittle with regard to the processing of the XML input. All it takes is for an element in the XML to be not in quite the right order and it fails. What's more the order it expects the data to be declared in does not necessarily match the order the object members are declared in the class declaration, making it impossible to determine what XML will work without having the data in the objects to start with so that you can serialize it and check what it expects.
The XmlSerializer does not appear to be able to serialize to non-public data of any type.
Since the purpose is to generate test input data for what might be quite simple applications during development I'd rather not have to resort to heavyweight ORM technologies like Entity or Nhibernate.
Is there a simple solution?
[Update]
#Chuck Savage
Thanks very much for your reply. I'm responding in this edit due to the comment character limit.
In the technique you suggested the logic to deserialize each tier of the object hierarchy is maintained in each class, so in a sense you do have nested Linq to SQL just spread out across the various classes involved. This technique also maintains a reference to the XElement from which each object gets its values in each class, so in that sense it isn't so much deserialized as just creating a wrapper around the XML. In the scenario I have in mind I'd ideally like to be deserializing the actual business objects the application will use so an XML wrapper type object like this wouldn't work very well since it would require a distinctly different implementation for test usage compared to production usage.
What I'm really after is something that can do something akin to what the XmlSerializer can do, but which can also deserialize private fields, (or at least properties with no setter). The reason being that the XmlSerializer does what it does with minimal impact on the 'normal' production use of the classes involved (and hence no impact on their implementation).

How about something like this: https://stackoverflow.com/a/10158569/353147
You will have to create your own boilerplate code to go back and forth to xml, but with the included extensions that can be minimized.
Here is another example: https://stackoverflow.com/a/9035905/353147
You can also search my answers on the topic with: user:353147 XElement in the StackOverflow search.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Security implications of deserializing arbitray objects from YAML with yamldotnet - c#

Related

How to validate a dynamic (Expando) object?

How to deserialize JSON objects (of a schema given at runtime) to serializable C# objects?

Why does C# add `xsi:type` to serialized objects? And where is the corresponding schema?

JSON Serialization Output Contains Count Property

Deserialize an object graph with private members in C#

Categories

Resources