Serializing xml to object while keeping the original xml? - c#

I'm using an external API to receive xml and serializing this to an object but I want a way to be able to keep the original xml used to serialize for debugging and auditing.
Here's a sample of how I'm serializing:
XmlReader reader = this.Execute(url);
return Read<Property>(reader, "property");
Extract of Execute() routine:
StringBuilder sb = new StringBuilder();
Stream s = response.GetResponseStream();
XmlReader reader = XmlReader.Create(s);
return reader;
Read() simply wraps up the native xml serialization:
private T Read<T>(XmlReader reader, string rootElement)
{
XmlRootAttribute root = new XmlRootAttribute();
root.ElementName = rootElement;
root.IsNullable = true;
XmlSerializer xmlSerializer = new XmlSerializer(typeof(T), root);
object result = xmlSerializer.Deserialize(reader);
return (T)result;
}
I've had a look around at it appears once you've used the reader, you can't use it again (forward only reading stream?). Without trying to change to much, how can I extract the contents of the reader as xml while still benefiting from the built in serialization with the reader?
What would be nice is to adjust Read with an out param:
private T Read<T>(XmlReader reader, string rootElement, out string sourceXml);

You did not share the code for this.Execute(url), but presumably you build a reader from a stream. First write that stream to a string, then use it somewhere. If the stream is not seekable, dispose it and create a new stream from it.
Also, note that XmlSerializer can take a stream instead of a reader, so you could never bother with the reader and just pass streams among your methods.

Use fiddler.
Fiddler is a Web Debugging Proxy which logs all HTTP(S) traffic between your computer and the Internet. Fiddler allows you to inspect traffic, set breakpoints, and "fiddle" with incoming or outgoing data. Fiddler includes a powerful event-based scripting subsystem, and can be extended using any .NET language.

Related

ReadAllTextFile vs. StreamReader for JSON Parsing

This is more of a theoretical question, but I am curious what the difference between these two methods of reading a file is and why I would want to choose one over the other.
I am parsing a JSON configuration file (from local disk). Here is one method of doing it:
// Uses JSON.NET Serializer + StreamReader
using(var s = new StreamReader(file))
{
var jtr = new JsonTextReader(sr);
var jsonSerializer = new JsonSerializer();
return jsonSerializer.Deserialize<Configuration>(jtr);
}
...and, the second option...
// Reads the entire file and deserializes.
var json = File.ReadAllText(file);
return JsonConvert.DeserializeObject<JsonVmrConfigurationProvider>(json);
Is one any better than the other? Is there a case where one or the other should be used?
Again, this is more theoretical, but, I realized I don't really know the answer to it, and a search online didn't produce results that satisfied me. I could see the second being bad if the file was large (it isn't) since it's being read into memory in one shot. Any other reasons?
by reading the code you found that deserializaton from a string finally reach:
public static object DeserializeObject(string value, Type type, JsonSerializerSettings settings)
{
ValidationUtils.ArgumentNotNull(value, "value");
JsonSerializer jsonSerializer = JsonSerializer.CreateDefault(settings);
// by default DeserializeObject should check for additional content
if (!jsonSerializer.IsCheckAdditionalContentSet())
jsonSerializer.CheckAdditionalContent = true;
using (var reader = new JsonTextReader(new StringReader(value)))
{
return jsonSerializer.Deserialize(reader, type);
}
}
That is the creation of a JsonTextReader.
So the difference seems to effectively be the handling of huge files.
-- previous comment temper:
JsonTextReader overrides JsonReader.Close() and handles the stream (if CloseInput is true), but not only.
CloseInput should be true by default as the StringReader is not explicitly disposed in previous fragment of code
With File.ReadAllText(), the entire JSON needs to be loaded into memory first before deserializing it. Using a StreamReader, the file is read and deserialized incrementally. So if your file is huge, you might want to use the StreamReader to avoid loading the whole file into memory. If your JSON file is small (most cases), it doesn't really make a difference.

Big Data Xml file (file size over 20GB) convert to Json File

I have an XML file. I want to convert it to JSON with C#. However, the XML file is over 20 GB.
I have tried to read XML with XmlReader, then append every node to a JSON file. I wrote the following code:
var path = #"c:\result.json";
TextWriter tw = new StreamWriter(path, true, Encoding.UTF8);
tw.Write("{\"A\":");
using (XmlTextReader xmlTextReader = new XmlTextReader("c:\\muslum.xml"))
{
while (xmlTextReader.Read())
{
if (xmlTextReader.Name == "A")
{
var xmlDoc = new XmlDocument();
var v = xmlTextReader.ReadInnerXml();
string json = Newtonsoft.Json.JsonConvert.SerializeXmlNode(xmlDoc, Newtonsoft.Json.Formatting.None, true);
tw.Write(json);
}
}
}
tw.Write("}");
tw.Close();
This code not working. I am getting error while converting json. Is there any best way to perform the conversion?
I would do it the following way
generate classes out of xsd schema using xsd.exe
open file and read top level ( i e your document level ) tags one by one (with XmlTextReader or XmlReader)
serialize each tag into object using generated classes
deserialize resulting object to json and save to whatever
consider saving in batches of 1000-2000 tags
you are right about serialize/deserialize being slow. still doing work in several threads, preferably using TPL will give you good speed. Also consider using json.net serializer, it is really a lot faster than standard ones ( it is standard for web.api though)
I can put some code snippets in the morning if you need them.
We are processing big ( 1-10gigs) files this manner in order to save data to sql server database.

Getting JSON data from a response stream and reading it as a string?

I am trying to read a response from a server that I receive when I send a POST request. Viewing fiddler, it says it is a JSON response. How do I decode it to a normal string using C# Winforms with preferably no outside APIs. I can provide additional code/fiddler results if you need them.
The fiddler and gibberish images:
The gibberish came from my attempts to read the stream in the code below:
Stream sw = requirejs.GetRequestStream();
sw.Write(logBytes, 0, logBytes.Length);
sw.Close();
response = (HttpWebResponse)requirejs.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader sr = new StreamReader(stream);
MessageBox.Show(sr.ReadToEnd());
As mentioned in the comments, Newtonsoft.Json is really a good library and worth using -- very lightweight.
If you really want to only use Microsoft's .NET libraries, also consider System.Web.Script.Serialization.JavaScriptSerializer.
var serializer = new System.Web.Script.Serialization.JavaScriptSerializer();
var jsonObject = serializer.DeserializeObject(sr.ReadToEnd());
Going to assume (you haven't clarified yet) that you need to actually decode the stream, since A) retrieving a remote stream of text is well documented, and B) you can't do anything much with a non-decoded JSON stream.
Your best course of action is to implement System.Web.Helpers.Json:
using System.Web.Helpers.Json
...
var jsonObj = Json.Decode(jsonStream);

Receiving post data with different encodings

I am trying to integrate with a third-party system and in the documentation is mentions that when they send xml data via HttpPost, they sometimes use "text/xml charset=\"UTF-8**"" for the "Content-Type", and in other cases they use "**application/x-www.form-urlencoded" as the Content-Type.
Would there be any differences in parsing the request? Right now I just pull the post data using the folllowing code:
StreamReader reader = new StreamReader(Request.InputStream);
String xmlData = reader.ReadToEnd();
When you open the stream reader, you should pass the encoding specified on the HttpRequest object.
StreamReader reader = new StreamReader(request.InputStream, request.ContentEncoding);
string xmlData = reader.ReadToEnd();
This should allow you to get the original contents of the request into a proper .NET string regardless of whatever encoding is used.
Always give preference to use Encoding.UTF8. This will ensure that, in most cases, the reading is always done in a correct coding standard.
StreamReader sr = new StreamReader(Request.InputStream, Encoding.UTF8);
Hope it helps.
You can pass an encoding to your StreamReader at construction like so:
StreamReader s = new StreamReader(new FileStream(FILE), Encoding.UTF8);
application/x-www.form-urlencoded is HTTP Form Data, not XML.
Your code would most likely fail if you expect that Request.InputStream will be a parsable XML string when the Content-Type is application/x-www.form-urlencoded

Issue using XMLSerializer to Serialize and Deserialize

I am working on C# sockets and using XMLSerializer to send and receive data.
The XML data are sent from a server to a client over a network connection using TCP/IP protocol. The XML.Serializer.Serialize(stream) serializes the XML data and send them over the socket connection but when I want to use the XMLSerializer.Deserialize(stream) to read. The sent data returns a xml parse error.
Here is how I'm serializing:
Memory Stream ms = new MemoryStream();
FrameClass frame= new FrameClass ();
frame.string1 = "hello";
frame.string2 = "world";
XmlSerializer xmlSerializer = new XmlSerializer(frame.GetType());
xmlSerializer.Serialize(ms, frame);
socket.Send(ms.GetBuffer(), (int)ms.Length, SocketFlags.None);
Deserializing:
FrameClass frame;
XmlSerializer xml = new XmlSerializer(typeof(FrameClass));
frame= (FrameClass)xml.Deserialize(new MemoryStream(sockCom.SocketBuffer));
listbox1.Items.Add(frame.string1);
listbox2.Items.Add(frame.string2);
I think it has something to do with sending the data one right after another.
Can anyone teach me how to do this properly?
Have you received all of the data before attempting to deserialize (it's not clear from your code). I'd be inclined to receive all of the data into a local string and the deserialize from that rather than attempting to directly deserialize from the socket. It would also allow you to actually look at the data in the debugger before deserializing it.
Try this:
using (MemoryStream ms = new MemoryStream())
{
FrameClass frame= new FrameClass ();
frame.string1 = "hello";
frame.string2 = "world";
XmlSerializer xmlSerializer = new XmlSerializer(frame.GetType());
xmlSerializer.Serialize(ms, frame);
ms.Flush();
socket.Send(ms.GetBuffer(), (int)ms.Length, SocketFlags.None);
}
If you're sending the Frame XML one right after the other, then you're not sending an XML document. The XML Serializer will attempt to deserialize your entire document!
I don't have time to research this now, but look into the XmlReaderSettings property for reading XML fragments. You would then create an XmlReader over the memorystream with those settings, and call it in a loop.
The important thing is to flush the stream. It's also useful to put the stream in a using block to ensure it's cleaned up quickly.
Besides what #John said about the Flush call, your code looks alright.
You say you're sending multiple FrameClass data pieces, then the code should work sending just a single piece of data.
If you need to send multiple data objects, then you cannot send them all in one go, otherwise the deserialization process will stumble over the data.
You could setup some communication between the server & the client so the server knows what it's getting.
client: I have some data
Server: ok I'm ready, send it
client: sends
Server: done processing
repeat process...

Categories