Receiving post data with different encodings - c#

I am trying to integrate with a third-party system and in the documentation is mentions that when they send xml data via HttpPost, they sometimes use "text/xml charset=\"UTF-8**"" for the "Content-Type", and in other cases they use "**application/x-www.form-urlencoded" as the Content-Type.
Would there be any differences in parsing the request? Right now I just pull the post data using the folllowing code:
StreamReader reader = new StreamReader(Request.InputStream);
String xmlData = reader.ReadToEnd();

When you open the stream reader, you should pass the encoding specified on the HttpRequest object.
StreamReader reader = new StreamReader(request.InputStream, request.ContentEncoding);
string xmlData = reader.ReadToEnd();
This should allow you to get the original contents of the request into a proper .NET string regardless of whatever encoding is used.

Always give preference to use Encoding.UTF8. This will ensure that, in most cases, the reading is always done in a correct coding standard.
StreamReader sr = new StreamReader(Request.InputStream, Encoding.UTF8);
Hope it helps.

You can pass an encoding to your StreamReader at construction like so:
StreamReader s = new StreamReader(new FileStream(FILE), Encoding.UTF8);

application/x-www.form-urlencoded is HTTP Form Data, not XML.
Your code would most likely fail if you expect that Request.InputStream will be a parsable XML string when the Content-Type is application/x-www.form-urlencoded

Related

StreamReader fails to detect BOM

I have the following piece of code:
using (StreamReader sr = new StreamReader(path, Encoding.GetEncoding("shift-jis"), true)) {
mCertainFileIsUTFFormat = !sr.CurrentEncoding.Equals(Encoding.GetEncoding("shift-jis"));
mCodingFromBOM = sr.CurrentEncoding;
String line = sr.ReadToEnd();
return line.Split('\n');
}
Basically reading a file and assuming Shift-Jis if there is no BOM. Alas, this method is always, no matter what, returning Shift-JIS encoding, even if the file in question has a BOM within it. Am I doing something wrong here or perhaps there is a known issue? I could always open the file binary and do the work myself, but this is supposed to do what I want :)
You need to call Read of any kind - StreamReader will not detect encoding before reading. I.e. get encoding after your ReadToEnd call:
String line = sr.ReadToEnd();
mCodingFromBOM = sr.CurrentEncoding;
Info: StreamReader.CurrentEncoding
The value can be different after the first call to any Read` method of StreamReader, since encoding autodetection is not done until the first call to a Read method.

Getting JSON data from a response stream and reading it as a string?

I am trying to read a response from a server that I receive when I send a POST request. Viewing fiddler, it says it is a JSON response. How do I decode it to a normal string using C# Winforms with preferably no outside APIs. I can provide additional code/fiddler results if you need them.
The fiddler and gibberish images:
The gibberish came from my attempts to read the stream in the code below:
Stream sw = requirejs.GetRequestStream();
sw.Write(logBytes, 0, logBytes.Length);
sw.Close();
response = (HttpWebResponse)requirejs.GetResponse();
Stream stream = response.GetResponseStream();
StreamReader sr = new StreamReader(stream);
MessageBox.Show(sr.ReadToEnd());
As mentioned in the comments, Newtonsoft.Json is really a good library and worth using -- very lightweight.
If you really want to only use Microsoft's .NET libraries, also consider System.Web.Script.Serialization.JavaScriptSerializer.
var serializer = new System.Web.Script.Serialization.JavaScriptSerializer();
var jsonObject = serializer.DeserializeObject(sr.ReadToEnd());
Going to assume (you haven't clarified yet) that you need to actually decode the stream, since A) retrieving a remote stream of text is well documented, and B) you can't do anything much with a non-decoded JSON stream.
Your best course of action is to implement System.Web.Helpers.Json:
using System.Web.Helpers.Json
...
var jsonObj = Json.Decode(jsonStream);

Serializing xml to object while keeping the original xml?

I'm using an external API to receive xml and serializing this to an object but I want a way to be able to keep the original xml used to serialize for debugging and auditing.
Here's a sample of how I'm serializing:
XmlReader reader = this.Execute(url);
return Read<Property>(reader, "property");
Extract of Execute() routine:
StringBuilder sb = new StringBuilder();
Stream s = response.GetResponseStream();
XmlReader reader = XmlReader.Create(s);
return reader;
Read() simply wraps up the native xml serialization:
private T Read<T>(XmlReader reader, string rootElement)
{
XmlRootAttribute root = new XmlRootAttribute();
root.ElementName = rootElement;
root.IsNullable = true;
XmlSerializer xmlSerializer = new XmlSerializer(typeof(T), root);
object result = xmlSerializer.Deserialize(reader);
return (T)result;
}
I've had a look around at it appears once you've used the reader, you can't use it again (forward only reading stream?). Without trying to change to much, how can I extract the contents of the reader as xml while still benefiting from the built in serialization with the reader?
What would be nice is to adjust Read with an out param:
private T Read<T>(XmlReader reader, string rootElement, out string sourceXml);
You did not share the code for this.Execute(url), but presumably you build a reader from a stream. First write that stream to a string, then use it somewhere. If the stream is not seekable, dispose it and create a new stream from it.
Also, note that XmlSerializer can take a stream instead of a reader, so you could never bother with the reader and just pass streams among your methods.
Use fiddler.
Fiddler is a Web Debugging Proxy which logs all HTTP(S) traffic between your computer and the Internet. Fiddler allows you to inspect traffic, set breakpoints, and "fiddle" with incoming or outgoing data. Fiddler includes a powerful event-based scripting subsystem, and can be extended using any .NET language.

HttpWebResponse - Encoding problem

I have a problem with encoding. When I get site's source code I have:
I set encoding to UTF8 like this:
StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
string sourceCode = reader.ReadToEnd();
Thanks for your help!
Try to use the encoding specified:
Encoding encoding;
try
{
encoding = Encoding.GetEncoding(response.CharacterSet);
}
catch (ArgumentException)
{
// Cannot determine encoding, use dafault
encoding = Encoding.UTF8;
}
StreamReader reader = new StreamReader(response.GetResponseStream(), encoding);
string sourceCode = reader.ReadToEnd();
If you are accepting gzip somehow, this may help: (Haven't tried it myself and admittedly it doesn't make much sense since your encoding is not gzip?!)
request.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip,deflate");
request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
I had the same issue, I tried changing encoding, from the source to the result, and I got nothing. in the end, I come across a thread that leads me to the following...
Take look here...
.NET: Is it possible to get HttpWebRequest to automatically decompress gzip'd responses?
you need to use the following code, before retrieving the response from the request.
rqst.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
since once we use accept-encoding 'gzip' or 'deflate', the data get compressed, and turn into data unreadable by us. so we need to decompress them.
But the response might not be UTF-8. Have you checked the CharacterSet and the ContentType properties of the response object to make sure you're using the right encoding?
In any event, those two characters look like the code page 437 characters for values 03 and 08. It looks like there's some binary data in your data stream.
I would suggest that for debugging, you use Stream.Read to read the first few bytes from the response into a byte array and then examine the values to see what you're getting.
Change this line in your code:
using (StreamReader streamReader = new StreamReader(stream, Encoding.GetEncoding(1251)))
it may help you..

How can I import a raw RSS feed in C#?

Does anyone know an easy way to import a raw, XML RSS feed into C#? Am looking for an easy way to get the XML as a string so I can parse it with a Regex.
Thanks,
-Greg
This should be enough to get you going...
using System.Net
WebClient wc = new WebClient();
Stream st = wc.OpenRead(“http://example.com/feed.rss”);
using (StreamReader sr = new StreamReader(st)) {
string rss = sr.ReadToEnd();
}
If you're on .NET 3.5 you now got built-in support for syndication feeds (RSS and ATOM). Check out this MSDN Magazine Article for a good introduction.
If you really want to parse the string using regex (and parsing XML is not what regex was intended for), the easiest way to get the content is to use the WebClient class.It got a download string which is straight forward to use. Just give it the URL of your feed. Check this link for an example of how to use it.
I would load the feed into an XmlDocument and use XPATH instead of regex, like so:
XmlDocument doc = new XmlDocument();
HttpWebRequest request = WebRequest.Create(feedUrl) as HttpWebRequest;
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
StreamReader reader = new StreamReader(response.GetResponseStream());
doc.Load(reader);
<parse with XPATH>
}
What are you trying to accomplish?
I found the System.ServiceModel.Syndication classes very helpful when working with feeds.
You might want to have a look at this: http://www.codeproject.com/KB/cs/rssframework.aspx
XmlDocument (located in System.Xml, you will need to add a reference to the dll if it isn't added for you) is what you would use for getting the xml into C#. At that point, just call the InnerXml property which gives the inner Xml in string format then parse with the Regex.
The best way to grab an RSS feed as the requested string would be to use the System.Net.HttpWebRequest class. Once you've set up the HttpWebRequest's parameters (URL, etc.), call the HttpWebRequest.GetResponse() method. From there, you can get a Stream with WebResponse.GetResponseStream(). Then, you can wrap that stream in a System.IO.StreamReader, and call the StreamReader.ReadToEnd(). Voila.
The RSS is just xml and can be streamed to disk easily. Go with Darrel's example - it's all you'll need.

Categories