The Facebook graph API's return to me the user's email address as
foo\u0040bar.com.
in a JSON object. I need to convert it to
foo#bar.com.
There must be a built in method in .NET that changes the Unicode character expression (\u1234) to the actual unicode symbol.
Do you know what it is?
Note: I prefer not to use JSON.NET or JavaScriptSerializer for performance issues.
I think the problem is in my StreamReader:
requestUrl = "https://graph.facebook.com/me?access_token=" + accessToken;
request = WebRequest.Create(requestUrl) as HttpWebRequest;
try
{
using (HttpWebResponse response2 = request.GetResponse() as HttpWebResponse)
{
// Get the response stream
reader = new StreamReader(response2.GetResponseStream(),System.Text.Encoding.UTF8);
string json = reader.ReadToEnd();
I tried different encodings for the StreamReader, UTF8, UTF7, Unicode, ... none worked.
Many thanks!
Thanks to L.B for correcting me. The problem was not in the StreamReader.
Yes, there is some built in method for that, but that would involve something like using a compiler to parse the string as code...
Use a simple replace:
s = s.Replace(#"\u0040", "#");
For a more flexible solution, you can use a regular expression that can handle any unicode character:
s = Regex.Replace(s, #"\\u([\dA-Fa-f]{4})", v => ((char)Convert.ToInt32(v.Groups[1].Value, 16)).ToString());
Json responses are not binary data to convert to a string using some encodings. Instead they are strings correctly decoded by your browser or by HttpWebResponse as in your example. You need a second procesing on it(regex, deserializers etc) to get the final data.
See what you get with
webClient.DownloadString("https://graph.facebook.com/HavelVaclav?access_token=????") without any encoding
{"id":"100000042150992",
"name":"Havel V\u00e1clav",
"first_name":"Havel",
"last_name":"V\u00e1clav",
"link":"http:\/\/www.facebook.com\/havel.vaclav",
"username":"havel.vaclav",
"gender":"male",
"locale":"cs_CZ"
}
Would your encoding change \/ to /?
So, the problem is not in your StreamReader.
Related
Got a problem here... If I put the XML file on the server, then I can read it through steamReader, convert to variable and got everything working in the MSSQL database.
However, it is required that I send through html POST, and it doesn't work for the code below:
page.Response.ContentType = "text/xml";
StreamReader reader = new StreamReader(page.Request.InputStream);
inputString = reader.ReadToEnd();
deleteShip(inputString);
it seems to me that the above code didn't get the XML that POST from my program. Because for the same code in deleteShip, if I use an xml on the server then it works fine.
Is there a way to solve this problem? As long as I can send any string to deleteShip(string s) then I'm happy. The string will be in XML format though
Thanks for the help!
It would be useful to see how the XML is POSTed to your program. Typically, data is sent from an HTML form as name-value pairs in the HTTP request body when using the POST method. It's not clear from your question whether you're using an HTML form to POST the XML to your program and it's hard to tell what might be going wrong without more information.
From your code it looks like you're reading the entire HTTP request where you'd usually read the value of a request parameter for example:
Request["XmlParameterName"]
Where XmlParameterName is the name of an HTML form input field.
Have you inspected the value of the inputString variable? Is it valid XML? Is it encoded correctly? Are any invalid characters like ampersands (&) escaped correctly?
Update your question with a bit more information if none of the things I mentioned are the problem.
OK, I got it fixed.
Here is the code.
System.IO.Stream stream;
string inputString;
Int32 stringLength;
stream = Request.InputStream;
stringLength = Convert.ToInt32(stream.Length);
byte[] stringArray = new byte[stringLength];
inputString = System.Text.Encoding.ASCII.GetString(stringArray, 0, stringLength);
deleteShip(inputString);
By this it will access the POST body from my html request (which in this case XML).
I am trying to use System.windows.Forms.WebBrowser to display a content in the languages other than English, but the resulting encoding is incorrect. What should I do to display for example Russian?
I am downloading and displaying a string as following:
System.Net.WebClient wc = new System.Net.WebClient();
webBrsr.DocumentText = wc.DownloadString(url);
The problem is with the WebClient and how it is interpreting the string encoding. One solution is to download the data as raw bytes and parse it out manually:
Bytes[] bytes = wc.DownloadData("http://news.google.com/news?edchanged=1&ned=ru_ru");
//You should really inspect the headers from the response to determine the exact encoding to use,
// this example just assumes UTF-8 which might work in most scenarios
String t = System.Text.Encoding.UTF8.GetString(bytes);
webBrsr.DocumentText = t;
I have a problem with encoding. When I get site's source code I have:
I set encoding to UTF8 like this:
StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
string sourceCode = reader.ReadToEnd();
Thanks for your help!
Try to use the encoding specified:
Encoding encoding;
try
{
encoding = Encoding.GetEncoding(response.CharacterSet);
}
catch (ArgumentException)
{
// Cannot determine encoding, use dafault
encoding = Encoding.UTF8;
}
StreamReader reader = new StreamReader(response.GetResponseStream(), encoding);
string sourceCode = reader.ReadToEnd();
If you are accepting gzip somehow, this may help: (Haven't tried it myself and admittedly it doesn't make much sense since your encoding is not gzip?!)
request.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip,deflate");
request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
I had the same issue, I tried changing encoding, from the source to the result, and I got nothing. in the end, I come across a thread that leads me to the following...
Take look here...
.NET: Is it possible to get HttpWebRequest to automatically decompress gzip'd responses?
you need to use the following code, before retrieving the response from the request.
rqst.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
since once we use accept-encoding 'gzip' or 'deflate', the data get compressed, and turn into data unreadable by us. so we need to decompress them.
But the response might not be UTF-8. Have you checked the CharacterSet and the ContentType properties of the response object to make sure you're using the right encoding?
In any event, those two characters look like the code page 437 characters for values 03 and 08. It looks like there's some binary data in your data stream.
I would suggest that for debugging, you use Stream.Read to read the first few bytes from the response into a byte array and then examine the values to see what you're getting.
Change this line in your code:
using (StreamReader streamReader = new StreamReader(stream, Encoding.GetEncoding(1251)))
it may help you..
I wrote a small program for iterating through a lot of files and applying some changes where a certain string match is found, the problem I have is that different files have different encodings. So what I would like to do is check the encoding, then overwrite the file in its original encoding.
What would be the prettiest way of doing that in C# .net 2.0?
My code looks very simple as of now;
String f1 = File.ReadAllText(fileList[i]).ToLower();
if (f1.Contains(oPath))
{
f1 = f1.Replace(oPath, nPath);
File.WriteAllText(fileList[i], f1, Encoding.Unicode);
}
I took a look at Auto encoding detect in C# which made me realize how I could detect encoding, but I am not sure how I could use that information to write in the same encoding.
Would greatly appreciate any help here.
Unfortunately encoding is one of those subjects where there is not always a definitive answer. In many cases it's much closer to guessing the encoding as opposed to detecting it. Raymond Chen did an excellent blog post on this subject that is worth the read
http://blogs.msdn.com/b/oldnewthing/archive/2007/04/17/2158334.aspx
The gist of the article is
If the BOM (byte order marker) exists then you're golden
Else it's guess work and heuristics
However I still think the best approach is to Darin mentioned in the question you linked. Let StreamReader guess for you vs. re-inventing the wheel. It only requires a very slight modification to your sample.
String f1;
Encoding encoding;
using (var reader = new StreamReader(fileList[i])) {
f1 = reader.ReadToEnd().ToLower();
encoding = reader.CurrentEncoding;
}
if (f1.Contains(oPath))
{
f1 = f1.Replace(oPath, nPath);
File.WriteAllText(fileList[i], f1, encoding);
}
By default, .Net use UTF8. It is hard to detect character encoding becus most of the time .Net will read as UTF8. i alway have problem with ANSI.
my trick is i will read the file as Stream as force it to read as UTF8 and detect usual character that should be in text. If found, then UTF8 else ANSI ... and tell user u can use just 2 encoding either ANSI or UTF8. auto dectect not quite work in my language :p
I am afraid, you will have to know the encoding. For UTF based encodings though you can use StreamReader built in functionality though.
Taken form here.
With regard to encodings - you will
need to have identified the encoding
in order to use the StreamReader.
However, the StreamReader itself can
help if you create it with one of the
constructor overloads that allows you
to supply the flag
detectEncodingFromByteOrderMarks as
true (or you can use
Encoding.GetPreamble and look at the
byte preamble yourself).
Both these methods will only help
auto-detect UTF based encodings though
- so any ANSI encodings with a specified codepage will probably not
be parsed correctly.
Prob a bit late but I encountered the same problem myself, using the previous answers I found a solution that works for me, It reads in the text using StreamReaders default encoding, extracts the encoding used on that file and uses StreamWriter to write it back with the changes using the found Encoding. Also removes\reAdds the ReadOnly flag
string file = "File to open";
string text;
Encoding encoding;
string oldValue = "string to be replaced";
string replacementValue = "New string";
var attributes = File.GetAttributes(file);
File.SetAttributes(file, attributes & ~FileAttributes.ReadOnly);
using (StreamReader reader = new StreamReader(file, Encoding.Default))
{
text = reader.ReadToEnd();
encoding = reader.CurrentEncoding;
reader.Close();
}
bool changedValue = false;
if (text.Contains(oldValue))
{
text = text.Replace(oldValue, replacementValue);
changedValue = true;
}
if (changedValue)
{
using (StreamWriter write = new StreamWriter(file, false, encoding))
{
write.Write(text.ToString());
write.Close();
}
File.SetAttributes(file, attributes | FileAttributes.ReadOnly);
}
The solution for all Germans => ÄÖÜäöüß
This function opens the file an determines the Encoding by the BOM.
If the BOM is missing the file will be interpreted as ANSI, but if there are UTF8 encoded German Umlaute in it, it will be detected as UTF8.
https://stackoverflow.com/a/69312696/9134997
Does anyone know an easy way to import a raw, XML RSS feed into C#? Am looking for an easy way to get the XML as a string so I can parse it with a Regex.
Thanks,
-Greg
This should be enough to get you going...
using System.Net
WebClient wc = new WebClient();
Stream st = wc.OpenRead(“http://example.com/feed.rss”);
using (StreamReader sr = new StreamReader(st)) {
string rss = sr.ReadToEnd();
}
If you're on .NET 3.5 you now got built-in support for syndication feeds (RSS and ATOM). Check out this MSDN Magazine Article for a good introduction.
If you really want to parse the string using regex (and parsing XML is not what regex was intended for), the easiest way to get the content is to use the WebClient class.It got a download string which is straight forward to use. Just give it the URL of your feed. Check this link for an example of how to use it.
I would load the feed into an XmlDocument and use XPATH instead of regex, like so:
XmlDocument doc = new XmlDocument();
HttpWebRequest request = WebRequest.Create(feedUrl) as HttpWebRequest;
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
StreamReader reader = new StreamReader(response.GetResponseStream());
doc.Load(reader);
<parse with XPATH>
}
What are you trying to accomplish?
I found the System.ServiceModel.Syndication classes very helpful when working with feeds.
You might want to have a look at this: http://www.codeproject.com/KB/cs/rssframework.aspx
XmlDocument (located in System.Xml, you will need to add a reference to the dll if it isn't added for you) is what you would use for getting the xml into C#. At that point, just call the InnerXml property which gives the inner Xml in string format then parse with the Regex.
The best way to grab an RSS feed as the requested string would be to use the System.Net.HttpWebRequest class. Once you've set up the HttpWebRequest's parameters (URL, etc.), call the HttpWebRequest.GetResponse() method. From there, you can get a Stream with WebResponse.GetResponseStream(). Then, you can wrap that stream in a System.IO.StreamReader, and call the StreamReader.ReadToEnd(). Voila.
The RSS is just xml and can be streamed to disk easily. Go with Darrel's example - it's all you'll need.