Decoding Http Response Content from Russian (Cyrillic) - c#

I send a request to an API and the server sends some portion of its response in Russian. I url decode the response using code page 1251 encoding but still don't get the result I want.
How can I convert the response back to plain english? What encoding do I use?

If you just need to convert Russian letters (Cyrillic) to Latin ones you can use Dictionary structure with Cyrillic-Latin relationship.
var map = new Dictionary<char, string>
{
{ 'Ж', "G" },
{ 'е', "e" },
{ 'ф', "f" },
{ 'Й', "Y" },
...
}
var result = string.Concat("Россия".Select(c => map[c]));

Not sure if I understood your intention correctly, but in case of HttpClient you can work with Windows-1251 (or another encoding) like this:
using (var httpClient = new HttpClient())
{
var httpResponse = await httpClient.GetAsync("requestUri");
var httpContent = await httpResponse.Content.ReadAsByteArrayAsync();
string responseString = Encoding.GetEncoding(1251).GetString(httpContent, 0, httpContent.Length);
// - check status code
// (int)httpResponse.StatusCode
// - and here's your response
// responseString
}
If responseString still contains some gibberish, then I would assume that this server uses not Windows-1251 but some other encoding, so first you'll need to establish which one exactly.
P.S. For Encoding.GetEncoding(1251) to work you might need to install System.Text.Encoding.CodePages NuGet package and register encoding provider:
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

Related

Parse HTTP request body to JSON string in .net core 3.0

I have implemented the following method format request body
private async Task<string> FormatRequest(HttpRequest request)
{
request.EnableBuffering();
//Create a new byte[] with the same length as the request stream
var buffer = new byte[Convert.ToInt32(request.ContentLength)];
//Copy the entire request stream into the new buffer
await request.Body.ReadAsync(buffer, 0, buffer.Length).ConfigureAwait(false);
//Convert the byte[] into a string using UTF8 encoding
var bodyAsText = Encoding.UTF8.GetString(buffer);
request.Body.Position = 0;
return bodyAsText;
}
I got the following result
------WebKitFormBoundaryY8OPXY2MlrKMjBRe
Content-Disposition: form-data; name="RoleId"
2
------WebKitFormBoundaryY8OPXY2MlrKMjBRe
Content-Disposition: form-data; name="AuthenticationSettingsId"
3
.....
Expected result
"{\"fields\":[\"RoleId\",\"2\",\"AuthenticationSettingsId\",\"1\",\"recommendation\",\"reviewerId\"],\"size\":100,\"filter\":[{\"id\":\"ApplicationId\",\"operator\":\"and\",\"parent\":\"\",\"nested\":false,\"type\":\"integer\",\"value\":[360]}],\"aggregate\":[],\"sort\":[]}"
Note: Previously we used request.EnableRewind() it was returning the above result and later upgraded to .net core 3.0
Here is a high level of how I handle JSON queries. If you really want to get fancy you can implement all this into an abstract class and inherit direct to your data model.
There are plenty of different ways to get where you want to be, hopefully this helps you get there.
I've put comments in the code, but feel free to ask away if something doesn't make sense.
class SomeHttpJsonUtility
{
//If you want to parse your return data
//directly into a data model
class DataModel{
class ReturnData
{
[JsonPropertyName("fields")]
public Field[] Fields { get; set; }
}
class Field
{
[JsonPropertyName("RoleId")]
public int RoleId { get; set; }
//...you get the idea
}
}
//Some data if your sending a post request
private Dictionary<string, string> postParameters = new Dictionary<string, string>();
//Creates a HTTP Client With Specified Parameters
//You can do this any number of ways depending on the
//source you are querying
private HttpClient GetClient()
{
HttpClient _client = new HttpClient();
_client.DefaultRequestHeaders.Clear();
_client.DefaultRequestHeaders.Add(
"UserAgent",
new string[] { "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:103.0) Gecko/20100101 Firefox/103.0" });
_client.DefaultRequestHeaders.Add(
"AcceptLanguage",
new string[] { "en-US" });
_client.DefaultRequestHeaders.Add(
"AcceptEncoding",
new string[] { "gzip", "deflate", "br" });
_client.DefaultRequestHeaders.Add(
"Accept",
new string[] { "*/*" });
_client.DefaultRequestHeaders.Add(
"Connection",
new string[] { "keep-alive" });
return _client;
}
private void GetJson(Uri from_uri)
{
//Get the HttpClient With Proper Request Headers
HttpClient _client =
GetClient();
Task.Run(async () =>
{
//If your data comes from a get request
HttpResponseMessage _httpResponse =
await _client.GetAsync(
requestUri:from_uri);
//Or if your response comes from a post
_httpResponse =
await _client.PostAsync(
requestUri: from_uri,
content: new FormUrlEncodedContent(postParameters)
);
//since your initial post used a stream, we can
//keep going in that direction
//Initilize a memory stream to process the data
using(MemoryStream _ms = new MemoryStream())
{
//Send the http response content
////into the memory stream
await _httpResponse.Content.CopyToAsync(
stream: _ms);
//Goto the start of the memory stream
_ms.Seek(
offset: 0,
loc: SeekOrigin.Begin);
//Option 1:
//Send direct to data model
// This is utilizing the Microsoft Library:
// System.Text.Json.Serialization;
DataModel dataModel =
JsonSerializer.Deserialize<DataModel>(
utf8Json: _ms);
//Option 2:
//Send to a string
using(StreamReader _sr = new StreamReader(_ms))
{
string dataAsSting = _sr.ReadToEnd();
}
}
}).Wait();
}
}
If your query is only a Get request, then it's pretty easy get get the exact headers you need.
Using Firefox hit F12 and goto the web address.
Click the Network Tab, then Headers and view the request data.
You really only need a few of these:
Accept
Accept-Encoding
Accept-Language
Connection
User-Agent
Mozilla has some nice resources regarding the different header objects.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept
Host should be taken care of by the HttpClient.
Cookies should be handled by the HttpClient (if you need them)
If you are actually getting the data back as Gzip you'll have to implement a reader, unless the HttpClient you are using will automatically decode it.
And at the end, victory :-)
I think you need to set the content-type on the request when you send it to application/json
Could you try to read this way?
var reader = new System.IO.StreamReader(request.Body);
var body = reader.ReadToEndAsync().Result;
Then you can use Newtonsoft or a similar library on body.
You need to to tokenize / encode your string with some JSON encoder.
Here you have two choices:
the internal (Microsoft) JsonConverter
the Newtonsoft.Json JsonConverter
Karthik, it appears you are sending a multipart request from a webkit browser.
If you would be able to just change it on the client side from multipart to application/json your problem would be fixed.
If this is not possible, you can just use:
private async Task<string> FormatRequest(HttpRequest request)
{
var form = request.Form.ToDictionary(x => x.Key, x => x.Value);
return JsonSerializer.Serialize(form);
}
This could parses your form into a dictionary and returns it as a Json.
(This code is written in dotnet 6, which has System.Text.Json. If you need to stick in .net 3.1, you would need to use a JsonSerializer like Newtonsoft.)

UTF-8 URL Encode

I am having issues in encoding my query params using HttpUtility.UrlEncode() it is not getting converted to UTF-8.
query["agent"] = HttpUtility.UrlEncode("{\"mbox\":\"mailto: UserName#company.com\"}");
I tried using the overload method and passed utf encoding but still it is not working.
expected result:
?agent=%7B%22mbox%22%3A%22mailto%3AUserName%40company.com%22%7D
Actual Result:
?agent=%257b%2522mbox%2522%253a%2522mailto%253aUserName%2540company.com%2522%257d
public StatementService(HttpClient client, IConfiguration conf)
{
configuration = conf;
var BaseAddress = "https://someurl.com/statements?";
client.BaseAddress = new Uri(BaseAddress);
client.DefaultRequestHeaders.Add("Custom-Header",
"customheadervalue");
Client = client;
}
public async Task<Object> GetStatements(){
var query = HttpUtility.ParseQueryString(Client.BaseAddress.Query);
query["agent"] = HttpUtility.UrlEncode( "{\"mbox\":\"mailto:UserName#company.com\"}");
var longuri = new Uri(Client.BaseAddress + query.ToString());
var response = await Client.GetAsync(longuri);
response.EnsureSuccessStatusCode();
using var responseStream = await response.Content.ReadAsStreamAsync();
dynamic statement = JsonSerializer.DeserializeAsync<object>(responseStream);
//Convert stream reader to string
StreamReader JsonStream = new StreamReader(statement);
string JsonString = JsonStream.ReadToEnd();
//convert Json String to Object.
JObject JsonLinq = JObject.Parse(JsonString);
// Linq to Json
dynamic res = JsonLinq["statements"][0].Select(res => res).FirstOrDefault();
return await res;
}
The method HttpUtility.ParseQueryString internally returns a HttpValueCollection. HttpValueCollection.ToString() already performs url encoding, so you don't need to do that yourself. If you do it yourself, it is performed twice and you get the wrong result that you see.
I don't see the relation to UTF-8. The value you use ({"mbox":"mailto: UserName#company.com"}) doesn't contain any characters that would look different in UTF-8.
References:
HttpValueCollection and NameValueCollection
ParseQueryString source
HttpValueCollection source
I strongly suggest you this other approach, using Uri.EscapeDataString method. This method is inside System.Net instead of System.Web that is a heavy dll. In addition HttpUtility.UrlEncode encode characters are in uppercase this would be an issue in certain cases while implementing HTTP protocols.
Uri.EscapeDataString("{\"mbox\":\"mailto: UserName#company.com\"}")
"%7B%22mbox%22%3A%22mailto%3A%20UserName%40company.com%22%7D"

Uploading .mp4 via HTTPS

I'm trying to upload an .mp4 file to Giphy.com's API. It says to send the file over as 'Binary' and I think I'm confused as what exactly they mean by that. Here's the docs if you scroll to the bottom at "Upload Endpoint". https://developers.giphy.com/docs/
Here's what I have right now.
I've tried multiple versions of this (using StringContent, MultipartFormDataContent, ByteArrayContent, HttpMessages... etc) and always get a '400 - Bad Request - No Source Url' (which the docs say isn't required if you upload you're own) which makes me believe the content isn't being recognized.
public async Task<HttpResponseMessage> UploadVideoAsync(StorageFile file)
{
using (var stream = await file.OpenStreamForReadAsync())
{
byte[] bytes = new byte[stream.Length];
await stream.ReadAsync(bytes, 0, (int)stream.Length);
Dictionary<string, string> dic = new Dictionary<string, string>
{
{ "file", Encoding.ASCII.GetString(bytes) },
{ "api_key", api_key }
};
MultipartFormDataContent multipartContent = new MultipartFormDataContent();
multipartContent.Add(new ByteArrayContent(bytes));
var response = await httpClient.PostAsync($"v1/gifs?api_key={api_key}", multipartContent);
var stringResponse = await response.Content.ReadAsStringAsync();
return response;
}
}
It seems that your code doesn't match {api_key} properly. You don't use the "dic" variable anywhere. You can try with v1/gifs?api_key=YOUR_API_KEY&file= instead. Where YOUR_API_KEY should be replaced by your API key obtained from giphy.
always get a '400 - Bad Request - No Source Url' (which the docs say isn't required if you upload you're own) which makes me believe the content isn't being recognized.
You need to apply a name for the ByteArrayContent. The document has shown that Request Parameters contains 'file: string (binary) required if no source_image_url supplied'.
The code should like the following:
MultipartFormDataContent multipartContent = new MultipartFormDataContent();
multipartContent.Add(new ByteArrayContent(bytes),"file");

Restsharp Not Encoding &s

I'm using restsharp to make a call out to an api and it is not encoding &s in parameter values (that's all I've tried so far with characters that need to be url encoded). I've used it before and looked at the source to double check that it does url encode both the key and value of parameters. Maybe I'm doing something wrong.
...
private static readonly RestClient _client = new RestClient();
public Guid Create(Dto myDto)
{
var request = new RestRequest(Method.GET)
{
Resource = "GetGuid"
};
request.AddParameter("name", myDto.Name);
var response = _client.Execute();
if (response.StatusCode != HttpStatusCode.OK)
{
Log.Error(string.Format("Could not register user with email {0} in crm", user.Email), this);
throw new Exception("Response from crm was not OK");
}
return Guid.Parse(response.Content);
}
...
The version of I was using was 105.0.0 which seems to have some encoding issues: https://github.com/restsharp/RestSharp/blob/master/releasenotes.markdown
I haven't looked at the source for that but bumping my version to 105.0.1 seemed to fix the issue.
Commit with the revert that fixed the encoding issue I encountered.

ISO-8859 encode post request content C#

I am trying to send a POST request in C# with a parameter encoded to ISO-8859. I am using this code:
using (var wb = new WebClient())
{
var encoding = System.Text.Encoding.GetEncoding("ISO-8859-1");
var encodedText = System.Web.HttpUtility.UrlEncode("åæ ÆÆ øØ ø", encoding);
wb.Encoding = encoding;
wb.Headers.Add("Content-Type", "application/x-www-form-urlencoded");
var data = new NameValueCollection();
data["TXT"] = encodedText;
var response = wb.UploadValues(_url, "POST", data);
}
I have figured out that the correctly encoded string for "åæ ÆÆ øØ ø" is %E5%E6+%C6%C6++%F8%D8+%F8, and I can see when debugging that encodedText actually is this string. However when inspecting the raw request in fiddler, I can see that the string looks like this: TXT=%25e5%25e6%2B%25c6%25c6%2B%25f8%25d8%2B%25f8. I am guessing some kind of extra encoding is being done to the string after or during the call to UploadValues().
Thank you so much in advance.
I checked Google for this. According to another question here on SO at UTF32 for WebClient.UploadValues? (second answer), Webclient.UploadValues() indeed does encoding itself. However, it does ASCII encoding. Youll have to use another method to upload this, like HttpWebRequest.

Categories