SendGrid inbound parse nordic chars - c#

Completely stuck on a problem related to the inbound parse webhook functionality offered by SendGrid: https://sendgrid.com/docs/for-developers/parsing-email/setting-up-the-inbound-parse-webhook/
First off everything is working just fine with retrieving the mail sent to my application endpoint. Using Request.Form I'm able to retrieve the data and work with it.
The problem is that we started noticing question mark symbols instead of letters when recieving some mails (written in swedish using Å Ä and Ö). This occured both when sending plaintext mails, and mails with an HTML-body.
However, this only happens every now and then. After a lot of searching I found out that if the mail is sent from e.g. Postbox or Outlook (or the like), and the application has the charset set to iso-8859-1 that's when Å Ä Ö is replaced by question marks.
To replicate the error and be able to debug it I set up a HTML page with a form using the iso-8859-1 encoding, sending a similar payload as the one seen in the link above (the default one). And after that been through testing a multitude of things trying to get it to work.
As of now I'm trying to recode the input, without success. Code I'm testing:
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = wind1252.GetBytes(Request.Form.["html"]);
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8,wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);
This only results in the utf8String producing the same result with "???" where Å Ä Ö should be. My guess here is that perhaps it's due to the Request.Form["html"] returning a UTF-16 string, of the content that is encoded already in the wrong encoding iso-8859-1.
The method for fetching the POST is as follows
public async Task<InboundParseModel> FetchMail(IFormCollection form)
{
InboundParseModel _em = new InboundParseModel
{
To = form["to"].SingleOrDefault(),
From = form["from"].SingleOrDefault(),
Subject = form["subject"].SingleOrDefault(),
Html = form["html"].SingleOrDefault(),
Text = System.Net.WebUtility.HtmlEncode(form["text"].SingleOrDefault()),
Envelope = form["envelope"].SingleOrDefault()
};
}
Called from another method that the POST is done to by FetchMail(Request.Form);
Project info: ASP.NET Core 2.2, C#
So as stated earlier, I am completely stuck and don't really have any ideas on how to solve this. Any help would be much appreciated!

Related

Send UTF-8 string from Android to C#

I've been trying to accomplish a simple text transmission from my Android app to my C# server (asmx server), sending the simplest string - and for some reason it never works. My Android code is as following (assume that the variable 'message' holds the string as received from an EditText, which is UTF-16 as far as I'm concerned):
httpClient = new DefaultHttpClient();
HttpPost post = new HttpPost(POST_MESSAGE_ADDRESS);
byte[] messageBytes = message.getBytes("utf-8");
builder.addPart("message", new StringBody(messageBytes.toString()));
HttpEntity entity = builder.build();
post.setEntity(entity);
HttpResponse response = httpClient.execute(post);
So I get something simple for my message, say a 10 bytes array. In my server, I have a function set to that specific address; its code is:
string message = HttpContext.Current.Request.Form["message"];
byte[] test = System.Text.Encoding.UTF8.GetBytes(message);
Now after that line the byte array ('test') has the exact same value as the result of the ToString() function I called in the app. Question is, how do I convert it to normal UTF-8 text to display?
Note: I have tried sending the string normally as a string content, but as far as I understood the default coding is ASCII so I got a lot of question marks.
Edit: Now I'm looking for some conversions solutions and trying them, but my question is also if there's a simpler way to do that (perhaps BinaryBody in the android, or different coding?)
Problem is in following lines:
byte[] messageBytes = message.getBytes("utf-8");
builder.addPart("message", new StringBody(messageBytes.toString()));
First you are transforming your UTF-16 string message into UTF-8 encoded messageBytes only to convert them back to UTF-16 string in next line. And there you are using StringBody constructor that will use ASCII encoding as default.
You should replace those lines with:
builder.addPart("message", new StringBody(message, Charset.forName("UTF-8")));

C# : Change String Encoding?

I'm struggling with the encoding of one of my string.
On a Mail Sending WS, I'm receiving a bad string containing "�" instead of "é" (that's what I'm seeing in the Debug Mode of Visual Studio at least).
The character comes from some JSON that is deserialized when entering the WS into my DTO.
Changing the Content-Type of the JSON is not solving the thing.
So I thought I'll change the encoding of my string by myself, because the JSON encoding thing seems like a VS deserialization issue (I started a thread here if one of you guys want to take a look at it).
I tried :
Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding defaultEncoding = Encoding.Default;
byte[] bytes = defaultEncoding.GetBytes(messedUpString);
byte[] isoBytes = Encoding.Convert(defaultEncoding, iso, bytes);
cleanString = iso.GetString(isoBytes);
Or :
byte[] bytes = Encoding.Default.GetBytes(messedUpString);
cleanString = Encoding.UTF8.GetString(bytes);
And it's not really effective... I get rid of the "�" char, which is the nice part, but I'm receiving in the cleanString "?" instead of the expected "é", and this in not really nice, or at least, the expected behavior.
In fact, every thing was fine in my application.
I used SOAPUI to test, and this was my error.
I downloaded some rest plugin for my browser, try from there, and everything worked.
Thanks for the help though #MattiVirkkunen

Sending quotation marks in a GCM Payload (and other special characters that break syntax)

I'm struggling finding a feasible solution to this. I've tried looking around but can't find any documentation regarding this issue. If a customer sends out a message with quote(s), it break the payload syntax and android spits me back a 400 Bad Request error.
The only solution I can think of is by doing my own translations and validations. Allow only the basics, and for the restricted do my own "parsing" Ie take a quote, replace them with "/q" and then replace "/q" on the App when received. I don't like this solution because it involves logic on the App that if, I forget something. I want to be able to change it immediately rather then update everyones phone, app, etc.
I'm looking for an existing encoding I could apply that is processed correctly by the GCM servers. Allowing them to be accepted then broadcasted. Received by the phone with the characters intact.
Base64 encoding should get rid of the special characters. Just encode it before sending and decode it again on receiving:
Edit: sorry, just got a java/android sample here, I don't know how exactly xamarin works and what functions it provides:
// before sending
byte[] data = message.getBytes("UTF-8");
String base64Message = Base64.encodeToString(data, Base64.DEFAULT);
// on receiving
byte[] data = Base64.decode(base64Message , Base64.DEFAULT);
String message= new String(data, "UTF-8");
.Net translation of #tknell solution
Decode:
Byte[] data = System.Convert.FromBase64String(encodedString);
String decoded = System.Text.Encoding.UTF8.GetString(data);
Encode:
Byte[] data = System.Text.Encoding.UTF8.GetBytes(decodedString);
String encoded = System.Convert.ToBase64String(data);

C# - Korean Encoding

This might be different with other Korean encoding questions.
There is this site I have to scrape and it's Korean.
An example sentence in their site is this
"개인정보보호를 위해 뒤로가기 버튼 대신 검색결과 화면 상단과 하단의 이전 버튼을 사용하시기 바랍니다."
I am using HttpWebRequest and HttpWebResponse to scrape the site.
this is how I retreive the html
-- partial code --
using (Stream data = resp.GetResponseStream())
{
response.Append(new StreamReader(data, Encoding.GetEncoding(code), true).ReadToEnd());
}
now my problem is, am not getting the correct Korean characters. In my "code" variable, I'm basing the code page here in MSDN http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx (let me narrow it down).
here are the Korean code pages:
51949, 50225, 20949, 20833, 10003, 949
but am still not getting the correct Korean characters? What you think is the problem?
It is very likely that the page is not in a specific Korean encoding, but one of the Unicode encodings.
Try Encoding.UTF8, Encoding.Default (UTF-16) instead of the specific code pages. There are also Encoding.UTF7 and Encoding.UTF32, but they are not as common.
To be certain, examine the meta tags and headers for the content-type returned by the server.
Update (gleaned from commments):
Since the content-type header is EUC-KR, the corresponding codepage is 51949 and this is what you need to use to retrieve the page.
It was not clear that you are writing this out to a file - you need to use the same encoding when writing the file out, or convert the byte[] from the original to the output file encoding (using Encoding.Convert).
While having exact same issue I've finished it with code below:
Encoding.UTF8.GetString(DownloadData(URL));
This directly transform output for the WebClient GET request to UTF8 encoding.

c# with SOAP - problem with utf-8 encoding

I'm using automatic conversion from wsdl to c#, everything works apart from encoding, whenever
I have native characters (like 'ł' or 'ó') I get '??' insted of them in string fields ('G????wny' instead of 'Główny'). How to deal with it? Server sends document with correct encoding, with header .
EDIT: I noticed in Wireshark, that packets send FROM me have BOM, but packets sends TO me, don't have it - maybe it's a root of problem?
So maybe the following will help:
What I am sure I did is:
In the webservice PHP file, after connecting to the Mysql Database I call:
mysql_query("SET CHARSET utf8");
mysql_query("SET NAMES utf8 COLLATE utf8_polish_ci");
The second I did:
In the same PHP file,
I added utf8_encode to the service on the $POST_DATA variable:
$server->service(utf8_encode($POST_DATA));
in the class.nusoap_base.php I changed:
`//var $soap_defencoding = 'ISO-8859-1';
var $soap_defencoding = 'UTF-8';`
and olso in the nusoap.php the same as above:
//var $soap_defencoding = 'ISO-8859-1';
var $soap_defencoding = 'UTF-8';
and in the nusoap.php file again:
var $decode_utf8 = true;
Now I can send and receive properly encoded data.
Hope this helps.
Regards,
The problem was on the server side with sent Content-Type parameter in header (it was set to "text/xml"). It occurs that for utf-8 it HAVE TO be "text/xml; charset=utf-8", other methods such as placing BOM aren't correct (according to RFC 3023). More info here: http://annevankesteren.nl/2005/03/text-xml

Categories