Parsing MIME mail type - c#

After lots of efforts I created my own mail parser. Now successfully able to parse and display emails. But few mails especially sent from apple or Iphone appear like this after parsing. I have no idea why this is happening. Please help.
=D8=AA=D9=88=D8= =A7=D8=AC=D9=87=D9=86=D9=8A =D9=85=D8=B4=D9=83=D9=84=D8=A9 =D8=A5=D8=B4=D8= =A7=D8=B1=D8=A9 =D9=84=D9=84=D9=83=D8=B1=D8=AA =D8=B1=D9=82=D9=85 410814189= 68 =D8=B9=D9=84=D9=85=D8=A7=D9=8B =D8=A8=D8=A3=D9=86 =D8=A5=D8=B4=D8=

It would appear that you mail parser does not handle decoding of Quoted Printable content.
I imagine that if you looked at the headers, you'd find a header like this:
Content-Transfer-Encoding: quoted-printable
I've written several email clients and multiple mime parsers and am currently working on writing a new mime parser in C# (the others were in C) called MimeKit here: http://github.com/jstedfast/MimeKit. This may be of interest to you...
I've got a filterable stream class that you can add a QuotedPrintableDecoder to (which I've also implemented), then pass your data through that to decode it. Or you could just pass it through the QuotedPrintableDecoder directly, depending on whatever is easiest for you.
Example usage:
var decoder = new QuotedPrintableDecoder ();
var output = new byte[decoder.EstimateOutputLength (input.Length)];
var outputLength = decoder.Decode (input, 0, input.Length, output);
// convert the output into a string displayable to the user...
var text = System.Text.Encoding.UTF8.GetString (output, 0, outputLength);
Obviously you'd use the proper System.Text.Encoding for the content (by looking at the "charset" parameter in the Content-Type header) instead of blindly using System.Text.Encoding.UTF8.

Related

SendGrid inbound parse nordic chars

Completely stuck on a problem related to the inbound parse webhook functionality offered by SendGrid: https://sendgrid.com/docs/for-developers/parsing-email/setting-up-the-inbound-parse-webhook/
First off everything is working just fine with retrieving the mail sent to my application endpoint. Using Request.Form I'm able to retrieve the data and work with it.
The problem is that we started noticing question mark symbols instead of letters when recieving some mails (written in swedish using Å Ä and Ö). This occured both when sending plaintext mails, and mails with an HTML-body.
However, this only happens every now and then. After a lot of searching I found out that if the mail is sent from e.g. Postbox or Outlook (or the like), and the application has the charset set to iso-8859-1 that's when Å Ä Ö is replaced by question marks.
To replicate the error and be able to debug it I set up a HTML page with a form using the iso-8859-1 encoding, sending a similar payload as the one seen in the link above (the default one). And after that been through testing a multitude of things trying to get it to work.
As of now I'm trying to recode the input, without success. Code I'm testing:
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = wind1252.GetBytes(Request.Form.["html"]);
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8,wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);
This only results in the utf8String producing the same result with "???" where Å Ä Ö should be. My guess here is that perhaps it's due to the Request.Form["html"] returning a UTF-16 string, of the content that is encoded already in the wrong encoding iso-8859-1.
The method for fetching the POST is as follows
public async Task<InboundParseModel> FetchMail(IFormCollection form)
{
InboundParseModel _em = new InboundParseModel
{
To = form["to"].SingleOrDefault(),
From = form["from"].SingleOrDefault(),
Subject = form["subject"].SingleOrDefault(),
Html = form["html"].SingleOrDefault(),
Text = System.Net.WebUtility.HtmlEncode(form["text"].SingleOrDefault()),
Envelope = form["envelope"].SingleOrDefault()
};
}
Called from another method that the POST is done to by FetchMail(Request.Form);
Project info: ASP.NET Core 2.2, C#
So as stated earlier, I am completely stuck and don't really have any ideas on how to solve this. Any help would be much appreciated!

Sending quotation marks in a GCM Payload (and other special characters that break syntax)

I'm struggling finding a feasible solution to this. I've tried looking around but can't find any documentation regarding this issue. If a customer sends out a message with quote(s), it break the payload syntax and android spits me back a 400 Bad Request error.
The only solution I can think of is by doing my own translations and validations. Allow only the basics, and for the restricted do my own "parsing" Ie take a quote, replace them with "/q" and then replace "/q" on the App when received. I don't like this solution because it involves logic on the App that if, I forget something. I want to be able to change it immediately rather then update everyones phone, app, etc.
I'm looking for an existing encoding I could apply that is processed correctly by the GCM servers. Allowing them to be accepted then broadcasted. Received by the phone with the characters intact.
Base64 encoding should get rid of the special characters. Just encode it before sending and decode it again on receiving:
Edit: sorry, just got a java/android sample here, I don't know how exactly xamarin works and what functions it provides:
// before sending
byte[] data = message.getBytes("UTF-8");
String base64Message = Base64.encodeToString(data, Base64.DEFAULT);
// on receiving
byte[] data = Base64.decode(base64Message , Base64.DEFAULT);
String message= new String(data, "UTF-8");
.Net translation of #tknell solution
Decode:
Byte[] data = System.Convert.FromBase64String(encodedString);
String decoded = System.Text.Encoding.UTF8.GetString(data);
Encode:
Byte[] data = System.Text.Encoding.UTF8.GetBytes(decodedString);
String encoded = System.Convert.ToBase64String(data);

C# display XML from html POST

Got a problem here... If I put the XML file on the server, then I can read it through steamReader, convert to variable and got everything working in the MSSQL database.
However, it is required that I send through html POST, and it doesn't work for the code below:
page.Response.ContentType = "text/xml";
StreamReader reader = new StreamReader(page.Request.InputStream);
inputString = reader.ReadToEnd();
deleteShip(inputString);
it seems to me that the above code didn't get the XML that POST from my program. Because for the same code in deleteShip, if I use an xml on the server then it works fine.
Is there a way to solve this problem? As long as I can send any string to deleteShip(string s) then I'm happy. The string will be in XML format though
Thanks for the help!
It would be useful to see how the XML is POSTed to your program. Typically, data is sent from an HTML form as name-value pairs in the HTTP request body when using the POST method. It's not clear from your question whether you're using an HTML form to POST the XML to your program and it's hard to tell what might be going wrong without more information.
From your code it looks like you're reading the entire HTTP request where you'd usually read the value of a request parameter for example:
Request["XmlParameterName"]
Where XmlParameterName is the name of an HTML form input field.
Have you inspected the value of the inputString variable? Is it valid XML? Is it encoded correctly? Are any invalid characters like ampersands (&) escaped correctly?
Update your question with a bit more information if none of the things I mentioned are the problem.
OK, I got it fixed.
Here is the code.
System.IO.Stream stream;
string inputString;
Int32 stringLength;
stream = Request.InputStream;
stringLength = Convert.ToInt32(stream.Length);
byte[] stringArray = new byte[stringLength];
inputString = System.Text.Encoding.ASCII.GetString(stringArray, 0, stringLength);
deleteShip(inputString);
By this it will access the POST body from my html request (which in this case XML).

c# with SOAP - problem with utf-8 encoding

I'm using automatic conversion from wsdl to c#, everything works apart from encoding, whenever
I have native characters (like 'ł' or 'ó') I get '??' insted of them in string fields ('G????wny' instead of 'Główny'). How to deal with it? Server sends document with correct encoding, with header .
EDIT: I noticed in Wireshark, that packets send FROM me have BOM, but packets sends TO me, don't have it - maybe it's a root of problem?
So maybe the following will help:
What I am sure I did is:
In the webservice PHP file, after connecting to the Mysql Database I call:
mysql_query("SET CHARSET utf8");
mysql_query("SET NAMES utf8 COLLATE utf8_polish_ci");
The second I did:
In the same PHP file,
I added utf8_encode to the service on the $POST_DATA variable:
$server->service(utf8_encode($POST_DATA));
in the class.nusoap_base.php I changed:
`//var $soap_defencoding = 'ISO-8859-1';
var $soap_defencoding = 'UTF-8';`
and olso in the nusoap.php the same as above:
//var $soap_defencoding = 'ISO-8859-1';
var $soap_defencoding = 'UTF-8';
and in the nusoap.php file again:
var $decode_utf8 = true;
Now I can send and receive properly encoded data.
Hope this helps.
Regards,
The problem was on the server side with sent Content-Type parameter in header (it was set to "text/xml"). It occurs that for utf-8 it HAVE TO be "text/xml; charset=utf-8", other methods such as placing BOM aren't correct (according to RFC 3023). More info here: http://annevankesteren.nl/2005/03/text-xml

Parsing mail subject with inline specified encoding

I'm trying to parse Email Subject which have encoding specified in format itself. I get the format and imagine how this can be done, but maybe there is any free .Net solution available already so I wouldn't waste time on it?
Here is an example of subject I want to parse:
=?ISO-8859-13?Q?Fwd=3A_Dvira=E8iai_vasar=E0_vagiami_da=FEniau=2C_bet_draust?=
I found a great library for parsing mentioned strings and whole mail in general - SharpMimeTools
It can't get mail from POP3 server on its own (I use OpenPop.Net for that) but it parses it nicely. Waay waaay muuuch better than OpenPop.Net parser
var popClient = new POPClient();
popClient.Connect("pop.test.lt", 110, false);
popClient.Authenticate("test#test.lt", "test");
// Get OpenPop.Net message
var messageInfo = popClient.GetMessage(1, false);
// Covert raw message string into stream and create instance of SharpMessage from SharpMimeTools library
var messageBytes = Encoding.ASCII.GetBytes(rawMessage);
var messageStream = new MemoryStream(messageBytes);
var message = new SharpMessage(messageStream);
// Get decoded message and subject
var messageText = message.Body;
var messageSubject = message.Subject;
I am one of the developer of OpenPop.NET, and as of now a new release have been made. You should not see any problems parsing any emails in OpenPop.NET anymore. If you find any - please let us at our mailing list.
We even implemented a test case for your specific subject - just to make sure.

Categories