Accessing headers of body part in multipart mail in Outlook using C# - c#

I am trying to extract the headers of the body part of a multipart mail message in Outlook. The raw message (which I have not been able to get from my code) looks something like this:
Return-Path: ...
Received: ...
From: ...
Content-Type: multipart/signed; boundary="Apple-Mail=_06FDFEBB-366E-4B1E-AA7F-F5DDEC13FD03"; protocol="application/pgp-signature"; micalg=pgp-sha512
Subject: ...
Message-Id: ...
Date: ...
To: ...
Mime-Version: ...
X-Mailer: ...
--Apple-Mail=_06FDFEBB-366E-4B1E-AA7F-F5DDEC13FD03
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=us-ascii
...
--Apple-Mail=_06FDFEBB-366E-4B1E-AA7F-F5DDEC13FD03
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename=signature.asc
Content-Type: application/pgp-signature;
name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail
...
--Apple-Mail=_06FDFEBB-366E-4B1E-AA7F-F5DDEC13FD03--
I have replaced some of the none relevant parts by dots. The headers I am trying to get are the ones under the first boundary. So this is the part I am looking for:
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=us-ascii
However, if I could get the entire part between the boundaries, that would also be fine as I could parse it myself.
So far, I have only been able to get the headers at the top of the message (so from Return-Path until X-Mailer).
I was able to do that using a `PropertyAccessor in the following way:
mailItem.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x007D001E")
In this case mailItem is my Microsoft.Office.Interop.Outlook.MailItem instance.
So, what my question comes down to: How can I get the headers under the first boundary, or any bigger part of the message that contains them?

For signed messages, Outlook preserves the signed message body (with full MIME data) in an attachment called smime.p7m (it's always called smime, even if it's actually PGP/MIME). Unfortunately, Outlook hides this from you, transparently unpacking the signed message and displaying it instead. There's no way, using the Outlook Object Model, to get the actual message body.
However, if you're willing to call MAPI directly (easiest from native code, but can be done from .NET if you're not afraid of some nasty COM interop), you can get the multipart/signed body - both the signature and the complete signed part - as follows:
Starting from the Outlook MailItem, get the MAPIOBJECT property. This is actually a MAPI IMessage. Cast it to an IMAPISecureMessage (.NET will handle this as a QueryInterface behind the scenes). Call GetBaseMessage() on this IMAPISecureMessage (the only documented function), which returns another IMessage. This is the "real" message, the one with the smime.p7m attachment. Unfortunately, there's no way to put this back into OOM, so you have to continue using MAPI. By calling the functions on IMessage, you can get the attachment, then get its data. You'll need to parse the MIME parts, at least enough to get the signed part without its headers, outer boundaries, or of course the signature part. Verify that signed part (without decoding its internal parts, if any, or decoding quoted-printable or anything like that) against the signature blob.

PR_TRANSPORT_MESSAGE_HEADERS property is the only thing you can get. Outlook does not store the full MIME source of the original message.
You can see what is available in OutlookSpy (I am its author) - click IMessage button.

Related

Sending/Uploading Files through APIs - .Net Core

I would like to send a file through an API. The server will receive the file and save it on the server drive.
The question is that I have two Options to do this:
Read the file as a string and send the whole string in the request Body
Use multi-part
Could someone help in the pros and cons in using the two options.
It's not related to pros and cons, So basically to understand what is multipart means
Multipart requests combine one or more sets of data into a single
body, separated by boundaries. You typically use these requests for
file uploads and for transferring data of several types in a single
request (for example, a file along with a JSON object).
This mean that multipart contains different section of info that separated by boundary(random number).
For example:
POST /upload HTTP/1.1
Content-Length: 428
Content-Type: multipart/form-data; boundary=abcde12345
--abcde12345
Content-Disposition: form-data; name="id"
Content-Type: text/plain
{...Additional plain content goes here...}
--abcde12345
Content-Disposition: form-data; name="address"
Content-Type: application/json
{
"street": "3, Garden St",
"city": "Hillsbery, UT"
}
--abcde12345
Content-Disposition: form-data; name="profileImage "; filename="image1.png"
Content-Type: application/octet-stream
{...file content...}
--abcde12345--
In this example you can see different content sent in one request separated by boundary [boundary=abcde12345]. (Plain text, Json Object and File Content)
The file content section here is used for sending file data in application/octet-stream (string in the binary or base64 format), so basically you are sending the file data in form of string :) as you referred to the first point, but you can include some additional info, may be the new file name to be saved , the user who uploaded this file or whatever you want.
Hope you get the point.
Ref:
https://swagger.io/docs/specification/describing-request-body/file-upload/
https://swagger.io/docs/specification/describing-request-body/multipart-requests/
The only difference on the protocol level is that multipart/form-data requests must adhere to RFC 2388 while a custom typed request body can be arbitrary.
Using Base64 string has an advantage for uploading very tiny individual images. It's easy to handle and avoids dependency on Http.IFormFile. On the contrary, base64 encoded files are larger than the original and you need decode it on the server side.

Discrepancy in text/plain content encoding returned by Gmail API

I am experimenting with reading multipart/mixed emails with GMail API.
The goal is to correctly decode each text/plain part of the multipart/mixed email (there can be many, in different encodings) to a C# string (i.e. UTF-16):
public static string DecodeTextPart(Google.Apis.Gmail.v1.Data.MessagePart part)
{
var content_type_header = part.Headers.FirstOrDefault(h => string.Equals(h.Name, "content-type", StringComparison.OrdinalIgnoreCase));
if (content_type_header == null)
throw new ArgumentException("No content-type header found in the email part");
var content_type = new System.Net.Mime.ContentType(content_type_header.Value);
if (!string.Equals(content_type.MediaType, "text/plain", StringComparison.OrdinalIgnoreCase))
throw new ArgumentException("The part is not text/plain");
return Encoding.GetEncoding(content_type.CharSet).GetString(GetAttachmentBytes(part.Body));
}
GetAttachmentBytes returns raw attachment bytes, without conversion, decoded from the base64url encoding that GMail uses.
What I find is that in many cases this produces invalid strings, because the raw bytes that I get for the attachment content appear to always be in UTF-8, even though content-type of that same part declares otherwise.
E.g. given the email:
Date: ...
From: ...
Reply-To: ...
Message-ID: ...
To: ...
Subject: Test 1 text file
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----------0E50FC0802A2FCCAA"
------------0E50FC0802A2FCCAA
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: 8bit
Content test: Cyrillic, Windows-1251 (à, ÿ, æ)
------------0E50FC0802A2FCCAA
Content-Type: TEXT/PLAIN;
name="Irrelevant.txt"
Content-transfer-encoding: base64
Content-Disposition: attachment;
filename="Irrelevant.txt"
VGhpcyBmaWxlIGRvZXMgbm90IGNvbnRhaW4gdXNlZnVsIGluZm9ybWF0aW9u
------------0E50FC0802A2FCCAA--
, I successfully find the first part, the code above figures that it's charset=windows-1251 with the help of System.Net.Mime.ContentType, and then .GetString() returns garbage because the actual raw bytes returned by GetAttachmentBytes correspond to UTF-8 encoding, not Windows-1251.
Exactly the same happens with
Subject: Test 2 text file
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----------0B716C1D8123D8710"
------------0B716C1D8123D8710
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
Content test: Cyrillic, koi-8 (Б, С, Ц)
------------0B716C1D8123D8710
Content-Type: TEXT/PLAIN;
name="Irrelevant.txt"
Content-transfer-encoding: base64
Content-Disposition: attachment;
filename="Irrelevant.txt"
VGhpcyBmaWxlIGRvZXMgbm90IGNvbnRhaW4gdXNlZnVsIGluZm9ybWF0aW9u
------------0B716C1D8123D8710--
Note that the three test letters in the parentheses after the encoding name are the same in both emails, and in Unicode look like (а, я, ж), but (correctly) look wrong in the email body represenatation quoted above due to different encodings.
If I "fix" the function to always use Encoding.UTF8 instead of GetEncoding(content_type.CharSet), then it appears to work in the tests that I've done so far.
At the same time, the GMail interface displays the letters correctly in both cases, so it must have correctly parsed the incoming emails using the correct declared encodings.
Is it the case that the GMail API re-encodes all text chunks into UTF-8 (wrapped in base64url), but reports the original charset for them?
Am I therefore supposed to always use UTF-8 with GMail API and disregard content-type's charset=?
Or is there a problem with my code?
According to these two resources:
Stack Overflow: Gmail API decoding messages in Javascript
GitHub: Google API Python Client: Invalid message body size
The Value is indeed a base-64 encoded representation of the part converted to UTF-8.
This is however not documented by Google, as far as I can find.

Not able to get date or header from "From" line with MimeKit

When I try to parse an MBOX file with MimeKit, it fails at grabbing the first line which starts with "From". When I try to import the same data in Thunderbird (ImportExportTools NG), it parses the first line.
Here is a sample section that is not working
From 1234#test Sat Feb 3 05:50:57 2018
From: Test Name <test.name#test.com>
MIME-Version: 1.0
Content-Type: text/plain
Hello
MimeKit seems to be only recognizing this
From: Test Name <test.name#test.com>
MIME-Version: 1.0
Content-Type: text/plain
Maybe I am going about it incorrectly, would love some guidance on how to get the date from the first line.
Mbox From-lines are not part of a MIME message, they are just markers in an mbox file.
You can get access to the mbox marker (aka From-line) from the MimeParser via the MboxMarker property.

How to call methods after HttpResponse?

I am a bit puzzled at the moment. I have a web application that manipulates a file and then returns the file to a user's browser for download when it's done.
The download part is going well, as I'm using Response.AddHeader and Reponse.BinaryWrite to push the file back to the browser but I am unable to call any further methods after using Response methods.
I suppose I have not worked with HttpReponse enough to know the trick to this. Perhaps I would be better off using another class or generic handler to handle the download?
My code goes something like...
// Methods to be called first
Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
Response.AddHeader("Content-Disposition", string.Format("attachment;filename={0}.pdf", "New_Merged_PDF_" + DateTime.Now.ToString("MMMM-dd-yyyy")));
Response.ContentType = "application/pdf";
Response.BinaryWrite(output.ToArray());
Response.End();
// Methods to be called last (these wont work)
Probably something simple that I'm overlooking but I'm still trying to figure it out.
To add a little color to Servy's explaination; there is an order of operations within the HTTP protocol specification. One of them is that the Headers need to be sent to the client before the Body. This allows the receiver of the response to properly deal with the Body that is sent based on any Headers.
The presence of a message-body in a request is signaled by the
inclusion of a Content-Length or Transfer-Encoding header field in
the request's message-headers.
IETF RFC 2616 - Hypertext Transfer Protocol -- HTTP/1.1
One of the few (if the only, I'm not sure) exception is the Content-Type: multipart/mixed;
IETF RFC 1867 - Form-based File Upload in HTML (although Obsolete, examples are still relevant)
Content-type: multipart/form-data, boundary=AaB03x
--AaB03x
content-disposition: form-data; name="field1"
Joe Blow
--AaB03x
content-disposition: form-data; name="pics"; filename="file1.txt"
Content-Type: text/plain
... contents of file1.txt ...
--AaB03x--

Error in reading image from NetworkStream

I have recieved an image with some text while reading from a NetworkStream. The stream includes something like this:
HTTP/1.0 200 OK
Expires: -1
Cache-Control: no-cache
Content-length: 29160
Content-type: image/jpeg
...followed by the image.
How can I read just the image from the NetworkStream?
You would have to parse the HTTP header first, to know where to stop discarding data. Alternatively, save the whole thing and then examine it afterwards, which may be simpler. Basically you'd be looking for two ASCII carriage-return/line-feed ("\r\n") pairs in a row.
However, there's a much better alternative: use an HTTP library. Parsing this yourself is like using text manipulation to handle XML; you're better off working at a higher level of abstraction with code which has been well tested for that abstraction.

Categories