Sending/Uploading Files through APIs - .Net Core - c#

I would like to send a file through an API. The server will receive the file and save it on the server drive.
The question is that I have two Options to do this:
Read the file as a string and send the whole string in the request Body
Use multi-part
Could someone help in the pros and cons in using the two options.

It's not related to pros and cons, So basically to understand what is multipart means
Multipart requests combine one or more sets of data into a single
body, separated by boundaries. You typically use these requests for
file uploads and for transferring data of several types in a single
request (for example, a file along with a JSON object).
This mean that multipart contains different section of info that separated by boundary(random number).
For example:
POST /upload HTTP/1.1
Content-Length: 428
Content-Type: multipart/form-data; boundary=abcde12345
--abcde12345
Content-Disposition: form-data; name="id"
Content-Type: text/plain
{...Additional plain content goes here...}
--abcde12345
Content-Disposition: form-data; name="address"
Content-Type: application/json
{
"street": "3, Garden St",
"city": "Hillsbery, UT"
}
--abcde12345
Content-Disposition: form-data; name="profileImage "; filename="image1.png"
Content-Type: application/octet-stream
{...file content...}
--abcde12345--
In this example you can see different content sent in one request separated by boundary [boundary=abcde12345]. (Plain text, Json Object and File Content)
The file content section here is used for sending file data in application/octet-stream (string in the binary or base64 format), so basically you are sending the file data in form of string :) as you referred to the first point, but you can include some additional info, may be the new file name to be saved , the user who uploaded this file or whatever you want.
Hope you get the point.
Ref:
https://swagger.io/docs/specification/describing-request-body/file-upload/
https://swagger.io/docs/specification/describing-request-body/multipart-requests/

The only difference on the protocol level is that multipart/form-data requests must adhere to RFC 2388 while a custom typed request body can be arbitrary.
Using Base64 string has an advantage for uploading very tiny individual images. It's easy to handle and avoids dependency on Http.IFormFile. On the contrary, base64 encoded files are larger than the original and you need decode it on the server side.

Related

Upload file in server with a multipart/form-data - C#

I want to upload a file in WindChill (is a PLM from PTC). They give to us an REST API with the services to do this. They split an upload of file in 3 stages.
Stage 1 - We call a service where we give the number of files to upload. In this case only one.
Stage 2 - A multipart/formdata where we give the file to upload.
Stage 3 - The last stage where we give the file name, the file size etc...
I think my problem is on stage 2.
All the stages run successfully but when i try to open the uploaded file, in this case a pdf, the file is blank, but with the same number of pages of the original one. I compare the content of the uploaded file with the original one and the content inside is the same with a big difference. The original is with an ANSI encoding while the uploaded one is with the UTF-8 encoding. So, I think my problem is on the stage 2.
I'm with some doubts on this stage. In C# I get the bytes[] of file, but in the end I need to pass this bytes to a string to send in a multipart form. What is the encoding that i should use to get string? I tested with default, UTF-8, UNICODE, ASCII encoding but nothing.
Here is the example of the Post request body. In a C# I use the HTTPWebRequest to make a request.
------boundary
Content-Disposition: form-data; name="Master_URL"
https://MyUrl/Windchill/servlet/WindchillGW
------boundary
Content-Disposition: form-data; name="CacheDescriptor_array"
844032:844032:844032;
------boundary
Content-Disposition: form-data; name="844032"; filename="newDoc.pdf"
Content-Type: application/pdf
%PDF-1.7 //// The content of the file starts here
%µµµµ
1 0 obj
........
------boundary--
Before this approach I tried to convert the bytes[] ToBase64String and send an body like this:
------boundary
Content-Disposition: form-data; name="Master_URL"
https://MyUrl/Windchill/servlet/WindchillGW
------boundary
Content-Disposition: form-data; name="CacheDescriptor_array"
844033:844033:844033;
------boundary
Content-Disposition: form-data; name="844033"; filename="newDoc.pdf"
Content-Type: application/pdf
Content-Transfer-Encoding: base64
JVBERi0xLjcNCiW1tbW1DQox ........ //// The content of the file starts here
------boundary--
In this case, when I try to open the file i get the error "Failed to load PDF document". The file is corrupt.
I think the problem is on the stage 2, but I will share the body that i send in last stage for your understanding.
{"ContentInfo":[{"StreamId":844034,"EncodedInfo": "844034%3A40384%3A9276564%3A844034","FileName": "newDoc.pdf","PrimaryContent": true,"MimeType" : "application/pdf","FileSize" : 40384}]}
The StreamId and the EncodedInfo are returns of the stage 2 that I need to provide in the stage 3.
Anyone can see what I'm doing wrong? Anyone have some tips to help me to solve this issue?
Many thanks.
I have a big tip for solve issues like that.
Use postman. Make all the job in postman. After your job is done, you can generate the code for multiple languages with postman. Many thanks.

Discrepancy in text/plain content encoding returned by Gmail API

I am experimenting with reading multipart/mixed emails with GMail API.
The goal is to correctly decode each text/plain part of the multipart/mixed email (there can be many, in different encodings) to a C# string (i.e. UTF-16):
public static string DecodeTextPart(Google.Apis.Gmail.v1.Data.MessagePart part)
{
var content_type_header = part.Headers.FirstOrDefault(h => string.Equals(h.Name, "content-type", StringComparison.OrdinalIgnoreCase));
if (content_type_header == null)
throw new ArgumentException("No content-type header found in the email part");
var content_type = new System.Net.Mime.ContentType(content_type_header.Value);
if (!string.Equals(content_type.MediaType, "text/plain", StringComparison.OrdinalIgnoreCase))
throw new ArgumentException("The part is not text/plain");
return Encoding.GetEncoding(content_type.CharSet).GetString(GetAttachmentBytes(part.Body));
}
GetAttachmentBytes returns raw attachment bytes, without conversion, decoded from the base64url encoding that GMail uses.
What I find is that in many cases this produces invalid strings, because the raw bytes that I get for the attachment content appear to always be in UTF-8, even though content-type of that same part declares otherwise.
E.g. given the email:
Date: ...
From: ...
Reply-To: ...
Message-ID: ...
To: ...
Subject: Test 1 text file
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----------0E50FC0802A2FCCAA"
------------0E50FC0802A2FCCAA
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: 8bit
Content test: Cyrillic, Windows-1251 (à, ÿ, æ)
------------0E50FC0802A2FCCAA
Content-Type: TEXT/PLAIN;
name="Irrelevant.txt"
Content-transfer-encoding: base64
Content-Disposition: attachment;
filename="Irrelevant.txt"
VGhpcyBmaWxlIGRvZXMgbm90IGNvbnRhaW4gdXNlZnVsIGluZm9ybWF0aW9u
------------0E50FC0802A2FCCAA--
, I successfully find the first part, the code above figures that it's charset=windows-1251 with the help of System.Net.Mime.ContentType, and then .GetString() returns garbage because the actual raw bytes returned by GetAttachmentBytes correspond to UTF-8 encoding, not Windows-1251.
Exactly the same happens with
Subject: Test 2 text file
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----------0B716C1D8123D8710"
------------0B716C1D8123D8710
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
Content test: Cyrillic, koi-8 (Б, С, Ц)
------------0B716C1D8123D8710
Content-Type: TEXT/PLAIN;
name="Irrelevant.txt"
Content-transfer-encoding: base64
Content-Disposition: attachment;
filename="Irrelevant.txt"
VGhpcyBmaWxlIGRvZXMgbm90IGNvbnRhaW4gdXNlZnVsIGluZm9ybWF0aW9u
------------0B716C1D8123D8710--
Note that the three test letters in the parentheses after the encoding name are the same in both emails, and in Unicode look like (а, я, ж), but (correctly) look wrong in the email body represenatation quoted above due to different encodings.
If I "fix" the function to always use Encoding.UTF8 instead of GetEncoding(content_type.CharSet), then it appears to work in the tests that I've done so far.
At the same time, the GMail interface displays the letters correctly in both cases, so it must have correctly parsed the incoming emails using the correct declared encodings.
Is it the case that the GMail API re-encodes all text chunks into UTF-8 (wrapped in base64url), but reports the original charset for them?
Am I therefore supposed to always use UTF-8 with GMail API and disregard content-type's charset=?
Or is there a problem with my code?
According to these two resources:
Stack Overflow: Gmail API decoding messages in Javascript
GitHub: Google API Python Client: Invalid message body size
The Value is indeed a base-64 encoded representation of the part converted to UTF-8.
This is however not documented by Google, as far as I can find.

Where does strange MIME types come from?

I have a web service for uploading files written in C#.
Front-end application is written in Javascript / HTML5 (using https://github.com/blueimp/jQuery-File-Upload)
Recently, I was reviewing server logs and found some strange MIME types for PDF files that where sent by client browser, for example:
application/unknown
application/force-download
application/force-download/n
application/force-download\n
[application/pdf]
Some of them are causing .NET framework throwing exception:
MultipartMemoryStreamProvider streamProvider = new MultipartMemoryStreamProvider();
await Request.Content.ReadAsMultipartAsync(streamProvider);
"Message Error parsing MIME multipart body part header byte 156 of data segment System.Byte[]."
I don't have a clue what to do with that.
Try checking the Content-Type in the request.
Content-Disposition:
form-data;
name="imagefile";
filename="C:\Users\Pictures\sid.png"
Content-Type:
(notice the blank Content-Type, it should be Content-Type: image/png )
// within WebAPI you can use code below to log the request body
string requestBody = await Request.Content.ReadAsStringAsync();

How to call methods after HttpResponse?

I am a bit puzzled at the moment. I have a web application that manipulates a file and then returns the file to a user's browser for download when it's done.
The download part is going well, as I'm using Response.AddHeader and Reponse.BinaryWrite to push the file back to the browser but I am unable to call any further methods after using Response methods.
I suppose I have not worked with HttpReponse enough to know the trick to this. Perhaps I would be better off using another class or generic handler to handle the download?
My code goes something like...
// Methods to be called first
Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
Response.AddHeader("Content-Disposition", string.Format("attachment;filename={0}.pdf", "New_Merged_PDF_" + DateTime.Now.ToString("MMMM-dd-yyyy")));
Response.ContentType = "application/pdf";
Response.BinaryWrite(output.ToArray());
Response.End();
// Methods to be called last (these wont work)
Probably something simple that I'm overlooking but I'm still trying to figure it out.
To add a little color to Servy's explaination; there is an order of operations within the HTTP protocol specification. One of them is that the Headers need to be sent to the client before the Body. This allows the receiver of the response to properly deal with the Body that is sent based on any Headers.
The presence of a message-body in a request is signaled by the
inclusion of a Content-Length or Transfer-Encoding header field in
the request's message-headers.
IETF RFC 2616 - Hypertext Transfer Protocol -- HTTP/1.1
One of the few (if the only, I'm not sure) exception is the Content-Type: multipart/mixed;
IETF RFC 1867 - Form-based File Upload in HTML (although Obsolete, examples are still relevant)
Content-type: multipart/form-data, boundary=AaB03x
--AaB03x
content-disposition: form-data; name="field1"
Joe Blow
--AaB03x
content-disposition: form-data; name="pics"; filename="file1.txt"
Content-Type: text/plain
... contents of file1.txt ...
--AaB03x--

Error in reading image from NetworkStream

I have recieved an image with some text while reading from a NetworkStream. The stream includes something like this:
HTTP/1.0 200 OK
Expires: -1
Cache-Control: no-cache
Content-length: 29160
Content-type: image/jpeg
...followed by the image.
How can I read just the image from the NetworkStream?
You would have to parse the HTTP header first, to know where to stop discarding data. Alternatively, save the whole thing and then examine it afterwards, which may be simpler. Basically you'd be looking for two ASCII carriage-return/line-feed ("\r\n") pairs in a row.
However, there's a much better alternative: use an HTTP library. Parsing this yourself is like using text manipulation to handle XML; you're better off working at a higher level of abstraction with code which has been well tested for that abstraction.

Categories