I'm using C#'s WebClient to download and image from a CardDav server and when I look at Fiddler the response from the server will contain a jpeg file, I can even preview the response as an image on Fiddler and it looks fine.
I've tried all the c# conventional methods to convert byte arrays to image/bitmap and none of them work, they throw an "invalid argument exception" exception.
FIDDLER response preview:
Content-Type: image/jpeg
Cache-Control: max-age=32000000, private
Content-Disposition: attachment
Content-Length: 46341
ÿØÿà JFIF ÿÛ C ÿÛ CÿÀ œ ² ÿÄ
ÿÄ µ } !1AQa"q2‘¡#B±ÁRÑð$3br‚
%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyzƒ„…†‡ˆ‰Š’“”•–—˜™š¢£¤¥¦§¨©ª²³´µ¶·¸¹ºÂÃÄÅÆÇÈÉÊÒÓÔÕÖ×ØÙÚáâãäåæçèéêñòóôõö÷øùúÿÄ
ÿÄ µ w !1AQaq"2B‘¡±Á #3RðbrÑ
$4á%ñ&'()56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz‚ƒ„…†‡ˆ‰Š’“”•–—˜™š¢£¤¥¦§¨©ª²³´µ¶·¸¹ºÂÃÄÅÆÇÈÉÊÒÓÔÕÖ×ØÙÚâãäåæçèéêòóôõö÷øùúÿÚ ? ÿ ?ú ( €
(Ü>~οiïX|9øà=[âŒ5ËK(4}È{¥k¸¯®ÃÃ$É3YÙéú^©ªêw0E2iºN™¨jw¾M•œó êì…ÿ &ý©ièüq{¯jºÁµøo«Ieâ_
øâË]³ñµäz–½¦Þbøoû1µ©e¹»ðÖ§é£Í£^ëË¢6¡·¥ëú^‘š‹Kvíøú
®Á¼:f™û,káü¿þ(~Ô><ñ=æƒðÏÂÞ4Ó"ø/¥Úivþ¹Ôõ=w^Ò¼A¯yö+¢^Xj?b:Ô–/ØCe|šVÛäµµI÷ð«öcý†>3|hý®¼9û2.™¦øOÆÖþ3—C½»ñî]ø&ÛZЊê'Iñ,iot¶wðÂ>ݦ]Åm©Œ—\XGx¯fvM¾‰¿¸ôý™?fKŸÙãàφ¾~Îqi¿ üsñ7Bñ'íà¿|HøÉ¢i%øà8í<;âü>Õ<3â2ø]|Cðźèþ›sqáýà³ð]íí‘Õ¾ÚéwÊ´\d£Ì›M'¦ýCk?Äÿ
ø‡Æ^"¸øÉñ<|ÃÚ7ˆuÙüpøû7ŽnõO é/™l-õýGâÑ黨øœk~ ¹Ò5]2ïQ“Á^Ò5“EƒW’æ![•Z|Òv²{ëÑ¿.þ›þw_ðUOØ+Dz'í6“ÿ ÷„-´¿‰vkã
ø/ៈu xÃú•êé>ðä:®¥¿¾žâK[ˆàžðIu}qò=î©ró]?Dd¤®¿¯ëüÀýý™¿à߯|Tý›ü¯|cÅ?
Format: JPEG
46,341 bytes
178w x 156h
1.67 bytes/px
96 dpi
Baseline
Subsample#4:4:4 (non-opt)
APP0 Data (14 bytes)
[JFIF1.1]
Aspect: 1:1
HuffmanTables: 4
SOLUTION
It seems like in my WebClient routine I was pushing some necessary headers that caused the image to be returned in a strange encoding, now I'm pushing only "User-Agent" and "Authorization" and the response can be decoded to image perfectly.
It seems like in my WebClient routine I was pushing some unnecessary headers that caused the image to be returned in a strange encoding, now I'm pushing only "User-Agent" and "Authorization" and the response can be decoded into an image perfectly.
Related
I'll get straight to the point: how to upload PDF files from a C# backend into a HTTP web service inside a multipart/form-data request without the contents being mangled to the point of the file becoming unreadable? The web service documentation only states that text files should be text/plain and image files should be binary; PDF files are only mentioned as "also supported", with no mention of what format or encoding they should be in.
The code I'm using to create the request:
HttpWebRequest request;
string boundary = "---------------------------" + DateTime.Now.Ticks.ToString("x");
request.ContentType = "multipart/form-data; boundary=" + boundary;
using (StreamWriter sw = new StreamWriter(request.GetRequestStream())) {
sw.WriteLine("--" + boundary);
sw.WriteLine("Content-Disposition: form-data; name=\"files\"; filename=\"" + Path.GetFileName(filePath) + "\"");
sw.WriteLine(filePath.EndsWith(".pdf") ? "Content-Type: application/pdf" : "Content-Type: text/plain");
sw.WriteLine();
if (filePath.EndsWith(".pdf")) {
// write PDF content into the request stream
}
else sw.WriteLine(File.ReadAllText(filePath));
sw.Write("--" + boundary);
sw.Write("--");
sw.Flush();
}
For simple text files, this code works just fine. However, I have trouble uploading a PDF file.
Writing the file into the request body using StreamWriter.WriteLine with either File.ReadAllText or Encoding.UTF8.GetString(File.ReadAllBytes) results in the uploaded file being unreadable due to .NET having replaced all the non-UTF-8 bytes with squares (which somehow also increased file size by over 100 kB). Same result with UTF-7 and ANSI, but UTF-8 results in the closest match to the original file's contents.
Writing the file into the request body as binary data using either BinaryWriter or Stream.Write results in the web service rejecting it outright as invalid POST data. Content-Transfer-Encoding: binary (indicated by the documentation as necessary for application/http, hence why I tried) also causes rejection.
What alternative options are available? How can I encode PDF without .NET silently replacing the invalid bytes with placeholder characters? Note that I have no control over what kind of content the web service accepts; if I did, I'd already have moved on to base64.
Problem solved, my bad. The multipart form header and the binary data were both correct but were in the wrong order because I didn't Flush() the StreamWriter before writing the binary data into the request stream with Stream.CopyTo().
Moral of the story: if you're writing into the same Stream with more than one Writer at the same time, always Flush() before doing anything with the next Writer.
I try to extract one ore more PDF Files from a Signed Mail. Simply i tryed to load the smime.p7m with
mimeMessage = MimeMessage.Load(mem);
//mem is a MemoryStream from File created with File.WriteAllBytes(file,fileAttachment.Content); (EWS FileAttachment)
This is not working, because the File begins with:
0€ *†H†÷
€0€10 + 0€ *†H†÷
€$€‚
&Content-Type: multipart/mixed;
boundary="----=_NextPart_000_0024_01D432F9.7988F010"
So i removed the shit (not all here visible) before Content-Type (with IndexOf, Substring) .. now i can load it into a MineMessage. Now i try to decode the Base64 String, but if i use the decodeto Method the Filesize is nearly the Same
but File is damaged, if i look in the Raw Data of the Original PDF File decoded by Outlook and my decoded one, they are nearly the same but in the last 10% they are different (in the original are more Linebreaks).
So i tryed to use
Convert.FromBase64String()
But i get allways invalied base64 code exception
the PDF Part with header begins with:
Content-Type: application/pdf;
name="DE_Windows 7_WebDAV.pdf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="WebDAV.pdf"
‚ JVBERi0xLjUNCiW1tbW1DQoxIDAgb...
(before and after the , are not here visible Chars, i removed them too). If i load the base64 Code (copy&Paste as Text, with Windows Editor) into a onlinedecoder its is decoding, if i upload the File with the base64code it fails ...
AND inside the base64 are some not base64 chars "unknown" ,"," "Uparrow Symbol", i think this will kill the decoding, the base64 code is too long for here =( (see picture)
But this ist 1:1 what File.WriteAllBytes(file,fileAttachment.Content); or/and fileAttachment.Load(file); saves
Can u help me please? And from where are this unknown Chars?
Ok, i got it ... 2 Days of my Life wasted for this ***
Before saving a signed Attachment u must run this code to "unsign" and all the chars u not want are gone =)
byte[] content = fileAttachment.Content;
var signed = new SignedCms();
signed.Decode(content);
byte[] unsigned = signed.ContentInfo.Content;
For some experimentation was working with Simple HTTP Server code here
In one case I wanted it to serve some ANSI encoded text configuration files. I am aware there are more issues with this code but the only one I'm currently concerned with is Content-Length is wrong, but only for certain text files.
Example code:
Output stream initialisation:
outputStream = new StreamWriter(new BufferedStream(socket.GetStream()));
The handling of HTTP get:
public override void handleGETRequest(HttpProcessor p)
{
if (p.http_url.EndsWith(".pac"))
{
string filename = Path.Combine(Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location), p.http_url.Substring(1));
Console.WriteLine(string.Format("HTTP request for : {0}", filename));
if (File.Exists(filename))
{
FileInfo fi = new FileInfo(filename);
DateTime lastWrite = fi.LastWriteTime;
Stream fs = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.Read);
StreamReader sr = new StreamReader(fs);
string result = sr.ReadToEnd().Trim();
Console.WriteLine(fi.Length);
Console.WriteLine(result.Length);
p.writeSuccess("application/x-javascript-config",result.Length,lastWrite);
p.outputStream.Write(result);
// fs.CopyTo(p.outputStream.BaseStream);
p.outputStream.BaseStream.Flush();
fs.Close();
}
else
{
Console.WriteLine("404 - FILE not found!");
p.writeFailure();
}
}
}
public void writeSuccess(string content_type,long length,DateTime lastModified) {
outputStream.Write("HTTP/1.0 200 OK\r\n");
outputStream.Write("Content-Type: " + content_type + "\r\n");
outputStream.Write("Last-Modified: {0}\r\n", lastModified.ToUniversalTime().ToString("r"));
outputStream.Write("Accept-Range: bytes\r\n");
outputStream.Write("Server: FlakyHTTPServer/1.3\r\n");
outputStream.Write("Date: {0}\r\n", DateTime.Now.ToUniversalTime().ToString("r"));
outputStream.Write(string.Format("Content-Length: {0}\r\n\r\n", length));
}
For most files I've tested with Content-Length is correct. However when testing with HTTP debugging tool Fiddler some times protocol violation is reported on Content-Length.
For example fiddler says:
Request Count: 1
Bytes Sent: 303 (headers:303; body:0)
Bytes Received: 29,847 (headers:224; body:29,623)
So Content-Length should be 29623. But the HTTP header generated is
Content-Length: 29617
I saved the body of HTTP content from Fiddler and visibly compared the files, couldn't notice any difference. Then loaded them into BeyondCompare Hex compare, there are several problems with files like this:
Original File: 2D 2D 96 20 2A 2F
HTTP Content : 2D 2D EF BF BD 20 2A 2F
Original File: 27 3B 0D 0A 09 7D 0D 0A 0D 0A 09
HTTP Content : 27 3B 0A 09 7D 0A 0A 09
I suspect problem is related to encoding but not exactly sure. Only serving ANSI encoded files, no Unicode.
I made the file serve correctly with right Content-Length by modifying parts of the file with bytes sequence. Made this change in 3 parts of the file:
2D 2D 96 (--–) to 2D 2D 2D (---)
Based on the bytes you pasted, it looks like there are a couple things going wrong here. First, it seems that CRLF in your input file (0D 0A) is being converted to just LF (0A). Second, it looks like the character encoding is changing, either when reading the file into a string, or Writeing the string to the HTTP client.
The HTTP Content-Length represents the number of bytes in the stream, whereas string.Length gives you the number of characters in the string. Unless your file is exclusively using the first 128 ASCII characters (which precludes non-English characters as well as special windows-1252 characters like the euro sign), it's unlikely that string.Length will exactly equal the length of the string encoded in either UTF-8 or ISO-8859-1.
If you convert the string to a byte[] before sending it to the client, you'll be able to get the "true" Content-Length. However, you'll still end up with mangled text if you didn't read the file using the proper encoding. (Whether you specify the encoding or not, a conversion is happening when reading the file into a string of Unicode characters.)
I highly recommend specifying the charset in the Content-Type header (e.g. application/x-javascript-config;charset=utf-8). It doesn't matter whether your charset is utf-8, utf-16, iso-8859-1, windows-1251, etc., as long as it's the same character encoding you use when converting your string into a byte[].
I have an image's URL. Is it possible to find out its size(in bytes) and dimensions without downloading the complete image?
EDIT
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Method = "HEAD";
HttpWebResponse resp = (HttpWebResponse)(req.GetResponse());
System.Console.WriteLine(resp.ContentLength);
I have written this code. It works fine for first two time and the third time it gives WebException i.e. "operation timed out" irrespective of the image url. Is there something that I am missing here?
No, this is not possible.
You can get its size in bytes by issuing an HTTP HEAD command (instead of a GET); this will return the HTTP headers only, omitting the contents.
The HTTP header of an image will return its size in bytes:
Content-length: 6372
Content-type: image/jpeg
but not its dimensions.
So you'll have to do an HTTP GET...
Basically, yes. If you know what kind of image it is e.g. png or jpg (you can also get this information from the MIME-type of your HTTP-connection, you can download the header only and read the image extents from there. Also see this related question:
What is the header size of png, jpg/jpeg, bmp, gif and other common graphics format?
I have written a mail-processing program, which basically slaps a template on incoming mail and forwards it on. Incoming mail goes to a Gmail account, which I download using POP, then I read the mail (both html and plain text multipart-MIME), make whatever changes I need to the template, then create a new mail with the appropriate plain+html text and send it on to another address.
Trouble is, when the mail gets to the other side, some of the mails have been mangled, with weird characters like à and  magically getting inserted. They weren't in the original mails, they're not in my template, and I can't find any sort of predictable pattern as to when these characters appear. I'm sure it's got something to do with the encoding properties of the mails, but I am making sure to set both the charset and the transfer encoding of the outgoing mail to be the same as the incoming mail. So what else do I need to do?
EDIT: Here's a snipped sample of an incoming mail:
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
=0A=0ASafari Special:=0A=0A=A0=0A=0ASafari in Thornybush Priv=
ate Game Reserve 9-12=0AJanuary 2012 (3nights)
After processing, this comes out as:
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
=0D=0A=0D=0ASafari Special:=0D=0A=0D=0A=C2=A0=0D=0A=0D=0A=
Safari in Thornybush Private Game Reserve 9-12=0D=0AJanuary=
2012 (3nights)
Notice the insertion of the =0D and =C2 characters (aside from a few =0A's that weren't in the original).
So what does you think is happening here?
ANOTHER CLUE: Here's my code that creates the alternate view:
var htmlView = AlternateView.CreateAlternateViewFromString(htmlBody, null, "text/html");
htmlView.ContentType.CharSet = charSet;
htmlView.TransferEncoding = transferEncoding;
m.AlternateViews.Add(htmlView);
Along the lines of what #mjwills suggested, perhaps the CreateAlternativeViewFromString() method already assumes UTF-8, and changing it later to iso-8859-1 doesn't make a difference?
So every =0A is becoming =0D=0A.
And every =A0 is becoming =C2=A0.
The former looks like it might be related to Carriage Return / Line Feeds.
The latter looks like it might be related to What is "=C2=A0" in MIME encoded, quoted-printable text?.
My guess is that even though you have specified the charset, something alone the line is treating it as UTF8.
You may want to try using this form of CreateAlternateViewFromString, where the ContentType.CharSet is set appropriately.