Unable to Extract MTOM/XOP Attachment in C# - c#

I'm still unable to extract the MIME attachment. Please check below MIME message. which we received from the service.
--MIMEBoundary_199ca6b7114b9acca5deb2047d25d5841d4afb7f68281379
Content-Type: application/xop+xml; charset=utf-8; type="text/xml"
Content-Transfer-Encoding: binary
Content-ID: <0.099ca6b7114b9acca5deb2047d25d5841d4afb7f68281379#apache.org>
<?xml version="1.0" encoding="utf-8"?><soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"><soapenv:Header><StateHeader xmlns="http://www.statemef.com/StateGatewayService"><MessageID>12345201704200009962</MessageID><RelatesTo>12345201704200009962</RelatesTo><Action>GetNewAcks</Action><Timestamp>2017-02-11T01:54:51.676-05:00</Timestamp><TestIndicator>T</TestIndicator></StateHeader></soapenv:Header><soapenv:Body><GetNewAcksResponse xmlns="http://www.statemef.com/StateGatewayService"><MoreAvailable>true</MoreAvailable><AcknowledgementListAttachmentMTOM><xop:Include xmlns:xop="http://www.w3.org/2004/08/xop/include" href="cid:299ca6b7114b9acca5deb2047d25d5841d4afb7f68281379#apache.org"></xop:Include></AcknowledgementListAttachmentMTOM></GetNewAcksResponse></soapenv:Body></soapenv:Envelope>
--MIMEBoundary_199ca6b7114b9acca5deb2047d25d5841d4afb7f68281379
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary
Content-ID: <299ca6b7114b9acca5deb2047d25d5841d4afb7f68281379#apache.org>

Step 1: Get the complete MIME stream, i.e. the Content-Type header that defines the boundary parameter to be MIMEBoundary_199ca6b7114b9acca5deb2047d25d5841d4afb7f68281379. Without that, you are SOL.
If you are using something like HttpWebRequest, proceed to Step 2.
Step 2: Follow the instructions in the MimeKit FAQ:
How would I parse multipart/form-data from an HTTP web request?
Since classes like HttpWebResponse take care of parsing the HTTP headers (which includes the Content-Type header) and only offer a content stream to consume, MimeKit provides a way to deal with this using the following
two static methods on MimeEntity:
public static MimeEntity Load (ParserOptions options, ContentType contentType, Stream content, CancellationToken cancellationToken = default (CancellationToken));
public static MimeEntity Load (ContentType contentType, Stream content, CancellationToken cancellationToken = default (CancellationToken));
Here's how you might use these methods:
MimeEntity ParseMultipartFormData (HttpWebResponse response)
{
var contentType = ContentType.Parse (response.ContentType);
return MimeEntity.Load (contentType, response.GetResponseStream ());
}
Once you have the MimeEntity, you can cast it to a Multipart and enumerate the attachments within, saving the content to a stream like this:
int i = 1;
foreach (var attachment in multipart.OfType<MimePart> ()) {
string fileName = string.Format ("attachment.{0}.dat", i++);
using (var stream = File.Create (fileName))
attachment.ContentObject.DecodeTo (stream);
}

The question shows only the response Body. To parse it, you should prepend the response Header.
For example, it should look like:
MIME-Version: 1.0
content-type: multipart/related; type="application/xop+xml";start="<http://tempuri.org/0>";boundary="MIMEBoundary_someuniqueID";start-info="text/xml"
Server: Microsoft-IIS/10.0
X-Powered-By: ASP.NET
Content-Length:24371900
--MIMEBoundary_someuniqueID
Content-Type: application/xop+xml; charset=utf-8; type="text/xml" Content-Transfer-Encoding: binary
Content-ID: <http://tempuri.org/0>
<soap:Envelope>
<someWrapperElt>
<xop:Include href="cid:uri_of_content"></xop:Include>
</someWrapperElt>
</soap:Envelope>
--MIMEBoundary_someuniqueID
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary
Content-ID: <uri_of_content>
...start.b1n#ry-content-here-etc.fckZ8990832d...
--MIMEBoundary_someuniqueID--
Then convert the whole response into a MemoryStream object, and use XmlDictionaryReader to parse it.
XmlDictionaryReader mtomReader = XmlDictionaryReader.CreateMtomReader(ms, Encoding.UTF8, XmlDictionaryReaderQuotas.Max);
That is, you can now extract desired values out of the mtomReader object, including the attachment.

You can take parser project in Github
WebResponseDerializer component can parse multipart soap message
1) Copy xml message between soap body tag to xml2charp site and take deserilized object.
2) Take response stream and call like below.
Byte[] file = File.ReadAllBytes("..\\..\\Data\\ccc.xxx");
Stream stream = new MemoryStream(file);
WebResponseDerializer<SIGetImageResponse> deserilizer = new WebResponseDerializer<SIGetImageResponse>(stream);
SIGetImageResponse ddd = deserilizer.GetData();
foreach (var item in ddd.ResponseData.AttachmentDescriptor.Attachment)
{
String contentId = "<<" + item.ImageData.Include.Href + ">>";
contentId = contentId.Replace("%40", "#").Replace("cid:", "");
item.ImageData.Include.XopData = deserilizer.GetAttachment(contentId);
}

Related

How to log file contents in request body of a multipart/form-data request

I am trying to log all requests in my asp.net web API project to a text file. I am using DelegationHandler feature to implement logging mechanism in my application, below is the code snippet for that,
public class MyAPILogHandler : DelegatingHandler
{
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
// Captures all properties from the request.
var apiLogEntry = CreateApiLogEntryWithRequestData(request);
if (request.Content != null)
{
await request.Content.ReadAsStringAsync()
.ContinueWith(task =>
{
apiLogEntry.RequestContentBody = task.Result;
}, cancellationToken);
}
return await base.SendAsync(request, cancellationToken)
.ContinueWith(task =>
{
var response = task.Result;
// Update the API log entry with response info
apiLogEntry.ResponseStatusCode = (int)response.StatusCode;
apiLogEntry.ResponseTimestamp = DateTime.Now;
if (response.Content != null)
{
apiLogEntry.ResponseContentBody = response.Content.ReadAsStringAsync().Result;
apiLogEntry.ResponseContentType = response.Content.Headers.ContentType.MediaType;
apiLogEntry.ResponseHeaders = SerializeHeaders(response.Content.Headers);
}
var logger = new LogManager();
logger.Log(new LogMessage()
{
Message = PrepareLogMessage(apiLogEntry),
LogTo = LogSource.File
});
return response;
}, cancellationToken);
}
}
Above implementation is working as expected and it is logging all required request/response information to the file.
But when we make any multipart/form-data POST api call with some images attached, after logging this request, log file becomes huge, because all image/binary content is getting converted into string and writing down it the text file. please find below log file content,
Body:
----------------------------079603462429865781513947
Content-Disposition: form-data; name="batchid"
22649EEE-3994-4225-AF73-D9A6B659CAE3
----------------------------079603462429865781513947
Content-Disposition: form-data; name="files"; filename="d.png"
Content-Type: image/png
PNG
IHDR í %v ¸ sRGB ®Îé gAMA ±üa pHYs à ÃÇo¨d ÿ¥IDATx^ìýX]K¶(
·îsß»ß÷þï{O÷iÛ Á2âîîîÁe¹âîî,<# Á$÷w_ÈZó5$Dwvv×}
----------------------------4334344396037865656556781513947
Content-Disposition: form-data; name="files"; filename="m.png"
Content-Type: image/png
PNG
IHDR í %v ¸ sRGB ®Îé gAMA ±üa pHYs à ÃÇo¨d ÿ¥IDATx^ìýX]K¶(
·îsß»ß÷þï{O÷iÛ Á2âîîîÁe¹âîî,<# Á$÷w_ÈZó5$Dwvv×}
I don't want to log the binary content of a request body, it could be sufficient to log only request body file contents like,
----------------------------079603462429865781513947
Content-Disposition: form-data; name="batchid"
22649EEE-3994-4225-AF73-D9A6B659CAE3
----------------------------079603462429865781513947
Content-Disposition: form-data; name="files"; filename="d.png"
Content-Type: image/png
----------------------------4334344396037865656556781513947
Content-Disposition: form-data; name="files"; filename="m.png"
Content-Type: image/png
Can you please suggest how to prevent logging binary content of request body and log only file contents of a request body.
From what I gather, you are implementing something similar to this approach.
When uploading a file (i.e., a request type "multipart/form-data") its actual content always begins with the "Content-Type: {ContentTypeValue}\r\n\r\n" sequence and the next header begins with "\r\n--" sequence (as it is illustrated in your logs). You can explore more info about raw request data parsing at ReferenceSource. So, you can strip everything about file (if exists), for example, via RegEx:
Content-Type: {ContentTypeOrOctetStream}\r\n\r\n{FileContentBytesToRemove}\r\n
using System.Text.RegularExpressions;
...
string StripRawFileContentIfExists(string input) {
if(input.IndexOf("Content-Type") == -1)
return input;
string regExPattern = "(?<ContentTypeGroup>Content-Type: .*?\\r\\n\\r\\n)(?<FileRawContentGroup>.*?)(?<NextHeaderBeginGroup>\\r\\n--)";
return Regex.Replace(input, regExPattern, me => me.Groups["ContentTypeGroup"].Value + string.Empty + me.Groups["NextHeaderBeginGroup"].Value);
}
...
//apiLogEntry.RequestContentBody = task.Result;
apiLogEntry.RequestContentBody = StripRawFileContentIfExists(task.Result);
I'd suggest separate huge content from log. like you encountered, it just screws everything up in the log. to some extent disables log functionality.
I'd suggest you organize those huge content in file system. like this:
---request-a/
|--request-a-body-multi-part1.txt
|--request-a-body-multi-part2.txt
and just maintain a link in your log to reference this file system path.
hope it helps.

C# OAuth batch multipart content response, how to get all the contents not as string objects

I'm receiving a multipart content response that belongs to an OAuth batch request:
// batchRequest is a HttpRequestMessage, http is an HttpClient
HttpResponseMessage response = await http.SendAsync(batchRequest);
If I read its content as full text:
string fullResponse = await response.Content.ReadAsStringAsync();
This is what it contains:
--batchresponse_e42a30ca-0f3a-4c17-8672-22abc469cd16
Content-Type: application/http
Content-Transfer-Encoding: binary
HTTP/1.1 200 OK
DataServiceVersion: 3.0;
Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8
{\"odata.metadata\":\"https://graph.windows.net/XXX.onmicrosoft.com/$metadata#directoryObjects/#Element\",\"odata.type\":\"Microsoft.DirectoryServices.User\",\"objectType\":\"User\",\"objectId\":\"5f6851c3-99cc-4a89-936d-4bb44fa78a34\",\"deletionTimestamp\":null,\"accountEnabled\":true,\"signInNames\":[],\"assignedLicenses\":[],\"assignedPlans\":[],\"city\":null,\"companyName\":null,\"country\":null,\"creationType\":null,\"department\":\"NRF\",\"dirSyncEnabled\":null,\"displayName\":\"dummy1 Test\",\"facsimileTelephoneNumber\":null,\"givenName\":\"dummy1\",\"immutableId\":null,\"isCompromised\":null,\"jobTitle\":\"test\",\"lastDirSyncTime\":null,\"mail\":null,\"mailNickname\":\"dummy1test\",\"mobile\":null,\"onPremisesSecurityIdentifier\":null,\"otherMails\":[],\"passwordPolicies\":null,\"passwordProfile\":{\"password\":null,\"forceChangePasswordNextLogin\":true,\"enforceChangePasswordPolicy\":false},\"physicalDeliveryOfficeName\":null,\"postalCode\":null,\"preferredLanguage\":null,\"provisionedPlans\":[],\"provisioningErrors\":[],\"proxyAddresses\":[],\"refreshTokensValidFromDateTime\":\"2016-12-02T08:37:24Z\",\"showInAddressList\":null,\"sipProxyAddress\":null,\"state\":\"California\",\"streetAddress\":null,\"surname\":\"Test\",\"telephoneNumber\":\"666\",\"thumbnailPhoto#odata.mediaEditLink\":\"directoryObjects/5f6851c3-99cc-4a89-936d-4bb44fa78a34/Microsoft.DirectoryServices.User/thumbnailPhoto\",\"usageLocation\":null,\"userPrincipalName\":\"dummy1test#XXX.onmicrosoft.com\",\"userType\":\"Member\"}
--batchresponse_e42a30ca-0f3a-4c17-8672-22abc469cd16
Content-Type: application/http
Content-Transfer-Encoding: binary
HTTP/1.1 200 OK
DataServiceVersion: 3.0;
Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8
{\"odata.metadata\":\"https://graph.windows.net/XXX.onmicrosoft.com/$metadata#directoryObjects/#Element\",\"odata.type\":\"Microsoft.DirectoryServices.User\",\"objectType\":\"User\",\"objectId\":\"dd35d761-e6ed-44e7-919f-f3b1e54eb7be\",\"deletionTimestamp\":null,\"accountEnabled\":true,\"signInNames\":[],\"assignedLicenses\":[],\"assignedPlans\":[],\"city\":null,\"companyName\":null,\"country\":null,\"creationType\":null,\"department\":null,\"dirSyncEnabled\":null,\"displayName\":\"Max Admin\",\"facsimileTelephoneNumber\":null,\"givenName\":null,\"immutableId\":null,\"isCompromised\":null,\"jobTitle\":null,\"lastDirSyncTime\":null,\"mail\":null,\"mailNickname\":\"maxadmin\",\"mobile\":null,\"onPremisesSecurityIdentifier\":null,\"otherMails\":[],\"passwordPolicies\":null,\"passwordProfile\":null,\"physicalDeliveryOfficeName\":null,\"postalCode\":null,\"preferredLanguage\":null,\"provisionedPlans\":[],\"provisioningErrors\":[],\"proxyAddresses\":[],\"refreshTokensValidFromDateTime\":\"2016-12-05T15:11:51Z\",\"showInAddressList\":null,\"sipProxyAddress\":null,\"state\":null,\"streetAddress\":null,\"surname\":null,\"telephoneNumber\":null,\"thumbnailPhoto#odata.mediaEditLink\":\"directoryObjects/dd35d761-e6ed-44e7-919f-f3b1e54eb7be/Microsoft.DirectoryServices.User/thumbnailPhoto\",\"usageLocation\":null,\"userPrincipalName\":\"maxadmin#XXX.onmicrosoft.com\",\"userType\":\"Member\"}
--batchresponse_e42a30ca-0f3a-4c17-8672-22abc469cd16--
I need to get all these contents as objects (like classics HttpResponseMessage, not simple strings), in order to get the HTTP return code, the JSON content, etc. as properties and be able to treat them.
I know how to read separatly all these contents, but I can't figure how to get them as objects, I've only succeeded in getting a string content :
var multipartContent = await response.Content.ReadAsMultipartAsync();
foreach (HttpContent currentContent in multipartContent.Contents) {
var testString = currentContent.ReadAsStringAsync();
// How to get this content as an exploitable object?
}
In my example, testString contains:
HTTP/1.1 200 OK
DataServiceVersion: 3.0;
Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8
{\"odata.metadata\":\"https://graph.windows.net/XXX.onmicrosoft.com/$metadata#directoryObjects/#Element\",\"odata.type\":\"Microsoft.DirectoryServices.User\",\"objectType\":\"User\",\"objectId\":\"5f6851c3-99cc-4a89-936d-4bb44fa78a34\",\"deletionTimestamp\":null,\"accountEnabled\":true,\"signInNames\":[],\"assignedLicenses\":[],\"assignedPlans\":[],\"city\":null,\"companyName\":null,\"country\":null,\"creationType\":null,\"department\":\"NRF\",\"dirSyncEnabled\":null,\"displayName\":\"dummy1 Test\",\"facsimileTelephoneNumber\":null,\"givenName\":\"dummy1\",\"immutableId\":null,\"isCompromised\":null,\"jobTitle\":\"test\",\"lastDirSyncTime\":null,\"mail\":null,\"mailNickname\":\"dummy1test\",\"mobile\":null,\"onPremisesSecurityIdentifier\":null,\"otherMails\":[],\"passwordPolicies\":null,\"passwordProfile\":{\"password\":null,\"forceChangePasswordNextLogin\":true,\"enforceChangePasswordPolicy\":false},\"physicalDeliveryOfficeName\":null,\"postalCode\":null,\"preferredLanguage\":null,\"provisionedPlans\":[],\"provisioningErrors\":[],\"proxyAddresses\":[],\"refreshTokensValidFromDateTime\":\"2016-12-02T08:37:24Z\",\"showInAddressList\":null,\"sipProxyAddress\":null,\"state\":\"California\",\"streetAddress\":null,\"surname\":\"Test\",\"telephoneNumber\":\"666\",\"thumbnailPhoto#odata.mediaEditLink\":\"directoryObjects/5f6851c3-99cc-4a89-936d-4bb44fa78a34/Microsoft.DirectoryServices.User/thumbnailPhoto\",\"usageLocation\":null,\"userPrincipalName\":\"dummy1test#XXX.onmicrosoft.com\",\"userType\":\"Member\"}
I can't just imagine to parse manually this string... So if someone has a clue or can explain me the good way to read the content, it would be nice.
Thanks,
Max
Here is a way it can be done. The key is to add a new content type "msgtype" header to the response:
var multipartContent = await response.Content.ReadAsMultipartAsync();
var multipartRespMsgs = new List<HttpResponseMessage>();
foreach (HttpContent currentContent in multipartContent.Contents) {
// Two cases:
// 1. a "single" response
if (currentContent.Headers.ContentType.MediaType.Equals("application/http", StringComparison.OrdinalIgnoreCase)) {
if (!currentContent.Headers.ContentType.Parameters.Any(parameter => parameter.Name.Equals("msgtype", StringComparison.OrdinalIgnoreCase) && parameter.Value.Equals("response", StringComparison.OrdinalIgnoreCase))) {
currentContent.Headers.ContentType.Parameters.Add(new NameValueHeaderValue("msgtype", "response"));
}
multipartRespMsgs.Add(await currentContent.ReadAsHttpResponseMessageAsync());
// The single object in multipartRespMsgs contains a classic exploitable HttpResponseMessage (with IsSuccessStatusCode, Content.ReadAsStringAsync().Result, etc.)
}
// 2. a changeset response, which is an embedded multipart content
else {
var subMultipartContent = await currentContent.ReadAsMultipartAsync();
foreach (HttpContent currentSubContent in subMultipartContent.Contents) {
currentSubContent.Headers.ContentType.Parameters.Add(new NameValueHeaderValue("msgtype", "response"));
multipartRespMsgs.Add(await currentSubContent.ReadAsHttpResponseMessageAsync());
// Same here, the objects in multipartRespMsgs contain classic exploitable HttpResponseMessages
}
}
}
Thanks to darl0026

mimekit outlook show text as attachment

I have a word document and using Aspose.Word to perform a mail merge and save the result to a memory stream as mhtml (part of my code):
Aspose.Words.Document doc = new Aspose.Words.Document(documentDirectory + countryLetterName);
doc.MailMerge.Execute(tempTable2);
MemoryStream outStream = new MemoryStream();
doc.Save(outStream, Aspose.Words.SaveFormat.Mhtml);
Then I use MimeKit (latest version from NuGet) to send my message:
outStream.Position = 0;
MimeMessage messageMimeKit = MimeMessage.Load(outStream);
messageMimeKit.From.Add(new MailboxAddress("<sender name>", "<sender email"));
messageMimeKit.To.Add(new MailboxAddress("<recipient name>", "<recipient email>"));
messageMimeKit.Subject = "my subject";
using (var client = new MailKit.Net.Smtp.SmtpClient())
{
client.Connect(<smtp server>, <smtp port>, true);
client.Authenticate("xxxx", "pwd");
client.Send(messageMimeKit);
client.Disconnect(true);
}
When opening the received email in my mail web client, I see the text (with image) and the image as attachment.
When opening the received email in Outlook (2016), the mail body is empty and I have two attachments, 1 with the text and 1 with the image.
Looking at the mht contents itself, it looks like:
MIME-Version: 1.0
Content-Type: multipart/related;
type="text/html";
boundary="=boundary.Aspose.Words=--"
This is a multi-part message in MIME format.
--=boundary.Aspose.Words=--
Content-Disposition: inline;
filename="document.html"
Content-Type: text/html;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Location: document.html
<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; charset=
=3Dutf-8" /><meta http-equiv=3D"Content-Style-Type" content=3D"text/css" />=
<meta name=3D"generator" content=3D"Aspose.Words for .NET 14.1.0.0" /><titl=
e></title></head><body>
*****body removed *****
</body></html>
--=boundary.Aspose.Words=--
Content-Disposition: inline;
filename="image.001.jpeg"
Content-Type: image/jpeg
Content-Transfer-Encoding: base64
Content-Location: image.001.jpeg
****image content remove****
--=boundary.Aspose.Words=----
Is there some formatting or so I have to do to get this correctly shown in Outlook? Or is it caused by the "3D"-keywords found, like content=3D"xxxx", style=3D"xxxx"?
Thanks in advance.
Edward
The =3D bits are the quoted-printable encoding of the = character. Since the headers properly declare the Content-Transfer-Encoding to be quoted-printable, that's not the problem.
Here are some suggestions on trying to massage the content into something that will work in Outlook (Outlook is very finicky):
MimeMessage messageMimeKit = MimeMessage.Load(outStream);
messageMimeKit.From.Add(new MailboxAddress("<sender name>", "<sender email"));
messageMimeKit.To.Add(new MailboxAddress("<recipient name>", "<recipient email>"));
messageMimeKit.Subject = "my subject";
var related = (MultipartRelated) messageMimeKit.Body;
var body = (MimePart) related[0];
// It's possible that the filename on the HTML body is confusing Outlook.
body.FileName = null;
// It's also possible that the Content-Location is confusing Outlook
body.ContentLocation = null;

C# WebClient Strange Characters

I am trying to download this webpage using C# WebClient..
Now it works perfectly with python urllib2 but with c# web client it gives these strange characters in the output file..
I have tried using Encoding with webclient class as well but it doesn't work at all..
public static string GetWebURL()
{
string url = "http://bet.hkjc.com";
WebClient webClient = new WebClient();
webClient.Encoding = Encoding.UTF8;
string html = webClient.DownloadString(url);
File.WriteAllText("page.html", html);
}
this is the output with those strange characters
‹âå²Qtñw‰pUðñõQuòñtVPÒÕ×7vÖ×w qÂH˜è*„%æg–dæç%æèë»ú)ÙñrÂ(N.Ê,(Q(©,HµU*I­(ÑÃJ,K„ˆ*Ùq)((â€U*TÆ’e‰E ©y‰I9©ŽÉÉ©ÅÅÎùy%Eù9 ¶i‰9Å©Ö %â„¢i Xâ€h"(É-P°U(ÃÃŒKÉ/×ËÉON¹H/£(5M¯¸4©¸¤HÃ\SlHu°kPËœkP¼Ÿ£¯+PP/L‘ÂËœ4&µÂ?MCI_IS®+%?713Ã/17¨ ɘfd!¸ zJšÚ†P«Sò“KsSóJô &MA V¨ŸKòô’RK‚s2ÜŠ€ªô2‹}òÓóó445¡ÊÃ=­Wâ€Z“˜œ t|zj^jQbN<Ø1z䁚9‰y鶩yJ_ÂP-ˆÔšœchˆe¦‚ µ\H&[×rÙèC’€0ÂJ%à „ ÷‚üüP9Ud¦MÃÃÔÌØÈÖM×ÃÈ25² ÷ô³V·†(ÃŽM-JOM
What should I do to see the html that is being send?
You're looking at a compressed byte stream. You can tell by inspecting the headers of the http response, for example with curl:
curl -X HEAD -i http://bet.hkjc.com/
but the Developer Console of your browser will reveal the same:
HTTP/1.1 200 OK
Cache-Control: public, max-age=120, must-revalidate
Content-Length: 3615
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Expires: Wed, 29 Jun 2016 08:01:06 GMT
Vary: Accept-Encoding
Server: Microsoft-IIS/7.0
X-AspNet-Version: 2.0.50727
X-Powered-By: ASP.NET
Date: Wed, 29 Jun 2016 08:00:14 GMT
Via: 1.1 stjbwbwa52
Accept-Ranges: bytes
Notice the Content-Encoding: to say gzip. This means the result you just got is compressed with the gzip algorithm. The standard WebClient can't handle that but with an simple subclass the WebClient can do new tricks:
public class DecompressWebClient:WebClient
{
// moved common logic here
public DecompressWebClient()
{
this.Encoding = Encoding.UTF8;
}
// This is the factory to create the webrequest
protected override WebRequest GetWebRequest(Uri address)
{
// get the default one
var request = base.GetWebRequest(address);
// see if it is a HttpWebRequest
var httpReq = request as HttpWebRequest;
if (httpReq != null)
{
// add extra capabilities, like decompression
httpReq.AutomaticDecompression = DecompressionMethods.GZip;
}
return request;
}
}
On the HttpWebRequest there exists a property AutomaticDecompression that, when set to true, will take care of the decompression for us.
When you put the Subclassed WebClient to use your code will look like:
string url = "http://bet.hkjc.com";
using(WebClient webClient = new DecompressWebClient())
{
string html = webClient.DownloadString(url);
File.WriteAllText("page.html", html);
}
The encoding UTF8 is correct, as you can also see in the header for the Content-Type setting.
The top of the html file will look like this:
<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7; IE=EmulateIE10"/>
<meta name="application-name" content="香港賽馬會"/>
<title>香港賽馬會</title>

Web Service always return empty plain/text

Has been using Service Reference without any success:
Web service return only XML
Now I am using the raw SOAP message to do it:
XmlDocument doc = new XmlDocument();
doc.Load("Service.xml");
// create the request to your URL
Uri wsHost = new Uri("http://www.rrr.net/services/Connect");
HttpWebRequest request = (HttpWebRequest) WebRequest.Create(wsHost);
// add the headers
// the SOAPACtion determines what action the web service should use
request.Headers.Add("SOAPAction", "act");
// set the request type
request.ContentType = "text/xml;charset=\"utf-8\"";
request.Accept = "text/xml";
request.Method = "POST";
// add our body to the request
Stream stream = request.GetRequestStream();
doc.Save(stream);
stream.Close();
// get the response back
using( HttpWebResponse response = (HttpWebResponse)request.GetResponse() )
{
Stream dataStream = response.GetResponseStream();
StreamReader dataReader = new StreamReader(dataStream);
// Use Linq to read the xml response
using (XmlReader reader = XmlReader.Create(dataStream))
{
The post is correct, but response always give me text/plain empty result, the reponse header:
Headers = {Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/plain
Date: Thu, 06 Sep 2012 15:59:28 GMT
}
The SOAP message is, act is the function:
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:web="http://webService">
<soapenv:Header/>
<soapenv:Body>
<web:act>
<web:d1>1</web:d1>
<web:d2>14</web:d2>
</web:act>
</soapenv:Body>
</soapenv:Envelope>
I use SoapUI, below is the raw request from SoapUI, it return a xml result:
POST http://www.rrr.net/services/Connect HTTP/1.1
Accept-Encoding: gzip,deflate
Content-Type: text/xml;charset=UTF-8
SOAPAction: ""
Content-Length: 516
Host: www.rrr.net
Connection: Keep-Alive
User-Agent: Apache-HttpClient/4.1.1 (java 1.5)
Thank you.
SOAP webservices require action/method to be specified (and NOT empty). If you don't know which action you want you can look at webservice WSDL by invoking the webservice with queryString "?WSDL". I.e. www.yourSite.com/your/Web/Service/URL?WSDL
You must specify an action in both the request and also the service interface. You can set the action value on the interface member using the attributes shown below and then in the request using the method you have used but by specifying the action name you used in the contract:
The attributes on the interface member
[OperationContract Name="YourActionName"]
[WebInvoke (Method = "POST", UriTemplate = "YourActionName")]
Message YourServiceFunction();
One method of specifying the action on the message
Message inputMessage = Message.CreateMessage (MessageVersion.Soap, "YourActionName", reader);

Categories