Set encoding between PHP soap server and c# soap client - c#

I have a PHP SOAP server (using nuSOAP with wsdl) that send the content of a html page. Of course, the HTML can be coded with differents encoding and here is when the problems appear. If I used a PHP SOAP client I can send the encoding like this:
$clienteSOAP = new SoapClient ("http://test.mine.com/wsdl/filemanager?wsdl",
array ('encoding' => 'ISO-8859-15'));
$clienteSOAP->__soapCall ("Test.uploadHTML",
array (file_get_contents ('/home/КОЛЛЕКЦИЯ_РОДНИК_ПРЕМИУМ.html')));
And if I put the correct encoding, has never failed so far. But when I use a C# client, how can I put the encoding in the web service petition? In C# the code is:
System.IO.StreamReader html = new System.IO.StreamReader (
"C:\\Documents and Settings\\КОЛЛЕКЦИЯ_РОДНИК_ПРЕМИУМ.html"
,System.Text.Encoding.GetEncoding("iso-8859-15"));
string contenido = html.ReadToEnd();
html.Close();
Test.FileManager upload = new Test.FileManager();
string resultado = upload.TestUploadHTML (contenido);
Test.FileManager is a Web reference of the wsdl, and when I see the "upload html" some characters aren't correct.
Thanks in advance.

nusoap internally uses the php function xml_parser_create, that only supports: ISO-8859-1, UTF-8 and US-ASCII. For this reason, this library don't works well with other encoding. Great PacHecoPe...
UPDATE: The best option, in my case, is read the archive in its original encoding and transform it to utf-8:
System.IO.StreamReader html = new System.IO.StreamReader (
"C:\\Documents and Settings\\КОЛЛЕКЦИЯ_РОДНИК_ПРЕМИУМ.html"
,System.Text.Encoding.GetEncoding("iso-8859-15"));
string contenido = html.ReadToEnd();
html.Close();
System.Text.UTF8Encoding encoder = new System.Text.UTF8Encoding();
byte[] bytes = System.Text.Encoding.UTF8.GetBytes (contenido);
string contenidoUTF8 = encoder.GetString(bytes);
upload.RequestEncoding = System.Text.Encoding.GetEncoding("UTF-8");
Test.FileManager upload = new Test.FileManager();
string resultado = upload.TestUploadHTML (contenidoUTF8);
UPDATE2: With encoding that not supported in UTF-8 like big5, don't work very well the above code. For this reason, it's better don't make the transform to UTF-8 and set the parameter with the content of html like base64Binary, in the wsdl.

Related

Send UTF-8 string from Android to C#

I've been trying to accomplish a simple text transmission from my Android app to my C# server (asmx server), sending the simplest string - and for some reason it never works. My Android code is as following (assume that the variable 'message' holds the string as received from an EditText, which is UTF-16 as far as I'm concerned):
httpClient = new DefaultHttpClient();
HttpPost post = new HttpPost(POST_MESSAGE_ADDRESS);
byte[] messageBytes = message.getBytes("utf-8");
builder.addPart("message", new StringBody(messageBytes.toString()));
HttpEntity entity = builder.build();
post.setEntity(entity);
HttpResponse response = httpClient.execute(post);
So I get something simple for my message, say a 10 bytes array. In my server, I have a function set to that specific address; its code is:
string message = HttpContext.Current.Request.Form["message"];
byte[] test = System.Text.Encoding.UTF8.GetBytes(message);
Now after that line the byte array ('test') has the exact same value as the result of the ToString() function I called in the app. Question is, how do I convert it to normal UTF-8 text to display?
Note: I have tried sending the string normally as a string content, but as far as I understood the default coding is ASCII so I got a lot of question marks.
Edit: Now I'm looking for some conversions solutions and trying them, but my question is also if there's a simpler way to do that (perhaps BinaryBody in the android, or different coding?)
Problem is in following lines:
byte[] messageBytes = message.getBytes("utf-8");
builder.addPart("message", new StringBody(messageBytes.toString()));
First you are transforming your UTF-16 string message into UTF-8 encoded messageBytes only to convert them back to UTF-16 string in next line. And there you are using StringBody constructor that will use ASCII encoding as default.
You should replace those lines with:
builder.addPart("message", new StringBody(message, Charset.forName("UTF-8")));

String encoding with a JSON flow got by web request C#

I have a little problem with a string in C#. Actually, I take a JSON flow by an URL.
WebClient webC = new WebClient();
string jsonStr = webC.DownloadString("http://www.express-board.fr/api/jobs");
But when I write the string in the console, I have the problem of encoding.
[...]"contract":"Freelance/Indépendant"[...]
I have try to used lot of trick seen on stackoverflow with Encoding class. But impossible, to solve the problem. Of course if I use the link directly in my web browser and open it in Notepadd++ no problem.
Sometimes, with some combinaison of encoding ( ACSII-> UTF-8 I think), I obtain this :
[...]"contract":"Freelance/Indépendant"[...] to
[...]"contract":"Freelance/Ind??pendant"[...]
This actually returns the string as intended:
WebClient webC = new WebClient();
webC.Encoding = Encoding.UTF8;
string jsonStr = webC.DownloadString("http://www.express-board.fr/api/jobs");

Sending UTF-8 Text Messages on C# using TIBCO EMS

I'm using the TIBCO EMS library TIBCO.EMS.dll to send xml messages to a queue on a TIBCO EMS server. The application receiving those messages requires the XML to be UTF-8 encoded. Generating the UTF-8 xml in itself isn't a problem, however I can see no method of sending a TextMessage to the queue while keeping the data in UTF-8 format.
To serialize the objects to UTF-8 XML I use the following (simplified here):
XmlSerializer serializer = new XmlSerializer(data.GetType());
MemoryStream ms = new MemoryStream();
StreamWriter sw = new StreamWriter(ms, System.Text.Encoding.UTF8);
serializer.Serialize(sw, data);
byte[] result = ms.ToArray();
Which leaves me with a byte array containing the utf-8 encoded xml. I can write this to a BytesMessage to send to the EMS queue..
BytesMessage message = _queueSession.CreateBytesMessage();
message.WriteBytes(result);
_queueSender.Send(message);
_queueSession.Commit();
But that results in a BytesMessage on the queue. The only way I can see to get a TextMessage is to use the TextMessage class but the text property of that class is a standard Unicode string which would result in the xml loosing its utf-8 encoding.
Anyone know of a way to send a UTF-8 encoded text message?
You may want to try call the Tibems.setEncoding("UTF-8") method before you send the message;
Please note that this method will impact the message encoding globally.
It seems that, by default, the TIBCO API converts C# unicode strings to UTF-8 when a message is submitted to a queue. Fine for text, but if the string uses XML and include an encoding type option, you must manually change the option to utf-8.

Unable to print languages other than English in System.windows.Forms.WebBrowser

I am trying to use System.windows.Forms.WebBrowser to display a content in the languages other than English, but the resulting encoding is incorrect. What should I do to display for example Russian?
I am downloading and displaying a string as following:
System.Net.WebClient wc = new System.Net.WebClient();
webBrsr.DocumentText = wc.DownloadString(url);
The problem is with the WebClient and how it is interpreting the string encoding. One solution is to download the data as raw bytes and parse it out manually:
Bytes[] bytes = wc.DownloadData("http://news.google.com/news?edchanged=1&ned=ru_ru");
//You should really inspect the headers from the response to determine the exact encoding to use,
// this example just assumes UTF-8 which might work in most scenarios
String t = System.Text.Encoding.UTF8.GetString(bytes);
webBrsr.DocumentText = t;

c# with SOAP - problem with utf-8 encoding

I'm using automatic conversion from wsdl to c#, everything works apart from encoding, whenever
I have native characters (like 'ł' or 'ó') I get '??' insted of them in string fields ('G????wny' instead of 'Główny'). How to deal with it? Server sends document with correct encoding, with header .
EDIT: I noticed in Wireshark, that packets send FROM me have BOM, but packets sends TO me, don't have it - maybe it's a root of problem?
So maybe the following will help:
What I am sure I did is:
In the webservice PHP file, after connecting to the Mysql Database I call:
mysql_query("SET CHARSET utf8");
mysql_query("SET NAMES utf8 COLLATE utf8_polish_ci");
The second I did:
In the same PHP file,
I added utf8_encode to the service on the $POST_DATA variable:
$server->service(utf8_encode($POST_DATA));
in the class.nusoap_base.php I changed:
`//var $soap_defencoding = 'ISO-8859-1';
var $soap_defencoding = 'UTF-8';`
and olso in the nusoap.php the same as above:
//var $soap_defencoding = 'ISO-8859-1';
var $soap_defencoding = 'UTF-8';
and in the nusoap.php file again:
var $decode_utf8 = true;
Now I can send and receive properly encoded data.
Hope this helps.
Regards,
The problem was on the server side with sent Content-Type parameter in header (it was set to "text/xml"). It occurs that for utf-8 it HAVE TO be "text/xml; charset=utf-8", other methods such as placing BOM aren't correct (according to RFC 3023). More info here: http://annevankesteren.nl/2005/03/text-xml

Categories