Encoding problem between C# TCP server and Java TCP Client - c#

i'm facing some encoding issue which i'm not able to find the correct solution.
I have a C# TCP server, running as a window service which received and respond XML, the problem comes down when passing special characters in the output such as spanish characters with accents (like á,é,í and others).
Server response is being encoded as UTF-8, and java client is reading using UTF-8. But when i print its output the character is totally different.
This problem only happens in Java client(C# TCP client works as expected).
Following is an snippet of the server code that shows the encoding issue:
C# Server
byte[] destBytes = System.Text.Encoding.UTF8.GetBytes("á");
try
{
clientStream.Write(destBytes, 0, destBytes.Length);
clientStream.Flush();
}catch (Exception ex)
{
LogErrorMessage("Error en SendResponseToClient: Detalle::", ex);
}
Java Client:
socket.connect(new InetSocketAddress(param.getServerIp(), param.getPort()), 20000);
InputStream sockInp = socket.getInputStream();
InputStreamReader streamReader = new InputStreamReader(sockInp, Charset.forName("UTF-8"));
sockReader = new BufferedReader(streamReader);
String tmp = null;
while((tmp = sockReader.readLine()) != null){
System.out.println(tmp);
}
For this simple test, the output show is:
ß
I did some testing printing out the byte[] on each language and while on C# á output as:
195, 161
In java byte[] read print as:
-61,-95
Will this have to do with the Signed (java), UnSigned (C#) of byte type?.
Any feedback is greatly appreciated.

To me this seems like an endianess problem... you can check that by reversing the bytes in Java before printing the string...
which usually would be solved by including a BOM... see http://de.wikipedia.org/wiki/Byte_Order_Mark

Are you sure that's not a unicode character you are attemping to encode to bytes as UTF-8 data?
I found the below has a useful way of testing to see if the data in that string is ccorrect UTF-8 before you send it.
How to test an application for correct encoding (e.g. UTF-8)

Related

SendGrid inbound parse nordic chars

Completely stuck on a problem related to the inbound parse webhook functionality offered by SendGrid: https://sendgrid.com/docs/for-developers/parsing-email/setting-up-the-inbound-parse-webhook/
First off everything is working just fine with retrieving the mail sent to my application endpoint. Using Request.Form I'm able to retrieve the data and work with it.
The problem is that we started noticing question mark symbols instead of letters when recieving some mails (written in swedish using Å Ä and Ö). This occured both when sending plaintext mails, and mails with an HTML-body.
However, this only happens every now and then. After a lot of searching I found out that if the mail is sent from e.g. Postbox or Outlook (or the like), and the application has the charset set to iso-8859-1 that's when Å Ä Ö is replaced by question marks.
To replicate the error and be able to debug it I set up a HTML page with a form using the iso-8859-1 encoding, sending a similar payload as the one seen in the link above (the default one). And after that been through testing a multitude of things trying to get it to work.
As of now I'm trying to recode the input, without success. Code I'm testing:
Encoding wind1252 = Encoding.GetEncoding(1252);
Encoding utf8 = Encoding.UTF8;
byte[] wind1252Bytes = wind1252.GetBytes(Request.Form.["html"]);
byte[] utf8Bytes = Encoding.Convert(wind1252, utf8,wind1252Bytes);
string utf8String = Encoding.UTF8.GetString(utf8Bytes);
This only results in the utf8String producing the same result with "???" where Å Ä Ö should be. My guess here is that perhaps it's due to the Request.Form["html"] returning a UTF-16 string, of the content that is encoded already in the wrong encoding iso-8859-1.
The method for fetching the POST is as follows
public async Task<InboundParseModel> FetchMail(IFormCollection form)
{
InboundParseModel _em = new InboundParseModel
{
To = form["to"].SingleOrDefault(),
From = form["from"].SingleOrDefault(),
Subject = form["subject"].SingleOrDefault(),
Html = form["html"].SingleOrDefault(),
Text = System.Net.WebUtility.HtmlEncode(form["text"].SingleOrDefault()),
Envelope = form["envelope"].SingleOrDefault()
};
}
Called from another method that the POST is done to by FetchMail(Request.Form);
Project info: ASP.NET Core 2.2, C#
So as stated earlier, I am completely stuck and don't really have any ideas on how to solve this. Any help would be much appreciated!

PHP utf8 variable encoding (HMAC Key -> C# Server)

I'm trying to create a PHP client wrapper to talk to a .NET API. What I have is working but I am new to PHP development and what I have now looks like it may not work 100% of the time.
C# code I am trying to replicate:
private static void HMAC_Debug()
{
Console.WriteLine("Secret Key (Base64): 'qCJ6KNCd/ASFOt1cL5uq2TUYcRjplpYUy7QdUmvaCTs='");
var secret = Convert.FromBase64String("qCJ6KNCd/ASFOt1cL5uq2TUYcRjplpYUy7QdUmvaCTs=");
Console.WriteLine("Value To Hash (UTF8): 'MyHashingValue©'");
var value = Encoding.UTF8.GetBytes("MyHashingValue©");
using (HMACSHA256 hmac = new HMACSHA256(secret))
{
byte[] signatureBytes = hmac.ComputeHash(value);
string requestSignatureBase64String = Convert.ToBase64String(signatureBytes);
Console.WriteLine("Resulting Hash (Base64): '{0}'", requestSignatureBase64String);
}
Console.ReadLine();
}
My PHP Equiv:
$rawKey = base64_decode("qCJ6KNCd/ASFOt1cL5uq2TUYcRjplpYUy7QdUmvaCTs=");
// $hashValArr = unpack("C*", utf8_encode("MyHashingValue©"));
//
// $hashVal = call_user_func_array("pack", array_merge(array("C*"), $hashValArr));
$hashVal = "MyHashingValue©";
$raw = hash_hmac("sha256", $hashVal, $rawKey, TRUE);
$rawEnc = base64_encode($raw);
echo $rawEnc;
These two snippets produce the same Base64 output, but I am relying on the string variables in PHP being default encoded to UTF8 - is this a correct assumption or is there something more stable I can do?
You can see from the commented out PHP lines I attempted to manually encode it to UTF8 then extract out the ASCII bytes for the PHP HMAC function but it didn't produce the same output as the c# code.
Thanks
Marlon
Which version of PHP are you using?
In general you cannot rely on the encoding being UTF-8. In fact it might be possible that you just stored the file as UTF-8 (I guess without BOM) but older PHP versions (as far as I know before PHP 7) are not capable to work natively with unicode, they just read it as ASCII / Extended ASCII.
That said, if you do not manipulate the string it is possible that your example works because you are just processing the bytes that are stored in the variable. And if this byte sequence happend to be a UTF-8 encoded string at the time you inserted it into your source code it stays that way.
If you get the string from an abritrary source you should make sure which encoding is used and consider the multibyte string processing functions of PHP, which can work with different encodings [1].
[1] http://us2.php.net/manual/en/ref.mbstring.php

Sending quotation marks in a GCM Payload (and other special characters that break syntax)

I'm struggling finding a feasible solution to this. I've tried looking around but can't find any documentation regarding this issue. If a customer sends out a message with quote(s), it break the payload syntax and android spits me back a 400 Bad Request error.
The only solution I can think of is by doing my own translations and validations. Allow only the basics, and for the restricted do my own "parsing" Ie take a quote, replace them with "/q" and then replace "/q" on the App when received. I don't like this solution because it involves logic on the App that if, I forget something. I want to be able to change it immediately rather then update everyones phone, app, etc.
I'm looking for an existing encoding I could apply that is processed correctly by the GCM servers. Allowing them to be accepted then broadcasted. Received by the phone with the characters intact.
Base64 encoding should get rid of the special characters. Just encode it before sending and decode it again on receiving:
Edit: sorry, just got a java/android sample here, I don't know how exactly xamarin works and what functions it provides:
// before sending
byte[] data = message.getBytes("UTF-8");
String base64Message = Base64.encodeToString(data, Base64.DEFAULT);
// on receiving
byte[] data = Base64.decode(base64Message , Base64.DEFAULT);
String message= new String(data, "UTF-8");
.Net translation of #tknell solution
Decode:
Byte[] data = System.Convert.FromBase64String(encodedString);
String decoded = System.Text.Encoding.UTF8.GetString(data);
Encode:
Byte[] data = System.Text.Encoding.UTF8.GetBytes(decodedString);
String encoded = System.Convert.ToBase64String(data);

C# Sending Hex string to tcp socket

I'm trying to send a hex string to a tcp socket. I have some problems with the format or conversion of this string, cause I'm not very sure what the format its using.
I've written a WindowsPhone app which is working fine based on Socket Class.
This app emulates request, that are normaly send from a desktop program to a device which hosts a webservice.
Via wireshark, I found out, that the webservice will accept an input stream (think its in hex) and returns a 2nd. hex stream which contains the data I need.
So the desktop app is sending a stream
and Wireshark shows when :
Data (8 bytes)
Data: 62ff03fff00574600
Length: 8
Now I've tried a lot to reproduce this stream. I thougt, it used to be a UTF8 string and converted this stream to this format. But every time I send it, is see in Wireshark the following output: 62c3bf03c3bf00574600
As far as i've investigated 62 = b but ff send always c3bf.
Does somebody know how to send this stream in the right format?
Cheers,
Jo
The socket transport shouldn't care, the content of a TCP packet is binary representing "whatever".
From the code you pointed to in the comments:
byte[] payload = Encoding.UTF8.GetBytes(data);
socketEventArg.SetBuffer(payload, 0, payload.Length);
...
response = Encoding.UTF8.GetString(e.Buffer, e.Offset, e.BytesTransferred);
response = response.Trim('\0');
At the end of the socket send/receive (data == response). If that isn't occurring you need to figure how where the problem is. The first step is to write some very simple code like so:
string source = "your problem text string";
byte[] encode = Encoding.UTF8.GetBytes(source);
target = Encoding.UTF8.GetString(encode, 0, encode.Length);
Debug.Assert(source == target);
If that works, then output the 'encode' array can check to make sure that is contained in the packet data where it is being send, then verify that that is what is being received. If you are sending the right data but receiving it corrupted you have serious problems ... I doubt you find that but if so write a very simple test program that sends and receives on the same machine (localhost) to see if it is repeatable.
If I had to guess I would say that the characters being encoded are not Unicode or that Win phone doesn't properly support it (Proper unicode support).
As long as you don't know the protocol / the encoding the server expects you can only replay the known messages, like the bytes you provided in your question.
Therefore you just define directly the byte array like this:
byte[] payload = new byte[] {0x62, 0xff, 0x03, 0xff, 0xf0, 0x05, 0x74, 0x60};
and send it over the socket like you did with the encoded string before. The server should now accept the message like it was sent by the client you sniffed.

Why is this encrypted message damaged?

I asked this question over Security site, and people there suggested I should have posted it here.
Some background. We have proprietary devices which run c over a proprietary OS and other devices which run a c# dll over a windows OS.
Both contact our Server via TCP connection, for our server both type of requests are the same.
The TCP server transfers part of the request to a self-hosted WCF service, through http-binding.
The requests are encrypted as shown in the link(like the C# dll encrypts them).
I am in the process of trying to cut off the TCP server and send requests straight to the WCF service.
My problem is that it seems like the WCF service receives the request string wrong, and it can't decrypt it.
It seems like there are additional \t \n in the server side receives string. other than that it looks the same.
This is the decryption code on the server side:
byte[] byteChiperText = Encoding.Default.GetBytes(input);
if (k.Length != 16)
{
throw new Exception("Wrong key size exception");
}
TripleDESCryptoServiceProvider des = new TripleDESCryptoServiceProvider();
des.Mode = CipherMode.ECB;
des.Padding = PaddingMode.Zeros;
des.Key = k;
ICryptoTransform ic = des.CreateDecryptor();
MemoryStream ms = new MemoryStream(byteChiperText);
CryptoStream cStream = new CryptoStream(ms,
ic,
CryptoStreamMode.Read);
StreamReader sReader = new StreamReader(cStream);
byte[] data = new byte[byteChiperText.Length];
int len = sReader.BaseStream.Read(data, 0, data.Length);
output = Encoding.Default.GetString(data, 0, len);
cStream.Close();
Well this looks broken to start with:
byte[] byteChiperText = Encoding.Default.GetBytes(input);
You're treating encrypted data as if it's text encoded with the platform default encoding. That's a great way to lose data. Encrypted data isn't text. It's arbitrary binary data, and should be treated as such.
Instead, you should use base64 to encode the encrypted data as text (Convert.ToBase64String) and then reverse that (Convert.FromBase64String) later on to get back to the original cypher-text. That's assuming you need it in text form to start with, of course. If you can pass it as a byte[] in the first place, that would be even better.
Also note that your approach to getting the text out is somewhat odd - you're creating a StreamReader, then only using the base stream. It would be better to use:
// You should be using "using" statements for all your streams, by the way...
using (TextReader reader = new StreamReader(cStream))
{
output = reader.ReadToEnd();
}
Note that this will use UTF-8 rather than the platform default encoding - but that's a good thing, so long as you make the corresponding change in the encryption code. Using the platform default encoding is almost always a mistake - it may well not support all of Unicode, and it varies from machine to machine.
The problem could be in Encoding.Default, since:
Different computers can use different encodings as the default.
You should use a given standard encoding (UTF-8, UTF-16, ..).

Categories