I try to replicate this C# code in php to get the same output(I cannot change c# code only php).And here I'm stuck
public static string HashData(string textToBeEncripted)
{
//Convert the string to a byte array
Byte[] byteDataToHash = System.Text.Encoding.Unicode.GetBytes(textToBeEncripted);
//Compute the MD5 hash algorithm
Byte[] byteHashValue = new System.Security.Cryptography.MD5CryptoServiceProvider().ComputeHash(byteDataToHash);
return System.Text.Encoding.Unicode.GetString(byteHashValue);
}
The php code that I have made till now looks like this
$a = "test";
$a = mb_convert_encoding($a, "UTF-16LE");
$a = md5($a,true);
$a = unpack('C*', $a);
var_dump($a);
//with the output
array(16) { [1]=> int(200) [2]=> int(5) [3]=> int(158) [4]=> int(46) [5]=> int(199) [6]=> int(65) [7]=> int(159) [8]=> int(89) [9]=> int(14) [10]=> int(121) [11]=> int(215) [12]=> int(241) [13]=> int(183) [14]=> int(116) [15]=> int(191) [16]=> int(230) }
As you can see the output is the same as in the C# code
But I'm stuck at the function System.Text.Encoding.Unicode.GetString().How to replicate this in php?Or there is a easier way to get the same output?(I cannot change the C# code sorry)
Edit: Based on Vasiliy Zverev answers since the php hash is different a little bit.I end up making aproximating the hash value of php with the C# hash
function validare_parola($parola,$dbHash){
$parola = mb_convert_encoding($parola, "UTF-16LE");
$parola = md5($parola, true);
$parola = mb_convert_encoding($parola, "UCS-2BE", "UCS-2LE");
$parola = bin2hex($parola);
$procent;
similar_text($dbHash,$parola,$procent);
if($procent>=90){
return true;
}else{
return false;
}
}
$parola = "testa";
$dbHash = "10095018710be2bcbbf9bba3f9d91ce8";
if(validare_parola($parola,$dbHash)){
echo 'PASSWORD CORRECT.You can log in.';
}else{
echo 'INCORRECT PASSWORD.Try again.';
}
As a side note don't use md5 for passwords use php password hashing api
Edit2: I ended up using Vasiliy Zverev solution.
Edit3: For the value "111111" there is different output in php...
Edit4: Vasily Zverev updated his solution and now is working as expected
The solution, updated:
$a = "SF0D9G9SGGF0gdsfg976590";
$a = mb_convert_encoding($a, "UTF-16LE");
$a = md5($a, true);
$res = '';
for($i=0; $i<16; $i+=2) {
// System.Text.Encoding.Unicode.GetString(byteHashValue) replaces invalid characters to 0xfffd ('я')
// while PHP to 0x003d ('?') or empty string. So replace them like C# does
$a2 = mb_convert_encoding($a[$i].$a[$i+1], "UTF-16LE","UTF-16LE"); // check if this is invalid UTF-16 character
if(strlen($a2)==0 || $a[$i]!=$a2[0]) {
// replace invalid UTF-16 character with C# like
$v = 0xfffd;
}
else {
// prepare a word (UTF-16 character)
$v = ord($a[$i+1])<<8 | ord($a[$i]);
}
// print each word without leading zeros as C# code does
$res .= sprintf("%x",$v);
}
echo($res);
Removed this variant because it was wrong. See correct code above.
Related
I am trying to parse a message that possibly contains emojis in it. An example message that could be received looks like:
{"type":"chat","msg":"UserName:\u00a0\ud83d\ude0b \n"}
What should match is \u00a0 as a single character, and \ud83d\ude0b as a pair.
I have regex that can pull individual codes, but not pairs to match the full emoji:
\\u[a-z0-9]{4}
Is there a clean way to account for any/multiple emojis in a sentence so I can replace the surrogate pair with the function I have? Thanks!
Edit: Here is the function I will be using alongside the regex
string ConvertToUnicode(string SurrogatePair)
{
string returnValue = "";
for (var i = 0; i < SurrogatePair.Length; i += char.IsSurrogatePair(SurrogatePair, i) ? 2 : 1)
{
var codepoint = char.ConvertToUtf32(SurrogatePair, i);
returnValue = String.Format("U+{0:X4}", codepoint);
}
return returnValue;
}
I have a dictionary where the key is unicode and the value is a reference to the emoji image so I can display it in a Unity UI element.
Edit 2:
As this question is a bit too broad, I will get specific as to what I am trying to accomplish. I am receiving a json message from a server through a websocket connection. This message is being displayed in a Unity Panel where each message is a text mesh pro text object. When an emoji is sent, the message displays similarly to the example message above, with the only change being that the surrogate pair changes based on the emoji sent. In order to insert the corresponding emoji image correctly into the text mesh pro object, I need to get the sprite atlas id that points to the correct emoji. As I did not insert the sprites by hand into the atlas but read them in from a spliced sprite sheet, the only way of accessing each image is by their indexed id. To correctly identify the emoji in the atlas by id, I created a dictionary that inserted the Unicode in order as they appeared in the sprite sheet as the key, where the value is the index in the atlas. What I am trying to do now is parse the message received for an emoji using regex, send this parsed data into the function I posted above to convert it to the Unicode value, then retrieve the correct id from the dictionary to finally insert the emoji that was originally entered on the front-end. If there is a better way of going about this please let me know, but from the research I did, the only efficient way to insert images into a Unity text object is how I am going about it. Due to this, I need to get the surrogate pairs from the message.
Edit 3:
If someone else happens to stumble upon this question i'll just leave my solution that I came up with for getting emojis from an html website, through a websocket server into Unity textmeshpro.
Here is a google spreadsheet containing the hex values and the sprite sheet I used to create the dictionary / sprite atlas in Unity: https://docs.google.com/spreadsheets/d/1XQY1n9cA1hx_PnsXoisxjanZRiQG0gd25VYEmk1W7mE/edit?usp=sharing.
I then used a library that can be found here: https://github.com/aaronpk/emoji-detector-php
It can parse a string and find emojis. I replaced the regex used with the regex provided by sln below, then tweaked the main script to just return the message text back but with the emojis replaced with the hex wrapped in a delimeter I could find with regex on the Unity side.
<?php
namespace Emoji;
define('LONGEST_EMOJI', 8);
function detect_emoji($string) {
// Find all the emoji in the input string
$prevencoding = mb_internal_encoding();
mb_internal_encoding('UTF-8');
$data = array();
$test = $string;
static $map;
if(!isset($map))
$map = _load_map();
static $regexp;
if(!isset($regexp))
$regexp = _load_regexp();
if(preg_match_all($regexp, $string, $matches)) {
foreach($matches[0] as $ch) {
$points = array();
for($i=0; $i<mb_strlen($ch); $i++) {
$points[] = strtoupper(dechex(uniord(mb_substr($ch, $i, 1))));
}
$hexstr = implode('-', $points);
$theMatch = $string.exec($ch);
$test = substr_replace($test, "[[[[".$hexstr."]]]]", strpos($test, $ch), strlen($ch));
}
}
if($prevencoding)
mb_internal_encoding($prevencoding);
return $test;
}
function _load_map() {
return json_decode(file_get_contents(dirname(__FILE__).'/map.json'), true);
}
function _load_regexp() {
return '/(?:' . json_decode(file_get_contents(dirname(__FILE__).'/regexp.json')) . ')/u';
}
function uniord($c) {
$ord0 = ord($c[0]); if ($ord0>=0 && $ord0<=127) return $ord0;
$ord1 = ord($c[1]); if ($ord0>=192 && $ord0<=223) return ($ord0-192)*64 + ($ord1-128);
$ord2 = ord($c[2]); if ($ord0>=224 && $ord0<=239) return ($ord0-224)*4096 + ($ord1-128)*64 + ($ord2-128);
$ord3 = ord($c[3]); if ($ord0>=240 && $ord0<=247) return ($ord0-240)*262144 + ($ord1-128)*4096 + ($ord2-128)*64 + ($ord3-128);
return false;
}
To call this in my server script, I just had to add:
include("src/Emoji.php");
at the top of my script, and call the function as follows:
$message = Emoji\detect_emoji($message);
I made sure this was only sent to my Unity client as I was storing their resourceId in the server code.
On the Unity side, to find the emojis that need to be replaced, I used:
string emojiPattern = #"(?<=\[\[\[\[).[^\]\]\]\]]*";
MatchCollection emojiMatch = Regex.Matches(messageString, emojiPattern);
for(int x = 0; x < emojiMatch.Count; x++)
{
messageString = messageString.Replace("[[[[" + emojiMatch[x].Value + "]]]]", "<sprite=" + emojiDictionary[emojiMatch[x].Value.ToLower()].ToString() + ">");
}
As the text element, I was instantiating was a text mesh pro text GUI element, it is able to convert the to an image in the sprite atlas I made using the sprite sheet. Hope this helps someone in the future!
A complete C# regex to find any/all V12 Emoji
[#*0-9]\uFE0F\u20E3|[\u00A9\u00AE\u203C\u2049\u2122\u2139\u2194-\u2199\u21A9\u21AA\u231A\u231B\u2328\u23CF\u23E9-\u23F3\u23F8-\u23FA\u24C2\u25AA\u25AB\u25B6\u25C0\u25FB-\u25FE\u2600-\u2604\u260E\u2611\u2614\u2615\u2618]|\u261D(?:\uD83C[\uDFFB-\uDFFF])?|[\u2620\u2622\u2623\u2626\u262A\u262E\u262F\u2638-\u263A\u2640\u2642\u2648-\u2653\u265F\u2660\u2663\u2665\u2666\u2668\u267B\u267E\u267F\u2692-\u2697\u2699\u269B\u269C\u26A0\u26A1\u26AA\u26AB\u26B0\u26B1\u26BD\u26BE\u26C4\u26C5\u26C8\u26CE\u26CF\u26D1\u26D3\u26D4\u26E9\u26EA\u26F0-\u26F5\u26F7\u26F8]|\u26F9(?:\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?|\uFE0F\u200D[\u2640\u2642]\uFE0F)?|[\u26FA\u26FD\u2702\u2705\u2708\u2709]|[\u270A-\u270D](?:\uD83C[\uDFFB-\uDFFF])?|[\u270F\u2712\u2714\u2716\u271D\u2721\u2728\u2733\u2734\u2744\u2747\u274C\u274E\u2753-\u2755\u2757\u2763\u2764\u2795-\u2797\u27A1\u27B0\u27BF\u2934\u2935\u2B05-\u2B07\u2B1B\u2B1C\u2B50\u2B55\u3030\u303D\u3297\u3299]|\uD83C(?:[\uDC04\uDCCF\uDD70\uDD71\uDD7E\uDD7F\uDD8E\uDD91-\uDD9A]|\uDDE6\uD83C[\uDDE8-\uDDEC\uDDEE\uDDF1\uDDF2\uDDF4\uDDF6-\uDDFA\uDDFC\uDDFD\uDDFF]|\uDDE7\uD83C[\uDDE6\uDDE7\uDDE9-\uDDEF\uDDF1-\uDDF4\uDDF6-\uDDF9\uDDFB\uDDFC\uDDFE\uDDFF]|\uDDE8\uD83C[\uDDE6\uDDE8\uDDE9\uDDEB-\uDDEE\uDDF0-\uDDF5\uDDF7\uDDFA-\uDDFF]|\uDDE9\uD83C[\uDDEA\uDDEC\uDDEF\uDDF0\uDDF2\uDDF4\uDDFF]|\uDDEA\uD83C[\uDDE6\uDDE8\uDDEA\uDDEC\uDDED\uDDF7-\uDDFA]|\uDDEB\uD83C[\uDDEE-\uDDF0\uDDF2\uDDF4\uDDF7]|\uDDEC\uD83C[\uDDE6\uDDE7\uDDE9-\uDDEE\uDDF1-\uDDF3\uDDF5-\uDDFA\uDDFC\uDDFE]|\uDDED\uD83C[\uDDF0\uDDF2\uDDF3\uDDF7\uDDF9\uDDFA]|\uDDEE\uD83C[\uDDE8-\uDDEA\uDDF1-\uDDF4\uDDF6-\uDDF9]|\uDDEF\uD83C[\uDDEA\uDDF2\uDDF4\uDDF5]|\uDDF0\uD83C[\uDDEA\uDDEC-\uDDEE\uDDF2\uDDF3\uDDF5\uDDF7\uDDFC\uDDFE\uDDFF]|\uDDF1\uD83C[\uDDE6-\uDDE8\uDDEE\uDDF0\uDDF7-\uDDFB\uDDFE]|\uDDF2\uD83C[\uDDE6\uDDE8-\uDDED\uDDF0-\uDDFF]|\uDDF3\uD83C[\uDDE6\uDDE8\uDDEA-\uDDEC\uDDEE\uDDF1\uDDF4\uDDF5\uDDF7\uDDFA\uDDFF]|\uDDF4\uD83C\uDDF2|\uDDF5\uD83C[\uDDE6\uDDEA-\uDDED\uDDF0-\uDDF3\uDDF7-\uDDF9\uDDFC\uDDFE]|\uDDF6\uD83C\uDDE6|\uDDF7\uD83C[\uDDEA\uDDF4\uDDF8\uDDFA\uDDFC]|\uDDF8\uD83C[\uDDE6-\uDDEA\uDDEC-\uDDF4\uDDF7-\uDDF9\uDDFB\uDDFD-\uDDFF]|\uDDF9\uD83C[\uDDE6\uDDE8\uDDE9\uDDEB-\uDDED\uDDEF-\uDDF4\uDDF7\uDDF9\uDDFB\uDDFC\uDDFF]|\uDDFA\uD83C[\uDDE6\uDDEC\uDDF2\uDDF3\uDDF8\uDDFE\uDDFF]|\uDDFB\uD83C[\uDDE6\uDDE8\uDDEA\uDDEC\uDDEE\uDDF3\uDDFA]|\uDDFC\uD83C[\uDDEB\uDDF8]|\uDDFD\uD83C\uDDF0|\uDDFE\uD83C[\uDDEA\uDDF9]|\uDDFF\uD83C[\uDDE6\uDDF2\uDDFC]|[\uDE01\uDE02\uDE1A\uDE2F\uDE32-\uDE3A\uDE50\uDE51\uDF00-\uDF21\uDF24-\uDF84]|\uDF85(?:\uD83C[\uDFFB-\uDFFF])?|[\uDF86-\uDF93\uDF96\uDF97\uDF99-\uDF9B\uDF9E-\uDFC1]|\uDFC2(?:\uD83C[\uDFFB-\uDFFF])?|[\uDFC3\uDFC4](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDFC5\uDFC6]|\uDFC7(?:\uD83C[\uDFFB-\uDFFF])?|[\uDFC8\uDFC9]|\uDFCA(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDFCB\uDFCC](?:\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?|\uFE0F\u200D[\u2640\u2642]\uFE0F)?|[\uDFCD-\uDFF0]|\uDFF3(?:\uFE0F\u200D\uD83C\uDF08)?|\uDFF4(?:\u200D\u2620\uFE0F|\uDB40\uDC67\uDB40\uDC62\uDB40(?:\uDC65\uDB40\uDC6E\uDB40\uDC67|\uDC73\uDB40\uDC63\uDB40\uDC74|\uDC77\uDB40\uDC6C\uDB40\uDC73)\uDB40\uDC7F)?|[\uDFF5\uDFF7-\uDFFF])|\uD83D(?:[\uDC00-\uDC14]|\uDC15(?:\u200D\uD83E\uDDBA)?|[\uDC16-\uDC40]|\uDC41(?:\uFE0F\u200D\uD83D\uDDE8\uFE0F)?|[\uDC42\uDC43](?:\uD83C[\uDFFB-\uDFFF])?|[\uDC44\uDC45]|[\uDC46-\uDC50](?:\uD83C[\uDFFB-\uDFFF])?|[\uDC51-\uDC65]|[\uDC66\uDC67](?:\uD83C[\uDFFB-\uDFFF])?|\uDC68(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\u2764\uFE0F\u200D\uD83D(?:\uDC8B\u200D\uD83D)?\uDC68|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D(?:\uDC66(?:\u200D\uD83D\uDC66)?|\uDC67(?:\u200D\uD83D[\uDC66\uDC67])?|[\uDC68\uDC69]\u200D\uD83D(?:\uDC66(?:\u200D\uD83D\uDC66)?|\uDC67(?:\u200D\uD83D[\uDC66\uDC67])?)|[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92])|\uD83E[\uDDAF-\uDDB3\uDDBC\uDDBD])|\uD83C(?:\uDFFB(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E[\uDDAF-\uDDB3\uDDBC\uDDBD]))?|\uDFFC(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D\uDC68\uD83C\uDFFB|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?|\uDFFD(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D\uDC68\uD83C[\uDFFB\uDFFC]|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?|\uDFFE(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D\uDC68\uD83C[\uDFFB-\uDFFD]|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?|\uDFFF(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D\uDC68\uD83C[\uDFFB-\uDFFE]|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?))?|\uDC69(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\u2764\uFE0F\u200D\uD83D(?:\uDC8B\u200D\uD83D)?[\uDC68\uDC69]|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D(?:\uDC66(?:\u200D\uD83D\uDC66)?|\uDC67(?:\u200D\uD83D[\uDC66\uDC67])?|\uDC69\u200D\uD83D(?:\uDC66(?:\u200D\uD83D\uDC66)?|\uDC67(?:\u200D\uD83D[\uDC66\uDC67])?)|[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92])|\uD83E[\uDDAF-\uDDB3\uDDBC\uDDBD])|\uD83C(?:\uDFFB(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D\uDC68\uD83C[\uDFFC-\uDFFF]|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?|\uDFFC(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D(?:\uDC68\uD83C[\uDFFB\uDFFD-\uDFFF]|\uDC69\uD83C\uDFFB)|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?|\uDFFD(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D(?:\uDC68\uD83C[\uDFFB\uDFFC\uDFFE\uDFFF]|\uDC69\uD83C[\uDFFB\uDFFC])|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?|\uDFFE(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D(?:\uDC68\uD83C[\uDFFB-\uDFFD\uDFFF]|\uDC69\uD83C[\uDFFB-\uDFFD])|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?|\uDFFF(?:\u200D(?:[\u2695\u2696\u2708]\uFE0F|\uD83C[\uDF3E\uDF73\uDF93\uDFA4\uDFA8\uDFEB\uDFED]|\uD83D[\uDCBB\uDCBC\uDD27\uDD2C\uDE80\uDE92]|\uD83E(?:\uDD1D\u200D\uD83D[\uDC68\uDC69]\uD83C[\uDFFB-\uDFFE]|[\uDDAF-\uDDB3\uDDBC\uDDBD])))?))?|\uDC6A|[\uDC6B-\uDC6D](?:\uD83C[\uDFFB-\uDFFF])?|\uDC6E(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDC6F(?:\u200D[\u2640\u2642]\uFE0F)?|\uDC70(?:\uD83C[\uDFFB-\uDFFF])?|\uDC71(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDC72(?:\uD83C[\uDFFB-\uDFFF])?|\uDC73(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDC74-\uDC76](?:\uD83C[\uDFFB-\uDFFF])?|\uDC77(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDC78(?:\uD83C[\uDFFB-\uDFFF])?|[\uDC79-\uDC7B]|\uDC7C(?:\uD83C[\uDFFB-\uDFFF])?|[\uDC7D-\uDC80]|[\uDC81\uDC82](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDC83(?:\uD83C[\uDFFB-\uDFFF])?|\uDC84|\uDC85(?:\uD83C[\uDFFB-\uDFFF])?|[\uDC86\uDC87](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDC88-\uDCA9]|\uDCAA(?:\uD83C[\uDFFB-\uDFFF])?|[\uDCAB-\uDCFD\uDCFF-\uDD3D\uDD49-\uDD4E\uDD50-\uDD67\uDD6F\uDD70\uDD73]|\uDD74(?:\uD83C[\uDFFB-\uDFFF])?|\uDD75(?:\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?|\uFE0F\u200D[\u2640\u2642]\uFE0F)?|[\uDD76-\uDD79]|\uDD7A(?:\uD83C[\uDFFB-\uDFFF])?|[\uDD87\uDD8A-\uDD8D]|[\uDD90\uDD95\uDD96](?:\uD83C[\uDFFB-\uDFFF])?|[\uDDA4\uDDA5\uDDA8\uDDB1\uDDB2\uDDBC\uDDC2-\uDDC4\uDDD1-\uDDD3\uDDDC-\uDDDE\uDDE1\uDDE3\uDDE8\uDDEF\uDDF3\uDDFA-\uDE44]|[\uDE45-\uDE47](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDE48-\uDE4A]|\uDE4B(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDE4C(?:\uD83C[\uDFFB-\uDFFF])?|[\uDE4D\uDE4E](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDE4F(?:\uD83C[\uDFFB-\uDFFF])?|[\uDE80-\uDEA2]|\uDEA3(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDEA4-\uDEB3]|[\uDEB4-\uDEB6](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDEB7-\uDEBF]|\uDEC0(?:\uD83C[\uDFFB-\uDFFF])?|[\uDEC1-\uDEC5\uDECB]|\uDECC(?:\uD83C[\uDFFB-\uDFFF])?|[\uDECD-\uDED2\uDED5\uDEE0-\uDEE5\uDEE9\uDEEB\uDEEC\uDEF0\uDEF3-\uDEFA\uDFE0-\uDFEB])|\uD83E(?:[\uDD0D\uDD0E]|\uDD0F(?:\uD83C[\uDFFB-\uDFFF])?|[\uDD10-\uDD17]|[\uDD18-\uDD1C](?:\uD83C[\uDFFB-\uDFFF])?|\uDD1D|[\uDD1E\uDD1F](?:\uD83C[\uDFFB-\uDFFF])?|[\uDD20-\uDD25]|\uDD26(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDD27-\uDD2F]|[\uDD30-\uDD36](?:\uD83C[\uDFFB-\uDFFF])?|\uDD37(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDD38\uDD39](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDD3A|\uDD3C(?:\u200D[\u2640\u2642]\uFE0F)?|[\uDD3D\uDD3E](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDD3F-\uDD45\uDD47-\uDD71\uDD73-\uDD76\uDD7A-\uDDA2\uDDA5-\uDDAA\uDDAE-\uDDB4]|[\uDDB5\uDDB6](?:\uD83C[\uDFFB-\uDFFF])?|\uDDB7|[\uDDB8\uDDB9](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDDBA|\uDDBB(?:\uD83C[\uDFFB-\uDFFF])?|[\uDDBC-\uDDCA]|[\uDDCD-\uDDCF](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|\uDDD0|\uDDD1(?:\u200D\uD83E\uDD1D\u200D\uD83E\uDDD1|\uD83C(?:\uDFFB(?:\u200D\uD83E\uDD1D\u200D\uD83E\uDDD1\uD83C\uDFFB)?|\uDFFC(?:\u200D\uD83E\uDD1D\u200D\uD83E\uDDD1\uD83C[\uDFFB\uDFFC])?|\uDFFD(?:\u200D\uD83E\uDD1D\u200D\uD83E\uDDD1\uD83C[\uDFFB-\uDFFD])?|\uDFFE(?:\u200D\uD83E\uDD1D\u200D\uD83E\uDDD1\uD83C[\uDFFB-\uDFFE])?|\uDFFF(?:\u200D\uD83E\uDD1D\u200D\uD83E\uDDD1\uD83C[\uDFFB-\uDFFF])?))?|[\uDDD2-\uDDD5](?:\uD83C[\uDFFB-\uDFFF])?|\uDDD6(?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDDD7-\uDDDD](?:\u200D[\u2640\u2642]\uFE0F|\uD83C[\uDFFB-\uDFFF](?:\u200D[\u2640\u2642]\uFE0F)?)?|[\uDDDE\uDDDF](?:\u200D[\u2640\u2642]\uFE0F)?|[\uDDE0-\uDDFF\uDE70-\uDE73\uDE78-\uDE7A\uDE80-\uDE82\uDE90-\uDE95])
I'm working on an SSO implementation in PHP that authenticates to a system written in C#. Here's some pseudo code to demonstrate:
$token = "MqsXexqpYRUNAHR_lHkPRic1g1BYhH6bFNVPagEkuaL8Mf80l_tOirhThQYIbfWYErgu4bDwl-7brVhXTWnJNQ2";
$id = "bob#company.com";
$ssokey = "7MpszrQpO95p7H";
$idAndKey = $id . $ssokey;
$salt = base64_decode(substr($token, 0, -1));
$hashed = hash_pbkdf2("sha256", $idAndKey, mb_convert_encoding($salt, 'UTF-16LE'), 1000, 24, false);
$data = base64_encode($hashed);
This outputs: NWZiMTBhZmNhNTlmYzMxMTEzMThhZmVl
Here's the C# version from the system with which I'm integrating:
var token = "MqsXexqpYRUNAHR_lHkPRic1g1BYhH6bFNVPagEkuaL8Mf80l_tOirhThQYIbfWYErgu4bDwl-7brVhXTWnJNQ2";
var id = "bob#company.com";
var ssokey = "7MpszrQpO95p7H";
string idAndKey = id + ssokey;
var salt = HttpServerUtility.UrlTokenDecode(token);
var pbkdf2 = new Rfc2898DeriveBytes(idAndKey, salt) {IterationCount = 1000};
var key = HttpServerUtility.UrlTokenEncode(pbkdf2.GetBytes(24));
Console.WriteLine(key.ToString());
This outputs: aE1k9-djZ66WbUATqdHbWyJzskMI5ABS0
I cannot figure out how to get my PHP code to do the same thing. I have a feeling it is in the salt generation.
I've tried to translate the C# HttpServerUtility.UrlTokenDecode function to PHP like so:
function UrlTokenDecode($token) {
$numPadChars = substr($token, -1);
// add the padded count to the end
$salt = substr($token, 0, -1) . $numPadChars;
// Transform the "-" to "+", and "*" to "/"
$salt = str_replace('-', '+', str_replace('*', '/', $salt));
// base64_decode
$salt = base64_decode($salt);
return $salt;
}
That didn't get me to where I needed to go. Halp!
This is for Absorb LMS. Documentation of their methods are here: https://support.absorblms.com/hc/en-us/articles/222446647-Incoming-Absorb-Single-Sign-On#Methods
Thanks!
I don't know php at all, but still can help I think. First, as stated in my comment, Rfc2898DeriveBytes in C# uses SHA1 as hash function, not SHA256, doesn't matter what your documentation says.
Next, UrlTokenDecode (and Encode) is quite strange thing I rarely seen in practice. It converts regular base64 to "url safe" version as follows:
replaces '+' with '-'
replaces '/' with '_'
removes padding ('==' at the end) and appends length of removed padding as a number as last character (if there were no padding - it still appends "0"). This step doesn't make any sense to me, but it's how it works.
So to replicate you need to base64_encode, replace, remove padding, and then add padding length as character. So if your base64 string ended with == - you remove that and add "2" at the end. If there was no padding - you add "0".
So to decode that string you need to make back replacement, then remove last character and add that much '=' to the end as indicated by that character.
So string
MqsXexqpYRUNAHR_lHkPRic1g1BYhH6bFNVPagEkuaL8Mf80l_tOirhThQYIbfWYErgu4bDwl-7brVhXTWnJNQ2
In normal base64 is
MqsXexqpYRUNAHR/lHkPRic1g1BYhH6bFNVPagEkuaL8Mf80l/tOirhThQYIbfWYErgu4bDwl+7brVhXTWnJNQ==
Then, I have no idea why you do that
mb_convert_encoding($salt, 'UTF-16LE')
Just remove it (though as I don't know php - there might be some reason you are doing that, but I just cannot imagine which, so take care).
Then as other answer states - the last argument to hash_pbkdf2() should be true.
After making this changes your code will work (I used token already converted to normal base64 string):
$token = "MqsXexqpYRUNAHR/lHkPRic1g1BYhH6bFNVPagEkuaL8Mf80l/tOirhThQYIbfWYErgu4bDwl+7brVhXTWnJNQ==";
$id = "bob#company.com";
$ssokey = "7MpszrQpO95p7H";
$idAndKey = $id . $ssokey;
$salt = base64_decode($token);
$hashed = hash_pbkdf2("sha1", $idAndKey, $salt, 1000, 24, true);
$data = base64_encode($hashed);
echo $data;
produces expected answer (in normal base64 - you need to "url encode" it to get exact match).
I've already burned through more time than I should have on this, but while it's not a complete answer, the few major problems that I found were:
The hashing algorithm is SHA1, not SHA256. [as #Evk already noted]
HttpServerUtility.UrlToken(De|En)code() use a url-safe variant of base64 that needs to be replicated.
function base64url_encode($bin) {
return str_replace(['+', '/', '='], ['-', '_', ''], base64_encode($bin));
}
function base64url_decode($str) {
return base64_decode(str_replace(['-', '_'], ['+', '/'], $str));
}
When you decode the token the result is a binary string, and trying to run that through mb_convert_encoding to change the endian-ness [I found that awful blog post too] won't do what you think. You can try the following, but the token has an odd number of bytes which is problematic no matter which way you look at it. [edit: is there just a bare \x0d carriage return at the end?]
function swapEndian16($in) {
$out = '';
foreach(str_split($in, 2) as $chunk) {
$out .= $chunk[1] . $chunk[0];
}
return $out;
}
The last argument to hash_pbkdf2() should be true, otherwise you're getting a hex-encoded hash rather than the raw bytes.
Really what I'd suggest is asking your vendor if they have any insight on accomplishing this. Chances are that someone's already had and solved this problem with their integrations.
Edit: With the new info from #Evk's answer, here are some sassily-named functions for compatibility with C#'s brilliant base64 URL encoding:
function dumb_base64url_encode($bin) {
return preg_replace_callback(
'/(=*)$/',
function($matches){
return strlen($matches[0]);
},
str_replace(
['+', '/'],
['-', '_'],
base64_encode($bin)
),
1
);
}
function dumb_base64url_decode($str) {
return base64_decode(
str_replace(
['-', '_'],
['+', '/'],
substr($str, 0, -1)
)
);
}
So now, with the un-"corrected" token:
$token = "MqsXexqpYRUNAHR_lHkPRic1g1BYhH6bFNVPagEkuaL8Mf80l_tOirhThQYIbfWYErgu4bDwl-7brVhXTWnJNQ2";
$id = "bob#company.com";
$ssokey = "7MpszrQpO95p7H";
$idAndKey = $id . $ssokey;
$salt = dumb_base64url_decode($token);
$hashed = hash_pbkdf2("sha1", $idAndKey, $salt, 1000, 24, true);
$data = dumb_base64url_encode($hashed);
echo $data; // output: aE1k9-djZ66WbUATqdHbWyJzskMI5ABS0
And don't sweat whose answer to mark correct, I think #Evk's got the most important bits sorted.
I'm trying to consume an API and for that purpose I have to create a signature using SHA384. The docs describe doing:
signature = hex(HMAC_SHA384(base64(payload), key=api_secret))
They give an example:
~$ base64 << EOF
> {
> "request": "/v1/order/status",
> "nonce": 123456,
>
> "order_id": 18834
> }
> EOF
ewogICAgInJlcXVlc3QiOiAiL3YxL29yZGVyL3N0YXR1cyIsCiAgICAibm9uY2UiOiAxMjM0NTYs
CgogICAgIm9yZGVyX2lkIjogMTg4MzQKfQo=
In this example, the api_secret is 1234abcd
echo -n 'ewogICAgInJlcXVlc3QiOiAiL3YxL29yZGVyL3N0YXR1cyIsCiAgICAibm9uY2UiOiAxMjM0NTYsCgogICAgIm9yZGVyX2lkIjogMTg4MzQKfQo=' | openssl sha384 -hmac "1234abcd"
(stdin)= 337cc8b4ea692cfe65b4a85fcc9f042b2e3f702ac956fd098d600ab15705775017beae402be773ceee10719ff70d710f
It took a little while, but I realized that in order to replicate the base64 of the original string I had to replace "\r\n" with "\n".
Here's what I've got (ignoring the formatting that I wasted 20 minutes trying to make good):
var raw = #"{
""request"": ""/v1/order/status"",
""nonce"": 123456,
""order_id"": 18834
}
";
var data = raw.Replace("\r\n", "\n");
Console.WriteLine(data);
var data64 = Convert.ToBase64String(Encoding.UTF8.GetBytes(data.ToCharArray()));
if (data64 != "ewogICAgInJlcXVlc3QiOiAiL3YxL29yZGVyL3N0YXR1cyIsCiAgICAibm9uY2UiOiAxMjM0NTYsCgogICAgIm9yZGVyX2lkIjogMTg4MzQKfQo=")
{
Console.WriteLine("base64's don't match");
}
Console.WriteLine("ewogICAgInJlcXVlc3QiOiAiL3YxL29yZGVyL3N0YXR1cyIsCiAgICAibm9uY2UiOiAxMjM0NTYsCgogICAgIm9yZGVyX2lkIjogMTg4MzQKfQo=");
Console.WriteLine(data64);
var key = Encoding.UTF8.GetBytes("1234abcd");
using (var hash = new HMACSHA384(key))
{
var hash64 = Convert.ToBase64String(hash.ComputeHash(Encoding.UTF8.GetBytes(data64)));
StringBuilder sb = new StringBuilder();
foreach (char c in hash64)
{
sb.Append(Convert.ToInt32(c).ToString("x"));
}
Console.WriteLine(sb.ToString());
// yields:
// 4d337a49744f70704c50356c744b68667a4a38454b79342f6343724a5676304a6a57414b7356634664314158767135414b2b647a7a753451635a2f3344584550
// should be:
// 337cc8b4ea692cfe65b4a85fcc9f042b2e3f702ac956fd098d600ab15705775017beae402be773ceee10719ff70d710f
}
My code's output doesn't match the documentation's expected output. Can someone see what I'm doing wrong?
For some reason you are converting hash to base-64 string, then you convert each character of that string to int and that you convert to hex. All that is not needed and not described in "documentation". Instead, do like this:
var hashBin = hash.ComputeHash(Encoding.UTF8.GetBytes(data64));
var hashHex = BitConverter.ToString(hashBin).Replace("-", "").ToLowerInvariant();
Console.WriteLine(hashHex);
I have been working, trying to get a C# equivalent of a PHP API, I was getting error message as invalid hash, so I decided to break the code in parts and check the out put of the individual part for both PHP and C#.
Below is what I got:
The code and out put of php ref:
$ref = time().mt_rand(0,9999999);
----Out put as at the time it was tested----
14909496966594256
In my C# code for ref is as follows:
string refl = (DateTime.UtcNow .Subtract( new DateTime(1970, 1, 1, 0, 0, 0, 0))).TotalSeconds + rnd.Next(0, 9999999).ToString();
----Out put as at the time it was tested----
1490602845.686821282389
The php hash out put with the following variables is as follows:
$task = 'pay';
$merchant_email_on_voguepay = 'merchant#example.com';
$ref = '14909496966594256';
$command_api_token = '9ufkS6FJffGplu9t7uq6XPPVQXBpHbaN';
$hash = hash('sha512',$command_api_token.$task.$merchant_email_on_voguepay.$ref);
----Out put ----
1cee97da4c0b742b6d5cdc463914fe07c04c6aff8d999fb7ce7aaa05076ea17577752ecf8947e5b98e7305ef09e0de2fed73e4817d906d6b123e54c1f9b15e74
Then the C# out put using the same variables and the same PHP ref out put
const string task = "Pay";
const string command_api_token = "9ufkS6FJffGplu9t7uq6XPPVQXBpHbaN";
const string merchant_email_on_voguepay = "merchant#example.com";
Random rnd = new Random();
string refl = "14909496966594256";
string hash_target = (command_api_token + task + merchant_email_on_voguepay + refl);
SHA512 sha512 = new System.Security.Cryptography.SHA512Managed();
var bytes = UTF8Encoding.UTF8.GetBytes(hash_target);
string cryString = BitConverter.ToString(sha512.ComputeHash(bytes));
string hashD = (cryString).Replace("-", string.Empty).ToLower();
----Out put ----
551b057b64f49fc6bd7d428a8e3c36ddaab5e468fd5f9042ad5d4a4fa50349e5312ad2097b4e46d1e74a5a3f4e843848352edb0ea7073dd1cd53b1c4c14ab286
Here I discovered that the out put of my C# is different from that of php. so what could be the problem with my code, I will like to get the same out put of that of php using the same variables.
Any good idea to resolve this is welcome.
In your C# code, the value of task is "Pay".
In the PHP code it's "pay".
Different input values will, naturally, not hash the same.
I'm having a hard time trying to figure this out. I'm writing a Unit test that verifies that the MD5 that a site displays matches the actual MD5 of the file. I do this by simply grabbing what the page displays and then calculating my own MD5 of the file. I get the text on the page by using Selenium WebDriver.
As expected, the strings show up as the same...or it appears to be
When I try to test the two strings using Assert.AreEqual or Assert.IsTrue, it fails no matter how I try to compare them
I've tried the following ways:
Assert.AreEqual(md5, md5Text); //Fails
Assert.IsTrue(md5 == md5Text); //Fails
Assert.IsTrue(String.Equals(md5, md5Text)); //Fails
Assert.IsTrue(md5.Normalize() == md5Text.Normalize()); //Fails
Assert.AreEqul(md5.Normalize(), md5Text.Normalize()); //Fails
At first, I thought the strings were actual different, but looking at them in the debugger shows that both strings are exactly the same
So I tried looking at their lengths, that's when I saw why
The strings are different lengths..so I tried to substring the md5 variable to match the size of the md5Text variable. My thinking here was maybe md5 had a bunch of 0 width characters. However doing this got rid of the last half of md5
SOO, this must mean their in different encodings correct? But wouldn't Normalize() fix that?
This is how the variable md5 is created
string md5;
using (var stream = file.Open()) //file is a custom class with an Open() method that returns a Stream
{
using (var generator = MD5.Create())
{
md5 = BitConverter.ToString(generator.ComputeHash(stream)).Replace("-", "").ToLower().Trim();
}
}
and this is how the md5Text variable is created
//I'm using Selenium WebDrvier to grab the text from the page
var md5Element = row.FindElements(By.XPath("//span[#data-bind='text: MD5Hash']")).Where(e => e.Visible()).First();
var md5Text = md5Element.Text;
How can I make this test pass? as it should be passing (since they are the same)
UPDATE:
The comments suggested I turn the strings into a char[] and iterate over it. Here are the results of that (http://pastebin.com/DX335wU8) and the code I added to do it
char[] md5Characters = md5.ToCharArray();
char[] md5TextCharacters = md5Text.ToCharArray();
//Use md5 length since it's bigger
for (int i = 0; i < md5Characters.Length; i++)
{
System.Diagnostics.Debug.Write("md5: " + md5Characters[i]);
if (i >= md5TextCharacters.Length)
{
System.Diagnostics.Debug.Write(" | Exhausted md5Text characters..");
}
else
{
System.Diagnostics.Debug.Write(" | md5Text: " + md5TextCharacters[i]);
}
System.Diagnostics.Debug.WriteLine("");
}
One thing I found interesting is that the md5 char array has a bunch of random characters inside of it every 2 letters
.Replace("-", "")
Your "" is not empty, there is actually a " then unicode zero width non-joiner + zero width space then " so you are not replacing "-" with an empty string rather you are inserting additional characters.
Delete and retype "" or use String.Empty.