Decoding Decimal64 data

Decoding Decimal64 data - c#

I'm looking for a simple way to decode data stored in the Decimal64 format (described here: http://en.wikipedia.org/wiki/Decimal64_floating-point_format) using C#.
Any thoughts?

Looked everywhere for it, finally we implemented it ourselves.
Update
Some of you has asked us for the code - here is our code for it, We call it Float Decimal, I think it matches what Decimal 64 does - but no guarantees - please check for yourself.
Also note - that the value of _size should be 8.
if (bytes[0] == 0) return 0;
var s = "";
for (var i = 1; i < bytes.Length; i++)
s += bytes[i].ToString("X").PadLeft(2, '0');
return decimal.Parse("." + s.TrimEnd('0')) * (decimal)Math.Pow(10 , ((bytes[0] & ~128) - 64)) * ((bytes[0] & 128) > 0 ? -1 : 1);
To save:
if (value != 0)
{
var negative = value < 0;
var s = value.ToDecimal().ToString(CultureInfo.InvariantCulture).TrimStart('-', '0');
var i = s.IndexOf('.');
if (i >= 0)
{
s = s.Remove(i, 1);
if (i == 0)
{
i = s.Length;
s = s.TrimStart('0');
i = s.Length-i;
}
}
else i = s.Length;
bytes[0] = (byte)(64 + i + (negative ? 128 : 0));
s = s.PadRight((_size - 1) * 2, '0');
for (var j = 1; j < _size && (j - 1) * 2 < s.Length; j++)
bytes[j] = byte.Parse(s.Substring((j - 1) * 2, 2), System.Globalization.NumberStyles.HexNumber);
}

Read a bit too fast there.
I think you'd want to have a look at the BitConverter.ToSingle method in C# but reverse the order of the bytes to get a correct result. :)
B.R
Jaggernauten

Related

inaccurate results with function to add an array of digits together

so i have this function:
static int[] AddArrays(int[] a, int[] b)
{
int length1 = a.Length;
int length2 = b.Length;
int carry = 0;
int max_length = Math.Max(length1, length2) + 1;
int[] minimum_arr = new int[max_length - length1].Concat(a).ToArray();
int[] maximum_arr = new int[max_length - length2].Concat(b).ToArray();
int[] new_arr = new int[max_length];
for (int i = max_length - 1; i >= 0; i--)
{
int first_digit = maximum_arr[i];
int second_digit = i - (max_length - minimum_arr.Length) >= 0 ? minimum_arr[i - (max_length - minimum_arr.Length)] : 0;
if (second_digit + first_digit + carry > 9)
{
new_arr[i] = (second_digit + first_digit + carry) % 10;
carry = 1;
}
else
{
new_arr[i] = second_digit + first_digit + carry;
carry = 0;
}
}
if (carry == 1)
{
int[] result = new int[max_length + 1];
result[0] = 1;
Array.Copy(new_arr, 0, result, 1, max_length);
return result;
}
else
{
return new_arr;
}
}
it basically takes 2 lists of digits and adds them together. the point of this is that each array of digits represent a number that is bigger then the integer limits. now this function is close to working the results get innacurate at certein places and i honestly have no idea why. for example if the function is given these inputs:
"1481298410984109284109481491284901249018490849081048914820948019" and
"3475893498573573849739857349873498739487598" (both of these are being turned into a array of integers before being sent to the function)
the expected output is:
1,481,298,410,984,109,284,112,957,384,783,474,822,868,230,706,430,922,413,560,435,617
and what i get is:
1,481,298,410,984,109,284,457,070,841,142,258,634,158,894,233,092,241,356,043,561,7
i would very much appreciate some help with this ive been trying to figure it out for hours and i cant seem to get it to work perfectly.

I suggest Reverse arrays a and b and use good old school algorithm:
static int[] AddArrays(int[] a, int[] b) {
Array.Reverse(a);
Array.Reverse(b);
int[] result = new int[Math.Max(a.Length, b.Length) + 1];
int carry = 0;
int value = 0;
for (int i = 0; i < Math.Max(a.Length, b.Length); ++i) {
value = (i < a.Length ? a[i] : 0) + (i < b.Length ? b[i] : 0) + carry;
result[i] = value % 10;
carry = value / 10;
}
if (carry > 0)
result[result.Length - 1] = carry;
else
Array.Resize(ref result, result.Length - 1);
// Let's restore a and b
Array.Reverse(a);
Array.Reverse(b);
Array.Reverse(result);
return result;
}
Demo:
string a = "1481298410984109284109481491284901249018490849081048914820948019";
string b = "3475893498573573849739857349873498739487598";
string c = string.Concat(AddArrays(
a.Select(d => d - '0').ToArray(),
b.Select(d => d - '0').ToArray()));
Console.Write(c);
Output:
1481298410984109284112957384783474822868230706430922413560435617

How to print a string in diamond format in c#

If I have a word "start", I want to print like this using for loop
a
tar
start
tar
a
How to print string in c# when it is taking string with odd number length as input from the user eg: "START", "QUESTIONS"
Here is my code
string input;
for (int i = 1; i <= input.Length; i++)
{
for (int j = 0; j < (input.Length - 2); j++)
Console.Write(" ");
for (int j = number; j < (number - 1); j--)
{
Console.Write(input[j]);
}
for (int k = number; k < i && k > 0; k++)
Console.Write(input[k]);
Console.WriteLine();
}

I doubt if this Linq routine will be accepted as a homework solution, however it could be useful for you for testing your own code:
String source = "start";
String result = String.Join(Environment.NewLine, Enumerable
.Range(0, source.Length)
.Select(index => source.Length - Math.Abs(index - source.Length / 2) * 2)
.Where(length => length > 0) // for even size words, e.g. "star"
.Select(length => source
.Substring((source.Length - length) / 2, length)
.PadLeft((source.Length - length) / 2 + length, ' ')));
// Test
// a
// tar
// start
// tar
// a
Console.Write(result);

If you really need a for-loop solution, you can do this
string input = "questions"; //for example
if (input.Length % 2 == 0)
return; //as per given condition, only ODD length strings
var isReducing = false;
for (int i = 0, len = 1, startIndex = (input.Length - 1) / 2; i < input.Length; i++)
{
var str = input.Substring(startIndex, len);
Console.WriteLine(str.PadLeft(len + startIndex, ' '));
if (len == input.Length)
isReducing = true;
startIndex = isReducing ? startIndex + 1 : startIndex - 1;
len = isReducing ? len - 2 : len + 2;
}

How about 2 x for-loop :)
string s = "0123456";
int l = s.Length;
int c = l / 2 + 1; //center
for (int i = 0; i < c; i++)
Console.WriteLine(s.Substring(c - i - 1, i * 2 + 1).PadLeft(c + i, ' '));
for (int i = c - 2; i >= 0; i--)
Console.WriteLine(s.Substring(c - i - 1, i * 2 + 1).PadLeft(c + i, ' '));

fuzzy matching word on OCR page

I have a static phrase the I am searching an OCR'd image for.
string KeywordToFind = "Account Number"
string OcrPageText = "
GEORGIA
POWER
A SOUTHERN COMPANY
AecountNumber
122- 493
Pagel of2
Please Pay By
Jan 29,2014
Total Due
39.11
"
How can I find the word "AecountNumber" using my keyword "Account Number"?
I have tried using variations of the Levenshtein Distance Algorithm HERE with varied success. I've also tried regexes, but the OCR often converts the text differently, thus rendering the regex useless.
Suggestions? I can provide more code if the link doesn't give enough information. Also, Thanks!

Why not try something mostly arbitrary, like this -- while it would certainly match a lot more than just account number, the chances of the start and end characters existing elsewhere in that order is pretty slim.
A.?c.?.?nt ?N.?[mn]b.?r
http://regex101.com/r/zV1yM2
It'll match things like:
Account Number
AccntNumbr
Aecnt Nunber

Answered My Question with the use of sub-strings. Posting in case others run into the same type of problem. A little unorthodox, but it works great for me.
int TextLengthBuffer = (int)StaticTextLength - 1; //start looking for correct result with one less character than it should have.
int LowestLevenshteinNumber = 999999; //initialize insanely high maximum
decimal PossibleStringLength = (PossibleString.Length); //Length of string to search
decimal StaticTextLength = (StaticText.Length); //Length of text to search for
decimal NumberOfErrorsAllowed = Math.Round((StaticTextLength * (ErrorAllowance / 100)), MidpointRounding.AwayFromZero); //Find number of errors allowed with given ErrorAllowance percentage
//Look for best match with 1 less character than it should have, then the correct amount of characters.
//And last, with 1 more character. (This is because one letter can be recognized as
//two (W -> VV) and visa versa)
for (int i = 0; i < 3; i++)
{
for (int e = TextLengthBuffer; e <= (int)PossibleStringLength; e++)
{
string possibleResult = (PossibleString.Substring((e - TextLengthBuffer), TextLengthBuffer));
int lAllowance = (int)(Math.Round((possibleResult.Length - StaticTextLength) + (NumberOfErrorsAllowed), MidpointRounding.AwayFromZero));
int lNumber = LevenshteinAlgorithm(StaticText, possibleResult);
if (lNumber <= lAllowance && ((lNumber < LowestLevenshteinNumber) || (TextLengthBuffer == StaticText.Length && lNumber <= LowestLevenshteinNumber)))
{
PossibleResult = (new StaticTextResult { text = possibleResult, errors = lNumber });
LowestLevenshteinNumber = lNumber;
}
}
TextLengthBuffer++;
}
public static int LevenshteinAlgorithm(string s, string t) // Levenshtein Algorithm
{
int n = s.Length;
int m = t.Length;
int[,] d = new int[n + 1, m + 1];
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
for (int i = 1; i <= n; i++)
{
for (int j = 1; j <= m; j++)
{
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
return d[n, m];
}

Damerau–Levenshtein distance algorithm, disable counting of delete

How can i disable counting of deletion, in this implementation of Damerau-Levenshtein distance algorithm, or if there is other algorithm already implemented please point me to it.
Example(disabled deletion counting):
string1: how are you?
string2: how oyu?
distance: 1 (for transposition, 4 deletes doesn't count)
And here is the algorithm:
public static int DamerauLevenshteinDistance(string string1, string string2, int threshold)
{
// Return trivial case - where they are equal
if (string1.Equals(string2))
return 0;
// Return trivial case - where one is empty
if (String.IsNullOrEmpty(string1) || String.IsNullOrEmpty(string2))
return (string1 ?? "").Length + (string2 ?? "").Length;
// Ensure string2 (inner cycle) is longer_transpositionRow
if (string1.Length > string2.Length)
{
var tmp = string1;
string1 = string2;
string2 = tmp;
}
// Return trivial case - where string1 is contained within string2
if (string2.Contains(string1))
return string2.Length - string1.Length;
var length1 = string1.Length;
var length2 = string2.Length;
var d = new int[length1 + 1, length2 + 1];
for (var i = 0; i <= d.GetUpperBound(0); i++)
d[i, 0] = i;
for (var i = 0; i <= d.GetUpperBound(1); i++)
d[0, i] = i;
for (var i = 1; i <= d.GetUpperBound(0); i++)
{
var im1 = i - 1;
var im2 = i - 2;
var minDistance = threshold;
for (var j = 1; j <= d.GetUpperBound(1); j++)
{
var jm1 = j - 1;
var jm2 = j - 2;
var cost = string1[im1] == string2[jm1] ? 0 : 1;
var del = d[im1, j] + 1;
var ins = d[i, jm1] + 1;
var sub = d[im1, jm1] + cost;
//Math.Min is slower than native code
//d[i, j] = Math.Min(del, Math.Min(ins, sub));
d[i, j] = del <= ins && del <= sub ? del : ins <= sub ? ins : sub;
if (i > 1 && j > 1 && string1[im1] == string2[jm2] && string1[im2] == string2[jm1])
d[i, j] = Math.Min(d[i, j], d[im2, jm2] + cost);
if (d[i, j] < minDistance)
minDistance = d[i, j];
}
if (minDistance > threshold)
return int.MaxValue;
}
return d[d.GetUpperBound(0), d.GetUpperBound(1)] > threshold
? int.MaxValue
: d[d.GetUpperBound(0), d.GetUpperBound(1)];
}

public static int DamerauLevenshteinDistance( string string1
, string string2
, int threshold)
{
// Return trivial case - where they are equal
if (string1.Equals(string2))
return 0;
// Return trivial case - where one is empty
// WRONG FOR YOUR NEEDS:
// if (String.IsNullOrEmpty(string1) || String.IsNullOrEmpty(string2))
// return (string1 ?? "").Length + (string2 ?? "").Length;
//DO IT THIS WAY:
if (String.IsNullOrEmpty(string1))
// First string is empty, so every character of
// String2 has been inserted:
return (string2 ?? "").Length;
if (String.IsNullOrEmpty(string2))
// Second string is empty, so every character of string1
// has been deleted, but you dont count deletions:
return 0;
// DO NOT SWAP THE STRINGS IF YOU WANT TO DEAL WITH INSERTIONS
// IN A DIFFERENT MANNER THEN WITH DELETIONS:
// THE FOLLOWING IS WRONG FOR YOUR NEEDS:
// // Ensure string2 (inner cycle) is longer_transpositionRow
// if (string1.Length > string2.Length)
// {
// var tmp = string1;
// string1 = string2;
// string2 = tmp;
// }
// Return trivial case - where string1 is contained within string2
if (string2.Contains(string1))
//all changes are insertions
return string2.Length - string1.Length;
// REVERSE CASE: STRING2 IS CONTAINED WITHIN STRING1
if (string1.Contains(string2))
//all changes are deletions which you don't count:
return 0;
var length1 = string1.Length;
var length2 = string2.Length;
// PAY ATTENTION TO THIS CHANGE!
// length1+1 rows is way too much! You need only 3 rows (0, 1 and 2)
// read my explanation below the code!
// TOO MUCH ROWS: var d = new int[length1 + 1, length2 + 1];
var d = new int[2, length2 + 1];
// THIS INITIALIZATION COUNTS DELETIONS. YOU DONT WANT IT
// or (var i = 0; i <= d.GetUpperBound(0); i++)
// d[i, 0] = i;
// But you must initiate the first element of each row with 0:
for (var i = 0; i <= 2; i++)
d[i, 0] = 0;
// This initialization counts insertions. You need it, but for
// better consistency of code I call the variable j (not i):
for (var j = 0; j <= d.GetUpperBound(1); j++)
d[0, j] = j;
// Now do the job:
// for (var i = 1; i <= d.GetUpperBound(0); i++)
for (var i = 1; i <= length1; i++)
{
//Here in this for-loop: add "%3" to evey term
// that is used as first index of d!
var im1 = i - 1;
var im2 = i - 2;
var minDistance = threshold;
for (var j = 1; j <= d.GetUpperBound(1); j++)
{
var jm1 = j - 1;
var jm2 = j - 2;
var cost = string1[im1] == string2[jm1] ? 0 : 1;
// DON'T COUNT DELETIONS! var del = d[im1, j] + 1;
var ins = d[i % 3, jm1] + 1;
var sub = d[im1 % 3, jm1] + cost;
// Math.Min is slower than native code
// d[i, j] = Math.Min(del, Math.Min(ins, sub));
// DEL DOES NOT EXIST
// d[i, j] = del <= ins && del <= sub ? del : ins <= sub ? ins : sub;
d[i % 3, j] = ins <= sub ? ins : sub;
if (i > 1 && j > 1 && string1[im1] == string2[jm2] && string1[im2] == string2[jm1])
d[i % 3, j] = Math.Min(d[i % 3, j], d[im2 % 3, jm2] + cost);
if (d[i % 3, j] < minDistance)
minDistance = d[i % 3, j];
}
if (minDistance > threshold)
return int.MaxValue;
}
return d[length1 % 3, d.GetUpperBound(1)] > threshold
? int.MaxValue
: d[length1 % 3, d.GetUpperBound(1)];
}
here comes my explanation why you need only 3 rows:
Look at this line:
var d = new int[length1 + 1, length2 + 1];
If one string has the length n and the other has the length m, then your code needs a space of (n+1)*(m+1) integers. Each Integer needs 4 Byte. This is waste of memory if your strings are long. If both strings are 35.000 byte long, you will need more than 4 GB of memory!
In this code you calculate and write a new value for d[i,j]. And to do this, you read values from its upper neighbor (d[i,jm1]), from its left neighbor (d[im1,j]), from its upper-left neighbor (d[im1,jm1]) and finally from its double-upper-double-left neighbour (d[im2,jm2]). So you just need values from your actual row and 2 rows before.
You never need values from any other row. So why do you want to store them? Three rows are enough, and my changes make shure, that you can work with this 3 rows without reading any wrong value at any time.

I would advise not rewriting this specific algorithm to handle specific cases of "free" edits. Many of them radically simplify the concept of the problem to the point where the metric will not convey any useful information.
For example, when substitution is free the distance between all strings is the difference between their lengths. Simply transmute the smaller string into the prefix of the larger string and add the needed letters. (You can guarantee that there is no smaller distance because one insertion is required for each character of edit distance.)
When transposition is free the question reduces to determining the sum of differences of letter counts. (Since the distance between all anagrams is 0, sorting the letters in each string and exchanging out or removing the non-common elements of the larger string is the best strategy. The mathematical argument is similar to that of the previous example.)
In the case when insertion and deletion are free the edit distance between any two strings is zero. If only insertion OR deletion is free this breaks the symmetry of the distance metric - with free deletions, the distance from a to aa is 1, while the distance from aa to a is 1. Depending on the application this could possibly be desirable; but I'm not sure if it's something you're interested in. You will need to greatly alter the presented algorithm because it makes the mentioned assumption of one string always being longer than the other.

Try to change var del = d[im1, j] + 1; to var del = d[im1, j];, I think that solves your problem.

Modulus (%) in for-loop

I have a code here and I would like that it will display the first 10 and if I click on that, it will display again the second batch. I tried this first with my first for-code and it work now I'm working with arrays it seems it didn't accept it
The one I commented dont work? is this wrong?
Thanks
long [] potenzen = new long[32];
potenzen[0] = 1;
for (int i = 1; i < potenzen.Length; ++i)
{
potenzen[i] = potenzen[i-1] * 2;
//if (potenzen % 10 == 0)
// Console.ReadLine();
}
foreach (long elem in potenzen)
{
Console.WriteLine(" " + elem);
}

long [] potenzen = new long[32];
potenzen[0] = 1;
for (int i = 1; i < potenzen.Length; ++i)
{
potenzen[i]=potenzen[i-1]*2;
Console.WriteLine(potenzen[i-1]);
if (i % 10 == 0)
Console.ReadLine();
}
is more in line with what you want. An improvement would be to separate your data-manipulation logic from your data display logic.
long [] potenzen = new long[32];
potenzen[0] = 1;
for (int i = 1; i < potenzen.Length; ++i)
potenzen[i]=potenzen[i-1]*2;
for (int i = 0; i < potenzen.Length; ++i)
{
Console.WriteLine(potenzen[i]);
if (i % 10 == 0)
Console.ReadLine();
}
Of course, you could do this without an array
long potenzen = 1;
for (int i = 1; i < 32; ++i)
{
Console.WriteLine(potenzen);
potenzen = potenzen * 2;
if (i % 10 == 0)
Console.ReadLine();
}

You need:
if (i % 10 == 0)
and not:
if (potenzen % 10 == 0)

Applying the modulus operator to an array of longs is dubious.

potenzen is an array so you maybe try
if (i % 10 == 0)
or maybe
if (potenzen[i] % 10 == 0)

You're taking an array mod 10 -- at best, in an unsafe language, you'd be doing the modulo operation on a memory address.
This should work fine if you just change the line to:
// if you don't want to pause the first time you run it, replace with:
// if (i > 0 && i % 10 == 0) {
if (i % 10 == 0) {
Console.ReadLine();
}

Try changing it to:
long [] potenzen = new long[32];
potenzen[0] = 1;
Console.WriteLine(potenzen[0]);
for (int i = 1; i < potenzen.Length; ++i)
{
potenzen[i]=potenzen[i-1]*2;
Console.WriteLine(potenzen[i]);
if (i % 10 == 0)
{
var s = Console.ReadLine();
// break if s == some escape condition???
}
}
Right now, you're never printing, unless you completely finish your first for loop. My guess is that you're not allowing the full 32 elements to complete, so you're never seeing your results -
This will print them as they go.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Decoding Decimal64 data - c#

I'm looking for a simple way to decode data stored in the Decimal64 format (described here: http://en.wikipedia.org/wiki/Decimal64_floating-point_format) using C#. Any thoughts?

Read a bit too fast there. I think you'd want to have a look at the BitConverter.ToSingle method in C# but reverse the order of the bytes to get a correct result. :) B.R Jaggernauten

Related

inaccurate results with function to add an array of digits together

How to print a string in diamond format in c#

fuzzy matching word on OCR page

Damerau–Levenshtein distance algorithm, disable counting of delete

Modulus (%) in for-loop

Categories

Resources