I am working with text readable files which are exported from a client's systems that use a custom XML-like structure. I need to be able to parse and extract data from large numbers of these files with no documentation on how they are structured.
I have mostly worked out the file structure, however I am struggling with how values have been encoded. I can manually look up in the system the correct values as a comparison. Some examples:
Export Data = System Value
D411E848 = 500000
D40F86A = 100000
D41086A = 200000
I'm fairly sure the leading "D" is a token to say the field is a decimal or double value. The reason is that all numeric fields start with "D" and all text fields start with "S". The following "4" may also be part of the field data type, as all numeric fields seem to start with "D4".
However converting from Hex to Decimal on any combination of the export data value does not yield the correct result.
Any ideas how to do the conversion?
Extra data mappings:
Value Export File
1 D3FF
2 D4
3 D4008
4 D401
5 D4014
6 D4018
7 D401C
8 D402
9 D4022
10 D4024
100 D4059
1000 D408F4
100000 D40F86A
500000 D411E848
500001 D411E8484
500002 D411E8488
500003 D411E848C
500004 D411E849
500005 D411E8494
500006 D411E8498
500007 D411E849C
500008 D411E84A
500009 D411E84A4
500010 D411E84A8
Seems like a normal, but truncated, IEEE 754 64-bit (double precision) number.
0x408F400000000000 = 1000
408F4 (truncated)
D408F4 (prefixed with D)
0x411E848000000000 = 500000
411E848 (truncated)
D411E848 (prefixed with D)
Try converting it with the following website as a reference: http://www.binaryconvert.com/result_double.html?decimal=053048048048048048
I can see the pattern, starting from 2. Here are the steps to get decimal value from your custom format.
Skip D4 from the beginning of the string.
If LEN() < 3 fill with 0s to get at least 3 letters long string
Take 2 letters from the beginning of the string and convert using HEX to DEC converter
Add 1 to number get from point 3.
Get rest of the input string, skipping first 2 letters
Convert text from point 5. using HEX to DEC converter
Calculate POW(16, LEN(Y)), where Y is text from point 5.
Calculate X / Y, where X is number from point 6 and Y is text from point 7.
Calculate final result: POW(2, X)*(1 + Y), where X comes from point 4. and Y comes from point 9.
It may looks quite complicated, but it's actually quite simple.
I've created Excel Web App spreadsheat with results for all these steps for your sample inputs: http://sdrv.ms/1bO0wnz
Related
I want to format a floating point number as follows in C# such that the entire width of the floating point number in C# is a fixed length (python equivalent format specifier 6.2f) I do NOT want it to be padded with 0's on the left but padded with a white space
100.00
90.45
7.23
0.00
what I have tried so far
string.Format({0:###.##},100);
string.Format({0:###.##},90.45);
string.Format({0:###.##},7.23);
string.Format({0:###.##},0.00);
but the output is incorrect
100
90.45
7.23
//nothing is printed
I have also gone through this but am unable to find a solution.
I am aware of the string.PadLeft method, but I am wondering if there is a more proper way than
(string.format({0,0.00},number)).PadLeft(6," ")
EDIT
I am specifically asking if there is a correct inbuilt method for the same, not if it can be done with same mathematical wizardry
If you always want 2 digits after the decimal, you can specify 00 in the format specifier. You would need to use a right aligned field width also (I used 6 as the max field width).
Try this:
void Main()
{
Console.WriteLine(string.Format("{0,6:##0.00}",100.0));
Console.WriteLine(string.Format("{0,6:##0.00}",90.45));
Console.WriteLine(string.Format("{0,6:##0.00}",7.23));
Console.WriteLine(string.Format("{0,6:##0.00}",0.00));
}
In LinqPad it outputs:
100.00
90.45
7.23
0.00
In .NET 5.0+ you don't need to call string.Format directly. Instead prefix your strings with $ and delimit variables inside {..}
Example
double myfloat1 = 100.0;
double myfloat2 = 0.2;
Debug.WriteLine($"{myfloat1,8:0.00}"); // 100.00
Debug.WriteLine($"{myfloat2,8:0.00}"); // 0.20
The code 8:0.00 formats as fixed 8-length (space-padded) with 2 decimal places.
In my application, for Currency Analysis I have to deal with numbers in 7,6 format. 7 digits before the decimal format and 6 digits after the decimal point. Example: 1234567.123456
I am getting the exchange rates from the user and sending it to backend through the C# code. I have used the following datatable to store the rates and sending it to the SP.
DataTable structureTable = new DataTable("CurrencyAnalysis");
structureTable.Columns.Add("CurrentYearRate", typeof(decimal));
structureTable.Columns.Add("PriorYearRate", typeof(decimal));
Now, the issue is whenever I try to save a number with 6 digits after the decimal point, only the first two digits after the decimal points are getting saved.
i.e, if I save 1234567.123456, only 123456.12 is getting saved.
It takes only two precision. How can I set decimal precision for that column so that it can take up to 6 digits to the right of the decimal point?
Application background:
C# Web application with HTML5 and AngularJS
SQL Server
Define database value as decimal(18,6) The first value(18) is the precision and the second (6) is the scale, so (18,6) is essentially 18 digits with 6 digits after the decimal place.
Define it in database as decimal(18,6)
On Sql Server you have option with those 3 data type :
Float - 32 bit (7 digits)
Decimal - 128 bit (28-29 significant digits)
Check if you choose : decimal(18,2) it might be why you have 2 decimal.
You can store your numbers as Float too.
I am reading in an XML file and reading a specific value of 10534360.9
When I parse the string for a decimal ( decimal.Parse(afNodes2[0][column1].InnerText) ) I get 10534360.9, but when I parse it for a float ( float.Parse(afNodes2[0][column1].InnerText) ) I get 1.053436E+07.
Can someone explain why?
You are seeing the value represented in "E" or "exponential" notation.
1.053436E+07 is equivalent to 1.053436 x 10^7, or 10,534,360, which is the most precise way .NET can store 10,534,360.9 in the System.Single (float) data type (32 bits, 7 digits of precision).
You're seeing it represented that way because it is the default format produced by the Single.ToString() method, which the debugger uses to display values on the watch screens, console, etc.
EDIT
It probably goes without saying, but if you want to retain the precision of the original value in the XML, you could choose a data type that can retain more information about your numbers:
System.Double (64 bits, 15-16 digits)
System.Decimal (128 bits, 28-29 significant digits)
1.053436E+07 == 10534360.9
Those numbers are the same, just displayed differently.
Because float has a precision of 7 digits.
Decimal has a precision of 28 digits.
When viewing the data in a debugger, or displaying it via .ToString, by default it might be formatted using scientific notation:
Some examples of the return value are "100", "-123,456,789", "123.45e+6", "500", "3.1416", "600", "-0.123", and "-Infinity".
To format it as exact output, use the R (round trip) format string:
myFloat.ToString("R");
http://msdn.microsoft.com/en-us/library/dwhawy9k(v=vs.110).aspx
We are rewriting some applications previously developed in Visual FoxPro and redeveloping them using .Net ( using C# )
Here is our scenario:
Our application uses smartcards. We read in data from a smartcard which has a name and number. The name comes back ok in readable text but the number, in this case '900' comes back as a 2 byte character representation (131 & 132) and look like this - ƒ„
Those 2 special characters can be seen in the extended Ascii table.. now as you can see the 2 bytes are 131 and 132 and can vary as there is no single standard extended ascii table ( as far as I can tell reading some of the posts on here )
So... the smart card was previously written to using the BINTOC function in VFP and therefore the 900 was written to the card as ƒ„. And within foxpro those 2 special characters can be converted back into integer format using CTOBIN function.. another built in function in FoxPro..
So ( finally getting to the point ) - So far we have been unable to convert those 2 special characters back to an int ( 900 ) and we are wondering if this is possible in .NET to read the character representation of an integer back to an actual integer.
Or is there a way to rewrite the logic of those 2 VFP functions in C#?
UPDATE:
After some fiddling we realise that to get 900 into 2bytes we need to convert 900 into a 16bit Binary Value, then we need to convert that 16 bit binary value into a decimal value.
So as above we are receiving back 131 and 132 and their corresponding binary values as being 10000011 ( decimal value 131 ) and 10000100 ( decimal value 132 ).
When we concatenate these 2 values to '1000001110000100' it gives the decimal value 33668 however if we removed the leading 1 and transform '000001110000100' to decimal it gives the correct value of 900...
Not too sure why this is though...
Any help would be appreciated.
It looks like VFP is storing your value as a signed 16 bit (short) integer. It seems to have a strange changeover point to me for the negative numbers but it adds 128 to 8 bit numbers and adds 32768 to 16 bit numbers.
So converting your 16 bit numbers from the string should be as easy as reading it as a 16 bit integer and then taking 32768 away from it. If you have to do this manually then the first number has to be multiplied by 256 and then add the second number to get the stored value. Then take 32768 away from this number to get your value.
Examples:
131 * 256 = 33536
33536 + 132 = 33668
33668 - 32768 = 900
You could try using the C# conversions as per http://msdn.microsoft.com/en-us/library/ms131059.aspx and http://msdn.microsoft.com/en-us/library/tw38dw27.aspx to do at least some of the work for you but if not it shouldn't be too hard to code the above manually.
It's a few years late, but here's a working example.
public ulong CharToBin(byte[] s)
{
if (s == null || s.Length < 1 || s.Length > 8)
return 0ul;
var v = s.Select(c => (ulong)c).ToArray();
var result = 0ul;
var multiplier = 1ul;
for (var i = 0; i < v.Length; i++)
{
if (i > 0)
multiplier *= 256ul;
result += v[i] * multiplier;
}
return result;
}
This is a VFP 8 and earlier equivalent for CTOBIN, which covers your scenario. You should be able to write your own BINTOC based on the code above. VFP 9 added support for multiple options like non-reversed binary data, currency and double data types, and signed values. This sample only covers reversed unsigned binary like older VFP supported.
Some notes:
The code supports 1, 2, 4, and 8-byte values, which covers all
unsigned numeric values up to System.UInt64.
Before casting the
result down to your expected numeric type, you should verify the
ceiling. For example, if you need an Int32, then check the result
against Int32.MaxValue before you perform the cast.
The sample avoids the complexity of string encoding by accepting a
byte array. You would need to understand which encoding was used to
read the string, then apply that same encoding to get the byte array
before calling this function. In the VFP world, this is frequently
Encoding.ASCII, but it depends on the application.
I would like to know how to go about creating a regular expression with the following conditions:
String must start with a decimal or fraction
Decimal should be positive and up to 2 decimal places
Fractions should be a fraction on it's own or one whole number and fraction i.e. 1 1/2, 3/4 and be separated by a space. (Could be cool if someone wrote 1 and 1/2 that it new it would be 1 1/2 but not necessary
List item
I would to validate that a string starts with either a decimal or a fraction and get extracted values out of it
Valid Examples
"1 cup" = VALID = Extracted values: (1) (cup)
".5 cup" = VALID = Extracted values: (0.5) (cup)
"1.0 cup" = VALID = Extracted values: (1.0) (cup)
"1.10 cup" = VALID = Extracted values: (1.10) (cup)
"1/2 cup" = VALID = Extracted values: (1/2) (cup)
"1 1/2 cup" = VALID = Extracted values: (1 1/2) (cup)
"1 and 1/2 cup" = VALID = Extracted values: (1 1/2) (cup)
"1 and a 1/2 cup" = VALID = Extracted values: (1 1/2) (cup)
"1 & 1/2 cup" = VALID = Extracted values: (1 1/2) (cup)
Invalid Examples
"1 1/2 1/4 1/4 cup" = INVALID (only allow whole and fraction, or one fraction)
"1.034 cup" = INVALID (2 decimal places only)
"cup 1/2" = INVALID (not the start of the string)
EDIT
What I have so far:
Parsing Fractions:
\d*\s*(and*|and a*|\s*)\d+\/?\d*(.*)$
Parsing Decimals:
^\d{0,2}(\.\d{1,2})?$
My combined version:
(\d*\s*&|and*|and a*|\s*\s*\d+\/?\d*)|(\d{0,2}\.\d{1,2})*(.*)$
Just don't know how to join the 2 properly and it it can be better optimized?, and the invalid one still parse
Any help would be appreciated, thanks
Regards DotnetShadow
The following works for me with your examples above (using Regex Hero to test):
^(?<WholeNumber>\d+){0,1}(?:\s(?<JoinWord>&|and|and\sa)?\s?)?(?<Decimal>\.\d{1,2})?(?<Fraction>(?<Numerator>\d+)\/(?<Denominator>\d+)){0,1}(?:\s(?<Unit>cup))$
You'll notice that I used named capture groups for the various components. I'll leave it to you to parse out the groups and join them meaningfully (for example add the whole number to the value of the decimal value divided by 100 and add the value of numerator divided by denominator).
You can also add an additional patterns for other supported "unit" and also other supported "joinWord".
Edited: Added my suggestions as per my comments.
I don't think you need regex to solve this problem. Consider following solution:
Write function IsFraction(string s) that checks if string is a valid fraction (can it be 3/2, for example?).
Split input string, check that last item is not a number of any kind - that would be unit.
Remove all non-number and non-fraction items from list.
Verify that remaining items count is less than 3.
If count of items is 2, verify first is natural number and second is a fraction. -> result
If count is 1, verify that it's a fraction or decimal. -> result
More time (maybe) for initial writing, but easier to maintain and extend.
P.S. Use int.TryParse and decimal.TryParse with and within IsFraction.