ExcelDataReader.AsDataSet() converts single fraction double value into multiple fractions

ExcelDataReader.AsDataSet() converts single fraction double value into multiple fractions - c#

I'm facing a problem when reading the excel-sheet data using ExcelDataReader in c#.
I am reading data from excel-sheet(.xlsm)
One of the cell has a list of values to choose.
Eg.
5.1
5.2
5.1a
When I choose the value either 5.2 or 5.1a and read, I get the same exact value in the dataset
But when I choose 5.1 and read, I get 5.0999999999999996 in the dataset
Here is the code which I used to read the data in c#,
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(fileStream);
DataSet findingsData = excelReader.AsDataSet();
Note :
For a workaround, I put a space after the value 5.1 in the cell. Then it read the value exactly same as expected(5.1 instead of 5.0999999999999996).
But I'm wondering, when it read the value 5.2 exactly same without applying any space, why doesn't work for 5.1?
Any suggestions are welcome to resolve this issue...
Thanks,
Karthik

Take a look at this question: Why can't decimal numbers be represented exactly in binary?
My maths isn't quite up to figuring it out precisely (comments welcome) but I suspect that 5.1 doesn't convert to the C# double precisely, but 5.2 does.
The reason it works when you add the space is that Excel will assume that the field is text, the same way 5.1a is, but when it receives something that looks like a number it assumes it is a number. (You can see this behaviour in a default blank spreadsheet as it will be right aligned if it is a number and left aligned when you add a space or any other text).
I expect that if you explicitly format all the cells as text in your source spreadsheet then the value will be read as you expect

Related

how to convert exponential string to normal string in c#

I am reading a column number from csv where the number is 123456791234567 however due to format it gets converted to
number="1.23457E+14" in the csv file.
is there any way where we can change it to original string using c# ?
I am trying with below code :
decimal number = Decimal.Parse(number,System.Globalization.NumberStyles.Any);
but the number i am getting is 123457000000000M and the actual number is 123456791234567
any idea on this?

If I understand correctly, excel is converting 123456791234567 to 1.23457E+14. In that case, you just need to format the excel cell (or potentially the column) containing the value to string, before you set the value.
If the C# program opens the csv to find the value to be 1.23457E+14, then there is no way for you to convert it back to 123456791234567, since the precision is already lost - unless of course the same value exists (or can be recreated) in other cells (columns)

Create a xls or csv file with certain header of rows and columns

I have a list of ID in a matrix 'UserID'. I want create a xls or csv file that this UserID is its header lines. number of rows is:2200000 and number of columns is 11. Label of columns is years of 1996 - 2006 . I read this page :
https://www.mathworks.com/matlabcentral/answers/101309-how-do-i-use-xlswrite-to-add-row-and-column-labels-to-my-matlab-matrix-when-i-write-it-to-excel-in-m
but this code give me error. Although sometimes less is true for the number of rows and sometimes does not answer.Can anyone introduce a program that will do this? (with matlab or even c# code)
I write this code:
data=zeros(2200000,11);
data_cells=num2cell(data);
col_header={'1996','1997','1998','1999','2000','2001','2002','2003','2004','2005','2006'};
row_header(1:2200000,1)=UserID;
output_matrix=[{' '} col_header; row_header data_cells];
xlswrite('My_file.xls',output_matrix);
and I get this error:
The specified data range is invalid or too large to write to the specified file format. Try writing to an XLSX file and use Excel A1 notation for the range argument, for example, ‘A1:D4’.

When you use xlswrite you are limited to the number of rows that your Excel version permits:
The maximum size of array A depends on the associated Excel version.
In Excel 2013 the maximum is 1048576, which is 1151424 fewer rows than your 2200000x11 matrix.
You better use csvwrite to export your data, and refer also to the tip therein:
csvwrite does not accept cell arrays for the input matrix M. To export a cell array that contains only numeric data, use cell2mat to convert the cell array to a numeric matrix before calling csvwrite. To export cell arrays with mixed alphabetic and numeric data... you must use low-level export functions to write your data.
EDIT:
In your case, you should at least change this parts of the code:
col_header = 1996:2006;
output_matrix=[0, col_header; row_header data];
and you don't need to define output_matrix as a cell array (and don't need data_dells). However, you may also have to convert UserID to a numeric variable.

Data truncated after 255 bytes while using Microsoft.Ace.Oledb.12.0 provider

I am reading an excel sheet using the ACE provider and certain cells contain data greater than 255 bytes. I tried changing the TypeGuessRows in the registry settings as well as setting the same from the connection string. Still I get the truncated value in the code. I am not in a position to restructure the excel sheet or use another provider. I run 64 bit windows. My office edition is 2013. (Have a small doubt if it is because of this).
This is my connection string; it is working fine for those cells having data < 255 bytes.
var connectionString = string.Format("provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileName + ";Extended Properties=\"Excel 12.0;IMEX=1;HDR=YES;TypeGuessRows=0;ImportMixedTypes=Text\"");
Any solutions? Thanks in advance.

I am also using Microsoft.ACE.OLEDB.12.0 on 64-bit Windows 7.
I found that the TypeGuessRows in the connection string has no effect.
But increasing the TypeGuessRows in the following registry location works:
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel
More info on a similar bug (although you may already know this as you're already trying to change TypeGuessRows)

The solution to this was extremely simple.
Just change the format of the column containing this huge data to "Text" from "General" in the excel sheet.
Now I feel like a n00b.

refer this link. I think this is the problem (try with Memo fields)
http://allenbrowne.com/ser-63.html
In Access tables, Text fields are limited to 255 characters,but Memo fields can handle 64,000 characters (about 8 pages of single-spaced text)
Nice workaround: have a look at this stack answer

The problem is that the ACE driver is inferring a TEXT data type for the column you're populating the data set from. Text columns are limited to 255 characters. You need to force it to use the MEMO data type.
Your best bet for that is to garantee that the majority of the first eight rows in that column exceed 255 characters in length.
Source
This behavior is determined by the the predictive nature of the Excel
driver/provider. Since it doesn't know what the data types are, it has
to make a guess based upon the data in the first several rows. If the
contents of a field exceeds 255 characters, and it's in the first
several rows, then the data type will be Memo, otherwise it will
probably be Text (which will result in the truncation).

Excel has some limits.
Excel specifications and limits - 2013
As you can see in the link posted:
Feature Maximum Limit
Column width 255 characters

String gets mysteriously cut off

In my application I use WpfLocalization to provide translations while the application is running. The library will basically maintain a list of properties and their assigned localization keywords and use DependencyObject.SetValue() to update their values when the active language is changed.
The scenario in which I noticed my problem is this: I have a simple TextBlock and have assigned a localization keyword for its Text property. Now when my application starts, it will write the initial value into it and it will display just fine on screen. Now I switch the language and the new value is set as the Text property but only half the text will actually display on screen. Switching the languages back and forth does not have any effect. The first language is always displayed fine, the second is cut off (in the middle of words, but always full characters).
The relative length of both languages to each other does not seem to have anything to do with it. In my test case the working language string is 498 bytes and the one that gets cut off is 439 bytes and gets cut off after 257 bytes).
When I inspect the current value of the Text property of said TextBlock right before I change its value through the localization code, it will always have the expected value (not cut off) in either language.
When inspecting the TextBlock at runtime through WPF Inspector it will display the cut off text as the Text property in the second language.
This makes no sense to me at all thus far. But now it gets better.
The original WpfLocalization library reads the localized strings from standard resource files, but we use a modified version that can also read those string from an Excel file. It does that by opening an OleDbConnection using the Microsoft OLE DB driver and reading the strings through that. In the debugger I can see that all the values are read just fine.
Now I was really surprised when a colleague found the fix for the "cut off text" issue. He re-ordered the rows in the Excel sheet. I don't see how that could be relevant, but switching between the two versions of that file has an impact on the issue.

That does actually make sense, it's because the ole db driver for Excel has to take a sample of the data in a column to assign it a type and in the case of string, also a length. If it only samples values below the 255 character threshold, you will get a string(255) type and truncated text, if it has sampled a longer string, it will assign it as a memo column and allow longer strings to be retrieved / stored. By re-ordering, you are changing which rows are sampled.
If you read the SQL Server to Excel using oledb you will find this is a known issue. http://msdn.microsoft.com/en-us/library/ms141683.aspx - since you are using the same ole db driver, I would expect the situation to also apply to you.
From the docs:
Truncated text.
When the driver determines that an Excel column
contains text data, the driver selects the data type (string or memo)
based on the longest value that it samples. If the driver does not
discover any values longer than 255 characters in the rows that it
samples, it treats the column as a 255-character string column instead
of a memo column. Therefore, values longer than 255 characters may be
truncated. To import data from a memo column without truncation, you
must make sure that the memo column in at least one of the sampled
rows contains a value longer than 255 characters, or you must increase
the number of rows sampled by the driver to include such a row. You
can increase the number of rows sampled by increasing the value of
TypeGuessRows under the
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry
key. For more information, see PRB: Transfer of Data from Jet 4.0
OLEDB Source Fails w/ Error.

sql type float, real, decimal?

well in my database i had a colum for price of one product
i had it as float, my problem is if i saved it since my c# application
as 10.50 .. in a query it returns 10,50 and if i update i get a error
10,50 cant convert to float ... or something so..
and if i saved it as decimal, in queries inside sql management .. are ok..
but in my c# application... i get the same error..
10.50 retuns as 10,50 i dont know why, and how to solved it.. my unique solution is saved it
as varchar...

That's a localisation problem of some sort. 10,50 is the "European" way of writing ten and a half. If you're getting that from your select statements then your database is probably configured incorrectly.

Generally speaking you should use the same type throughout your layers. So if the underlying types in the database are x, you should pass around those data with identical types in c#, too.
What type you choose depends on what you are storing--you shouldn't be switching around types just to get something to "work". To that end, storing numeric data in a non-numeric type (e.g. varchar) will come back to bite you very soon. It's good you've opened this question to fix that!
As others have miraculously inferred, you are likely running into a localization issue. This is a great example of why storing numbers as strings is a problem. If you properly accept user input in whatever culture/localization they want (or you want), and get it into a numeric-type variable, then the rest (talking to the DB) should be easy. More so, you should not do number formatting in the database if you can help it--that stuff is much better placed at the front end, closer to the users.

I think your setting in windows regional and language for decimal symbol is wrong.please set it to dot and again test it.

This may help out for temporary use but I wouldn't recommend it for permanent use:
Try making it so that just before you save the file, convert the number to a string, replace the commas with periods (From , to .) and then save it into the database as the string, hopefully it should see that it is in the correct format and turn it into what the database sees as "Decimal" or "Floating".
Hope this helps.

Yep, localization.
That said, I think your pice is being stored on a "money" field in SQLServer (I'm assuming it's SQLServer you're using). If that was a float in the DB, it would return it with a normal decimal point, and not the European money separator ",".
To fix:
Fist DO NO USE FLOAT in your c# code, unless you absolutely require a floating point number. Use the decimal type instead. That's not just in this case, but in all cases. Floating point numbers are binary (base-2), not decimal (base-10), so what you see in the interface is only a decimal approximation of the actual number. The result is that frequently (1 == 1) evaluates as false!
I've run into that problem myself, and it's maddening if you don't know that can happen. Always use decimal instead of float in c#.
Ok, after you've fixed that, then do this to get the right localization:
using System.Globalization;
...
NumberFormatInfo ni = new NumberFormatInfo();
ni.CurrencyDecimalSeparator = ",";
decimal price = decimal.Parse(dbPriceDataField, ni);
Note that "dbPriceDataField" must be a string, so you may have to do a ".ToString()" on that db resultset's field.
If you end up having to handle other "money" aspects of that money field, like currency symbols, check out: http://msdn.microsoft.com/en-us/library/system.globalization.numberformatinfo.aspx
If you need more robust error handling, either put that decimal.Parse in a try/catch, or use decimal.TryParse.
EDIT --
If you know what culture (really, country), the db is set to, you can do this instead:
using System.Globalization;
...
CultureInfo ci = new CultureInfo("fr-FR"); // fr-FR being "french France"
decimal price = decimal.Parse(dbprice, ci.NumberFormat);

Such problems were faced by me in my Web Apps... but i found the solution like I was fetching my price value in textbox. So I was have database attached with that. So when you attached your database with textbox... When you right click textbox and click Edit DataBinding.... in that you have to provide.... type like in Bind Property..... {0:N2}
This will work only for web apps or websites... not for desktop applications...

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.