C# writing text serial numbers as dates in Excel file - c#

I wrote a parser that takes some information from Excel sheets using the Spire.xls library and then writes the information to another Excel file.
I'm running into a weird problem. For some reason the program is taking serial numbers such as
03-02281
03-02282
03-01975
And writing them into the Excel sheet as
3/1/2281
3/1/2282
3/1/1975
This only happens with some values.
Others such as
30-04761
03-00613
03-00614
are transcribed unchanged.
I checked in the excel file, the fields are set as text format. So they were either stored that way originally or Excel is interpreting the serial numbers to be dates. Other possibility is that it doesn't happen in the original file and the text is not automatically corrected/changed if I manually type in the correct values.
Does anyone know why this is happening and how I can tell Excel to just treat these as text and nothing else?
I though about appending a ' to them in the beginning of each value, but these have to then be read by other parsers so it's not the most convenient option.
Edit:
Here's some ofthe code I use for this, hopefulyl it can give you guys an idea of where I'm going wrong.
This is the code that adds all the values:
Workbook workbook = new Workbook();
workbook.LoadFromFile(templateExcelFileUri);
Worksheet sheet = workbook.Worksheets[0];
int ColumnIndex = 0; //for the datatable columns iteration
int columnCounter = 1; //for the excel sheet columns iteration
int ColumnsToAdd = 6; //(Seccion, seccion desc, marca, marca desc, **IdArticulo**, articulo desc)
//get the data of the new column
DataColumn DescriptionsDataColumn;
//First, add the suggestions.
for (; ColumnIndex < ColumnsToAdd; ColumnIndex++,
columnCounter++)
{
sheet.InsertColumn(columnCounter);
if(columnCounter==5)
sheet.Columns[5].NumberFormat = "#";// the column with the serial numbers.
DescriptionsDataColumn = AutomatController.DescriptionsTable.Columns[ColumnIndex];
//insert the data into the new column
sheet.InsertDataColumn(DescriptionsDataColumn, true, 2, columnCounter);
}
And for references, the table the values of which I add:
public static void SetDescriptionsTable()
{
DescriptionsTable.Columns.Add("Seccion", typeof(string));
DescriptionsTable.Columns.Add("SeccionDescripcion", typeof(string));
DescriptionsTable.Columns.Add("Marca", typeof(string));
DescriptionsTable.Columns.Add("MarcaDescripcion", typeof(string));
DescriptionsTable.Columns.Add("IdArticulo", typeof(string)); //Serial numbers
DescriptionsTable.Columns.Add("ArticuloDescripcion", typeof(string));
}
Thanks for the edits to the format of my question and the title. I'm still a little new here and I'm learning how to do that better.

The reason why some values do not map to dates is because they fall outside of the format MM-dyyyy format. for example there is no month 30 (30-02281) or day 0 (03-01975).
I think the only thing you need to do is set the format of the target column and cell prior to setting its value through the API. Sometimes cloning a column or a cell defaults the formatting to "Auto" and Excel tries to be too smart.
If you can share a bit of your code the community may be able to more accurately diagnose the problem.

You should set the columns format to general before setting the value.

Related

C# EPPlus multiple cell formatting not applying

I am using EPPlus to generate excel file from data table. i have only two rows. i am applying % formatting on first row and $ formatting on second row but my two row has getting same % formatting for first two row which is wrong. i am not being able to capture the reason why this is happening. why second formatting not being applied on second row which is $ formatting.
See this line where i use range to apply formatting.
ws.Cells["C0:P0"].Style.Numberformat.Format = "#,###,##0.0%;(#,###,##0.0%)";
ws.Cells["C1:P1"].Style.Numberformat.Format = "$##,##0.0;($##,##0.0)";
in the above code i mention cell range with formatting but my two row getting only first formatting and second formatting not consider...not clear why this is happening?
Sample Code
using (OfficeOpenXml.ExcelPackage obj = new OfficeOpenXml.ExcelPackage(FileLoc))
{
// creating work sheet object
OfficeOpenXml.ExcelWorksheet ws = obj.Workbook.Worksheets.Add("Vertical");
// freezing work sheet columns and rows
ws.View.FreezePanes(2, 3);
// exporting data to excel
ws.Cells["A1"].LoadFromDataTable(selected, true);
// setting calumns as autofit
ws.Cells[ws.Dimension.Address].AutoFitColumns();
//fixing height of column
ws.Row(1).Height = 16;
ws.Row(1).Style.Fill.PatternType = ExcelFillStyle.Solid;
ws.Row(1).Style.Fill.BackgroundColor.SetColor(Color.LightGray);
obj.Save();
ws.Cells["C0:P0"].Style.Numberformat.Format = "#,###,##0.0%;(#,###,##0.0%)";
ws.Cells["C1:P1"].Style.Numberformat.Format = "$##,##0.0;($##,##0.0)";
}
screen shot of excel data. see first two line in picture and definitely understand #,###,##0.0%;(#,###,##0.0%) this format is applying on first two row but in my code i have given different format for second records.
please help me to find the wrong things in my code. thanks
Well, there are a couple of errors. First, you're saving before setting the formatting, so it's not being applied.
Second, Excel addresses are base 1, it doesn't exist "C0" and "P0". Also note that in the first row is the columns titles, so you probably want rows 2 and 3. Try the following:
ws.Cells["C2:P2"].Style.Numberformat.Format = "#,###,##0.0%;(#,###,##0.0%)";
ws.Cells["C3:P3"].Style.Numberformat.Format = "$##,##0.0;($##,##0.0)";
obj.Save();

Why is the Date in Row being transformed automatically?

I'm using Miscrosoft Office Interop Excel to sort and manage a ".csv" file and create an excel file.
When I copy one cell that contains a date for example :
"04/05/2018 18:55"
I substring that into
"04/05/2018"
and paste it into another cell, it transforms it into this weird number
43195
Why is this happening? How can I prevent or modify this? I'm currently passing the info. like this:
String date = worksheet.Cells[1, 1].Value2.ToString();
worksheet.Cells[1, 2].Value2 = date; //this shows up as 43195
It’s not being transformed really. Excel actually stores all dates as a number representing the number of days since 1/0/1900 (really!). What you are seeing in your cell is the raw numeric value representing the date. If you open your result spreadsheet, highlight your column, right click and select formatting, and select “Date”, you’ll see it display the date you expect.
So you are going to want to do this in your code. Assuming your date is in column “B” and your worksheet is “ws”:
ws.Range[“B:B”].NumberFormat = “mm/dd/yyyy”;
This assumes a US date format.

Adding string to int32 column in datatable

I am pretty sure this is not possible but I will ask for clarification.
I am using ExcelLibrary in C# to convert a dataset into an Excel document. I recently had the requirement to add a string to the end of the excel document in every file. I simply added two new rows to the datatable (one empty and the second row displayed the string in the first cell).
I got a bug report today because in one particular excel document the first column displays the ID (something unusual in the system) and I get the error:
Input string was not in a correct format.Couldn't store <**mystring**> in id Column. Expected type is Int32.
I am pretty sure I cannot add my string to this column and I need to add it to a different column which is a nvarchar, but does anyone have any suggestions to resolve this problem?
edit: the code for anyone who may require it
DataRow dr = ds.Tables[0].NewRow();
ds.Tables[0].Rows.Add(dr);
DataRow dr2 = ds.Tables[0].NewRow();
dr2[0] = System.Configuration.ConfigurationManager.AppSettings["excelString"];
ds.Tables[0].Rows.Add(dr2);
The dataset I am working with is directly from a query in the database and the first column is an INT

best method to choose to save huge data from excel to Sql

I have an Excel file from which I have to extract the required data and save it to a database. I know by using Range we can get a particular range of data. But my data that was to be extracted was a bit large. So can anyone suggest which was the best and simple method to retrieve the data and store the information in a database?
I would like to read the data from A10 to an unknown range. My data will be as follows
As per marked with red after that data should go in to the database column by column I will do that if anyone can suggest the best method to read the remaining columns too.
You could use SQL Server Integration Services to import the excel data to a table. A SSIS package can run at scheduled times or be invoked. It uses the spreadsheet as a data source and allows you to map columns.
Well, you can use NPOI to read in the Excel file and parse it any way you want. We use it to import large Excel files into a SQL database as well. Using NPOI you have complete freedom on how to interpret the data.
Most important thing is that, if you want to do this more often, either the format of the Excel file should not change, or you should have some generic description of the Excel file stored somewhere else which tells your code how to interpret the file. The latter is of course more difficult to do. It depends on your particular use case which is better.
In our case the Excel file has a fixed layout, so our implementation is based on that layout.
If you still need to do it from code there is only one way of doing. As per your question you said that your data will start from A10, first of all get the UsedRange of excel as follows
Microsoft.Office.Interop.Excel.Range xlRange = worksheet.UsedRange;
As there are only 2 columns get the row count and column count of excel ad follows
iRows = xlRange.Rows.Count;
iCols = xlRange.Columns.Count;
Later start your loop as follows
for (int iRow = 10; iRow <= iRows; iRow++)
{
for (int iCol = 1; iCol <= iCols; iCol++)
{
xlRange = (Microsoft.Office.Interop.Excel.Range)worksheet.Cells[iRow, iCol];
Console.WriteLine(xlRange.Text); // From here do as per you required and insert the required data to the data base.
List<string> lstItems = new List<string>(); // Declare this initially
lstItems.Add(xlRange.Text.ToString());
if (lstItems.Count == 10)
{
if (xlRange.Text.ToString().Contains("www") || lstItems[9].ToString() == string.Empty)
{
}
}

Scientific notation when importing from Excel in .Net

I have a C#/.Net job that imports data from Excel and then processes it. Our client drops off the files and we process them. I don't have any control over the original file.
I use the OleDb library to fill up a dataset. The file contains some numbers like 30829300, 30071500, etc... The data type for those columns is "Text".
Those numbers are converted to scientific notation when I import the data. Is there anyway to prevent this from happening?
One workaround to this issue is to change your select statement, instead of SELECT * do this:
"SELECT Format([F1], 'General Number') From [Sheet1$]"
-or-
"SELECT Format([F1], \"#####\") From [Sheet1$]"
However, doing so will blow up if your cells contain more than 255 characters with the following error:
"Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done."
Fortunately my customer didn't care about erroring out in this scenario.
This page has a bunch of good things to try as well:
http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/
The OleDb library will, more often than not, mess up your data in an Excel spreadsheet. This is largely because it forces everything into a fixed-type column layout, guessing at the type of each column from the values in the first 8 cells in each column. If it guesses wrong, you end up with digit strings converted to scientific-notation. Blech!
To avoid this you're better off skipping the OleDb and reading the sheet directly yourself. You can do this using the COM interface of Excel (also blech!), or a third-party .NET Excel-compatible reader. SpreadsheetGear is one such library that works reasonably well, and has an interface that's very similar to Excel's COM interface.
Using this connection string:
Provider=Microsoft.ACE.OLEDB.12.0; data source={0}; Extended Properties=\"Excel 12.0;HDR=NO;IMEX=1\"
with Excel 2010 I have noticed the following. If the Excel file is open when you run the OLEDB SELECT then you get the current version of the cells, not the saved file values. Furthermore the string values returned for a long number, decimal value and date look like this:
5.0130370071e+012
4.08
36808
If the file is not open then the returned values are:
5013037007084
£4.08
Monday, October 09, 2000
If you look at the actual .XSLX file using Open XML SDK 2.0 Productivity Tool (or simply unzip the file and view the XML in notepad) you will see that Excel 2007 actually stores the raw data in scientific format.
For example 0.00001 is stored as 1.0000000000000001E-5
<x:c r="C18" s="11" xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<x:v>1.0000000000000001E-5</x:v>
</x:c>
Looking at the cell in Excel its displayed as 0.00001 in both the cell and the formula bar. So it not always true that OleDB is causing the issue.
I have found that the easiest way is to choose Zip format, rather than text format for columns with large 'numbers'.
Have you tried casting the value of the field to (int) or perhaps (Int64) as you are reading it?
Look up the IMEX=1 connection string option and TypeGuessRows registry setting on google.
In truth, there is no easy way round this because the reader infers column data types by looking at the first few rows (8 by default). If the rows contain all numbers then you're out of luck.
An unfortunate workaround which I've used in the past is to use the HDR=NO connection string option and set the TypeGuessRows registry setting value to 1, which forces it to read the first row as valid data to make its datatype determination, rather than a header.
It's a hack, but it works. The code reads the first row (containing the header) as text, and then sets the datatype accordingly.
Changing the registry is a pain (and not always possible) but I'd recommend restoring the original value afterwards.
If your import data doesn't have a header row, then an alternative option is to pre-process the file and insert a ' character before each of the numbers in the offending column. This causes the column data to be treated as text.
So all in all, there are a bunch of hacks to work around this, but nothing really foolproof.
I had this same problem, but was able to work around it without resorting to the Excel COM interface or 3rd party software. It involves a little processing overhead, but appears to be working for me.
First read in the data to get the column names
Then create a new DataSet with each of these columns, setting each of their DataTypes to string.
Read the data in again into this new
dataset. Voila - the scientific
notation is now gone and everything is read in as a string.
Here's some code that illustrates this, and as an added bonus, it's even StyleCopped!
public void ImportSpreadsheet(string path)
{
string extendedProperties = "Excel 12.0;HDR=YES;IMEX=1";
string connectionString = string.Format(
CultureInfo.CurrentCulture,
"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"{1}\"",
path,
extendedProperties);
using (OleDbConnection connection = new OleDbConnection(connectionString))
{
using (OleDbCommand command = connection.CreateCommand())
{
command.CommandText = "SELECT * FROM [Worksheet1$]";
connection.Open();
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
using (DataSet columnDataSet = new DataSet())
using (DataSet dataSet = new DataSet())
{
columnDataSet.Locale = CultureInfo.CurrentCulture;
adapter.Fill(columnDataSet);
if (columnDataSet.Tables.Count == 1)
{
var worksheet = columnDataSet.Tables[0];
// Now that we have a valid worksheet read in, with column names, we can create a
// new DataSet with a table that has preset columns that are all of type string.
// This fixes a problem where the OLEDB provider is trying to guess the data types
// of the cells and strange data appears, such as scientific notation on some cells.
dataSet.Tables.Add("WorksheetData");
DataTable tempTable = dataSet.Tables[0];
foreach (DataColumn column in worksheet.Columns)
{
tempTable.Columns.Add(column.ColumnName, typeof(string));
}
adapter.Fill(dataSet, "WorksheetData");
if (dataSet.Tables.Count == 1)
{
worksheet = dataSet.Tables[0];
foreach (var row in worksheet.Rows)
{
// TODO: Consume some data.
}
}
}
}
}
}
}
I got one solution from somewhere else but it worked perfectly for me.
No need to make any code change, just format excel columns cells to 'General" instead of any other formatting like "number" or "text", then even Select * from [$Sheet1] or Select Column_name from [$Sheet1] will read it perfectly even with large numeric values more than 9 digits
I googled around this state..
Here are my solulition steps
For template excel file
1-format Excel coloumn as Text
2- write macro to disable error warnings for Number -> text convertion
Private Sub Workbook_BeforeClose(Cancel As Boolean)
Application.ErrorCheckingOptions.BackgroundChecking = Ture
End Sub
Private Sub Workbook_Open()
Application.ErrorCheckingOptions.BackgroundChecking = False
End Sub
On codebehind
3- while reading data to import
try to parse incoming data to Int64 or Int32....

Categories