EPPlus csv parsing ampersand("&") issue - c#

I´m using EPPlus LoadFromText to parse a csv into an excel file.
var format = new ExcelTextFormat();
format.Delimiter = ';';
format.Encoding = System.Text.Encoding.UTF8;
format.Culture = System.Globalization.CultureInfo.GetCultureInfo("pt-pt");
using (ExcelPackage package = new ExcelPackage(new FileInfo(excelFilePath)))
{
ExcelWorksheet worksheet = package.Workbook.Worksheets.Add(worksheetsName);
worksheet.Cells["A1"].LoadFromText(new FileInfo(fileName), format, OfficeOpenXml.Table.TableStyles.None, false);
package.Save();
}
When a row has more than one column with an ampersand("&"):
001;David & Goliath;10;20;David & Goliath
The call throws an exception:
"An item with the same key has already been added".
Is there a way to avoid this problem without changing the input csv data?

The two parameter call works for me, so you may be able to use it as a workaround:
worksheet.Cells["A1"].LoadFromText(
new FileInfo(path), format
);
Note that the four parameter call in you example above is doing exactly the same thing.
So this looks like a bug. Taken directly from source code, the four parameter overload:
public ExcelRangeBase LoadFromText(string Text, ExcelTextFormat Format, TableStyles TableStyle, bool FirstRowIsHeader)
{
ExcelRangeBase excelRangeBase = this.LoadFromText(Text, Format);
ExcelTable excelTable = this._worksheet.Tables.Add(excelRangeBase, "");
excelTable.ShowHeader = FirstRowIsHeader;
excelTable.TableStyle = TableStyle;
return excelRangeBase;
}
Is blowing up when trying to add the ExcelTable when there are ampersand(s) in the source data.

The ampersand will be escaped as & in XML and you are using ';' as a delimiter.

Related

Writing Japanese string to excel using OpenXml

I am trying to create Excel file by reading data from the database. One of the columns contains Japanese text. While writing that column to excel cell and saving workbook gives following error (which makes sense as the characters are not valid xml chars ) :'', hexadecimal value 0x0B, is an invalid character.
I am writing the string as following to the excel cell using DocumentFormat.OpenXml package.
var excelCell = new Cell();
var cellValue = dtRow[col.Name].ToString();
var inlineStr = new InlineString(new Text(cellValue));
excelCell.DataType = CellValues.InlineString;
excelCell.InlineString = inlineStr;
What needs to be done to write Japanese characters to the excel using OpenXml in C#
Ok. Found the right way. Putting it as answer so that it can be helpful.
To add text to excel which is not allowed as valid xml, add the text as SharedString to the SharedStringTable
var index = InsertSharedStringItem(text, shareStringPart);
excelCell.CellValue = new CellValue(index.ToString());
excelCell.DataType = new EnumValue<CellValues>(CellValues.SharedString);
private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart)
{
// If the part does not contain a SharedStringTable, create one.
if (shareStringPart.SharedStringTable == null)
{
shareStringPart.SharedStringTable = new SharedStringTable();
}
int i = 0;
// Iterate through all the items in the SharedStringTable. If the text already exists, return its index.
foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>())
{
if (item.InnerText == text)
{
return i;
}
i++;
}
// The text does not exist in the part. Create the SharedStringItem and return its index.
shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));
shareStringPart.SharedStringTable.Save();
return i;
}
Full documentation for adding text as shared string to excel using OpenXml
https://msdn.microsoft.com/en-us/library/office/cc861607.aspx

How do I apply Excel data validation for text length to a column with EPPlus?

I am using EPPlus with C# to create an excel file.
I want to put a data validation on a column to not except string longer than certain characters in its cells. Find the attached picture to better understand, what I mean.
I can't find how to give the limitation. If someone has the solution or link to their valid documentaton. Please post it.
This works for me:
var minLength = 1;
var maxLength = 4;
var textValidation = worksheet
.DataValidations.AddTextLengthValidation("D:D");
textValidation.ShowErrorMessage = true;
textValidation.ErrorStyle = ExcelDataValidationWarningStyle.warning;
textValidation.ErrorTitle = "The value you entered is not valid";
textValidation.Error = string.Format(
"This cell must be between {0} and {1} characters in length.",
minLength, maxLength
);
textValidation.Formula.Value = minLength;
textValidation.Formula2.Value = maxLength;

use the conditional format with epplus

I´m trying to aplly a conditional format to a an Excel using eppplus, in that case i want to apply a pattern to all odd rows. so i try use the mod function, but not working, i don´t know how to put the formula
ExcelAddress _formatRangeAddress = new ExcelAddress("A2:Q" + (listSize+ 1));
string _statement = "MOD(ROW();2)=0";
var _cond1 = hoja.ConditionalFormatting.AddExpression(_formatRangeAddress);
_cond1.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid;
_cond1.Style.Fill.BackgroundColor.Color = System.Drawing.Color.Gray;
_cond1.Formula = _statement;
Check the formula string. I think you want a , instead of that ;. So change this:
string _statement = "MOD(ROW();2)=0";
to this:
string _statement = "MOD(ROW(),2)=0";

Keep excel cell format as text with "date like" data

This seems silly, but I haven't been able to get my values in the format of #/#### to write as the literal string rather than becoming formatted as a date within excel.
I'm using ClosedXML to write to excel, and using the following:
// snip
IXLRangeRow tableRow = tableRowRange.Row(1);
tableRow.Cell(1).DataType = XLCellValues.Text;
tableRow.Cell(1).Value = "2/1997";
// snip
Looking at the output excel sheet I get in the cell 2/1/1997 - even though I'm setting the format as text in code, I'm getting it as a "Date" in the excel sheet - I checked this by right clicking the cell, format cell, seeing "date" as the format.
If I change things up to:
// snip
IXLRangeRow tableRow = tableRowRange.Row(1);
tableRow.Cell(1).Value = "2/1997";
tableRow.Cell(1).DataType = XLCellValues.Text;
// snip
I instead get 35462 as my output.
I just want my literal value of 2/1997 to be displayed on the worksheet. Please advise on how to correct.
try this
ws.Cell(rowCounter, colCounter).SetValue<string>(Convert.ToString(fieldValue));
Not sure about from ClosedXML, but maybe try Range.NumberFormat (MSDN Link)
For example...
Range("A1").NumberFormat = "#"
Or
Selection.NumberFormat = "#/####"
Consider:
tableRow.Cell(1).Value = "'2/1997";
Note the single quote.
ws.Cell(rowCounter, colCounter).Value="'"+Convert.ToString(fieldValue));
Formatting has to be done before you write values to the cells.
I had following mechanism, run after I make worksheet, right before I save it:
private void SetColumnFormatToText(IXLWorksheet worksheet)
{
var wholeSheet = worksheet.Range(FirstDataRowIndexInExcel, StartCellIndex, RowCount, HeaderCount);
wholeSheet.Style.NumberFormat.Format = "#";
}
which didn't do squat.
Doing it before I write values to the cells in a row did it.
worksheet.Range(RowIndex, StartCellIndex, RowIndex, EndCellIndex).Style.NumberFormat.Format = "#";
with cell value assignments following immediately after.

Excel Date column returning INT using EPPlus

So i'm using EPPlus to read and write excel documents.
Workflow
User generates populated excel document
Opens document and adds a row
Uploaded and read
The dates that are generated when I create the document using EPPlus show correctly when I'm reading the value back but the row the user changes the date one or adds is showing as an INT value not something I can use as a real date.
When I enter the date 1/01/2014 and write it, the output when I open the file up shows 41640
I'm reading it as follows
sheet.Cells[i, "AE".ConvertExcelColumnIndex()].Value != null
? sheet.Cells[i, "AE".ConvertExcelColumnIndex()].Value.ToString().Trim()
: string.Empty
Update
When exporting the file I have added the following
DateTime testDate;
if (DateTime.TryParse(split[i], out testDate))
{
sheet.Cells[row, i + 1].Style.Numberformat.Format = "MM/dd/yyyy";
sheet.Cells[row, i + 1].Value = testDate.ToString("MM/dd/yyyy");
}
Also when reading the value back I have tried
sheet.Cells[i, "AE".ConvertExcelColumnIndex()].Style.Numberformat.Format = "MM/dd/yyy";
I still get an INT back
...when I need to read that excel file, the only dates that are
incorrect are the ones the user has changed
So when you read the modified excel-sheet, the modified dates are numbers whereas the unchanged values are strings in your date-format?
You could get the DateTime via DateTime.FromOADate:
long dateNum = long.Parse(worksheet.Cells[row, column].Value.ToString());
DateTime result = DateTime.FromOADate(dateNum);
With your sample-number:
Console.Write(DateTime.FromOADate(41640)); // -> 01/01/2014
I stumbled upon this issue today when trying to generate some Excel documents from some ASP.NET DataTables: I had no problem with strings, but ran into few issues with numeric types (int, doubles, decimals) and DataTables, which were formatted as string or as numeric representations (OADate).
Here's the solution I eventually managed to pull off:
if (dc.DataType == typeof(DateTime))
{
if (!r.IsNull(dc))
{
ws.SetValue(row, col, (DateTime)r[dc]);
// Change the following line if you need a different DateTime format
var dtFormat = "dd/MM/yyyy";
ws.Cells[row, col].Style.Numberformat.Format = dtFormat;
}
else ws.SetValue(row, col, null);
}
Apparently, the trick was to set the value as DateTime and then configure the proper Style.Numberformat.Formataccordingly.
I published the full code sample (DataTable to Excel file with EPPlus) in this post on my blog.
You should try using
string dateFromExcel = workSheet.Cells[row, col].Text.ToString();
DateTime localdt;
if (DateTime.TryParse(dateFromExcel, out localdt))
{
dateFromExcel = localdt.ToString("MM/dd/yyyy");
};
the Value reads the value in the general formatting while Text reads the value as it is from the excel with applied formatting.
you could check if the cell format is in date format,
then parse it to date
var cell = worksheet.Cells[row, col];
value = cell.Value.ToString();
if (cell.Style.Numberformat.Format == "[$-409]d\\-mmm\\-yy;#")
{
string inputString = DateTime.FromOADate(long.Parse(value.ToString())).ToString("dd-MMM-yyyy");
}
You can also change the 'NumberFormatLocal' property. This worked for me. If you format the Excel file before improting it using EPPLUS.
The following basic example of code formats column A in a typical excel file.
Sub ChangeExcelColumnFormat()
Dim ExcelApp As Excel.Application
Dim ExcelWB As Excel.Workbook
Dim ExcelWS As Excel.Worksheet
Dim formatRange As Excel.Range
Dim strFile As String = "C:\Test.xlsx"
Dim strSheetname As String = "Sheet1"
ExcelApp = New Excel.Application
ExcelWB = ExcelApp.Workbooks.Open(strFile)
strColSelect = "A:A"
strFormat = "dd/mm/yyyy"
formatRange = ExcelWS.Range(strColSelect)
formatRange.NumberFormatLocal = strFormat
ExcelWB.Save()
ExcelWB.Close()
ExcelApp.Quit()
ExcelWS = Nothing
ExcelWB = Nothing
ExcelApp = Nothing
End Sub

Categories