I am creating an .Net core project which reads from different types of .xlsx file and exports the data into a MongoDB collection. The "format", "number of columns" -etc of the input files are unknown. I am not having any issue identifying decimal, boolean & string datatypes. But when the cell values contains a date, OpenXML is reading it in epoch string format. As the Cell's DataType property is null, I have no ways to determine if I am reading a Date field or, not. Below is the code snippet that I am using for reading-
private static object GetCellValue(SpreadsheetDocument doc, Cell cell)
{
double res;
string value = cell.CellValue.InnerText;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString) //Checking if it is a string
{
return doc.WorkbookPart.SharedStringTablePart.SharedStringTable.ChildElements.GetItem(int.Parse(value)).InnerText;
}
if (cell.DataType != null && cell.DataType.Value == CellValues.Boolean) //If contains boolean
{
return value == "1" ? true : false;
}
else
{
Double.TryParse(value, out res);
if (res != 0) //Determining if it is a Double value
{
return res;
}
return value; //Otherwise the "InnerText" value will be passed
}
}
This seems to be working fine except for date formats. In that its getting converted to Decimal, which is messing up the data.
p.s.- The actual data type of the cells are general type and there is no way I can change in the input file. Also using "Microsoft.Office.Interop.Excel" package is not an option.
Related
I have an excel document which I try to import to my system.
(.net core 2.2 and EPPlus v4.5.3.1)
The excel data like 09:57:32 and Custom Cell format [hh]:mm:ss
private TimeSpan? GetRequiredTimeFromRowOrNull(ExcelWorksheet worksheet, int row, int column, string columnName, StringBuilder exceptionMessage)
{
worksheet.Cells[row, column].Style.Numberformat.Format = "hh:mm:ss";
var cellValue = TimeSpan.Parse(worksheet.Cells[row, column].Value.ToString());
if (cellValue.ToString() != null && !string.IsNullOrWhiteSpace(cellValue.ToString()))
{
return cellValue;
}
exceptionMessage.Append(GetLocalizedExceptionMessagePart(columnName));
return null;
}
"worksheet.Cells[row, column].Value"
comes 0.414953703703704
and also
"worksheet.Cells[row, column].Text" comes 09:12:32
How can I get exact value?
Excel stores dates and time in decimal values. The whole-number part is the day and the decimal part is the time.
So, to get into a C# DateTime, use the OLE Automation converter:
try
{
var val = (double)worksheet.Cells[row, column].Value;
var dt = DateTime.FromOADate(val);
var cellValue = dt.TimeOfDay;
}
catch (Exception)
{
//log conversion error
}
Will want to put a Try.Catch around the cast JIC the cell does not have a number in it.
I'm storing mix of numeric and non-numeric values in a single column in spreadsheet using C# and EPPlus. When I open spreadsheet with Excel it shows green triangles in the cells with numeric values giving warning that 'Number Stored as Text' and giving option to ignore it for particular cell. Can I do it from code or is it some Excel specific feature?
You really have 2 options using code:
change the .NumberFormat property of Range to TEXT (I believe equivalent in epplus is Cell[row, column].Style.NumberFormat.Format)
prefix any number with ' (a single quote) - Excel then treats the number as TEXT - visually, it displays the number as is but the formula will show the single quote.
Alternatively, which I wouldn't recommend relying on
play with Excel's properties and untick the option to display warnings
From the EPPlus documentation:
My number formats does not work
If you add numeric data as strings (like the original ExcelPackage does), Excel will treat the data as a string and it will not be formatted. Do not use the ToString method when setting numeric values.
string s="1000"
int i=1000;
worksheet.Cells["A1"].Value=s; //Will not be formatted
worksheet.Cells["A2"].Value=i; //Will be formatted
worksheet.Cells["A1:A2"].Style.Numberformat.Format="#,##0";
http://epplus.codeplex.com/wikipage?title=FAQ&referringTitle=Documentation
This is a derivation of TechnoPriest's answer that works for me - I've added handling of decimal values, and changed the name of the method to more accurately document its true raison d'etre:
public static void ConvertValueToAppropriateTypeAndAssign(this ExcelRangeBase range, object value)
{
string strVal = value.ToString();
if (!String.IsNullOrEmpty(strVal))
{
decimal decVal;
double dVal;
int iVal;
if (decimal.TryParse(strVal, out decVal))
{
range.Value = decVal;
}
else if (double.TryParse(strVal, out dVal))
{
range.Value = dVal;
}
else if (Int32.TryParse(strVal, out iVal))
{
range.Value = iVal;
}
else
{
range.Value = strVal;
}
}
else
{
range.Value = null;
}
}
You can check if your value is integer, convert it to int and assign number to cell's value. Then it will be saved as number, not string.
public static void SetValueIntOrStr(this ExcelRangeBase range, object value)
{
string strVal = value.ToString();
if (!String.IsNullOrEmpty(strVal))
{
double dVal;
int iVal;
if (double.TryParse(strVal, out dVal))
range.Value = dVal;
else if (Int32.TryParse(strVal, out iVal))
range.Value = iVal;
else
range.Value = strVal;
}
else
range.Value = null;
}
I'm writing an application that's supposed to export data from a map application.
This application is using Silverlight, and to facilitate export to Excel I am using this library.
All of the data is represented in strings by default. When I write to the spreadsheet, I try to parse each string to see which type it is:
string str = kvp.Value;
int i = 0;
long l = 0;
decimal dec = 0;
DateTime dt;
if (int.TryParse(str, out i))
doc.Workbook.Sheets[0].Sheet.Rows[r].Cells[c].SetValue(i);
else if (decimal.TryParse(str, out dec))
doc.Workbook.Sheets[0].Sheet.Rows[r].Cells[c].SetValue(dec);
else if (long.TryParse(str, out l))
doc.Workbook.Sheets[0].Sheet.Rows[r].Cells[c].SetValue(l);
else if (DateTime.TryParse(str, out dt))
doc.Workbook.Sheets[0].Sheet.Rows[r].Cells[c].SetValue(dt);
else
doc.Workbook.Sheets[0].Sheet.Rows[r].Cells[c].SetValue(str);
This works great, except for DateTime and when I try to parse a social security number, which in my case is 12 characters long.
The social security number is parsed as a decimal number, and is displayed in scientific form in Excel. From what I've gathered through reading it seems like a limitation in Excel. If I mark the cell however, I see the correct number in the top bar where you can write formulas. I've solved this problem so far by simply putting this number as a string and letting the end user handle it for now, but I'd really like for it to be a number in the finished document. Is this possible?
What really boggles my mind though, is the DateTime. The format of the date comes like this from the application: 10/15/2013 1:10:00 AM.
It looks like this in the Excel file: 2455075.
I checked the source code for the date formatting but I don't seem to be adept enough to see if there is anything wrong in it. For anyone intresed, you can check it out here.
The SetValue-function is supposed to identify the following types automatically:
bool
DateTime
decimal
Exception
SharedStringDefinition
string
I apologize for the long post. It boils down to these questions:
Can I make Excel handle long numbers without scientific notation programatically?
Why is the DateTime being outputed to such a weird format?
To be set Cell Value in Date format you have to convert DateTime to OLE Automation Date. Also you can create more clear method for writing cell values. Somthing like this:
public bool UpdateValue(WorkbookPart wbPart, string sheetName, string addressName, string value,
UInt32Value styleIndex, CellValues dataType)
{
// Assume failure.
bool updated = false;
Sheet sheet = wbPart.Workbook.Descendants<Sheet>().Where(
(s) => s.Name == sheetName).FirstOrDefault();
if (sheet != null)
{
Worksheet ws = ((WorksheetPart)(wbPart.GetPartById(sheet.Id))).Worksheet;
Cell cell = InsertCellInWorksheet(ws, addressName);
if (dataType == CellValues.SharedString)
{
// Either retrieve the index of an existing string,
// or insert the string into the shared string table
// and get the index of the new item.
int stringIndex = InsertSharedStringItem(wbPart, value);
cell.CellValue = new CellValue(stringIndex.ToString());
}
else
{
cell.CellValue = new CellValue(value);
}
cell.DataType = new EnumValue<CellValues>(dataType);
if (styleIndex > 0)
cell.StyleIndex = styleIndex;
// Save the worksheet.
ws.Save();
updated = true;
}
return updated;
}
Then call this method like this (first call is for String second is for DateTime):
UpdateValue(workbookPart, wsName, "D14", "Some text", 0, CellValues.String);
UpdateValue(workbookPart, wsName, "H13", DateTime.Parse("2013-11-01").ToOADate().ToString(CultureInfo.InvariantCulture), 0, CellValues.Date);
I was wondering if there's a syntax for formatting NULL values in string.Format, such as what Excel uses
For example, using Excel I could specify a format value of {0:#,000.00;-#,000.00,NULL}, which means display the numeric value as number format if positive, number format in parenthesis if negative, or NULL if the value is null
string.Format("${0:#,000.00;(#,000.00);NULL}", someNumericValue);
Edit
I'm looking for formatting NULL/Nothing values for all data types, not just numeric ones.
My example is actually incorrect because I mistakenly thought Excel used the 3rd parameter if the value was NULL, but it's actually used when the value is 0. I'm leaving it in there because it's the closest thing I can think of to what I was hoping to do.
I am hoping to avoid the null coalescing operator because I am writing log records, and the data is not usually a string
It would be much easier to write something like
Log(string.Format("Value1 changes from {0:NULL} to {1:NULL}",
new object[] { oldObject.SomeValue, newObject.SomeValue }));
than to write
var old = (oldObject.SomeValue == null ? "null" : oldObject.SomeValue.ToString());
var new = (newObject.SomeValue == null ? "null" : newObject.SomeValue.ToString());
Log(string.Format("Value1 changes from {0} to {1}",
new object[] { old, new }));
You can define a custom formatter that returns "NULL" if the value is null and otherwise the default formatted string, e.g.:
foreach (var value in new[] { 123456.78m, -123456.78m, 0m, (decimal?)null })
{
string result = string.Format(
new NullFormat(), "${0:#,000.00;(#,000.00);ZERO}", value);
Console.WriteLine(result);
}
Output:
$123.456,78
$(123.456,78)
$ZERO
$NULL
Custom Formatter:
public class NullFormat : IFormatProvider, ICustomFormatter
{
public object GetFormat(Type service)
{
if (service == typeof(ICustomFormatter))
{
return this;
}
else
{
return null;
}
}
public string Format(string format, object arg, IFormatProvider provider)
{
if (arg == null)
{
return "NULL";
}
IFormattable formattable = arg as IFormattable;
if (formattable != null)
{
return formattable.ToString(format, provider);
}
return arg.ToString();
}
}
I don't think there's anything in String.Format that will let you specify a particular format for null strings. A workaround is to use the null-coalescing operator, like this:
const string DefaultValue = "(null)";
string s = null;
string formatted = String.Format("{0}", s ?? DefaultValue);
Is this what you want?
string test;
test ?? "NULL"
It looks like String.Format for .NET acts the same way as Excel, i.e., you can use ; separator for positive, negative, and 0 values, but not NULL: http://msdn.microsoft.com/en-us/library/0c899ak8.aspx#SectionSeparator.
You will probably just have to handle the null value manually:
if (myval == null)
// handle
else
return String.Format(...);
You could use an extension method:
public static string ToDataString(this string prm)
{
if (prm == null)
{
return "NULL";
}
else
{
return "'" + prm.Replace("'", "''") + "'";
}
}
Then in your code you can do:
string Field1="Val";
string Field2=null;
string s = string.Format("Set Value:{0}, NullValue={1}",Field1.ToDataString(), Field2.ToDataString());
I am trying to read Location of Calendar item in Lotus Notes.
When i check in Document Properties manually.I am able to view the value,
But when i read it via using Domino.dll in am getting "" value.
I am using:
String Location = ((object[])CalendarDoc.GetItemValue("Location"))[0] as String;
Also tried :
String tmpLocation = ((object[])CalendarDoc.GetItemValue("tmpLocation"))[0] as String;
is there any other way to get 'Location' value ? using Domino.dll in C#.
Thanx
Here's a wild guess... I'm wondering if it's the as string that's causing your issues. I think it depends on the object type being returned by GetItemValue. I'm guessing at runtime it will try to cast your object to a string which might not be what you want. You might just want the text that the object represents (assuming that the ToString gives that).
string location = GetLocationFromDocument();
private string GetLocationFromDocument()
{
object[] values = CalendarDoc.GetItemValue("Location");
if( values != null && values.Length > 0 && values[0] != null )
{
return values[0].ToString();
}
return string.Empty;
}
Sorry, I don't have the required assemblies to test this. If this doesn't work I can delete my answer because I don't wan't bad information floating around.