While Reading data from a .xlsx file - c#

string Code = "";
if (fileUp.HasFile)
{
string Path = fileUp.PostedFile.FileName;
// initialize the Excel Application class
ApplicationClass app = new ApplicationClass();
// create the workbook object by opening the excel file.
Workbook workBook = app.Workbooks.Open(Path, 0, true, 5, "", "", true,
XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
// Get The Active Worksheet Using Sheet Name Or Active Sheet
Worksheet workSheet = (Worksheet)workBook.ActiveSheet;
int index = 0;
// This row,column index should be changed as per your need.
// that is which cell in the excel you are interesting to read.
object rowIndex = 2;
object colIndex1 = 1;
object colIndex2 = 2;
object colIndex3 = 3;
object colIndex4 = 4;
object colIndex5 = 5;
object colIndex6 = 6;
object colIndex7 = 7;
try
{
while (((Range)workSheet.Cells[rowIndex, colIndex1]).Value2 != null)
{
rowIndex = 2 + index;
//string QuestionCode = (index + 1).ToString();
string QuestionCode = ((Range)workSheet.Cells[rowIndex, colIndex1]).Value2.ToString();
string QuestionText = ((Range)workSheet.Cells[rowIndex, colIndex2]).Value2.ToString();
string CorrectAnswer = ((Range)workSheet.Cells[rowIndex, colIndex3]).Value2.ToString();
string ChoiceA = ((Range)workSheet.Cells[rowIndex, colIndex4]).Value2.ToString();
string ChoiceB = ((Range)workSheet.Cells[rowIndex, colIndex5]).Value2.ToString();
string ChoiceC = ((Range)workSheet.Cells[rowIndex, colIndex6]).Value2.ToString();
string ChoiceD = ((Range)workSheet.Cells[rowIndex, colIndex7]).Value2.ToString();
// string ChoiceE = ((Excel.Range)workSheet.Cells[rowIndex, colIndex7]).Value2.ToString();
newQuestionElement = new XElement("Question");
XElement optionElement = new XElement(QuestionElement.Option);
questionType = ddlQusType.SelectedValue.ToByte();
if (!string.IsNullOrEmpty(QuestionText))
newQuestionElement.Add(new XElement(QuestionElement.QuestionText, QuestionText));
else
{
//lblMessage.Text = "Missing question in Qus No.: " + i;
break;
}
newQuestionElement.Add(new XElement(QuestionElement.QuestionType, questionType));
//newQuestionElement.Add(new XElement(QuestionElement.Randomize, chbRandomizeChoice.Checked));
newQuestionElement.Add(new XElement(QuestionElement.Answer, CorrectAnswer));
if (ChoiceA.Trim() != string.Empty)
optionElement.Add(new XElement("A", ChoiceA));
if (ChoiceB.Trim() != string.Empty)
optionElement.Add(new XElement("B", ChoiceB));
if (ChoiceC.Trim() != string.Empty)
optionElement.Add(new XElement("C", ChoiceC));
if (ChoiceD.Trim() != string.Empty)
optionElement.Add(new XElement("D", ChoiceD));
newQuestionElement.Add(optionElement);
index++;
saveData(QuestionCode.ToString());
I am using this code to retrieve the data from .xlsx file.
But if the file has any special characters in it, it is showing it as different, like so
The set S = {1,2,33……….12} is to be partitioned into three sets
A,B,C of equal size. Thus, `A U B U C = S,`
The set S = {1,2,33……….12} is to be partitioned into three sets
A,B,C of equal size. Thus, `A È B È C = S,`

Looks like an encoding issue.
I use to have this issue after reading Excel into a data table and then serializing the data table to a file.
Every time I would read the data back in from the serialized file, some symbols would be replaced with funny A's and E's.
I discovered the problem was with the encoding I was using. I then started to store excel data using Unicode encoding and have never encounter another symbol problem with Excel data again.
I hope this helps...

Related

Excel file (.xlsx) created by using DocumentFormat.OpenXML needs to be repaired when opening in Excel

I have a method which create an excel file (.xlsx) from a list of strings using DocumentFormat.OpenXml. The created file needs to be repaired when I try to open it with Excel 2016. When I click "Yes" Excel shows my file correctly.
Does anyone have any suggestions? Thanks in advance.
Here's my code:
private byte[] ExportDataXlsx(System.Data.Common.DbDataReader reader, string[] fields, string[] headers, string Culture) {
System.IO.MemoryStream sw = new System.IO.MemoryStream();
using (var workbook = Packaging.SpreadsheetDocument.Create(sw, SpreadsheetDocumentType.Workbook)) {
var sheetData = CreateSheet(workbook);
while (reader.Read()) {
Spreadsheet.Row newRow = new Spreadsheet.Row();
foreach (string column in fields) {
Spreadsheet.Cell cell = new Spreadsheet.Cell();
cell.DataType = Spreadsheet.CellValues.String;
object value = null;
try {
int index = reader.GetOrdinal(column);
cell.DataType = DbKymosDomainService.ToXlsType(reader.GetFieldType(index));
value = DbKymosDomainService.ToStringFromCulture(reader.GetValue(index), reader.GetFieldType(index), Culture);
if (cell.DataType == Spreadsheet.CellValues.Number){
value = value == null ? "" : value.ToString().Replace(",", ".");
}
}
catch { }
cell.CellValue = new Spreadsheet.CellValue(value == null ? null : value.ToString()); //
newRow.AppendChild(cell);
try { var x = newRow.InnerXml; } catch { newRow.RemoveChild(cell); }
}
sheetData.AppendChild(newRow);
}
workbook.Close();
}
byte[] data = sw.ToArray();
sw.Close();
sw.Dispose();
return data;
}
Function which create sheet:
private Spreadsheet.SheetData CreateSheet(Packaging.SpreadsheetDocument workbook)
{
var workbookPart = workbook.AddWorkbookPart();
workbook.WorkbookPart.Workbook = new Spreadsheet.Workbook();
workbook.WorkbookPart.Workbook.Sheets = new Spreadsheet.Sheets();
var sheetPart = workbook.WorkbookPart.AddNewPart<Packaging.WorksheetPart>();
var sheetData = new Spreadsheet.SheetData();
sheetPart.Worksheet = new Spreadsheet.Worksheet(sheetData);
Spreadsheet.Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<Spreadsheet.Sheets>();
string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);
uint sheetId = 1;
if (sheets.Elements<Spreadsheet.Sheet>().Count() > 0) {
sheetId =
sheets.Elements<Spreadsheet.Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
Spreadsheet.Sheet sheet = new Spreadsheet.Sheet() { Id = relationshipId, SheetId = sheetId, Name = "Export" };
sheets.Append(sheet);
return sheetData;
}
In my experience when a file needs to be repaired after creating it using OpenXML it means that it is missing a crucial element or the crucial element is in the wrong place. I'm having difficulty following your code so that in itself points to something being in the wrong place. Code should be sequential and self-explanatory. A few pointers however to help with getting to the root cause of your issue.
I would suggest first using ClosedXML as it takes so much strain out of the coding.https://github.com/closedxml/closedxml
Debug your code and step through each step to see what's going on.
Open the created file in OpenXML Productivity Tool https://github.com/OfficeDev/Open-XML-SDK/releases/tag/v2.5 and have a look around.
Another tool that I couldn't be without is OpenXML FileViewer: https://github.com/davecra/OpenXmlFileViewer
Lastly I always run this sub routine to validate documents I create using OpenXML:
public static List<string> ValidateWordDocument(FileInfo filepath, ref Int32 maxerrors = 100)
{
try
{
using (WordprocessingDocument wDoc = WordprocessingDocument.Open(filepath.FullName, false))
{
OpenXmlValidator validator = new OpenXmlValidator();
int count = 0;
List<string> er = new List<string>()
{
string.Format($"Assessment of {filepath.Name} on {DateTime.Now} yielded the following result: {Constants.vbCrLf}")
};
// set at zero so that we can determine the total quantity of errors
validator.MaxNumberOfErrors = 0;
// String.Format("<strong> Warning : </strong>")
foreach (ValidationErrorInfo error in validator.Validate(wDoc))
{
count += 1;
if (count > maxerrors)
break;
er.Add($"Error {count}{Constants.vbCrLf}" + $"Description {error.Description}{Constants.vbCrLf}" + $"ErrorType: {error.ErrorType}{Constants.vbCrLf}" + $"Node {error.Node}{Constants.vbCrLf}" + $"Name {error.Node.LocalName}{Constants.vbCrLf}" + $"Path {error.Path.XPath}{Constants.vbCrLf}" + $"Part: {error.Part.Uri}{Constants.vbCrLf}" + $"-------------------------------------------{Constants.vbCrLf}" + $"Outer XML: {error.Node.OuterXml}" + $"-------------------------------------------{Constants.vbCrLf}");
}
int validatorcount = validator.Validate(wDoc).Count;
switch (validatorcount)
{
case object _ when validatorcount > maxerrors:
{
er.Add($"Returned {count - 1} as this is the Maximum Number set by the system. The actual number of errors in {filepath.Name} is {validatorcount}");
er.Add("A summary list of all error types encountered is given below");
List<string> expectedErrors = validator.Validate(wDoc).Select(_e => _e.Description).Distinct().ToList();
er.AddRange(expectedErrors);
break;
}
case object _ when 1 <= validatorcount && validatorcount <= maxerrors:
{
er.Add($"Returned all {validator} errors in {filepath.Name}");
break;
}
case object _ when validatorcount == 0:
{
er.Add($"No Errors found in document {filepath.Name}");
break;
}
}
return er;
wDoc.Close();
}
}
catch (Exception ex)
{
Information.Err.MessageElevate();
return null;
}
}
It helps greatly with problem solving any potential issues.

Exporting to excel export converts special characters to HTML codes

I need to export Date, Title and Description to excel file, right now i am facing two issue with the export of excel file.
one special characters such as '," an other characters turn into ‘ & etc....
All these issue are with the Description column, which stored text in HTML format. Below is the example of text in various formats
Actual Text
The ‘ Golf Season Opening’ marked the official opening of the at Golf Club, Season to start on March 10, 2018.
Text Stored in Database MS SQL SERVER
The ‘Golf Season Opening ‘ marked the official opening of the at Golf Club& Season to start on March 10& 2018.
Text exported to Excel
The ‘Golf Season Opening ‘ marked the official opening of the at Golf Club& Season to start on March 10& 2018.
I am using below code to create excel file but i am facing above issue.
How can i store text without being decoding text is excel should be store in text format & all special characters show properly without any issue
var wb = new XLWorkbook();
var ws = wb.Worksheets.Add("Calendar");
DataTable dt = ds.Tables[0];
var rowIndex = 2; // 1 = header row
foreach (DataRow row in dt.Rows)
{
ws.Cell("A" + rowIndex).Value = row["Year"];
ws.Cell("B" + rowIndex).Value = row["Title"];
string noHTML = Regex.Replace(row["Description"].ToString(), #"<[^>]+>| ", "").Trim();
string noHTMLNormalised = Regex.Replace(noHTML, #"\s{2,}", " ");
ws.Cell("C" + rowIndex).Value = noHTMLNormalised;
rowIndex++;
}
//// From worksheet
var rngTable = ws.Range("A1:C" + rowIndex);
var rngHeader = ws.Range("A1:C1");
var rngYear = ws.Range("A2:A" + rowIndex);
//var rngDate = ws.Range("B2:B" + rowIndex);
var rngTitle = ws.Range("B2:D" + rowIndex);
var rngDesc = ws.Range("C2:C" + rowIndex);
rngHeader.Style.Fill.SetBackgroundColor(XLColor.CoolGrey);
rngHeader.Style.Alignment.Horizontal = XLAlignmentHorizontalValues.Center;
rngHeader.Style.Font.Bold = true;
rngHeader.Style.Font.FontColor = XLColor.White;
// rngYear.Style.Fill.SetBackgroundColor(XLColor.CoolGrey);
rngYear.Style.Font.Bold = true;
rngYear.Style.Font.FontColor = XLColor.Black;
rngYear.Style.Alignment.Indent = 1;
//rngDate.Style.DateFormat.Format = "MM/DD/YYYY";
//rngDate.Style.Alignment.Indent = 10;
rngDesc.Style.Alignment.SetWrapText();
ws.RangeUsed().Style.Border.OutsideBorder = XLBorderStyleValues.Thick;
var col3 = ws.Column("C");
//col3.Style.Fill.BackgroundColor = XLColor.Red;
col3.Width = 100;
ws.Columns().AdjustToContents();
string fileName;
fileName = "Golf_Calendat.xlsx";
wb.SaveAs(HttpContext.Current.Server.MapPath("../excel/" + fileName));
Any help to fixed the above issue and also if we we can wrap the text in description column and if row can take the auto height based on the wrapped text.
Just to mention i am using using Excel = Microsoft.Office.Interop.Excel; for excel export
you can replace it in a string
str.replace(/&/g, "&").replace(/</g, "<").replace(/>/g, ">");
Solved Both issue with following code
First by using HTML HttpUtility.HtmlDecode
string htmlDec = HttpUtility.HtmlDecode(row["Description"].ToString());
and text wrap issue with ws.Column(2).AdjustToContents(5, 7);
string htmlEnc = HttpUtility.HtmlEncode(row["Description"].ToString());
string htmlDec = HttpUtility.HtmlDecode(row["Description"].ToString());
string noHTML = Regex.Replace(htmlDec, #"<[^>]+>| ", "").Trim();
string noHTMLNormalised = Regex.Replace(noHTML, #"\s{2,}", " ");
ws.Cell("C" + rowIndex).Value = noHTMLNormalised;

Reading From Excel File - Cells with Values Show Null

I have written some code that reads every row in an excel file (for two specific columns) which I will be using later to execute an update SQL Query for each of the rows with a value.
I have displayed these values in a listbox, and I am getting far more nulls than expected when comparing with the stock codes in the excel file.
I have tried changing the formatting of the excel file, but this did not make any difference. There are rows where there definitely are stock codes at that position, but when the program does the cell comparison the program identifies them as nulls when they actually have values.
Does anyone know what the problem is with my code?
private void btnStockCodes_Click(object sender, RoutedEventArgs e)
{
string file = #"\\amn-fs-01\users$\Shanel\Desktop\Stock Codes.xlsx";
Microsoft.Office.Interop.Excel.Application ExcelApp = new Microsoft.Office.Interop.Excel.Application();
Workbook ExcelWorkbook = ExcelApp.Workbooks.Open(file);
Worksheet ews = ExcelApp.ActiveWorkbook.Sheets[1];
Microsoft.Office.Interop.Excel.Range usedRange = ews.UsedRange;
int TotalCounter = 0;
string StockCode = "";
string ReserveID = "";
int nullcounter = 0;
int foundcounter = 0;
foreach (Microsoft.Office.Interop.Excel.Range row in usedRange.Rows)
{
StockCode = "";
ReserveID = "";
TotalCounter = TotalCounter + 1;
if (row.Cells[TotalCounter,7].Value == null)
{
Listbox1.Items.Add(TotalCounter + " null");
nullcounter = nullcounter + 1;
}
else
{
StockCode = row.Cells[TotalCounter,7].Value.ToString();
ReserveID = row.Cells[TotalCounter, 3].Value.ToString();
Listbox1.Items.Add(TotalCounter + " " + StockCode + " " + ReserveID);
foundcounter = foundcounter + 1;
}
}
txtTotal1.Text = foundcounter.ToString() + " Found";
txtTotal2.Text = nullcounter.ToString() + " Null Values";
txtTotal3.Text = TotalCounter.ToString() + " Total Records";
}
I would not trust that Worksheet.UsedRange always works correctly, sometimes it contains more cells than it should, or less. My suggestion is to read all rows in worksheet, while you have any values. Once there are no more values, just stop reading it.
And if you have too many rows, you can read all values at the same time into an array, like here and work with the array.
Thanks for your contributions, I have resolved the error!
It occurs in the row.Cells[TotalCounter,7].Value.ToString()
It should have been row.Cells[7].Value.ToString()
There was no need for me to specify a row index as that's taken care of in the Foreach loop. I will look into alternative ways of writing the code as Worksheet.UsedRange might not work in all cases as Alex suggested.

OpenXml DataValidation set predefined List for columns

I am using OpenXml to create Excel file and export table data. One of the scenario is I want a column to have dropdown of predefined values, say like true and false. I followed this question and wrote code as below
DataValidation dataValidation = new DataValidation
{
Type = DataValidationValues.List,
AllowBlank = true,
SequenceOfReferences = new ListValue<StringValue>() { InnerText = "B1" },
//Formula1 = new Formula1("'SheetName'!$A$1:$A$3") // this was used in mentioned question
Formula1 = new Formula1("True,False") // I need predefined values.
};
DataValidations dvs = worksheet.GetFirstChild<DataValidations>(); //worksheet type => Worksheet
if (dvs != null)
{
dvs.Count = dvs.Count + 1;
dvs.Append(dataValidation);
}
else
{
DataValidations newDVs = new DataValidations();
newDVs.Append(dataValidation);
newDVs.Count = 1;
worksheet.Append(newDVs);
}
If I use it with SheetName with cell values range, it works fine, but if I add string, it throws me error "Unreadable content found" and removes datavalidation node.
How to add values for list dropdown validation in formula itself. XML it creates for manually added(by editing in excel application) list values is <formula1>"One,Two"</formula1> (observed xml for excel file)
Okay I got this solved. Added escaped double quotes to formula and done.
DataValidation dataValidation = new DataValidation
{
Type = DataValidationValues.List,
AllowBlank = true,
SequenceOfReferences = new ListValue<StringValue>() { InnerText = "B1" },
Formula1 = new Formula1("\"True,False\"") // escape double quotes, this is what I was missing
};
DataValidations dvs = worksheet.GetFirstChild<DataValidations>(); //worksheet type => Worksheet
if (dvs != null)
{
dvs.Count = dvs.Count + 1;
dvs.Append(dataValidation);
}
else
{
DataValidations newDVs = new DataValidations();
newDVs.Append(dataValidation);
newDVs.Count = 1;
worksheet.Append(newDVs);
}

Dynamically add range values to excel chart with C#

I am trying to generate a chart for a powerpoint slide using C#.net. The chart works perfectly when I hard code the data, so my goal here is to be able to populate the excel backend from my applications datatable. What i need help with is defining the data ranges(see below)
var areaworkbook = (EXCEL.Workbook)areachart.ChartData.Workbook;
areaworkbook.Windows.Application.Visible = false;
var dataSheet2 = (EXCEL.Worksheet)areaworkbook.Worksheets[1];
var sc2 = areachart.SeriesCollection();
dataSheet1.Cells.Range["A2"].Value2 = "Name 1";
dataSheet1.Cells.Range["A3"].Value2 = "Name 2";
dataSheet1.Cells.Range["A4"].Value2 = "Name 3";
dataSheet1.Cells.Range["A5"].Value2 = "Name 4";
dataSheet1.Cells.Range["B2"].Value2 = Value 1;
dataSheet1.Cells.Range["B3"].Value2 = value 2;
dataSheet1.Cells.Range["B4"].Value2 = value 3;
dataSheet1.Cells.Range["B5"].Value2 = value 4 ;
var series2 = sc2.NewSeries();
series2.Name = "Series 2";
series2.XValues = "'Sheet1'!$A$2:$A$5";
series2.Values = "'Sheet1'!$C$2:$C$5";
series2.ChartType = Office.XlChartType.xlAreaStacked;
areachart.HasTitle = true;
areachart.ChartTitle.Font.Bold = true;
areachart.ChartTitle.Font.Italic = true;
areachart.ApplyLayout(4);
areachart.Refresh();
How will I dynamically add A6, A7, A8... until my datatable is complete?
Just use a loop and calculate the cell address. For the sake of argument, I'm going to assume the data is coming from a Linq query, though you could get it any other way.
int row = 2; // You expect to start here
foreach (var data in db.MyData().Where(... whatever you need here ...))
{
dataSheet1.Cells.Range["A" + row].Value2 = data.Name;
dataSheet1.Cells.Range["B" + row].Value2 = data.Value;
row++;
}
series2.XValues = "'Sheet1'!$A$2:$A$" + row;
series2.Values = "'Sheet1'!$C$2:$C$" + row;

Categories