Converting XLSX file using to a CSV file - c#

I need to convert an XLSX file to another CSV file.
I've done a lot of research on how to do this process, but I did not find anything that suited me.
I found this Github Gist only Convert an Epplus ExcelPackage to a CSV file
That returns an Array of binary. But apparently it does not work any more.
I'm trying to load Array using LoadFromCollection
FileInfo novoArquivoCSV = new FileInfo(fbd.SelectedPath);
var fileInfoCSV = new FileInfo(novoArquivo + "\\" + nameFile.ToString() + ".csv");
using (var csv = new ExcelPackage(fileInfoCSV))
{
csv.Workbook.Worksheets.Add(nameFile.ToString());
var worksheetCSV = csv.Workbook.Worksheets[1];
worksheetCSV.Cells.LoadFromCollection(xlsx.ConvertToCsv());
}

The code you linked to reads an XLSX sheet and returns the CSV data as a byte buffer through a memory stream.
You can write directly to a file instead, if you remove the memory stream and pass the path to the target file in ConvertToCsv :
public static void ConvertToCsv(this ExcelPackage package, string targetFile)
{
var worksheet = package.Workbook.Worksheets[1];
var maxColumnNumber = worksheet.Dimension.End.Column;
var currentRow = new List<string>(maxColumnNumber);
var totalRowCount = worksheet.Dimension.End.Row;
var currentRowNum = 1;
//No need for a memory buffer, writing directly to a file
//var memory = new MemoryStream();
using (var writer = new StreamWriter(targetFile,false, Encoding.UTF8))
{
//the rest of the code remains the same
}
// No buffer returned
//return memory.ToArray();
}
Encoding.UTF8 ensures the file will be written as UTF8 with a Byte Order Mark that allows all programs to understand this is a UTF8 file instead of ASCII. Otherwise, a program could read the file as ASCII and choke on the first non-ASCII character encountered.

Checkout the .SaveAs() method in Excel object.
wbWorkbook.SaveAs("c:\yourdesiredFilename.csv", Microsoft.Office.Interop.Excel.XlFileFormat.xlCSV)
Or following:
public static void SaveAs()
{
Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.ApplicationClass();
Microsoft.Office.Interop.Excel.Workbook wbWorkbook = app.Workbooks.Add(Type.Missing);
Microsoft.Office.Interop.Excel.Sheets wsSheet = wbWorkbook.Worksheets;
Microsoft.Office.Interop.Excel.Worksheet CurSheet = (Microsoft.Office.Interop.Excel.Worksheet)wsSheet[1];
Microsoft.Office.Interop.Excel.Range thisCell = (Microsoft.Office.Interop.Excel.Range)CurSheet.Cells[1, 1];
thisCell.Value2 = "This is a test.";
wbWorkbook.SaveAs(#"c:\one.xls", Microsoft.Office.Interop.Excel.XlFileFormat.xlWorkbookNormal, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlShared, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
wbWorkbook.SaveAs(#"c:\two.csv", Microsoft.Office.Interop.Excel.XlFileFormat.xlCSVWindows, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlShared, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
wbWorkbook.Close(false, "", true);
}
Is there any simple way to convert .xls file to .csv file? (Excel)
There are several other resources online that can help with this ind of thing. Actually, for something generic like this, you should always Google for a solution, and try to figure it out yourself. That's the best way to learn how to do technical things. If you get stuck, or if you have a very specific question, this site is a great place to post your question(s). It seems to me, you probably started here, and you didn't do any preliminary work yourself.

Related

C# Programatically change cell format of all Excel cells to General

As part of an ETL process I am importing data from a variety of different Excel files into a database. Before this happens I need to be able to change the cell format of all cells in an excel worksheet to be in the "General" format.
I have made a start but I'm afraid I dont know how to progress after this:
using Excel = Microsoft.Office.Interop.Excel;
.
.
.
String FilePath = "Code to get file location from database"
String SheetName = "Code to get SheetName from database"
Excel.Application MyApp = new Excel.Application();
MyApp.Visible = false;
Excel.Workbook myWorkbook = MyApp.Workbooks.Open(FilePath,Type.Missing, Type.Missing, Type.Missing, Type.Missing,Type.Missing, Type.Missing, Type.Missing, Type.Missing,Type.Missing, Type.Missing, Type.Missing, Type.Missing,Type.Missing, Type.Missing);
//Code here to convert all rows to data type of general and then save
MyApp.Workbooks.Close();
Any help on this would be greatly appreciated
You can use Range.NumberFormat property:
var myWorksheet = (Excel.Worksheet)myWorkbook.Worksheets[1];
myWorksheet.Cells.NumberFormat = "General";
Please note that this may cause problems if your sheet contains date values.

Excel Interop File Accessed As Read Only Despite Specifically Being Directed Not To

I am having an issue while trying to use Excel Interop with C#.NET.
Im trying to iterate through a large number of workbooks/worksheets and deposit an array of data in a range on each worksheet, as I iterate through the sheets I open, grab my range, set range number formatting, and drop my values into the range, then save and close out, and clean up all my objects.
This all works fine for a while. However, seemingly at random during the process the file will open in read only mode, which causes the SaveAs to halt and bring up a dialog (when this should be running in background for the entire time).
Some factoids:
In my Workbooks.Open() statement I am setting ReadOnly to false, IgnoreRecommendedReadOnly to true, and Notify to false, this should not only prevent the sheet from opening in read only, it should also throw an error if the file isnt able to open in read/write. This does not seem to work out for some reason, as the random read only access still happens.
I am using a sleep command in HoldWhileFileIsOpen(string) to halt the program and wait for the excel file to be completely available both before and after the excel file is used (it halts until the file is not open using FileStream theFile = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.None); as a test, this should prevent my program from opening this file until it is 100% unused/available due to the FileShare.None, but I am still getting these read only errors when saving the excel file).
I am using SaveAs to overwrite the original file with the changed version.
I am closing all COM objects and releasing them, the excel application fully exits each iteration and the task manager no longer shows excel as running each time.
I am not opening these spreadsheets or Excel itself anywhere, nor am I touching these files from anywhere besides the code snippet included below.
I want to stress that these are seemingly random, I can iterate through and save the same spreadsheet 50+ times, move on to the next spreadsheet, iterate 30 times, then hit this error. Then the next run I will only be able to iterate through the first spreadsheet 10 times before it happens again. It is not associated with a specific workbook or worksheet.
My code:
if (newRows.Count > 0)
{
//Increment through our array list and build a 2D string array from that list (.Range.Value doesnt like ArrayList objects, so we must use a raw 2D array)...
string[] rowSample = (string[])newRows[0];
string[,] newRowsArray = new string[newRows.Count, rowSample.Length];
int currentRowIndex = 0;
foreach (string[] row in newRows)
{
int columnIndex = 0;
foreach (string columnText in row)
{
newRowsArray[currentRowIndex, columnIndex] = columnText;
columnIndex++;
}
currentRowIndex++;
}
//Hold processing while the current excel file is open, once it closes we can continue...
HoldWhileFileIsOpen(currentExcelFile);
try
{
//Open excel and open current workbook...
Microsoft.Office.Interop.Excel.Application excelApplication = new Microsoft.Office.Interop.Excel.Application();
excelApplication.DisplayAlerts = false;
excelApplication.Visible = false;
Microsoft.Office.Interop.Excel.Workbooks excelWorkbooks = excelApplication.Workbooks;
Microsoft.Office.Interop.Excel.Workbook currentWorkbook = excelWorkbooks.Open(currentExcelFile, Type.Missing, false, Type.Missing, Type.Missing, Type.Missing, true, Type.Missing, Type.Missing, Type.Missing, false, Type.Missing, false, Type.Missing, Type.Missing);
//Open up/Select our current worksheet
Microsoft.Office.Interop.Excel.Sheets workSheets = currentWorkbook.Sheets;
Microsoft.Office.Interop.Excel.Worksheet currentWorkSheet = workSheets[currentWorkSheetName];
currentWorkSheet.Select();
//Create new range, set range number formatting = "#" (text) to retain leading zeroes and other tricky bits in our output, then drop values into range...
Microsoft.Office.Interop.Excel.Range newExcelRows = (Microsoft.Office.Interop.Excel.Range)currentWorkSheet.get_Range(GetColumnAddress(startingColInt) + startingRow, GetColumnAddress(startingColInt + rowSample.Length - 1) + (startingRowIncrementing - 1));
newExcelRows.NumberFormat = "#";
//newExcelRow.Interior.Color = System.Drawing.ColorTranslator.ToOle(System.Drawing.Color.Red);
newExcelRows.Value = newRowsArray;
//Save workbook changes...
currentWorkbook.SaveAs(currentExcelFile, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlNoChange, Microsoft.Office.Interop.Excel.XlSaveConflictResolution.xlLocalSessionChanges, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
//Close things up, all COM objects references above must be released and scrubbed thouroughly...
System.Runtime.InteropServices.Marshal.ReleaseComObject(newExcelRows);
newExcelRows = null;
excelApplication.ActiveWorkbook.Close(Type.Missing, Type.Missing, Type.Missing);
System.Runtime.InteropServices.Marshal.ReleaseComObject(currentWorkSheet);
currentWorkSheet = null;
System.Runtime.InteropServices.Marshal.ReleaseComObject(workSheets);
workSheets = null;
System.Runtime.InteropServices.Marshal.ReleaseComObject(currentWorkbook);
currentWorkbook = null;
//Quit excel...
excelWorkbooks.Close();
System.Runtime.InteropServices.Marshal.ReleaseComObject(excelWorkbooks);
excelWorkbooks = null;
excelApplication.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(excelApplication);
excelApplication = null;
}
catch (Exception ex)
{
if (ex.Message == "Cannot save as that name. Document was opened as read-only.")
{
}
}
//Hold processing while the current excel file is open, once it closes we can continue...
HoldWhileFileIsOpen(currentExcelFile);
LblStatus.AppendText(" - Done modifying and saving worksheet " + currentWorkSheetName + " (" + currentCountry + ")." + Environment.NewLine);
}
else
{
LblStatus.AppendText(" - No data found for current worksheet " + currentWorkSheetName + " (" + currentCountry + ")." + Environment.NewLine);
}
Any assistance would be a great relief, I am not sure what is going on here or why Interop is randomly choosing to screw me during processing of these files.

Excel Instance wont close after Interop Operations

I'm building a new Excel workbook in c# by combining the first sheet of a series of different Excel workbooks; subsequently I export the new Workbook to PDF. I made this work, but there is always one Excel instance running by the end of the method.I had the same issue discussed here with a simpler setup and less Excel objects that I could solve with the GC.Collect command. Now, none of this is working.
public void CombineWorkBooks()
{
Microsoft.Office.Interop.Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
xlApp.DisplayAlerts = false;
xlApp.Visible = false;
Workbooks newBooks = null;
Workbook newBook = null;
Sheets newBookWorksheets = null;
Worksheet defaultWorksheet = null;
// Create a new workbook, comes with an empty default worksheet");
newBooks = xlApp.Workbooks;
newBook = newBooks.Add(XlWBATemplate.xlWBATWorksheet);
newBookWorksheets = newBook.Worksheets;
// get the reference for the empty default worksheet
if (newBookWorksheets.Count > 0)
{
defaultWorksheet = newBookWorksheets[1] as Worksheet;
}
// loop through every line in Gridview and get the path' to each Workbook
foreach (GridViewRow row in CertificadosPresion.Rows)
{
string path = row.Cells[0].Text;
string CertName = CertificadosPresion.DataKeys[row.RowIndex].Value.ToString();
Workbook childBook = null;
Sheets childSheets = null;
// Excel of each line in Gridview
childBook = newBooks.Open(path,Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
childSheets = childBook.Worksheets;
if (childSheets != null)
{
// Build a new Worksheet
Worksheet sheetToCopy = null;
// Only first Worksheet of the Workbook belonging to that line
sheetToCopy = childSheets[1] as Worksheet;
if (sheetToCopy != null)
{
// Assign the Certificate Name to the new Worksheet
sheetToCopy.Name = CertName;
// set PageSetup for the new Worksheet to be copied
sheetToCopy.PageSetup.Zoom = false;
sheetToCopy.PageSetup.FitToPagesWide = 1;
sheetToCopy.PageSetup.FitToPagesTall = 1;
sheetToCopy.PageSetup.PaperSize = Microsoft.Office.Interop.Excel.XlPaperSize.xlPaperA4;
// Copy that new Worksheet to the defaultWorksheet
sheetToCopy.Copy(defaultWorksheet, Type.Missing);
}
System.Runtime.InteropServices.Marshal.ReleaseComObject(sheetToCopy);
childBook.Close(false, Type.Missing, Type.Missing);
}
System.Runtime.InteropServices.Marshal.ReleaseComObject(childSheets);
System.Runtime.InteropServices.Marshal.ReleaseComObject(childBook);
}
//Delete the empty default worksheet
if (defaultWorksheet != null) defaultWorksheet.Delete();
//Export to PDF
newBook.ExportAsFixedFormat(Microsoft.Office.Interop.Excel.XlFixedFormatType.xlTypePDF, #"C:\pdf\" + SALESID.Text + "_CertPres.pdf", 0, false, true);
newBook.Close();
newBooks.Close();
xlApp.DisplayAlerts = true;
DownloadFile(SALESID.Text);
System.Runtime.InteropServices.Marshal.ReleaseComObject(defaultWorksheet);
System.Runtime.InteropServices.Marshal.ReleaseComObject(newBookWorksheets);
System.Runtime.InteropServices.Marshal.ReleaseComObject(newBook);
System.Runtime.InteropServices.Marshal.ReleaseComObject(newBooks);
xlApp.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(xlApp);
GC.Collect();
GC.WaitForPendingFinalizers();
}
protected void DownloadFile(string Salesid)
{
string path = #"c:\\pdf\" + Salesid + "_CertPres.pdf";
byte[] bts = System.IO.File.ReadAllBytes(path);
Response.Clear();
Response.ClearHeaders();
Response.AddHeader("Content-Type", "Application/octet-stream");
Response.AddHeader("Content-Length", bts.Length.ToString());
Response.AddHeader("Content-Disposition", "attachment; filename=" + Salesid + "_CertPres.pdf");
Response.BinaryWrite(bts);
Response.Flush();
Response.End();
}
The problem must have been related to the call of the DownloadFile Method. I eliminated that call, and the Excel process was properly closed. Some of these operations must have kept a reference to one of the COM objects open, so that they could not be closed. By calling "DownloadFile" at the very end after the GarbageCollect the problem is solved. (I'm not quite sure why)
In your method DownloadFile, you call
Response.End()
HttpResponse.End throws an exception (emphasis mine):
To mimic the behavior of the End method in ASP, this method tries to raise a ThreadAbortException exception. If this attempt is successful, the calling thread will be aborted, [...]
This exception aborts your thread. Thus, all your ReleaseComObject, Excel.Quit, GC.Collect stuff is never executed.
The solution: Don't call Response.End. You probably don't need it. If you need it, you might want to consider the alternative mentioned in the documentation instead:
This method is provided only for compatibility with ASP—that is, for compatibility with COM-based Web-programming technology that preceded ASP.NET. If you want to jump ahead to the EndRequest event and send a response to the client, it is usually preferable to call CompleteRequest instead.
[...]
The CompleteRequest method does not raise an exception, and code after the call to the CompleteRequest method might be executed
PS: Using Excel automation from a web application is not officially supported by Microsoft. For future development, you might want to consider using a third-party Excel library instead.
I found that sometimes the only thing that helps is the "sledgehammer method". Killing all running excel instances:
foreach (Process p in Process.GetProcessesByName("EXCEL"))
{
try
{
p.Kill();
p.WaitForExit();
}
catch
{
//Handle exception here
}
}
Looks to me that you have a reference not cleaned up. Probably something like the 'two dot rule' problem - which in my opinion is a silly rule because you can't code anything decent because it's to difficult to keep track of.
You could try Marshal.ReleaseComObject of your COM references but still asking for trouble...
My suggestion would be to try using VSTO to automate Excel. This will clear your references correctly on your behalf.
https://social.msdn.microsoft.com/Forums/vstudio/en-US/a12add6b-99ea-4677-8245-cd667101683e/vsto-and-office-objects-disposing

Writing To Excel Files in C#

I have a Excel File, i am able to read single row from "Excel file" Cell by Cell and store it in a ArrayList.
ExcelRange reads one row at a time , stores it into ArrayList (arrForValues).
ExcelRange = ExcelWorkSheet.get_Range("A"+rowNumber,columnMaxName+rowNumber );
items = (object[,])ExcelRange.Value2;
for (int i = 1; i <= nColumn; i++)
{
arrForValues.Add(items[1, i]);
}
I want to write row to another Excel file.There is some condition which needs to be satisfied for "particular Row" to get selected for writing.
Is there any way i can write complete ArrayList("Single Row") to ExcelFile instead of Cell By Cell Wrinting.
Thanks in Advance.
You can write whole array of objects by using range's set_value method
Here is example:
class Program
{
static void Main(string[] args)
{
string file = AppDomain.CurrentDomain.BaseDirectory + "SetArrayToExcel.xlsx";
Excel.Application excelApp = new Excel.Application();
excelApp.Visible = true;
Excel.Workbook wb = excelApp.Workbooks.Open(file, Type.Missing, Type.Missing
, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing
, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing
, Type.Missing, Type.Missing);
object[,] testTable = new object[2, 2]{{"6-Feb-10", 0.1}, {"26-Mar-10", 1.2}};
Excel.Worksheet ws = wb.ActiveSheet as Excel.Worksheet;
Excel.Range rng = ws.get_Range("rngSetValue", Type.Missing);
//rng.Value2 = testTable;
rng.set_Value(Type.Missing, testTable);
}
}
I highly recommend just getting FlexCel. It's fairly cheap and it has methods for copying rows and columns.
EDIT: I see that you mean to copy between workbooks. It's still easier with FlexCel than COM or the Interop stuff.
I do some writing to Excel in some code I have. The only way I have found to do it is with a foreach to iterate through my list of values and using an indexer to keep track of what cell it is going into. You might investigate the Range class and the Cells property:
http://msdn.microsoft.com/en-US/library/microsoft.office.tools.excel.namedrange.cells(v=vs.80).aspx
It didn't work for what I am doing but it might for you.
Another alternative would be to merge the cells in a range, build a string of all the values in your array and set the merged range value equal to that string but that might not be what you want.

Open a Read Only file as non read only and save/overwrite

I have a number of word documents I am converting. Everything is going great until I get a file that is read only. In this case I get a Save As prompt.
Is there any way to open the file in read/write format? I should have admin privileges so access isn't an issue.
I'm using VB.net to open the files. More specifically
doc = word.Documents.Open(path, Type.Missing, False, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing)
To open a read-only file you need to set that attribute to false:
string path = "C:\\test.txt";
FileInfo info = new FileInfo(path);
info.IsReadOnly = false;
StreamWriter writer = new StreamWriter(path);
writer.WriteLine("This is an example.");
writer.Close();
info.IsReadOnly=true;
This was an example but I'm sure it will work with word files.
EDIT:
VB.NET equivalent:
Dim path As String = "C:\test.txt"
Dim info As FileInfo = New FileInfo(path)
info.IsReadOnly = False
Dim writer As StreamWriter = New StreamWriter(path)
writer.WriteLine("This is an example.")
writer.Close()
info.IsReadOnly = True
Before you open the file, check its Attributes with a FileInfo class.
If the Attributes property contains FileAttributes.ReadOnly, change it and the file will no longer be read-only.

Categories