Reading shapes and textboxes in EPPlus 4.5.3 - c#

I am trying to read an excel file using EPPlus version 4.5.3, which I am able to do using the code below:
FileInfo existingFile = new FileInfo(FilePath);
using (ExcelPackage package = new ExcelPackage(existingFile))
{
//get the first worksheet in the workbook
ExcelWorksheet worksheet = package.Workbook.Worksheets[1];
int colCount = worksheet.Dimension.End.Column; //get Column Count
int rowCount = worksheet.Dimension.End.Row; //get row count
for (int row = 1; row <= rowCount; row++)
{
for (int col = 1; col <= colCount; col++)
{
Console.WriteLine(" Row:" + row + " column:" + col + " Value:" + worksheet.Cells[row, col].Value?.ToString().Trim());
}
}
}
Now, the place I am getting stuck at is with shapes. So the excel file that I need to read have shapes in it, these shapes have text inside it that I am trying to read. I have tried searching on the internet for this problem but I cant seem to find anything on it.
How can I read this data? The code I have tried thus far:
foreach (var drawing in sheet.Drawings)
{
var type = drawing.GetType();
var data = drawing.ToString();
Console.WriteLine("Drawing Type:" + type + " Data: " + data);
}

I was just having this issue today and figured it out. You have to iterate the worksheet.Drawings collection first to determine if any drawings are "shapes". From what I know of Excel VBA, you can not put text on an image/picture it HAS to be a shape. Someone can correct me if I am wrong.
using (ExcelPackage excelPackage = new ExcelPackage(stream))
{
ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets[1];
foreach (ExcelDrawing dw in worksheet.Drawings)
{
if (dw.GetType().ToString() == "OfficeOpenXml.Drawing.ExcelShape")
{
ExcelShape shape = (ExcelShape)dw;
if (shape.RichText != null && shape.RichText.Count > 0)
{
foreach (ExcelParagraph item in shape.RichText)
{
Console.WriteLine("{0} - Rich Text Line: {1}", dw.Name, item.Text);
}
}
}
else
{ Console.WriteLine("{0} is not a shape, its a {1}", dw.Name, dw.GetType().ToString()); }
}
}
From there it should be rather easy jump to modify the text in the picture:
item.RichText[1].Text = "Updated Text";
Output:
Picture 1 is not a shape, its a OfficeOpenXml.Drawing.ExcelPicture
TextBox 1 - Rich Text Line: Inventory List
TextBox 1 - Rich Text Line: Some Company

Related

how to select data from excel automatically using Microsoft.Office.Interop

I am using
using Microsoft.Office.Core;
using Microsoft.Office.Interop.Excel;
using Microsoft.Office.Interop.PowerPoint;
I have charts in PowerPoint like
and excel like
i need make select data as dynamic when add data more than 7 use 7 only because only charts select 7 rows but need this dynamic when excel data have more or less than 7 select automatically .
when add more than 7 data already exist in excel but select 7 rows only
foreach (Microsoft.Office.Interop.PowerPoint.Shape textShape in slide.Shapes)
{
if (textShape.HasChart == MsoTriState.msoTrue || textShape.HasSmartArt == MsoTriState.msoTrue)
{
ChartData chartData = textShape.Chart.ChartData;
//textShape.Chart.Legend.te = JsonData.twitter_account_analysis.content.content_type[0].name;
chartData.Activate();
Workbook workbook = chartData.Workbook;
Microsoft.Office.Interop.PowerPoint.ChartArea chartArea = textShape.Chart.ChartArea;
workbook.Application.Visible = false;
//workbook.Application.Calculate();
Worksheet dataSheet = workbook.Worksheets[1];
//dataSheet.TableUpdate();
//System.Threading.Thread.Sleep(50);
int firstcolNumber = 2;
int rowNumber = 2;
// Clearing previous data
//dataSheet.UsedRange.Columns[1, Type.Missing].Clear();
//dataSheet.UsedRange.Columns[2, Type.Missing].Clear();
//Dount chart
for (int i = 0; i <99; i++)
{
if (i < JsonData.twitter_metrics.tweets_over_time.Length)
{
dataSheet.Cells[firstcolNumber + i, rowNumber].Clear();
dataSheet.Cells[firstcolNumber + i, 1].Clear();
dataSheet.Cells[firstcolNumber + i, rowNumber] = JsonData.twitter_metrics.tweets_over_time[i].stats_count.ToString();
dataSheet.Cells[firstcolNumber + i, 1] = JsonData.twitter_metrics.tweets_over_time[i].id;
}
else
{
dataSheet.Cells[firstcolNumber + i, rowNumber].Clear();
dataSheet.Cells[firstcolNumber + i, 1].Clear();
}
}
//dataSheet.get_Range("A1", $"B{JsonData.twitter_metrics.tweets_over_time.Length}").Select();
//Marshal.FinalReleaseComObject(dataSheet);
var yAxis = (Microsoft.Office.Interop.PowerPoint.Axis)textShape.Chart.Axes(Microsoft.Office.Interop.PowerPoint.XlAxisType.xlValue, Microsoft.Office.Interop.PowerPoint.XlAxisGroup.xlPrimary);
var xAxis = (Microsoft.Office.Interop.PowerPoint.Axis)textShape.Chart.Axes(Microsoft.Office.Interop.PowerPoint.XlAxisType.xlCategory, Microsoft.Office.Interop.PowerPoint.XlAxisGroup.xlPrimary);
}
Marshal.ReleaseComObject(textShape);
}
Marshal.ReleaseComObject(slide);
//System.Threading.Thread.Sleep(50);
break;

Unmerge and clear cells in epplus 4.1

I had no luck deleting rows in excel so now I try to clear their content and still get the error:
"Can't delete/overwrite merged cells. A range is partly merged with the another merged range. A57788:J57788".
Columns 1-10 are really merged, but how do I unmerge them?
Here's my code:
cntr = 0;
while (ws.Cells[RowNum + cntr, 1].Value == null || !ws.Cells[RowNum + cntr, 1].Value.ToString().StartsWith("Report generation date"))
{
ws.Cells[RowNum + cntr, 1, RowNum + cntr, 18].Value = "";
ws.Cells[RowNum + cntr, 1, RowNum + cntr, 10].Merge = false;
for (int i = 1; i < 17; i++)
{
ws.Cells[RowNum + cntr, i].Style.Border.BorderAround(OfficeOpenXml.Style.ExcelBorderStyle.None);
ws.Cells[RowNum + cntr, i].Clear();
}
cntr++;
}
//ws.DeleteRow(RowNum, cntr);
The thing is you can not unmerge a single cell in a range, you have to unmerge the whole range.
To do that you can get the merged range that a cell belongs to using my solution here:
public string GetMergedRange(ExcelWorksheet worksheet, string cellAddress)
{
ExcelWorksheet.MergeCellsCollection mergedCells = worksheet.MergedCells;
foreach (var merged in mergedCells)
{
ExcelRange range = worksheet.Cells[merged];
ExcelCellAddress cell = new ExcelCellAddress(cellAddress);
if (range.Start.Row<=cell.Row && range.Start.Column <= cell.Column)
{
if (range.End.Row >= cell.Row && range.End.Column >= cell.Column)
{
return merged.ToString();
}
}
}
return "";
}
The second step is unmerging the whole range using:
public void DeleteCell(ExcelWorksheet worksheet, string cellAddress)
{
if (worksheet.Cells[cellAddress].Merge == true)
{
string range = GetMergedRange(worksheet, cellAddress); //get range of merged cells
worksheet.Cells[range].Merge = false; //unmerge range
worksheet.Cells[cellAddress].Clear(); //clear value
}
}
This way will cost you to lose merging of the other cells, and their value, to overcome this you can save value before clearing and unmerging then you can write it back, something like:
public void DeleteCell(ExcelWorksheet worksheet, string cellAddress)
{
if (worksheet.Cells[cellAddress].Merge == true)
{
var value = worksheet.Cells[cellAddress].Value;
string range = GetMergedRange(worksheet, cellAddress); //get range of merged cells
worksheet.Cells[range].Merge = false; //unmerge range
worksheet.Cells[cellAddress].Clear(); //clear value
//merge the cells you want again.
//fill the value in cells again
}
}

How can I write a string to format row data to save as a CSV file?

I currently have a program which uses StreamReader to access a CSV file and store the values in a data grid, however when saving this data it is printing a new line for each column value of the data row.
The program currently prints the csv file as:
headerText, headerText, headerText, headerText
Column 1, Column 2, Column 1, Column 2, Column 3, Column 1, Column 2, Column 3, Column 4
What I need it to print is:
headerText, headerText, headerText, headerText
Column 1, Column 2, Column 3, Column 4
string CsvFpath = "C:/StockFile/stockfiletest.csv";
try
{
StreamWriter csvFileWriter = new StreamWriter(CsvFpath, false);
string columnHeaderText = "";
int countColumn = stockGridView.ColumnCount - 1;
if (countColumn >= 0)
{
columnHeaderText = stockGridView.Columns[0].HeaderText;
}
for (int i = 1; i <= countColumn; i++)
{
columnHeaderText = columnHeaderText + ',' + stockGridView.Columns[i].HeaderText;
}
csvFileWriter.WriteLine(columnHeaderText);
foreach (DataGridViewRow dataRowObject in stockGridView.Rows)
{
if (!dataRowObject.IsNewRow)
{
string dataFromGrid = "{0} += {1} += {2} += {3}";
dataFromGrid = dataRowObject.Cells[0].Value.ToString();
for (int i = 1; i <= countColumn; i++)
{
dataFromGrid = dataFromGrid + ',' + dataRowObject.Cells[i].Value.ToString();
csvFileWriter.Write(dataFromGrid);
}
csvFileWriter.WriteLine();
}
}
csvFileWriter.Dispose();
MessageBox.Show("Saved stockfile.csv");
}
catch (Exception exceptionObject)
{
MessageBox.Show(exceptionObject.ToString());
}
Can anyone tell me what I'm doing wrong with my String formation and how to achieve the required file output?
As mentioned in another answer, the issue is that you are writing to the file inside the loop as you process each column, instead of after the loop, when you have collected all the column information for the row.
Another way you could do this is to use string.Join and System.Linq to more concisely concatenate the column values for each row.
Also note that we can wrap the csvFileWriter in a using block, so that it automatically gets closed and disposed when the block execution completes:
using (var csvFileWriter = new StreamWriter(CsvFpath, false))
{
// Write all the column headers, joined with a ','
csvFileWriter.WriteLine(string.Join(",",
stockGridView.Columns.Cast<DataGridViewColumn>().Select(col => col.HeaderText)));
// Grab all the rows that aren't new and, for each one, join the cells with a ','
foreach (var row in stockGridView.Rows.Cast<DataGridViewRow>()
.Where(row => !row.IsNewRow))
{
csvFileWriter.WriteLine(string.Join(",",
row.Cells.Cast<DataGridViewCell>().Select(cell => cell.Value.ToString())));
}
}
Another thing: Instead of writing your own csv parser, there are existing tools that you can use to write csv files, such as CsvHelper, which will handle other sorts of edge cases that can cause problems, such as values that have commas in them.
Your problem is here:
for (int i = 1; i <= countColumn; i++)
{
dataFromGrid = dataFromGrid + ',' + dataRowObject.Cells[i].Value.ToString();
csvFileWriter.Write(dataFromGrid);
}
You're adding to the string each time.. and not clearing it at any point. So on line 1 column one you get ",col1" col2 is ",col1,col2" .. but you're also writing them out each time..
There is a CSV writer class, but, you can do what you're doing, but, just move the write outside the loop, and then reset it. However..
for (int i = 1; i <= countColumn; i++)
{
if (i>1) csvFileWriter.Write(",");
csvFileWriter.Write(dataRowObject.Cells[i].Value.ToString());
}
will stop you getting the extra "," at the start, and write as it goes

Try...catch returning nothing but code is still breaking

UPDATE: So this code is collection a SQL Query into a DataSet prior to this method. This data set is then dropped into excel in the corresponding tab at a specific cell address(which is loaded from the form) but the code below is the exporting to excel method. I am getting the following error:
An unhandled exception of type 'System.AccessViolationException' occurred in SQUiRE (Sql QUery REtriever) v1.exe
Additional information: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
I have been tracking this for a while and thought I fixed it, but my solution was a false positive. So I am using a try...catch block that is breaking but not returning anything. Let me know if you all see anything that I am missing. I usually break on this line (templateSheet = templateBook.Sheets[tabName];) and on the same tabName. The tab is not locked or restricted so It can be written to and works more than half of the time.
public void ExportToExcel(DataSet dataSet, Excel.Workbook templateBook, int i, int h, Excel.Application excelApp) //string filePath,
{
try
{
lock (this.GetType())
{
Excel.Worksheet templateSheet;
//check to see if the template is already open, if its not then open it,
//if it is then bind it to work with it
//if (!fileOpenTest)
//{ templateBook = excelApp.Workbooks.Open(filePath); }
//else
//{ templateBook = (Excel.Workbook)System.Runtime.InteropServices.Marshal.BindToMoniker(filePath); }
//Grabs the name of the tab to dump the data into from the "Query Dumps" Tab
string tabName = lstQueryDumpSheet.Items[i].ToString();
templateSheet = templateBook.Sheets[tabName];
// Copy DataTable
foreach (System.Data.DataTable dt in dataSet.Tables)
{
// Copy the DataTable to an object array
object[,] rawData = new object[dt.Rows.Count + 1, dt.Columns.Count];
// Copy the values to the object array
for (int col = 0; col < dt.Columns.Count; col++)
{
for (int row = 0; row < dt.Rows.Count; row++)
{ rawData[row, col] = dt.Rows[row].ItemArray[col]; }
}
// Calculate the final column letter
string finalColLetter = string.Empty;
string colCharset = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int colCharsetLen = 26;
if (dt.Columns.Count > colCharsetLen)
{ finalColLetter = colCharset.Substring((dt.Columns.Count - 1) / colCharsetLen - 1, 1); }
finalColLetter += colCharset.Substring((dt.Columns.Count - 1) % colCharsetLen, 1);
/*Grabs the full cell address from the "Query Dump" sheet, splits on the '=' and
*pulls out only the cell address (i.e., "address=a3" becomes "a3")*/
string dumpCellString = lstQueryDumpText.Items[i].ToString();
string dumpCell = dumpCellString.Split('=').Last();
/*Refers to the range in which we are dumping the DataSet. The upper right hand cell is
*defined by 'dumpCell'and the bottom right cell is defined by the final column letter
*and the count of rows.*/
string firstRef = "";
string baseRow = "";
//Determines if the column is one letter or two and handles them accordingly
if (char.IsLetter(dumpCell, 1))
{
char[] createCellRef = dumpCell.ToCharArray();
firstRef = createCellRef[0].ToString() + createCellRef[1].ToString();
for (int z = 2; z < createCellRef.Count(); z++)
{ baseRow = baseRow + createCellRef[z].ToString(); }
}
else
{
char[] createCellRef = dumpCell.ToCharArray();
firstRef = createCellRef[0].ToString();
for (int z = 1; z < createCellRef.Count(); z++)
{ baseRow = baseRow + createCellRef[z].ToString(); }
}
int baseRowInt = Convert.ToInt32(baseRow);
int startingCol = ColumnLetterToColumnIndex(firstRef);
int endingCol = ColumnLetterToColumnIndex(finalColLetter);
int finalCol = startingCol + endingCol;
string endCol = ColumnIndexToColumnLetter(finalCol - 1);
int endRow = (baseRowInt + (dt.Rows.Count - 1));
string cellCheck = endCol + endRow;
string excelRange;
if (dumpCell.ToUpper() == cellCheck.ToUpper())
{ excelRange = string.Format(dumpCell + ":" + dumpCell); }
else
{ excelRange = string.Format(dumpCell + ":{0}{1}", endCol, endRow); }
//Dumps the cells into the range on Excel as defined above
templateSheet.get_Range(excelRange, Type.Missing).Value2 = rawData;
/*Check to see if all the SQL queries have been run from
if (i == lstSqlAddress.Items.Count - 1)
{
//Turn Auto Calc back on
excelApp.Calculation = Excel.XlCalculation.xlCalculationAutomatic;
/*Run through the value save sheet array then grab the address from the corresponding list
*place in the address array. If the address reads "whole sheet" then save the whole page,
*else set the addresses range and value save that.
for (int y = 0; y < lstSaveSheet.Items.Count; y++)
{
MessageBox.Show("Save Sheet: " + lstSaveSheet.Items[y] + "\n" + "Save Address: " + lstSaveRange.Items[y]);
}*/
//run the macro to hide the unused columns
excelApp.Run("ReportMakerExecute");
//save excel file as hospital name and move onto the next
SaveTemplateAs(templateBook, h);
}
}
}
}
catch (Exception e)
{
MessageBox.Show(e.ToString());
}
}

How to get text from slide in C# using Aspose

i am getting all shapes in slides of ppt file now i want to get text from those shapes how can i do this
here is my method where i am getting shapes of all slides in ppt file
public void Main(string[] args)
{
// The path to the documents directory.
string dataDir = Path.GetFullPath(#"C:\Users\Vipin\Desktop\");
//Load the desired the presentation
Presentation pres = new Presentation(dataDir + "Android.ppt");
using (Presentation prestg = new Presentation(dataDir + "Android.ppt"))
{
//Accessing a slide using its slide index
int slideCount = prestg.Slides.Count();
for (int i = 0; i <= slideCount - 1; i++)
{
ISlide slide = pres.Slides[i];
foreach (IShape shap in slide.Shapes)
{
int slideCountNumber = i + 1;
float shapeHeight = shap.Frame.Height;
float shapeWidth = shap.Frame.Width;
Debug.Write("slide Number: " + slideCountNumber + " shape width = " + shapeWidth + " shapeHeight = " + shapeHeight);
}
}
}
}
now ho can i get the text from it
aspose will give u truncated text if u don't have the license of it. so it will be better for you if you will use Microsoft.Office.Interop.PowerPoint
use as below
public void ReadSlide(){
string filePath= #"C:\Users\UserName\Slide.pptx";
Microsoft.Office.Interop.PowerPoint.Application PowerPoint_App = new Microsoft.Office.Interop.PowerPoint.Application();
Microsoft.Office.Interop.PowerPoint.Presentations multi_presentations = PowerPoint_App.Presentations;
Microsoft.Office.Interop.PowerPoint.Presentation presentation = multi_presentations.Open(filePath, MsoTriState.msoFalse, MsoTriState.msoFalse, MsoTriState.msoFalse);
string presentation_textforParent = "";
foreach (var item in presentation.Slides[1].Shapes)
{
var shape = (Microsoft.Office.Interop.PowerPoint.Shape)item;
if (shape.HasTextFrame == MsoTriState.msoTrue)
{
if (shape.TextFrame.HasText == MsoTriState.msoTrue)
{
var textRange = shape.TextFrame.TextRange;
var text = textRange.Text;
presentation_textforParent += text + " ";
}
}
}
}
You may want to extract text not from all shapes, but from text frames instead. In order to do this use the GetAllTextFrames static method exposed by the PresentationScanner class
using (Presentation prestg = new Presentation(dataDir + "Android.ppt"))
{
//Get an Array of ITextFrame objects from all slides in the PPTX
ITextFrame[] textFramesPPTX = Aspose.Slides.Util.SlideUtil.GetAllTextFrames(pptxPresentation, true);
//Loop through the Array of TextFrames
for (int i = 0; i < textFramesPPTX.Length; i++)
//Loop through paragraphs in current ITextFrame
foreach (IParagraph para in textFramesPPTX[i].Paragraphs)
//Loop through portions in the current IParagraph
foreach (IPortion port in para.Portions)
{
//Display text in the current portion
Console.WriteLine(port.Text);
//Display font height of the text
Console.WriteLine(port.PortionFormat.FontHeight);
//Display font name of the text
if (port.PortionFormat.LatinFont != null)
Console.WriteLine(port.PortionFormat.LatinFont.FontName);
}
See documentation

Categories