I m getting System Out of Memory exception while creating pivot table with NReco ExcelPivotTableWriter
public void Write(PivotTable pvtTbl)
{
var tbl = getPivotDataAsTable(pvtTbl.PivotData);
var rangePivotTable = wsData.Cells["A1"].LoadFromDataTable(tbl, false);
var pivotTable = ws.PivotTables.Add(
ws.Cells[1, 1],
rangePivotTable, "pvtTable");
foreach (var rowDim in pvtTbl.Rows)
pivotTable.RowFields.Add(pivotTable.Fields[rowDim]);
foreach (var colDim in pvtTbl.Columns)
pivotTable.ColumnFields.Add(pivotTable.Fields[colDim]);
pivotTable.ColumGrandTotals = false;
pivotTable.DataOnRows = false;
pivotTable.ColumGrandTotals = false;
pivotTable.RowGrandTotals = false;
if (pvtTbl.PivotData.AggregatorFactory is CompositeAggregatorFactory)
{
var aggrFactories = ((CompositeAggregatorFactory)pvtTbl.PivotData.AggregatorFactory).Factories;
for (int i = 0; i < aggrFactories.Length; i++)
{
var dt = pivotTable.DataFields.Add(pivotTable.Fields[String.Format("value_{0}", i)]);
dt.Function = SuggestFunction(aggrFactories[i]);
string columnName = "";
if (dt.Function == OfficeOpenXml.Table.PivotTable.DataFieldFunctions.Sum)
columnName = ((NReco.PivotData.SumAggregatorFactory)aggrFactories[i]).Field;
else if(dt.Function == OfficeOpenXml.Table.PivotTable.DataFieldFunctions.Average)
columnName = ((NReco.PivotData.AverageAggregatorFactory)aggrFactories[i]).Field;
if (columnNames.ContainsKey(columnName))
dt.Name = columnNames[columnName].ToString();
else
dt.Name = aggrFactories[i].ToString();
}
}
else
{
pivotTable.DataFields.Add(pivotTable.Fields["value"]).Function = SuggestFunction(pvtTbl.PivotData.AggregatorFactory);
}
}
error occures while creating rangePivotTable
var rangePivotTable = wsData.Cells["A1"].LoadFromDataTable(tbl, false);
The LazyTotal mode is true
var ordersPvtData = new PivotData(dimentionsArray, composite, true);
The dataset has 200k rows. It is not too much i think. I have 8 gb ram on windows 10.
NReco is free version.
Any solution ?
8G may not be enough physical memory depending upon how large each of the 200K rows are and the memory consumption of the other applications running on your system.
Before you run this program, start the Windows Task Manager and click on the Performance tab.
Note the Available and Free Memory values. Then run your program and watch how the memory is consumed. If your program does consume all of your available memory, then your options are...
Free up more memory by removing other applications that consume memory.
Add more physical memory to your system.
Modify your program to make it more memory efficient. (this includes removal of memory leaks)
Some combination of the prior three options.
You should be able to slice through 200k rows pretty easily. Try it like this . . .
Workbook workbook = new Workbook();
workbook.LoadFromFile(#"C:\your_path_here\SampleFile.xlsx");
Worksheet sheet = workbook.Worksheets[0];
sheet.Name = "Data Source";
Worksheet sheet2 = workbook.CreateEmptySheet();
sheet2.Name = "Pivot Table";
CellRange dataRange = sheet.Range["A1:G200000"];
PivotCache cache = workbook.PivotCaches.Add(dataRange);
PivotTable pt = sheet2.PivotTables.Add("Pivot Table", sheet.Range["A1"], cache);
var r1 = pt.PivotFields["Vendor No"];
r1.Axis = AxisTypes.Row;
pt.Options.RowHeaderCaption = "Vendor No";
var r2 = pt.PivotFields["Description"];
r2.Axis = AxisTypes.Row;
pt.DataFields.Add(pt.PivotFields["OnHand"], "SUM of OnHand", SubtotalTypes.Sum);
pt.DataFields.Add(pt.PivotFields["OnOrder"], "SUM of OnOrder", SubtotalTypes.Sum);
pt.DataFields.Add(pt.PivotFields["ListPrice"], "Average of ListPrice", SubtotalTypes.Average);
pt.BuiltInStyle = PivotBuiltInStyles.PivotStyleMedium12;
workbook.SaveToFile("PivotTable.xlsx", ExcelVersion.Version2010);
System.Diagnostics.Process.Start("PivotTable.xlsx");
Related
I am using Gembox.Documents to insert an HTML file into a Word or PDF document.
Unfortunately, in the resulting Word (or pdf), the height of the contents of the rows (cells) in the table is too high and does not correspond to the original one in the HTML file and I cannot change this with the help of CSS or HTML.
Can you, please, suggest solutions to the problem?
string fileName="zzzz";
var destinationDocument = new DocumentModel();
var section = new Section(destinationDocument);
destinationDocument.Sections.Add(section);
var srcDocument = DocumentModel.Load(TempPath + fileName + ".html");
var pageSetup = srcDocument.Sections[0].PageSetup;
var destpagesPageSetup = destinationDocument.Sections[0].PageSetup;
destpagesPageSetup.Orientation = Orientation.Landscape;
destpagesPageSetup.PageWidth = 1000;
destpagesPageSetup.PageHeight = 1000;
destpagesPageSetup.RightToLeft = true;
destpagesPageSetup.PageMargins.Left = 20;
destpagesPageSetup.PageMargins.Right = 0;
destpagesPageSetup.PageMargins.Bottom = 0;
destpagesPageSetup.PageMargins.Top = 0;
destpagesPageSetup.PageMargins.Gutter = 0;
destpagesPageSetup.PageMargins.Footer = 0;
var mapping = new ImportMapping(srcDocument, destinationDocument, false);
var blocks = srcDocument.Sections[0].Blocks;
foreach (Block b in blocks)
{
//b.ParentCollection.TableFormat.DefaultCellSpacing = 1;
Block b1 = destinationDocument.Import(b, true, mapping);
section.Blocks.Add(b1);
}
var pageSetup1 = section.PageSetup;
destinationDocument.Save(TempPath + fileName + ".pdf");
thanks
This issue occurred because of the cell margins appearing from the HTML content.
After investigating that HTML, the issue was resolved, the fix is available in the current latest bugfix version:
https://www.gemboxsoftware.com/document/downloads/bugfixes.html
Or in the current latest NuGet package:
https://www.nuget.org/packages/GemBox.Document/
So I'm trying to read an excel file with C# and the document is 181MB. I have tried using Microsoft.Office.Interop.Excel, OpenXML, ClosedXML, and ExcelDataReader.
I wasn't able to get OpenXML to work and ClosedXML seems to have issues with large excel file (it also takes at least 6 minutes to read the file). I like ExcelDataReader the most since I can read the data table like an array but it does take 4-5 minutes to read the file which is much faster than Interlop, but it's still a long wait. I'm considering converting the excel document into a csv file, but when I did that the size went from 181 MB to 248 MB so I'm unsure if it will be more efficient. It also forces the users to do an extra step to convert their files into a csv, but if the performance is worth it I will attempt this route.
Unfortunately, I am not able to pre-determine how many columns and rows the excel document will have as the users will be using openFileDialog to select a file.
Is ExcelDataReader the best way to go or is there a better solution?
Here's my current code in case there's some improvements I can make:
OpenFileDialog openFileDialog = new OpenFileDialog();
openFileDialog.Filter = "Excel Files|*.xls;*.xlsx;*.slxm";
if (openFileDialog.ShowDialog() == true)
{
using (var stream = File.Open(openFileDialog.FileName, FileMode.Open, FileAccess.Read))
{
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
//results will be in dataSet.Tables
var dataSet = reader.AsDataSet();
var dataTable = dataSet.Tables[0];
int r = 0;
for(int c = 0; c < dataTable.Columns.Count; c += 3)
{
TagListData.Add(new TagClass { IsTagSelected = false, TagName = dataTable.Rows[r][c].ToString(), rIndex = r, cIndex = c });
}
}
}
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
}
Idea 1: There is some overhead with ExcelDataReader's AsDataSet - so it's a good idea to use the reader interface directly when working with large sheets. It implements the IDataReader interface and provides pr-row level access to the data:
using (var reader = ExcelReaderFactory.CreateReader(stream)) {
reader.Read();
for(int c = 0; c < reader.FieldCount; c += 3) {
TagListData.Add(new TagClass { IsTagSelected = false, TagName = Convert.ToString(reader.GetValue(c)), rIndex = r, cIndex = c });
}
}
Idea 2: Try to pass ExcelDataSetConfiguration.UseColumnDataType = false to AsDataSet, this eliminates an internal pass and reduces memory pressure, so should improve performance noticably with large sheets
I am trying to achieve the following: I have a C# application which does some data processing and then outputs to a .xlsx using EPPlus. I want to add some conditional formatting to the excel and tried the following method, first I made a template blank excel with all conditional formatting rules set up and then tried dumping the data in it. The snippet below is my approach. p is an Excel package. Currently this does not work, the data is written correctly however the formatting rules that I set up are lost. I'm guessing because it basically clears everything before writing. Any help will be appreciated!
Byte[] bin = p.GetAsByteArray();
File.Copy("C:\\template.xlsx", "C:\\result.xlsx");
using (FileStream fs = File.OpenWrite("C:\\result.xlsx")) {
fs.Write(bin, 0, bin.Length);
}
Note :: I tried the following as well to avoid the whole external template situation.. check snippet below. The problem with this is that, after the .xlsx is generated and I open it, it says the file has unreadable or not displayable content and that it needs to repair it and after I do that, everything is fine and the conditional formatting has also worked. I have no clue why its doing that or how I can get rid of the error upon file opening.
string _statement = "$E1=\"3\"";
var _cond = ws.ConditionalFormatting.AddExpression(_formatRangeAddress);
_cond.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid;
_cond.Style.Fill.BackgroundColor.Color = Color.LightCyan;
_cond.Formula = _statement;
Any help will be appreciated!!
The method of using fs.Write will simply overwrite the copied file with the epplus generated file since you are doing it at the byte/stream level. So that will not get you what you want. (#MatthewD was showing you this in his post).
As for applying the format itself, what you have should work but if you are getting that kind of error I suspect you are mixing epplus and non-epplus manipulation of the excel file. This is how you should be doing it roughly:
[TestMethod]
public void Conditional_Format_Test()
{
//http://stackoverflow.com/questions/31296039/conditional-formatting-using-epplus
var existingFile = new FileInfo(#"c:\temp\temp.xlsx");
if (existingFile.Exists)
existingFile.Delete();
//Throw in some data
var datatable = new DataTable("tblData");
datatable.Columns.Add(new DataColumn("Col1", typeof(int)));
datatable.Columns.Add(new DataColumn("Col2", typeof(int)));
datatable.Columns.Add(new DataColumn("Col3", typeof(int)));
for (var i = 0; i < 20; i++)
{
var row = datatable.NewRow();
row["Col1"] = i;
row["Col2"] = i * 10;
row["Col3"] = i * 100;
datatable.Rows.Add(row);
}
using (var pack = new ExcelPackage(existingFile))
{
var ws = pack.Workbook.Worksheets.Add("Content");
ws.Cells["E1"].LoadFromDataTable(datatable, true);
//Override E1
ws.Cells["E1"].Value = "3";
string _statement = "$E1=\"3\"";
var _cond = ws.ConditionalFormatting.AddExpression(new ExcelAddress(ws.Dimension.Address));
_cond.Style.Fill.PatternType = ExcelFillStyle.Solid;
_cond.Style.Fill.BackgroundColor.Color = Color.LightCyan;
_cond.Formula = _statement;
pack.SaveAs(existingFile);
}
}
To expand on #Ernie code sample, here's a working example that colors a range according to cell's value. Each cell of the range can have any of three colors depending on the cell's value (<.01, <.05, <.1).
ExcelRange rng = ws.Cells[statsTableRowStart, 10, statsTableRowStart + gud.levels.level.Count() - 1, 10];
OfficeOpenXml.ConditionalFormatting.Contracts.IExcelConditionalFormattingExpression _condp01 = ws.ConditionalFormatting.AddExpression(rng);
_condp01.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid;
_condp01.Style.Fill.BackgroundColor.Color = System.Drawing.Color.OrangeRed;
_condp01.Formula = new ExcelFormulaAddress(rng.Address) + "<.01";
OfficeOpenXml.ConditionalFormatting.Contracts.IExcelConditionalFormattingExpression _condp05 = ws.ConditionalFormatting.AddExpression(rng);
_condp05.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid;
_condp05.Style.Fill.BackgroundColor.Color = System.Drawing.Color.OliveDrab;
_condp05.Formula = new ExcelFormulaAddress(rng.Address) + "<.05";
OfficeOpenXml.ConditionalFormatting.Contracts.IExcelConditionalFormattingExpression _condp1 = ws.ConditionalFormatting.AddExpression(rng);
_condp1.Style.Fill.PatternType = OfficeOpenXml.Style.ExcelFillStyle.Solid;
_condp1.Style.Fill.BackgroundColor.Color = System.Drawing.Color.LightCyan;
_condp1.Formula = new ExcelFormulaAddress(rng.Address) + "<.1";
I found some code to add an image to an Excel-sheet with the SDK 2.0. And this part works fine. Now I want a Text Box under the Image, but I don't know how to get a TextBox in general.
Which classes do I need an what is appands what or which property?
Furthermore it would be nice if it be groupt. So that when you drag one the other is following.
The code look like this (I know it's a bit much, but I couldt cut it more):
private void addImage(Offset offset, Extents extents, string sImagePath, string description)
{
WorksheetPart worksheetPart = this.arbeitsBlatt.WorksheetPart;
DrawingsPart drawingsPart;
ImagePart imagePart;
XDrSp.WorksheetDrawing worksheetDrawing;
ImagePartType imagePartType = getImageType(sImagePath);
{
// --- use the existing DrawingPart
drawingsPart = worksheetPart.DrawingsPart;
imagePart = drawingsPart.AddImagePart(imagePartType);
drawingsPart.CreateRelationshipToPart(imagePart);
worksheetDrawing = drawingsPart.WorksheetDrawing;
}
using (FileStream fileStream = new FileStream(sImagePath, FileMode.Open))
{
imagePart.FeedData(fileStream);
}
int imageNumber = drawingsPart.ImageParts.Count<ImagePart>();
if (imageNumber == 1)
{
Drawing drawing = new Drawing();
drawing.Id = drawingsPart.GetIdOfPart(imagePart);
this.arbeitsBlatt.Append(drawing);
}
XDrSp.NonVisualDrawingProperties noVisualDrawingProps = new XDrSp.NonVisualDrawingProperties();
XDrSp.NonVisualPictureDrawingProperties noVisualPictureDrawingProps = new XDrSp.NonVisualPictureDrawingProperties();
noVisualDrawingProps.Id = new UInt32Value((uint)(1024 + imageNumber));
noVisualDrawingProps.Name = "Picture " + imageNumber.ToString();
noVisualDrawingProps.Description = beschreibung;
PictureLocks picLocks = new PictureLocks();
picLocks.NoChangeAspect = true;
picLocks.NoChangeArrowheads = true;
noVisualPictureDrawingProps.PictureLocks = picLocks;
XDrSp.NonVisualPictureProperties noVisualPictureProps = new XDrSp.NonVisualPictureProperties();
noVisualPictureProps.NonVisualDrawingProperties = noVisualDrawingProps;
noVisualPictureProps.NonVisualPictureDrawingProperties = noVisualPictureDrawingProps;
Stretch stretch = new Stretch();
stretch.FillRectangle = new FillRectangle();
XDrSp.BlipFill blipFill = new XDrSp.BlipFill();
Blip blip = new Blip();
blip.Embed = drawingsPart.GetIdOfPart(imagePart);
blip.CompressionState = BlipCompressionValues.Print;
blipFill.Blip = blip;
blipFill.SourceRectangle = new SourceRectangle();
blipFill.Append(stretch);
Transform2D t2d = new Transform2D();
t2d.Offset = offset;
t2d.Extents = extents;
XDrSp.ShapeProperties sp = new XDrSp.ShapeProperties();
sp.BlackWhiteMode = BlackWhiteModeValues.Auto;
sp.Transform2D = t2d;
PresetGeometry prstGeom = new PresetGeometry();
prstGeom.Preset = ShapeTypeValues.Rectangle;
prstGeom.AdjustValueList = new AdjustValueList();
sp.Append(prstGeom);
sp.Append(new NoFill());
XDrSp.Picture picture = new XDrSp.Picture();
picture.NonVisualPictureProperties = noVisualPictureProps;
picture.BlipFill = blipFill;
picture.ShapeProperties = sp;
XDrSp.OneCellAnchor anchor = this.getCellAnchor();
XDrSp.Extent extent = new XDrSp.Extent();
extent.Cx = extents.Cx;
extent.Cy = extents.Cy;
anchor.Extent = extent;
anchor.Append(picture);
anchor.Append(new XDrSp.ClientData());
worksheetDrawing.Append(anchor);
worksheetDrawing.Save(drawingsPart);
#endregion
}
I think you are new to OpenXml SDK
First of all you need to use the newest version of Open XMl SDK - Version 2.5 [Download - http://www.microsoft.com/en-us/download/details.aspx?id=30425]
Here download BOTH OpenXMLSDKV25.msi , OpenXMLSDKToolV25.msi .. Install BOTH.
Now here is the trick, OpenXML productivity tool is the one you need here. It allows you to brows an existing Excel file and break it down to CODES [watch here - https://www.youtube.com/watch?v=KSSMLR19JWA]
Now what you need to do is create an Excel sheet manually with what you want [In your case add Text Box under the Image] Then open this Excel file with productivity tool and understand the CODE . Note that you will need to understand Spreadsheet file structure to understand this CODE [ Reffer this - https://www.google.com/#q=open+xml+sdk] .. Now write your codes to meet your requirement using codes of Productivity tool
NOTE - Once you analyse the dummy Spreadsheet with Productivity tool you will understand why giving or guiding with CODE examples as an answer is not practical.
- Happy Coding-
In another question I asked how to export an excel worksheet as image. Well, the logic behind the answer is OK. But I'm geting an Exception when calling CopyPicture (System.Runtime.InteropServices.COMException).
var a = new Microsoft.Office.Interop.Excel.Application();
Workbook w = a.Workbooks.Open(#"C:\scratch\blueyellow.xlsx");
Worksheet ws = w.Sheets["StatusR"];
ws.Protect(Contents: false);
Thread.Sleep(3000); // Fix (sometimes)
Range r = ws.Range["B4:P24"];
r.CopyPicture(XlPictureAppearance.xlScreen, XlCopyPictureFormat.xlBitmap); // <--- Exception
var data = Clipboard.GetDataObject();
data.GetDataPresent(DataFormats.Bitmap);
Image image = (Image)data.GetData(DataFormats.Bitmap, true);
image.Save(#"C:\scratch\informe_by.png", System.Drawing.Imaging.ImageFormat.Png);
w.Close(SaveChanges: false);
a.Quit();
I have put a Thread.Sleep() line before to solve this issue. It works most of the time. But would like it to work always without extrange behaviors.
I'm using Windows 8 Profesional 64 bits, Office 2013 64 bits and .Net 4
What can be wrong?
After unsuccesful research, I ended up with this inelegant solution:
var errorCounter = 0;
var copyDone = false;
do {
try {
r.CopyPicture(XlPictureAppearance.xlScreen, XlCopyPictureFormat.xlBitmap);
copyDone = true;
} catch {
++errorCounter;
} while (!copyDone && errorCounter <= 100);
if (errorCounter == 100) throw new ApplicationException("Unable to copy the selected range.");
I hope this help others.