Looping FindText method from GemBox.Spreadsheet - c#

Took an example from the website and trying to create a loop that would tag certain cells based on their cell content which would be identified through the FindText Method from the Gembox component
My goal is:
find cell with a partial match of the keyword
going to the last column of that row
changing the color of that row to a specific color
keep going down document repeating previous commands
stopping once the document has ended
The search works in a sense of finding the query then doing what I instructed it to do, but it stops after the 1st search result.
Is there a way to loop the search using this method or can I use it and another method to test a cell to see if it has a partial piece of what I'm searching for?
This is the link that I'm basing my knowledge on:
https://www.gemboxsoftware.com/spreadsheet/examples/excel-search/109
Thanks again guys.
Below is me working out how the system works on a 1 query basis I'd like to do this for the whole document
using System;
using System.Drawing;
using System.Text;
using System.IO;
using GemBox.Spreadsheet;
using System.Data;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
namespace autoexcel2
{
class Program
{
[STAThread]
static void Main(string[] args)
{
//IF USING PRO PUT YOUR SERIAL BELOW
SpreadsheetInfo.SetLicense("FREE-lIMITED-KEY");
ExcelFile ef = ExcelFile.Load("sample.xlsx");
string searchText = "pharma";
var ws = ef.Worksheets[0];
StringBuilder sb = new StringBuilder();
int row;
int col;
ws.Cells.FindText(searchText, false, false, out row, out col);;
if (row == -1 || col == -1)
{
sb.AppendLine("cant find nada");
Console.WriteLine(sb.ToString());
}
else
{
ws.Cells[row,5].Style.FillPattern.SetSolid(Color.Aqua);
}
ef.Save("done.xlsx");
}
}
}

Try the following:
var workbook = ExcelFile.Load("sample.xlsx");
var worksheet = workbook.Worksheets[0];
var searchText = "pharma";
foreach (var currnetRow in worksheet.Rows)
{
int row, col;
if (currnetRow.Cells.FindText(searchText, false, false, out row, out col))
currnetRow.AllocatedCells.Last().Style.FillPattern.SetSolid(Color.Aqua);
}
workbook.Save("done.xlsx");
With this, you can find the first occurrence of searched text in the row and then format the row's last cell as needed.
But if you need to format those found cells, then the above might not work for you because a single row could have multiple cells with searched text.
In that case, you could use something like the following:
var workbook = ExcelFile.Load("sample.xlsx");
var worksheet = workbook.Worksheets[0];
var searchText = "pharma";
foreach (var row in worksheet.Rows)
{
var range = row.Cells.GetSubrangeAbsolute(row.Index, 0, row.Index, row.AllocatedCells.Count);
while (range.FindText(searchText, out int r, out int c))
{
worksheet.Cells[r, c].Style.FillPattern.SetSolid(Color.Aqua);
range = range.GetSubrangeAbsolute(r, c + 1, r, range.LastColumnIndex);
}
}
workbook.Save("done.xlsx");

Related

How to get a merged cell from Excel using DocumentFormat.OpenXML or ClosedXML C#

I need to get the area of the merged cell, the line number on which the area ends in Excel using only DocumentFormat.OpenXml or ClosedXML, how to do this for each cell?
Using ClosedXML, this could be done with:
var ws = workbook.Worksheet("Sheet 1");
var cell = ws.Cell("A2");
var mergedRange = cell.MergedRange();
var lastCell = mergedRange.LastCell();
// or
var lastCellAddress = mergedRange.RangeAddress.LastAddress;
I found this to be a little clunky but I believe this to be the correct approach.
private static void GetMergedCells()
{
var fileName = $"c:\\temp\\Data.xlsm";
// Open the document.
using (SpreadsheetDocument document = SpreadsheetDocument.Open(fileName, false))
{
// Get the WorkbookPart object.
var workbookPart = document.WorkbookPart;
// Get the first worksheet in the document. You can change this as need be.
var worksheet = workbookPart.Workbook.Descendants<Sheet>().FirstOrDefault();
// Retrieve the WorksheetPart using the Part ID from the previous "Sheet" object.
var worksheetPart = (WorksheetPart)workbookPart.GetPartById(worksheet.Id);
// Retrieve the MergeCells element, this will contain all MergeCell elements.
var mergeCellsList = worksheetPart.Worksheet.Elements<MergeCells>();
// Now loop through and spit out each range reference for the merged cells.
// You'll need to process the range either as a string or turn it into another
// object that gives you the end row.
foreach (var mergeCells in mergeCellsList)
{
foreach (MergeCell mergeCell in mergeCells)
{
Console.WriteLine(mergeCell.Reference);
}
}
}
}
If you couldn't already tell, this is using DocumentFormat.OpenXml.Spreadsheet

C# DataTable Show Single Row in Console

I have searched high and low for a method to show the entire row of a C# datatable, both by referencing the row number and by simply writing the row contents to a string variable and showing the string in the console. I can specify the exact row and field value and display that value, but not the whole row. This is not a list in C#, this is a datatable.
For the simple code below, the output I get for the first WriteLine is "Horse", but the second two WriteLine commands, I get the console output of "System.Data.DataRow" instead of the whole row of data.
What am I doing wrong? Any help would be appreciated.
using System;
using System.Data;
using System.Threading;
namespace DataTablePractice
{
class Program
{
static void Main(string[] args)
{
// Create a DataTable.
using (DataTable table = new DataTable())
{
// Two columns.
table.TableName = "table";
table.Columns.Add("Number", typeof(string));
table.Columns.Add("Pet", typeof(string));
// ... Add two rows.
table.Rows.Add("4", "Horse");
table.Rows.Add("10", "Moose");
// ... Display first field of the first row in the console
Console.WriteLine(table.Rows[0].Field<string>(1));
//...Display the first row of the table in the console
Console.WriteLine(table.Rows[0]);
//...Create a new row variable to add a third pet
var newrow = table.Rows.Add("15", "Snake");
string NewRowString = newrow.ToString();
//...Display the new row of data in the console
Console.WriteLine(NewRowString);
//...Sleep for a few seconds to examine output
Thread.Sleep(4000);
}
}
}
}
When you run this:
Console.WriteLine(table.Rows[0]);
It's in effect calling this:
Console.WriteLine(table.Rows[0].ToString()); // prints object type, in this case a DataRow
If it were your own class, you could override ToString to return whatever you need, but you don't have that option with the DataRow class. And so it uses the default behavior as described here:
Default implementations of the Object.ToString method return the fully qualified name of the object's type.
You could iterate through the columns, like this for example:
var row = table.Rows[0];
for (var i = 0; i < row.Count; i++)
Console.Write(row[i] + " : ");
Or, a shorter way to print them all out:
Console.WriteLine(String.Join(" : ", table.Rows[0].ItemArray));
Given your data, maybe you just want to reference the two fields?
foreach (DataRow row in dt.Rows)
Console.WriteLine($"You have {row[0]} {row[1]}(s).");
// You have 4 Horse(s).
// You have 10 Moose(s).
While the answer here is excellent, I highly recommend using Spectre.Console
It is an open source library that helps you generate highly formatted console output.
With this, the code to write the output simply becomes:
public static void Print(this DataTable dataTable)
{
var table = new Table();
table.AddColumn("#");
for (int i=0;i<dataTable.Columns.Count;i++)
{
table.AddColumn(dataTable.Columns[i].ColumnName);
}
for(int i=0;i<dataTable.Rows.Count;i++)
{
var values = new List<string>
{
i.ToString()
};
for (int j = 0; j < dataTable.Columns.Count;j++)
{
values.Add(dataTable.Rows[i][j]?.ToString()??"null");
}
table.AddRow(values.ToArray());
}
AnsiConsole.Write(table);
}

How to remove unnecessary white space and sort in a datagrid from csv file, using C#?

I going to explain what I need.
In the end of the program, it will be able to input csv file, calculate and output the result. For now I'm doing it step by step.
Able to import csv to datagridview (done)
Remove unnecessary white space, sort it by name (in progress)
Calculation
To makes my question clear and easy to understand, here is the csv file sample.
As you can see that there are repeated 'lotID' is every section, and 2 type to lotID.
And here is what i have done so far. let's call this pic a.I successfully filter out lotID of the 1st type lotID.
This is pic B , as you can see the 'LotID' of second type(MSA) is appear again in each section
As you can see in PIC A, the lotID of each section is not repeated, and it white space appear in each section. This is the first thing i try want to fix.
Secondly, I want to filter out the 'LotID' header of second type lotid.
Here is the code.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
namespace test2
{
public partial class Form1 : Form
{
OpenFileDialog openFile = new OpenFileDialog();
public Form1()
{
InitializeComponent();
}
private void Button1_Click(object sender, EventArgs e)
{
if (openFile.ShowDialog() == DialogResult.OK)
{
List<string[]> rows = File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();
DataTable dt = new DataTable();
List<string> headerNames = rows[0].ToList();
foreach (var headers in rows[0])
{
dt.Columns.Add(headers);
}
foreach (var x in rows.Skip(1))
{
if (x.SequenceEqual(headerNames)) //linq to check if 2 lists are have the same elements (perfect for strings)
continue; //skip the row with repeated headers
dt.Rows.Add(x);
}
dataGridView1.DataSource = dt;
}
}
private void Form1_Load_1(object sender, EventArgs e)
{
openFile.Filter = "CSV|*.csv";
}
}
}
For sorting by header & removing the blank rows, try this piece of code: (this requires you to know "Lot ID" will be the first column)
private void Button1_Click(object sender, EventArgs e)
{
if (openFile.ShowDialog() == DialogResult.OK)
{
List<string[]> rows = File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();
DataTable dt = new DataTable();
List<string> headerNames = rows[0].ToList();
foreach (var headers in rows[0])
{
dt.Columns.Add(headers);
}
foreach (var x in rows.Skip(1).OrderBy(r => r.First())) //sort based on first column of each row
{
if (x.SequenceEqual(headerNames)) //linq to check if 2 lists are have the same elements (perfect for strings)
continue; //skip the row with repeated headers
if (x.All(val => string.IsNullOrWhiteSpace(val))) //if all columns of the row are whitespace / empty, skip this row
continue;
dt.Rows.Add(x);
}
dataGridView1.DataSource = dt;
}
}
As a kind of hackish way to remove a duplicated header line, you could try this:
if (x[0] == "Lot ID")
continue;
instead of
if (x.SequenceEqual(headerNames))
continue;
It's not very elegant, but it will work.
I'll add some explanation to the linq methods used:
File.ReadLines(openFile.FileName).Select(x => x.Split(',')).ToList();
Reads all the lines in the file, the .Select goes through each line and splits based on commma (since it is csv). Split by default returns an array of splitted values, and finally ToList() means this line returns a List of array of strings. The array contains individual cell values while the list contains rows.
List<string> headerNames = rows[0].ToList();
This saves the first row, which contains all the header names into a separate List which we can use later.
foreach (var x in rows.Skip(1).OrderBy(r => r.First()))
Skip() method ignores the first element in the list (and takes all the others), and OrderBy() sorts alphabetically, r => r.First() just means for each row "r", sort based on the First column inside "r.First()". "x" represents each row.
if (x[0] == "Lot ID")
This is not LINQ anymore, it just checks if the first column of this row is "Lot ID" and if it is, "continue" skips to the next row in foreach.
Hope my explanations helped you learn! A link to some basic LINQ is in the comments.

openXML overwrites formated cell in template

I have a formated template stored in the Database.
after building and opening the Excel the cell has the format but its not formated like it should.
example: the field looks in the template like this. 1234.56$ but know it is looking like this 1234.56. so the $ is missing.
second example. 12% its looking like but know its looking like this 11.9999999997%
The value I put in are exact values. like 1234.56 and 11.9999999997% so if i put them manually in the generatet excle it worsk with the formating but not during the creating phase.
does anyone have some ideas?
My insert statment
public static void InsertRows(List<ExcelRow> rowDefinitions, Stream template, string sheetName)
{
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(template, true))
{
// tell Excel to recalculate formulas next time it opens the doc
doc.WorkbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
doc.WorkbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
foreach (var rd in rowDefinitions)
{
// first get the context (WS + SheetData)
var ws = GetWorksheetPart(doc.WorkbookPart, sheetName);
var sheetData = ws.Worksheet.Descendants<SheetData>().First();
var nr = CreateRow((uint)rd.RowIndex, sheetData);
foreach (var cd in rd.Cells)
{
var c = EnsureCell(nr, cd.ColumnName);
SetCellValue(cd.CellText, c, doc.WorkbookPart.SharedStringTablePart);
}
}
doc.WorkbookPart.Workbook.Save();
}
}

DataRow constructor inaccessible when writing DataSet extension?

I am trying to write a couple of extensions to convert UniDataSets and UniRecords to DataSet and DataRow but I get the following error when I try to compile.
'System.Data.DataRow.DataRow(System.Data.DataRowBuilder)' is inaccessible due to its protection level
Is there any way to fix this or should I abandon this approach and come at it a different way?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data;
using IBMU2.UODOTNET;
namespace Extentions
{
public static class UniDataExtentions
{
public static System.Data.DataSet ImportUniDataSet(this System.Data.DataSet dataSet, IBMU2.UODOTNET.UniDataSet uniDataSet)
{
foreach (UniRecord uniRecord in uniDataSet)
{
DataRow dataRow = new DataRow();
dataRow.ImportUniRecord(uniRecord);
dataSet.Tables[0].ImportRow(dataRow);
}
return dataSet;
}
public static void ImportUniRecord(this System.Data.DataRow dataRow, IBMU2.UODOTNET.UniRecord uniRecord)
{
int fieldCount = uniRecord.Record.Dcount();
// ADD COLUMS
dataRow.Table.Columns.AddRange(new DataColumn[fieldCount]);
// ADD ROW
for (int x = 1; x < fieldCount; x++)
{
string stringValue = uniRecord.Record.Extract(x).StringValue;
dataRow[x] = stringValue;
}
}
}
}
It doesn't matter whether it's in an extension method, or any method. The DataRow constructor is not publicly accessible. You need to use the DataTable.NewRow() method to create a new DataRow.
It will use the schema information from the data table to create a row that matches it. If you just tried to use the constructor on it's own the object would have no idea what schema should be used.
I tried a simpler approach, however it is for multiple rows and can be applied to a single row as well:
//Declare a variable for multiple rows
DataRow[] rows = null;
//get some data in a DataTable named table
//Select specific data from DataTable named table
rows = table.Select("column = 'ColumnValue'");
//Read the value in a variable from the row
string ColumnValue = rows[0]["column"].ToString();
hope this helps...

Categories