Creating groups/blocks from table based on value - c#

I have input table in datagridview (output is showed in green) and I need to get to this output:
'Start of block' 'Size' 'TypKar'
1.2.2017 0:00:02 14 6280
1.2.2017 0:03:33 2 3147
1.2.2017 0:04:17 2 4147
1.2.2017 0:04:28 2 6280
1.2.2017 0:04:59 10 3147
Right now I use for loop in which I write first entry and then I count until value in column TypKar changes. When it changes, I write date and type and start counting from 1.
for(int i = 0; i < dviewExport.RowCount; i++)
{
//first line in excel
if(totalCount == 0)
{
totalCount = 32;
signCount = 1;
excelWsExport.Cells[totalCount, 2] = (DateTime)dviewExport[0, i].Value;
excelWsExport.Cells[totalCount, 3] = 1;
excelWsExport.Cells[totalCount, 4] = dviewExport["TypKar", i].Value;
continue;
}
//value is same = just increment
if((excelWsExport.Cells[totalCount, 4] as Excel.Range).Value.ToString() == dviewExport["TypKar", i].Value.ToString())
{
excelWsExport.Cells[totalCount, 3] = (excelWsExport.Cells[totalCount, 3] as Excel.Range).Value + 1;
signCount++;
if(maxCount < signCount)
maxCount = signCount;
}
//value changed = write new line and restart incrementing
else
{
totalCount++;
signCount = 1;
excelWsExport.Cells[totalCount, 2] = (DateTime)dviewExport[0, i].Value;
excelWsExport.Cells[totalCount, 3] = 1;
excelWsExport.Cells[totalCount, 4] = dviewExport["TypKar", i].Value;
}
}
Problem is, that I write it to excel and when data have several thousands of rows it takes a lot of time.
Is it possible to speed it up with excel interop - write it to array and then paste array to excel / sql / ling or anything else?
I tried to find similar problem and get some answers but I don't know how to describe my problem.

In one of the applications I'm working on right now I use something similar to:
string connectionString = "my connection string";
for (int i = 0; i < dataGridView1.RowCount - 1; i++)
{
DataGridViewRow row = dataGridView1.Rows[i];
SqlConnection conn = new SqlConnection(connectionString);
conn.Open();
try
{
var queryString = "INSERT INTO [SQLdb] " +
"(columnNamesInDB) " +
"VALUES (#dataBeingRead)";
SqlCommand comm = new SqlCommand(queryString, conn);
comm.ExecuteNonQuery();
comm.Close();
}
catch (Exception e)
{
//catch behavior
}
To loop through every value in the grid view and insert into an SQL server. Works pretty quickly for our purposes (~1000 range currently).

Based on Export a C# List of Lists to Excel I managed to fast things up by creating generic lists, then pasting it to object lists with two dimensions and then these created lists to excel range. This is way more faster than writing each time to excel cell.
Problem is that Excel does not like List<T> or either list[]. You have to send to excel object[,] (two dimensional) and since I had just one dimension, I made second dimesion 1.
//create generic lists
List<DateTime> listDate = new List<DateTime>();
List<int> listSize = new List<int>();
List<string> listSign = new List<string>();
//fill lists with data from wherever
for(int i = 0; i < dviewExport.RowCount; i++)
{
if(listSign.Count == 0)
{
signCount = 1;
listDate.Add((DateTime)dviewExport[0, i].Value);
listSize.Add(1);
listSign.Add((string)dviewExport[$"{Sign}", i].Value);
continue;
}
if(listSign[listSign.Count - 1] == dviewExport[$"{Sign}", i].Value.ToString())
{
listSize[listSize.Count - 1] += 1;
signCount++;
if(maxCount < signCount)
maxCount = signCount;
}
else
{
signCount = 1;
listDate.Add((DateTime)dviewExport[0, i].Value);
listSize.Add(1);
listSign.Add((string)dviewExport[$"{Sign}", i].Value);
}
}
//create two dimensional object lists with size of generic lists
object[,] outDate = new object[listDate.Count, 1];
object[,] outSize = new object[listSize.Count, 1];
object[,] outSign = new object[listSign.Count, 1];
//fill two dimensional object lists with data from generic lists
for(int row = 0; row < listDate.Count; row++)
{
outDate[row, 0] = listDate[row];
outSize[row, 0] = listSize[row];
outSign[row, 0] = listSign[row];
}
//set Excel ranges and paste lists
range = excelWsExport.get_Range($"B32:B{32 + listDate.Count}", Type.Missing);
range.NumberFormat = "d.MM.yyyy H:mm:ss";
range.Value = outDate;
range = excelWsExport.get_Range($"C32:C{32 + listSize.Count}", Type.Missing);
range.Value = outSize;
range = excelWsExport.get_Range($"D32:D{32 + listSign.Count}", Type.Missing);
range.Value = outSign;

Related

Microsoft.Office.Interop IndexOutOfRangeException

The below IndexOutOfRangeException is not letting my code run (it compiles). While I understand this kind of exception (array indexes etc) the issue is, what I am trying to do is simply update the String subsection2 with the value in cell B[excelrow]. For some reason, there is an index out of bounds exception which to me does not make sense. Neither subsection2 or excelrow is part of an array. The only array I can think of is the excel array, but excelrow is an integer with value of 3, it should updated to row B3, and so on. (I've even tried updating with B3 directly and I get the same error).
To help you out further with context, this method called createsource takes as input the excel spreadsheet and the total rows in that sheet. It does the below code to output a 2D array containing in the first dimension the excel index of each new order (each different customer), and the 2nd dimension is the number of items ordered per customer.
The method for the code is below:
private int[,] createsource(Microsoft.Office.Interop.Excel.Worksheet xlWorksheet, int totalRows)
{
String subsection = "";
object subsection2 = "";
int orders = 0;
//figures out how many different pages there are going to be
for (int n = 3; n < totalRows + 1; n++)
{
if (!(xlWorksheet.get_Range("B" + n.ToString()).Text == subsection))
{
subsection = xlWorksheet.get_Range("B" + n.ToString()).Text;
orders++;
}
}
MessageBox.Show(orders.ToString());
int[,] source = new int[orders, 2];
int excelrow = 3;
subsection2 = xlWorksheet.get_Range("B" + excelrow.ToString()).Text;
int i;
for (i = 0; i < orders + 1; i++)
{
int j = 1;
if (excelrow == totalRows + 1)
{
break;
}
//Out of bounds exception is found in the below if statement updating subsection2:
if (!(xlWorksheet.get_Range("B" + excelrow.ToString()).Text == subsection2))
{
source[i, 0] = excelrow;
//MessageBox.Show(xlWorksheet.get_Range("B" + excelrow.ToString()).Text.ToString());
subsection2 = xlWorksheet.get_Range("B" + excelrow.ToString()).Text;
excelrow++;
}
for (int iter = 0; iter < 1;)
{
if (excelrow == totalRows + 1)
{
break;
}
if (xlWorksheet.get_Range("B" + excelrow.ToString()).Text == subsection2)
{
excelrow++;
j++;
}
if (!(xlWorksheet.get_Range("C" + excelrow.ToString()).Text == subsection2))
{
subsection2 = xlWorksheet.get_Range("C" + excelrow.ToString()).Text;
iter = 1;
}
}
source[i, 1] = j;
}
MessageBox.Show(source[2, 0].ToString());
return source;
}
I see the problem. You're declaring source as:
int[,] source = new int[orders, 2];
... okay, but look at your loop:
for (i = 0; i < orders + 1; i++)
... which later feeds into:
source[i, 0] = excelrow;
Okay, so if orders = 100, you've declared a 100 long array, going from 0-99. Then your loop, you go from 0 to "less than 100+1", aka 0-100. When you get to the last loop, you're using a value of i=100, and trying to put it into the array spot that doesn't exist.
You need to either decrease your loop by one, or increase your array size by 1.

SSIS Export to Excel using Script Task

I'm trying to use a Script Task to export data to Excel because some of the reports I generate simply have too many columns to keep using a template file.
The most annoying part about using a template is: if something as simple as a column header changes, the metadata gets screwed forcing me to recreate my DataFlow. Because I use an OLE DB source, I need to use a Data Transformation task to convert between unicode and non-unicode character sets, then remap my Excel Destination to the "Copy of field x" in order for the Excel document to create properly.
This takes far too long and I need a new approach.
I have the following method in a script task using Excel = Microsoft.Office.Interop.Excel:
private void ExportToExcel(DataTable dataTable, string excelFilePath = null)
{
Excel.Application excelApp = new Excel.Application();
Excel.Worksheet workSheet = null;
try
{
if (dataTable == null || dataTable.Columns.Count == 0)
throw new System.Exception("Null or empty input table!" + Environment.NewLine);
excelApp.Workbooks.Add();
workSheet = excelApp.ActiveSheet;
for (int i = 0; i < dataTable.Columns.Count; i++)
{
workSheet.Cells[1, (i + 1)] = dataTable.Columns[i].ColumnName;
}
foreach (DataTable dt in dataSet.Tables)
{
// Copy the DataTable to an object array
object[,] rawData = new object[dt.Rows.Count + 1, dt.Columns.Count];
// Copy the column names to the first row of the object array
for (int col = 0; col < dt.Columns.Count; col++)
{
rawData[0, col] = dt.Columns[col].ColumnName;
}
// Copy the values to the object array
for (int col = 0; col < dt.Columns.Count; col++)
{
for (int row = 0; row < dt.Rows.Count; row++)
{
rawData[row + 1, col] = dt.Rows[row].ItemArray[col];
}
}
// Calculate the final column letter
string finalColLetter = string.Empty;
string colCharset = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int colCharsetLen = colCharset.Length;
if (dt.Columns.Count > colCharsetLen)
{
finalColLetter = colCharset.Substring((dt.Columns.Count - 1) / colCharsetLen - 1, 1);
}
finalColLetter += colCharset.Substring((dt.Columns.Count - 1) % colCharsetLen, 1);
workSheet.Name = dt.TableName;
// Fast data export to Excel
string excelRange = string.Format("A1:{0}{1}", finalColLetter, dt.Rows.Count + 1);
//The code crashes here (ONLY in SSIS):
workSheet.get_Range(excelRange, Type.Missing).Value2 = rawData;
// Mark the first row as BOLD
((Excel.Range)workSheet.Rows[1, Type.Missing]).Font.Bold = true;
}
List<int> lstColumnsToSum = new List<int>() { 9 };
Dictionary<int, string> dictColSumName = new Dictionary<int, string>() { { 9, "" } };
Dictionary<int, decimal> dictColumnSummation = new Dictionary<int, decimal>() { { 9, 0 } };
// rows
for (int i = 0; i < dataTable.Rows.Count; i++)
{
for (int j = 1; j <= dataTable.Columns.Count; j++)
{
workSheet.Cells[(i + 2), (j)] = dataTable.Rows[i][j - 1];
if (lstColumnsToSum.Exists(x => (x == j)))
{
decimal val = 0;
if (decimal.TryParse(dataTable.Rows[i][j - 1].ToString(), out val))
{
dictColumnSummation[j] += val;
}
}
}
}
//Footer
int footerRowIdx = 2 + dataTable.Rows.Count;
foreach (var summablecolumn in dictColSumName)
{
workSheet.Cells[footerRowIdx, summablecolumn.Key] = String.Format("{0}", dictColumnSummation[summablecolumn.Key]);
}
// check fielpath
if (excelFilePath != null && excelFilePath != "")
{
try
{
if (File.Exists(excelFilePath))
File.Delete(excelFilePath);
workSheet.Activate();
workSheet.Application.ActiveWindow.SplitRow = 1;
workSheet.Application.ActiveWindow.FreezePanes = true;
int row = 1;
int column = 1;
foreach (var item in dataTable.Columns)
{
Excel.Range range = workSheet.Cells[row, column] as Excel.Range;
range.NumberFormat = "#";
range.EntireColumn.AutoFit();
range.Interior.Color = System.Drawing.ColorTranslator.ToOle(System.Drawing.Color.LightGray);
column++;
}
Excel.Range InternalCalculatedAmount = workSheet.Cells[1, 9] as Excel.Range;
InternalCalculatedAmount.EntireColumn.NumberFormat = "#0.00";
InternalCalculatedAmount.Columns.AutoFit();
workSheet.SaveAs(excelFilePath);
}
catch (System.Exception ex)
{
throw new System.Exception("Excel file could not be saved! Check filepath." + Environment.NewLine + ex.Message);
}
}
else // no filepath is given
{
excelApp.Visible = true;
}
}
catch (System.Exception ex)
{
throw new System.Exception("ex.Message + Environment.NewLine, ex.InnerException);
}
}
The exception thrown is a System.OutOfMemoryException when trying to execute the following piece of code:
workSheet.get_Range(excelRange, Type.Missing).Value2 = rawData;
My biggest frustration is that this method works 100% in a regular C# application.
The DataTable contains about 435000 rows. I know it's quite a bit of data but I use this very method, modified of course, to split data across multiple Excel worksheets in one of my other applications, and that DataSet contains about 1.1m rows. So less than half of my largest DataSet should be a walk-in-the-park...
Any light shed on this matter would be amazing!

Populating 3 columns in an excel sheet from C#, without having repeated values across rows

Sorry if the question title was a bit weird. I want to populate 500 excel rows with a composite primary key which consists of 3 columns. 2 columns automatically generate random int between 1 and 50 and third is a date between 01.01.2006 and 31.12.2013. So i want to have 500 rows, each with a different combination of the 3. Here's my code
Type excelType = Type.GetTypeFromProgID("Excel.Application");
dynamic excel = Activator.CreateInstance(excelType);
excel.visible = true;
excel.Workbooks.Add();
Random rnd = new Random();
dynamic sheet = excel.ActiveSheet;
for (int i = 1; i <= 500; i++)
{
sheet.Cells[i, "A"] = rnd.Next(1,50);
sheet.Cells[i, "B"] = rnd.Next(1,50);
sheet.Cells[i, "C"] = RandomDay();
// this is where I'd check if a combination exists and if it does assign a new one
for (int j = 0; j <= i + 1; j++)
{
if ( sheet.Cells[j + 1, "A"] == sheet.Cells[i, "A"] &&
sheet.Cells[j + 1, "B"] == sheet.Cells[i, "B"] &&
sheet.Cells[j + 1, "C"] == sheet.Cells[i, "C"])
{
sheet.Cells[i, "A"] = rnd.Next(1,50);
sheet.Cells[i, "B"] = rnd.Next(1,50);
sheet.Cells[i, "C"] = RandomDay();
}
}
}
}
// random date method
public static DateTime RandomDay()
{
DateTime start = new DateTime(2006, 1, 1);
DateTime end = new DateTime(2013, 12, 31);
Random gen = new Random();
int range = (end - start).Days;
return start.AddDays(gen.Next(range));
}
I'm really not sure if this would work, plus it's running slow, it has to iterate over and over again to check if the combination exists. Does anyone have a better and faster solution?
Thank you all!
If your constraints allow it, I would recommend generating the unique values outside of excel and then inserting them into excel, so you can put them into a Dictionary of tuples.
That way you can check for preexisting values by creating a String from your values and using it as the key in your Dictionary. Then iterate over your Dictionary values and insert them into excel.
HashTables (what a Dictionary is) are constant time for lookup, so you'll save a ton of time guaranteeing uniqueness.
Dictionary<String,Tuple<int,int,DateTime>> store = new Dictionary<String, Tuple<int, int, DateTime>>();
for (int i = 0; i < 500; i++)
{
int n1 = rnd.Next(1,50);
int n2 = rnd.Next(1,50);
DateTime dt = RandomDay();
String key = n1.ToString() + n2.ToString() + dt.ToString();
while (store.ContainsKey(key)) {
n1 = rnd.Next(1,50);
n2 = rnd.Next(1,50);
dt = RandomDay();
key = n1.ToString() + n2.ToString() + dt.ToString();
}
store.Add(key, new Tuple(n1, n2, dt));
}
And to add to excel just iterate over store.Values.

How do I save values from a List<string> to an Excel worksheet using EPPLus

I'm trying to save values from a List<string> to a Excel Worksheet using EPPlus so I wrote this code:
private void button3_Click(object sender, EventArgs e)
{
int value = bdCleanList.Count() / Int32.Parse(textBox7.Text);
string bases_generadas = System.IO.Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "bases_generadas");
var package = new ExcelPackage();
package.Workbook.Worksheets.Add("L1");
ExcelWorksheet worksheet = package.Workbook.Worksheets[1];
worksheet.Name = "L1";
int j = 2;
int col = 1;
for (int i = 1; i < bdCleanList.Count(); i++)
{
if (i%Int32.Parse(textBox7.Text) == 0)
{
package.Workbook.Worksheets.Add("L" + j);
worksheet = package.Workbook.Worksheets[j];
worksheet.Name = "L" + j;
j += 1;
worksheet.Cells[i, col].Value = bdCleanList[i];
}
else
{
worksheet.Cells[i, col].Value = bdCleanList[i];
}
}
Byte[] bin = package.GetAsByteArray();
File.WriteAllBytes(System.IO.Path.Combine(bases_generadas, "bases_generadas_" + DateTime.Now.Ticks.ToString() + DateTime.Now.ToString("dd-MM-yyyy-hh-mm-ss") + ".xlsx"), bin);
MessageBox.Show("Se generaron un total de " + value + " bases y puede encontrarlas en la siguiente ruta: " + System.IO.Path.Combine(bases_generadas, "bases_generadas_" + DateTime.Now.Ticks.ToString() + DateTime.Now.ToString("dd-MM-yyyy-hh-mm-ss") + ".xlsx"), "InformaciĆ³n", MessageBoxButtons.OK, MessageBoxIcon.Information);
}
In the sample I'm running bdCleanList.Count() has 2056 values, Int32.Parse(textBox7.Text) has 500 as value so value gets in this case 5, the problem here is that values for L2, L3 ... L5 aren't saved and I don't know why. Values for first worksheet is saved fine but the rest don't, what's wrong in my code? How do I set active worksheet in order to save values on the active sheet? How do I move between worksheets?
Ok, after some days of headache and a lot of hours reading my code once and once and finding over Internet I found the solution: my code was "good" but I didn't get the mistake until I debug it line by line several times. If yours see in this line worksheet.Cells[i, col].Value = bdCleanList[i]; is where I set values for Cells and it does but for L1 and because i start at 0 and then I wrote in (i+1) all was good, Cells start in 1 and end in 499, then for L2 and because I don't scroll to the end of Column, values start in Cell 500 and end on Cell 1000, and this is right because i was on 500 at the moment where L2 was created. That was the problem. So I change my code to this one:
int pos = 1;
for (int i = 0; i < bdCleanList.Count(); i++)
{
if ((i + 1) % Int32.Parse(textBox7.Text) == 0)
{
package.Workbook.Worksheets.Add("B" + j);
worksheet = package.Workbook.Worksheets[j];
worksheet.Name = "B" + j;
j += 1;
pos = 1;
}
worksheet.Cells[pos, 1].Value = bdCleanList[i];
pos++;
}
And that does the job as I want. Thanks to every people here for try to help me

Fastest way to drop a DataSet into a worksheet

A rather higeisch dataset with 16000 x 12 entries needs to be dumped into a worksheet.
I use the following function now:
for (int r = 0; r < dt.Rows.Count; ++r)
{
for (int c = 0; c < dt.Columns.Count; ++c)
{
worksheet.Cells[c + 1][r + 1] = dt.Rows[r][c].ToString();
}
}
I rediced the example to the center piece
Here is what i implemented after reading the suggestion from Dave Zych.
This works great.
private static void AppendWorkSheet(Excel.Workbook workbook, DataSet data, String tableName)
{
Excel.Worksheet worksheet;
if (UsedSheets == 0) worksheet = workbook.Worksheets[1];
else worksheet = workbook.Worksheets.Add();
UsedSheets++;
DataTable dt = data.Tables[0];
var valuesArray = new object[dt.Rows.Count, dt.Columns.Count];
for (int r = 0; r < dt.Rows.Count; ++r)
{
for (int c = 0; c < dt.Columns.Count; ++c)
{
valuesArray[r, c] = dt.Rows[r][c].ToString();
}
}
Excel.Range c1 = (Excel.Range)worksheet.Cells[1, 1];
Excel.Range c2 = (Excel.Range)worksheet.Cells[dt.Rows.Count, dt.Columns.Count];
Excel.Range range = worksheet.get_Range(c1, c2);
range.Cells.Value2 = valuesArray;
worksheet.Name = tableName;
}
Build a 2D array of your values from your DataSet, and then you can set a range of values in Excel to the values of the array.
object valuesArray = new object[dataTable.Rows.Count, dataTable.Columns.Count];
for(int i = 0; i < dt.Rows.Count; i++)
{
//If you know the number of columns you have, you can specify them this way
//Otherwise use an inner for loop on columns
valuesArray[i, 0] = dt.Rows[i]["ColumnName"].ToString();
valuesArray[i, 1] = dt.Rows[i]["ColumnName2"].ToString();
...
}
//Calculate the second column value by the number of columns in your dataset
//"O" is just an example in this case
//Also note: Excel is 1 based index
var sheetRange = worksheet.get_Range("A2:O2",
string.Format("A{0}:O{0}", dt.Rows.Count + 1));
sheetRange.Cells.Value2 = valuesArray;
This is much, much faster than looping and setting each cell individually. If you're setting each cell individually, you have to talk to Excel through COM (for lack of a better phrase) for each cell (which in your case is ~192,000 times), which is incredibly slow. Looping, building your array and only talking to Excel once removes much of that overhead.

Categories