Check values in DataTable - c#

I have a DataTable dtOne, having records as below:
ColumnA ColumnB ColumnC
1001 W101 ARCH
1001 W102 ARCH
1002 W103 CUSS
1003 W104 ARCH
And another DataTable dtTwo, having values as:
ColumnA
ARCH
CUSS
I need to check whether values of dtTwo exist in dtOne or not, if not write it on the webpage.
I wrote the below code, but it doesn't work properly. I need to check like, if ARCH from dtTwo table is present in dtOne, don't check further, just write it to the webpage.
for (int counter = 0; counter < dtTwo.Rows.Count; counter++)
{
var contains=dtOne.Select("ColumnC= '" + dtTwo.Rows[counter][0].ToString() + "'");
if (contains.Length == 0)
{
Response.Write("CostCode "+dtTwo.Rows[counter][0].ToString()+" not present in the Excel");
}
}
Experts please help.
EDIT:
My functionality is achieved when I write the below code, but get a warning that unreachable code detected at counter variable.
I don't think its correct.
for (int counter = 0; counter < dtTwo.Rows.Count; counter++)
{
var contains=dtOne.Select("ColumnC= '" + dtTwo.Rows[counter][0].ToString() + "'");
if (contains.Length == 0)
{
Response.Write("CostCode "+dtTwo.Rows[counter][0].ToString()+" not present in the Excel");
}
break;
}
Regards

According to your clarification, you want to stop if you find a match from the second table in the first table.
Then you need to add the break statement when your Select finds one or more rows that matches the condition
for (int counter = 0; counter < dtTwo.Rows.Count; counter++)
{
var contains=dtOne.Select("ColumnC= '" + dtTwo.Rows[counter][0].ToString() + "'");
if (contains.Length != 0)
{
Response.Write("CostCode "+dtTwo.Rows[counter][0].ToString()+" not present in the Excel");
break;
}
}
From the C# reference
The break statement terminates the closest enclosing loop or switch
statement in which it appears. Control is passed to the statement that
follows the terminated statement, if any.

Related

Grouping results from for-loop C#

First time asking a question on Stackoverflow, fingers crossed! I'm new to programming and I'm struggeling to solve an issue. I got this foor loop:
for (int i = 0; i < antal.Count(); i++)
{
tbxResultat.AppendText(namnLista.ElementAt(i) + " \t");
tbxResultat.AppendText(personnummerLista.ElementAt(i).ToString() + " \t");
tbxResultat.AppendText(distriktLista.ElementAt(i) + " \t");
tbxResultat.AppendText(antal.ElementAt(i).ToString() + newLine);
}
I want to group the results from the loop into 4 sections: first values 0-49, second 50-99, third 100-199 and fourth 199. I'm interested in seeing how many is in each section and having it printed right each section, like:
23
37
---------------------------------
Total count of 2 in first section.
I've tried putting the for-loop in if statments but with no success. The sortment of the list is done with bubble sort which i modified to take a List instead of array. Any tips in the right direction would be much appriciated!
/Drone
This code should be what you asked for. Note the while in there. It's not if because there may be groups with zero elements, so you have to handle those as well.
int groupLimit = 50;
int elementsInGroup = 0;
for (int i = 0; i < antal.Count(); i++, elementsInGroup++)
{
while (antal[i] >= groupLimit)
{
// Write summary here. Use elementsInGroup..
groupLimit += 50;
elementsInGroup = 0;
}
// Write one row here. I suggest building the string and calling only one AppendText..
}

Instead of deleting the rows one by one in excel, the whole data is getting deleted in one execution of loop

Instead of deleting every row one by one,its deleting all the data of the excel sheet in one execution. The problem is that i dont want to delete the columns and hence i am iterating the loop till 2. But while debugging i found that in one execution of the loop,that is, i=47 that is the last row , its deleting all the records.
//last row=47
for (int i = lastRow; i >= 2; i--)
{
Array MyValues = (Array)MySheet.get_Range("A" + i.ToString(), "L" + i.ToString()).Cells.Value;//storing the range of values in an array
if (Convert.ToString(MyValues.GetValue(1,1))=="" && Convert.ToString(MyValues.GetValue(1,2))==""&&Convert.ToString(MyValues.GetValue(1,3))=="")
{
break;
}
else
{
Excel.Range cells = MySheet.get_Range("A" + i.ToString(), "L" + i.ToString());
cells.Delete();
}
}

Accessing and Querying Event Log in Windows

I was wondering how I could reach event Log entries. I have a client server application and it executes without problems. What i am looking for is all instances of log with the id of 1149. This log is of the remote connection entries. I have taken a piece of code, here it is.
string logType = "System";
string str = "";
EventLog ev = new EventLog(logType, System.Environment.MachineName);
int LastLogToShow = ev.Entries.Count;
if (LastLogToShow <= 0)
Console.WriteLine("No Event Logs in the Log :" + logType);
// Read the last 2 records in the specified log.
int i;
for (i = ev.Entries.Count; i >= LastLogToShow - 1000 ; i--)
{
EventLogEntry CurrentEntry = ev.Entries[i];
if (CurrentEntry.InstanceId == 1149)
{
str += "Event type: " + CurrentEntry.EntryType.ToString() + "\n" +
"Event Message: " + CurrentEntry.Message + CurrentEntry + "\n" +
"Event Time: " + CurrentEntry.TimeGenerated.ToShortTimeString() + "\n" +
"Event : " + CurrentEntry.UserName +"\n" +"\n";
}
}
ev.Close();
return str;
The thing is I get the 42567 index is out of bounds exception everytime. I also dont know if it will work after that, so questions may follow.
EDIT:
Indeed, the problem was me reaching out of the eventlog with my index like you guys said. Using this line for the loop solved my problem here and I am able to reach the eventlog now, if anyone is looking around, this solution worked for me, so thank you all so much.
for (i = ev.Entries.Count - 1; i >= 0; i--)
This for (i = ev.Entries.Count; i >= LastLogToShow - 1000 ; i--) is causing your error. I don't really get what you're trying to do here. For one if you have less than 1000 entries, your i can be negative. When you use a negative value as the index of an array you will get "index is out of bounds exception". When you are trying to process only the last 2 records (as your commentary above the for-loop suggests) you should just use this:
for (i = ev.Entries.Count - 1; i >= ev.Entries.Count - 2; i--)
Of course you will still have to check if there is more than 2 entries because if there are 0 entries, the code will still go into the for-loop and try to access the array with negative indexes:
if(ev.Entries.Count < 2)
return str;
for (i = ev.Entries.Count - 1; i >= ev.Entries.Count - 2; i--)
Edit: Also just noticed even if there are more than 1000 records, when you go into the for-loop for the first time you will have ev.Entries[ev.Entries.Count]. Since array-indexes are zero-based you have to substract 1 from the count to get the last element of an array.
I highly suggest you use C# Linq for this.
Add this namespace
using System.Linq;
Linq is very similar to SQL in how it works with the data. In your case:
List<string> stringLogs =
ev.Entries
.Where(t => t.InstanceId == 1149)
.Select(t => GenerateLogString(t))
.ToList();
public string GenerateLogString(EventLogEntry CurrentEntry)
{
return
string.Format("Event type: {0}\nEvent Message: {1}\nEvent Time: {2}\nEvent: {3}\n",
CurrentEntry.EntryType.ToString(),
CurrentEntry.Message + CurrentEntry,
CurrentEntry.TimeGenerated.ToShortTimeString(),
CurrentEntry.UserName)
}
You can then convert the string logs into a single string, like you have there.
string str = string.Join("/n", stringLogs);
If you want to select the top 2 logs (as your commentary suggests), add a .Take(2) to the query, like below.
List<string> stringLogs =
ev.Entries
.Where(t => t.InstanceId == 1149)
.Take(2)
.Select(t => GenerateLogString(t))
.ToList();
You just need to equal i to ev.Entries.Count -1.
i = (ev.Entries.Count -1)

Index out of range exception in using ParallelFor loop

This is a very weird situation, first the code...
The code
private List<DispatchInvoiceCTNDataModel> WorksheetToDataTableForInvoiceCTN(ExcelWorksheet excelWorksheet, int month, int year)
{
int totalRows = excelWorksheet.Dimension.End.Row;
int totalCols = excelWorksheet.Dimension.End.Column;
DataTable dt = new DataTable(excelWorksheet.Name);
// for (int i = 1; i <= totalRows; i++)
Parallel.For(1, totalRows + 1, (i) =>
{
DataRow dr = null;
if (i > 1)
{
dr = dt.Rows.Add();
}
for (int j = 1; j <= totalCols; j++)
{
if (i == 1)
{
var colName = excelWorksheet.Cells[i, j].Value.ToString().Replace(" ", String.Empty);
lock (lockObject)
{
if (!dt.Columns.Contains(colName))
dt.Columns.Add(colName);
}
}
else
{
dr[j - 1] = excelWorksheet.Cells[i, j].Value != null ? excelWorksheet.Cells[i, j].Value.ToString() : null;
}
}
});
var excelDataModel = dt.ToList<DispatchInvoiceCTNDataModel>();
// now we have mapped everything expect for the IDs
excelDataModel = MapInvoiceCTNIDs(excelDataModel, month, year, excelWorksheet);
return excelDataModel;
}
The problem
When I am running the code on random occasion it would throw IndexOutOfRangeException on the line
dr[j - 1] = excelWorksheet.Cells[i, j].Value != null ? excelWorksheet.Cells[i, j].Value.ToString() : null;
For some random value of i and j. When I step over the code (F10), since it is running in a ParallelLoop, some other thread kicks and and other exception is throw, that other exception is something like (I could not reproduce it, it just came once, but I think it is also related to this threading issue) Column 31 not found in excelWorksheet. I don't understand how could any of these exception occur?
case 1
The IndexOutOfRangeException should not even occur, as the only code/shared variable dt I have locked around accessing it, rest all is either local or parameter so there should not have any thread related issue. Also, if I check the value of i or j in debug window, or even evaluate this whole expression dr[j - 1] = excelWorksheet.Cells[i, j].Value != null ? excelWorksheet.Cells[i, j].Value.ToString() : null; or a part of it in Debug window, then it works just fine, no errors of any sort or nothing.
case 2
For the second error, (which unfortunately is not reproducing now, but still) it should not have occurred as there are 33 columns in the excel.
More Code
In case someone might need how this method was called
using (var xlPackage = new ExcelPackage(viewModel.postedFile.InputStream))
{
ExcelWorksheets worksheets = xlPackage.Workbook.Worksheets;
// other stuff
var entities = this.WorksheetToDataTableForInvoiceCTN(worksheets[1], viewModel.Month, viewModel.Year);
// other stuff
}
Other
If someone needs more code/details let me know.
Update
Okay, to answer some comments. It is working fine when using for loop, I have tested that many times. Also, there is no particular value of i or j for which the exception is thrown. Sometimes it is 8, 6 at other time it could be anything, say 19,2or anything. Also, in the Parallel loop the +1 is not doing any damage as the msdn documentation says it is exclusive not inclusive. Also, if that were the issue I would only be getting exception at the last index (the last value of i) but that's not the case.
UPDATE 2
The given answer to lock around the code
dr = dt.Rows.Add();
I have changed it to
lock(lockObject) {
dr = dt.Rows.Add();
}
It is not working. Now I am getting ArgumentOutOfRangeException, still if I run this in debug window, it just works fine.
Update 3
Here is the full exception detail, after update 2 (I am getting this on the line that I mentioned in update 2)
System.ArgumentOutOfRangeException was unhandled by user code
HResult=-2146233086
Message=Index was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
Source=mscorlib
ParamName=index
StackTrace:
at System.ThrowHelper.ThrowArgumentOutOfRangeException()
at System.Collections.Generic.List`1.get_Item(Int32 index)
at System.Data.RecordManager.NewRecordBase()
at System.Data.DataTable.NewRecordFromArray(Object[] value)
at System.Data.DataRowCollection.Add(Object[] values)
at AdminEntity.BAL.Service.ExcelImportServices.<>c__DisplayClass2e.<WorksheetToDataTableForInvoiceCTN>b__2d(Int32 i) in C:\Projects\Manager\Admin\AdminEntity\AdminEntity.BAL\Service\ExcelImportServices.cs:line 578
at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c()
InnerException:
Okay. So there are a few problems with your existing code, most of which have been touched on by others:
Parallel threads are at the mercy of the OS scheduler; therefore, although threads are queued in-order, they may (and often do) complete execution out-of-order. For example, given Parallel.For(0, 10, (i) => { Console.WriteLine(i); });, the first four threads (on a quad-core system) will be queued with i values 0-3. But any one of those threads may start or finish executing before any other. So you may see 2 printed first, whereupon thread 4 will be queued. Then thread 1 might complete, and thread 5 will be queued. Then thread 4 might complete, even before threads 0 or 3 do. Etc., etc. TL;DR: You CANNOT assume an ordered output in parallel.
Given that, as #ScottChamberlain noted, it's a very bad idea to do column generation within your parallel loop - because you have no guarantee that the thread doing column generation will create all your columns before another thread starts assigning data in rows to those column indices. E.g. you could be assigning data to cell [0,4] before your table actually has a fifth column.
It's worth noting that this should really be broken out of the loop anyway, purely from a code cleanliness perspective. At the moment, you have two nested loops, each with special behavior on a single iteration; better to separate that setup logic into its own loop and leave the main loop to assign data and nothing else.
For the same reason, you should not be creating new rows in the table within your parallel loop - because you have no guarantee that the rows will be added to the table in their source order. Break that out too, and access rows within the loop based on their index.
Some have mentioned using DataRow.NewRow() before Rows.Add(). Technically, NewRow() is the right way to go about things, but the actual recommended access pattern is a bit different than is probably appropriate for a cell-by-cell function, particularly when parallelism is intended (see MSDN: DataTable.NewRow Method). The fact remains that adding a new, blank row to a DataTable with Rows.Add() and populating it afterwards functions properly.
You can clean up your string formatting with the null-coalescing operator ??, which evaluates whether the preceding value is null, and if so, assigns the subsequent value. For example, foo = bar ?? "" is the equivalent of if (bar == null) { foo = ""; } else { foo = bar; }.
So right off the bat, your code should look more like this:
private void ReadIntoTable(ExcelWorksheet sheet)
{
DataTable dt = new DataTable(sheet.Name);
int height = sheet.Dimension.Rows;
int width = sheet.Dimension.Columns;
for (int j = 1; j <= width; j++)
{
string colText = (sheet.Cells[1, j].Value ?? "").ToString();
dt.Columns.Add(colText);
}
for (int i = 2; i <= height; i++)
{
dt.Rows.Add();
}
Parallel.For(1, height, (i) =>
{
var row = dt.Rows[i - 1];
for (int j = 0; j < width; j++)
{
string str = (sheet.Cells[i + 1, j + 1].Value ?? "").ToString();
row[j] = str;
}
});
// convert to your special Excel data model
// ...
}
Much better!
...but it still doesn't work!
Yep, it still fails with an IndexOutOfRange exception. However, since we took your original line dr[j - 1] = excelWorksheet.Cells[i, j].Value != null ? excelWorksheet.Cells[i, j].Value.ToString() : null; and split it into a couple pieces, we can see exactly which part it fails on. And it fails on row[j] = str;, where we actually write the text into the row.
Uh-oh.
MSDN: DataRow Class
Thread Safety
This type is safe for multithreaded read operations. You must synchronize any write operations.
*sigh*. Yeah. Who knows why DataRow uses static anything when assigning values, but there you have it; writing to DataRow isn't thread-safe. And sure enough, doing this...
private static object s_lockObject = "";
private void ReadIntoTable(ExcelWorksheet sheet)
{
// ...
lock (s_lockObject)
{
row[j] = str;
}
// ...
}
...magically makes it work. Granted, it completely destroys the parallelism, but it works.
Well, it almost completely destroys the parallelism. Anecdotal experimentation on an Excel file with 18 columns and 46319 rows shows that the Parallel.For() loop creates its DataTable in about 3.2s on average, whereas replacing Parallel.For() with for (int i = 1; i < height; i++) takes about 3.5s. My guess is that, since the lock is only there for writing data, there is a very small benefit realized by writing data on one thread and processing text on the other(s).
Of course, if you can create your own DataTable replacement class, you can see a much larger speed boost. For example:
string[,] rows = new string[height, width];
Parallel.For(1, height, (i) =>
{
for (int j = 0; j < width; j++)
{
rows[i - 1, j] = (sheet.Cells[i + 1, j + 1].Value ?? "").ToString();
}
});
This executes in about 1.8s on average for the same Excel table mentioned above - about half the time of our barely-parallel DataTable. Replacing the Parallel.For() with the standard for() in this snippet makes it run in about 2.5s.
So you can see a significant performance boost from parallelism, but also from a custom data structure - although the viability of the latter will depend on your ability to easily convert the returned values to that Excel data model thing, whatever it is.
The line dr = dt.Rows.Add(); is not thread safe, you are corrupting the internal state of the array in the DataTable that hold the rows for the table.
At first glance changing it to
if (i > 1)
{
lock (lockObject)
{
dr = dt.Rows.Add();
}
}
should fix it, but that does not mean other thread safety problems are not there from excelWorksheet.Cells being accessed from multiple threads. (If excelWorksheet is this class and you are running a STA main thread (WinForms or WPF) COM should marshal the cross thread calls for you)
EDIT: New thory, the problem comes from the fact that you are setting up your schema inside the parallel loop and attempting to write to it at the same time. Pull out all of the i == 1 logic to before the loop and then start at i == 2
private List<DispatchInvoiceCTNDataModel> WorksheetToDataTableForInvoiceCTN(ExcelWorksheet excelWorksheet, int month, int year)
{
int totalRows = excelWorksheet.Dimension.End.Row;
int totalCols = excelWorksheet.Dimension.End.Column;
DataTable dt = new DataTable(excelWorksheet.Name);
//Build the schema before we loop in parallel.
for (int j = 1; j <= totalCols; j++)
{
var colName = excelWorksheet.Cells[1, j].Value.ToString().Replace(" ", String.Empty);
if (!dt.Columns.Contains(colName))
dt.Columns.Add(colName);
}
Parallel.For(2, totalRows + 1, (i) =>
{
DataRow dr = null;
lock(lockObject) {
dr = dt.Rows.Add();
}
for (int j = 1; j <= totalCols; j++)
{
dr[j - 1] = excelWorksheet.Cells[i, j].Value != null ? excelWorksheet.Cells[i, j].Value.ToString() : null;
}
});
var excelDataModel = dt.ToList<DispatchInvoiceCTNDataModel>();
// now we have mapped everything expect for the IDs
excelDataModel = MapInvoiceCTNIDs(excelDataModel, month, year, excelWorksheet);
return excelDataModel;
}
You code is incorrect:
1) Parallel.For has its own batching mechanism (can be customized with ForEach with partitioners though) and does not guarantee that operation with (for) i==n will be executed after operation with i==m where n>m.
So line
dr[j - 1] = excelWorksheet.Cells[i, j].Value != null ? excelWorksheet.Cells[i, j].Value.ToString() : null;
throw exception when required column is not added yet (in {i==1} operation}
2) And it's recommended to use NewRow method:
dr=tbl.NewRow->Populate dr->tbl.Rows.Add(dr)
or Rows.Add(object[] values):
values=[KnownColumnCount]->Populate values->tbl.Rows.Add(values)
3) It's really better to populate columns first in this case, because it's sequential access to excel file (seek) and It would not harm perfomance
Have you tried using NewRow when creating the new datarow and moving the creation of the columns outside the parallel loop like Scott Chamberlain suggested above? By using newrow you're creating a row with the same schema as the parent datatable. I got the same error as you when I tried your code with a random excel file, but got it to work like this:
for (int x = 1; x <= totalCols; x++)
{
var colName = excelWorksheet.Cells[1, x].Value.ToString().Replace(" ", String.Empty);
if (!dt.Columns.Contains(colName))
dt.Columns.Add(colName);
}
Parallel.For(2, totalRows + 1, (i) =>
{
DataRow dr = null;
for (int j = 1; j <= totalCols; j++)
{
dr = dt.NewRow();
dr[j - 1] = excelWorksheet.Cells[i, j].Value != null
? excelWorksheet.Cells[i, j].Value.ToString()
: null;
lock (lockObject)
{
dt.Rows.Add(dr);
}
}
});

Try block being called twice

So I have a function that retrieves data from a database, and outputs it. It's encased in a try block, so that the error from running out of rows to process is dealt with.
Here's the function that is called on load (to make the initial output), and called later on to update the log box.
Problem is, whenever I call the function again, it outputs the data twice. Since only the data is in the try block, and not the headers, this points to the try / catch being the issue.
Apologies for the messy / hacky code:
private void NavigateRecords()
{
da.Fill(ds1, "LogOutput");
textlog4.Text = "Date \t\t Time \t\t Floor\r\n";
try
{
for (int i = 0; i <= 20; i++)
{
DataRow dRow = ds1.Tables["LogOutput"].Rows[i];
for (int i2 = 1; i2 <= 3; i2++)
{
textlog4.Text += dRow.ItemArray.GetValue(i2).ToString() + "\t\t";
}
textlog4.Text += "\r\n";
}
}
catch { textlog4.Text += "----------------------"; }
}
This is the code that does part of the connecting, and may be of use:
string sql = "SELECT * From tblLog";
da = new System.Data.SqlClient.SqlDataAdapter(sql, con);
NavigateRecords();
Initial output:
Date Time Floor
6/12 18:18:22 1
----------------------
Output next time it is called:
Date Time Floor
6/12 18:18:22 1
6/12 18:46:19 2
6/12 18:18:22 1
6/12 18:46:19 2
----------------------
its not an issue with your try catch , its an issue with your variables. Clear out the variable before you add the new data to it. I do not see your variable declared in the scope of the method so it retains the data you original put in it.

Categories