I have a C# program I am creating as a parsing program. One section of the code pulls in a list of rows in an Excel file that do not have a particular value in a particular column. I was then going to use a foreach loop to loop through each of those rows and delete them, however it is taking quite a long time to cycle through each of those rows. And there are multiple tabs that I am needing to run this on.
So my thought was turning the list of Excel rows into a range and then just deleting that range. Is this possible to convert that list of rows into an Excel range? Below is the code snippet:
XLWorkbook wb = new XLWorkbook(Path.Combine(Destination, fName) + ".xlsx");
IXLWorksheet ws = wb.Worksheet(SheetName);
var range = ws.RangeUsed();
var table = range.AsTable();
var cell = table.HeadersRow().CellsUsed(c => c.Value.ToString() == ColName).FirstOrDefault();
//Gets the column letter for use in next section
string colLetter = cell.WorksheetColumn().ColumnLetter();
//Create list of rows that DO NOT contain the inv number being searched
//This is the list I would like to convert to a range to speed up the delete
List<IXLRow> deleterows = ws
.Column(colLetter)
.CellsUsed(c => c.Value.ToString() != i)
.Select(c => c.WorksheetRow()).ToList();
//Deletes the header row so that isn't removed
deleterows.RemoveAt(0);
foreach (IXLRow x in deleterows)
{
x.Delete();
}
Right now each iteration checks the value of the cell by accessing that cell in the file. That takes a lot of time.
Read all the data on one sheet into an array and do all the iteration in the array. That will be a magnitude faster.
Since you want to parse data, I suggest you use that array for any other operation you want to do and only write it back to a file when you are done. (If your result is stored in Excel again, make sure write the whole array at once as a Range).
Related
I am following a tutorial of an inventory stock management system in C# language.
The original csv file is a stock list, which contains four categories:
Item Code, Item Description, Item Count, OnOrder
The original csv file:
In the tutorial, the code is generating a DataTable object, which will be used in the GridView demo in the application.
Here is the code:
DataTable dataTable = new DataTable();
dataTable.Columns.Add("Item Code");
dataTable.Columns.Add("Item Description");
dataTable.Columns.Add("Current Count");
dataTable.Columns.Add("On Order");
string CSV_FilePath = "C:/Users/xxxxx/Desktop/stocklist.csv";
StreamReader streamReader = new StreamReader(CSV_FilePath);
string[] rawData = new string[File.ReadAllLines(CSV_FilePath).Length];
rawData = streamReader.ReadLine().Split(',');
while(!streamReader.EndOfStream)
{
rawData = streamReader.ReadLine().Split(',');
dataTable.Rows.Add(rawData[0], rawData[1], rawData[2], rawData[3]);
}
dataGridView1.DataSource = dataTable;
I am assuming that rawData = streamReader.ReadLine().Split(','); splits the file into an array object like this:
["A0001", "Horse on Wheels","5","No"]
["A0002","Elephant on Wheels","2","No"]
In the while loop, it literates through each line (each array) and assign each of the rawData[x] into corresponding column.
Is this right to understand this code snippet? Thanks in advance.
Another question is, why do I need to run
rawData = streamReader.ReadLine().Split(',');
in a while loop?
Thanks in advance.
Your code should actually look like this:
DataTable dataTable = new DataTable();
dataTable.Columns.Add("Item Code");
dataTable.Columns.Add("Item Description");
dataTable.Columns.Add("Current Count");
dataTable.Columns.Add("On Order");
string CSV_FilePath = "C:/Users/xxxxx/Desktop/stocklist.csv";
using(StreamReader streamReader = new StreamReader(CSV_FilePath))
{
// Skip the header row
streamReader.ReadLine();
while(!streamReader.EndOfStream)
{
string[] rawData = streamReader.ReadLine().Split(','); // read a row and split it into cells
dataTable.Rows.Add(rawData[0], rawData[1], rawData[2], rawData[3]); // add the elements from each cell as a row in the datatable
}
}
dataGridView1.DataSource = dataTable;
Changes I've made:
We've added a using block around StreamReader to ensure that the file handle is only open for as long as we need to read the file.
We now only read the file once, not twice.
Since we only need the rawData in the scope of the while loop, I've moved it into the loop.
Explaining what's wrong:
The following line reads the entire file, and then counts how many rows are in it. With this information, we initialize an array with as many positions as there are rows in the file. This means for a 500 row file, you can access positions rawData[0], rawData[1], ... rawData[499].
string[] rawData = new string[File.ReadAllLines(CSV_FilePath).Length];
With the next row you discard that array, and instead take the cells from the top of the file (the headers):
rawData = streamReader.ReadLine().Split(',');
This line states "read a single line from the file, and split it by comma". You then assign that result to rawData, replacing its old value. So the reason you need this again in the loop is because you're interested in more than the first row of the file.
Finally, you're looping through each row in the file and replacing rawData with the cells from that row. Finally, you add each row to the DataTable:
rawData = streamReader.ReadLine().Split(',');
dataTable.Rows.Add(rawData[0], rawData[1], rawData[2], rawData[3]);
Note that File.ReadAllLines(...) reads the entire file into memory as an array of strings. You're also using StreamReader to read through the file line-by-line, meaning that you are reading the entire file twice. This is not very efficient and you should avoid this where possible. In this case, we didn't need to do that at all.
Also note that your approach to reading a CSV file is fairly naïve. Depending on the software used to create them, some CSV files have cells that span more than one line in the file, some include quoted sections for text, and sometimes those quoted sections include commas which would throw off your split code. Your code also doesn't deal with the possibility of a file being badly formatted such that a row may have less cells than expected, or that there may be a trailing empty row at the end of the file. Generally it's better to use a dedicated CSV parser such as CsvHelper rather than trying to roll your own.
I have the below table sorted & i want to verify the table is sorted in an order using C# selenium
first numeric values will be sorted and alphabets will be sorted.
Display name
1
2
5
7
Abbot
Edfdsf
Fdsf
i need to to verify in c# selenium.
My Thoughts: is it easy way to convert each row value to ASCII number and compare with the next row ?
Please provide your suggestion?
I would suggest that you store the display names in a List, copy the List and sort it, and then compare it to the original list.
List<String> displayNames = new List<string>();
// grab the cells that contain the display names you want to verify are sorted
IReadOnlyList<IWebElement> cells = Driver.FindElements(locator);
// loop through the cells and assign the display names into the ArrayList
foreach (IWebElement cell in cells)
{
displayNames.Add(cell.Text);
}
// make a copy of the displayNames array
List<String> displayNamesSorted = new List<string>(displayNames);
displayNamesSorted.Sort();
Console.WriteLine(displayNames.SequenceEqual(displayNamesSorted));
Most elegant way:
var cells = WebDriver.FindElements(locator);
Assert.IsTrue(cells.OrderBy(c => c.Text).SequenceEqual(cells));
I have two DataGridView in the main form and the first one displays data from SAP and another displays data from Vertica DB, the FM I'm using is RFC_READ_TABLE, but there's en exception when calling this FM, which is, if there are too many columns in target table, SAP connector will returns an DATA_BUFFER_EXCEED exception, is there any other FMs or ways to retrieving data from SAP without exception?
I figured out a solution, is about split fields into several arrays and store each parts data into a datatable, then merge datatables, but I'm afraid it will cost a lot of time if the row count is too large.
screenshot of the program
here comes my codes:
RfcDestination destination = RfcDestinationManager.GetDestination(cmbAsset.Text);
readTable = destination.Repository.CreateFunction("RFC_READ_TABLE");
/*
* RFC_READ_TABLE will only extract data up to 512 chars per row.
* If you load more data, you will get an DATA_BUFFER_EXCEEDED exception.
*/
readTable.SetValue("query_table", table);
readTable.SetValue("delimiter", "~");//Assigns the given string value to the element specified by the given name after converting it appropriately.
if (tbRowCount.Text.Trim() != string.Empty) readTable.SetValue("rowcount", tbRowCount.Text);
t = readTable.GetTable("DATA");
t.Clear();//Removes all rows from this table.
t = readTable.GetTable("FIELDS");
t.Clear();
if (selectedCols.Trim() != "" )
{
string[] field_names = selectedCols.Split(",".ToCharArray());
if (field_names.Length > 0)
{
t.Append(field_names.Length);
int i = 0;
foreach (string n in field_names)
{
t.CurrentIndex = i++;
t.SetValue(0, n);
}
}
}
t = readTable.GetTable("OPTIONS");
t.Clear();
t.Append(1);//Adds the specified number of rows to this table.
t.CurrentIndex = 0;
t.SetValue(0, filter);//Assigns the given string value to the element specified by the given index after converting it appropriately.
try
{
readTable.Invoke(destination);
}
catch (Exception e)
{
}
first of all you should use BBP_READ_TABLE if it is available in your system. This one is better for much reasons. But that is not the point of your question. In RFC_READ_TABLE you have two Imports ROWCOUNT and ROWSKIPS. You have to use them.
I would recommend you a rowcount between 30.000 and 60.000. So you have to execute the RFC several times and each time you increment your ROWSKIPS. First loop: ROWCOUNT=30000 AND ROWSKIPS = 0, Second Loop: ROWCOUNT=30000 AND ROWSKIPS=30000 and so on...
Also be careful of float-fields when using the old RFC_READ_TABLE. There is one in table LIPS. This RFC has problems with them.
Use transaction
BAPI
press filter and set to all.
Under Logistics execution you will find deliveries.
The detail screen shows the function name.
Test them directly to find one thats suits then call that function instead of RFC_read_tab.
example:
BAPI_LIKP_GET_LIST_MSG
Another possibility is to have an ABAP RFC function developped to get your datas (with the advantage that you can get a structured / multi table response in one call, and the disadvantage that this is not a standard function / BAPI)
I am trying to retrieve data from an Excel spreadsheet using C#. The data in the spreadsheet has the following characteristics:
no column names are assigned
the rows can have varying column lengths
some rows are metadata, and these rows label the content of the columns in the next row
Therefore, the objects I need to construct will always have their name in the very first column, and its parameters are contained in the next columns. It is important that the parameter names are retrieved from the row above. An example:
row1|---------|FirstName|Surname|
row2|---Person|Bob------|Bloggs-|
row3|---------|---------|-------|
row4|---------|Make-----|Model--|
row5|------Car|Toyota---|Prius--|
So unfortunately the data is heterogeneous, and the only way to determine what rows "belong together" is to check whether the first column in the row is empty. If it is, then read all data in the row, and check which parameter names apply by checking the row above.
At first I thought the straightforward approach would be to simply loop through
1) the dataset containing all sheets, then
2) the datatables (i.e. sheets) and
3) the row.
However, I found that trying to extract this data with nested loops and if statements results in horrible, unreadable and inflexible code.
Is there a way to do this in LINQ ? I had a look at this article to start by filtering the empty rows between data but didn't really get anywhere. Could someone point me in the right direction with a few code snippets please ?
Thanks in advance !
hiro
I see that you've already accepted the answer, but I think that more generic solution is possible - using reflection.
Let say you got your data as a List<string[]> where each element in the list is an array of string with all cells from corresponding row.
List<string[]> data;
data = LoadData();
var results = new List<object>();
string[] headerRow;
var en = data.GetEnumerator();
while(en.MoveNext())
{
var row = en.Current;
if(string.IsNullOrEmpty(row[0]))
{
headerRow = row.Skip(1).ToArray();
}
else
{
Type objType = Type.GetType(row[0]);
object newItem = Activator.CreateInstance(objType);
for(int i = 0; i < headerRow.Length; i++)
{
objType.GetProperty(headerRow[i]).SetValue(newItem, row[i+1]);
}
results.Add(newItem);
}
}
I have a csv file I am going to read from disk. I do not know up front how many columns or the names of the columns.
Any thoughts on how I should represent the fields. Ideally I want to say something like,
string Val = DataStructure.GetValue(i,ColumnName).
where i is the ith Row.
Oh just as an aside I will be parsing using the TextFieldParser class
http://msdn.microsoft.com/en-us/library/cakac7e6(v=vs.90).aspx
That sounds as if you would need a DataTable which has a Rows and Columns property.
So you can say:
string Val = table.Rows[i].Field<string>(ColumnName);
A DataTable is a table of in-memory data. It can be used strongly typed (as suggested with the Field method) but actually it stores it's data as objects internally.
You could use this parser to convert the csv to a DataTable.
Edit: I've only just seen that you want to use the TextFieldParser. Here's a possible simple approach to convert a csv to a DataTable:
var table = new DataTable();
using (var parser = new TextFieldParser(File.OpenRead(path)))
{
parser.Delimiters = new[]{","};
parser.HasFieldsEnclosedInQuotes = true;
// load DataColumns from first line
String[] headers = parser.ReadFields();
foreach(var h in headers)
table.Columns.Add(h);
// load all other lines as data '
String[] fields;
while ((fields = parser.ReadFields()) != null)
{
table.Rows.Add().ItemArray = fields;
}
}
If the column names are in the first row read that and store in a Dictionary<string, int> that maps the column name to the column index.
You could then store the remaining rows in a simple structure like List<string[]>.
To get a column for a row you'd do csv[rowIndex][nameToIndex[ColumnName]];
nameToIndex[ColumnName] gets the column index from the name, csv[rowIndex] gets the row (string array) we want.
This could of course be wrapped in a class.
Use the csv parser if you want, but a text parser is something very easy to do by yourself if you need customization.
For you need, i would use one (or more) Dictionnary. At least one to have the PropertyString --> column index. And maybe the reverse one column index--> PropertyString if needed.
When i parse a file for csv, i usually put the result in a list while parsing, and then in an array once complete for speed reasons (List.ToArray()).