Retrieving table data from a doc file using c# - c#

I am working on a project which involves getting data from a .doc or a .docx file. The input requirements are in a tabular format. Is it possible to retrieve data from table in a row wise manner or as a dataset.I am using Microsoft.Office.Interop.Word to get the data from the doc file.

You can use the property Tables of the Document interface to get a collection with all the tables in your document. For each Table in this collection you can get the rows and for each row the cells.
I.e. if app is your Application object you can write something like this to get the text contained in each cell(supposing that there is exactly one in your doc):
string aCellText;
foreach (Row aRow in Application.ActiveDocument.Tables[0].Rows)
foreach (Cell aCell in aRow.Cells)
aCellText = aCell.Range.Text;

That is not possible with the word, but if you want something like that than you should put tabular data in you excel file and than you can easily read it in the dataset object....

this is not possible to get the data in dataset object from .doc or .docx file. But if your data is in tabular form and also in the excel sheet than you can retrieve the data in dataset object. MS Word is for documentation purpose and excel is used for maintaining data sheets..

Related

ExcelDataReader in C# - How to reference an individual Cell using row and column cordinates

I'm reading an .xlsx spreadsheet into a C# console app with a view to outputting the content as a formatted xml file (to be picked up by another part of the system further down the line).
The problem with the the .xslx file is that it's a pro-forma input document based on, and replacing, an old paper-based order form we used to provide to customers, and the input fields aren't organised as a series of similar rows (except in the lower part of the document which consists of up to 99 rows of order detail lines). Some of the rows in the header part of the form/sheet are a mixture of label text AND data; same with the columns.
Effectively, what I need to do is to be able to cherry pick data from the initial dozen or so rows in order to poke data into the xml structure; the latter part of the document I can process by iterating over the rows for the order detail lines.
I can't use Interop as this will end up as an Azure function - so I've used ExcelDataReader to convert the spreadsheet to a dataset, then convert that dataset to a new dataset entirely composed of string values. But I haven't been able to successfully point to individual cells as I had expected to be using syntax something like
var cellValue = MyDataSet.Cell[10, 2];
I'd be grateful for any advice as to how I might get the result I need.
A Dataset has Tables and those have Rows which hold ColumnValues
A WorkSheet transforms into a Table (with Columns) and the Cells transform to Rows and column values.
To find the cell value at [10,2] on the first Worksheet do:
var cellValue = MyDataSet.Tables[0].Rows[10][2];
Remember that cellValue will be of type object. Cast accordingly.

Import Selected CSV file into New Excel file in user defined order

I would like to ask for your help!!
I'm trying to import CSV file into New Excel file, User will give the input to select which CSV file, and program should import purticular data into new excel file and data should be in user defined order.
So guide me to choose correct platform or language to do it.
my sample csv
and my expected output is
Time string will be Row headers,
VarName will be Column headers.
varvalue will be in main data
There are many ways to do this. One of flexible ways is:
Create a model class for your data. (Each record of your csv file will be an instance of your model)
Create a RDLC report that represent your data in your desired layout and format. There you can arrange your data as in rows, columns , pivots, ...
Read data from your csv file and fill in List and pass it as data source to your report.
Export your report to Excel (and even other supported file formats like .doc, .pdf)
Additional resources:
Using a Business Object Data Source with the ReportViewer
RDLC-Export directly to Excel or PDF
Read csv to list ofobjects

Reading in Excel file into array or list

So I have a HUGE excel file with headers, names, values stored.
I'm wanting to read everhting from row 5-95 and column A,B,C,D. I dont want to use a database I want to use a list or array.
What would be a way to read in the excel file and get the info I want?
You can use EasyXLS Excel library.
For reading XLS files use this code:
ExcelDocument xls = new ExcelDocument();
List listRows = xls.easy_ReadXLSSheet_AsList("file.xls", "SheetName", "A5:D95")
For reading XLSX files use this code:
ExcelDocument xls = new ExcelDocument();
List listRows = xls.easy_ReadXLSXSheet_AsList("file.xlsx", "SheetName", "A5:D95")
The listRows will containg the rows and each of this rows is another list that contains the cell values.

Parsing data from xls and added to a data grid view using c#

I am trying to read a folder that contain a number of xls files (27) and i need to read only 3 specific columns after 21 row e.g. A21,: B21,: ... In a new column a would like to have just the sum of the previous columns. I am thinking to insert a database grid and to insert there the parsing data. My problem is that i have never try to read something from xls. Do you have any ideas. Thanks in advance! (All the data are in the same workshhet in all workbooks)

Importing an Excel WorkSheet into a Datatable

I have been asked to create import functionality in my application. I am getting an excel worksheet as input. The worksheet has column headers followed by data. The users want to simply select an xls file from their system, click upload and the tool deletes the table in the database and adds this new data.
I thought the best way would be too bring the data into a datatable object and do a foeach for every row in the datatable insert row by row into the db.
My question is what can anyone give me code to open an excel file, know what line the data starts on in the file, and import the data into a datable object?
Take a look at Koogra.
You instantiate a WorkBook object from a path to an XLS file.
You access a WorkSheet object from the workbook's Sheets property.
You can enumerate over the rows in the worksheet by accessing the sheet's Rows property from index MinRow to MaxRow.
You can enumerate over the cells in a given row by accessing the row's Cells property from index MinColumn to MaxColumn.
Each cell has a Value property (object) as well as a FormattedValue method (string).
Give it a try -- I've found it to be extremely intuitive and easy to use.
You can make use of an OleDbConnection to connect to excel file and the query it using SQL queries.
If it is an Asp.Net application, then you make use of the FileUpload control and get the bytes from the file. Then you will have to manually convert it to a datatable.
Try out these links:
OleDbConnection to excel file
Byte array to datatable
What your looking for is the concept described Here
Providing you dont want to use a third party library anyway, else Dans solution will suit you
First you have to download the dll file namely
NExcel.dll
By using this dll you can make various object which are very useful for
import excel data in .net using both vb as well as c#.
Good luck.

Categories