Parallel write to Excel file c# - c#

From some time I am struggling with exporting data from db (stored procedures) to excel files. As I am working with loads of data (around 500k rows per sheet, business requirement unfortunately) I was wondering whether there is the possibility to execute parallel writes to a single worksheet. I am using EPPlus and OpenXML for writing to the files. As I was reading through the web, I was able to understand that the User Defined Functions can be executed in parallel, but I didn't found anything for parallel writing. Any help/advice/hint/etc. will be much appreciated!

To my knowledge, no. You can program it, but it's a hell
If You want to export to Excel fastly I think that you have two ways.
CVS (fields separated by semicolon).
Use Excel with ADO. You can insert registers with Sql Sentences.
More info:
MSDN: How To Transfer Data from ADO Data Source to Excel with ADO
StackOverflow: ado in excel, inserting records into access database

Unfortunately, you'll not be able to write to a single file in parallel. A better way is to create separate files (you can write to them in parallel :)) and finally merge the sheets to get a single document. you can do that using OpenXML. Here is an example - http://blogs.technet.com/b/ptsblog/archive/2011/01/24/open-xml-sdk-merging-documents.aspx
Well, parallel would be ambiguous, I'm thinking of multi-threading which is concurrent.

Related

Shall I process excel in database or process it using aspose.cells?

I am trying to upload a very large excel file potentially containing millions of records to run a name cleaning process on one of its cell columns. I match the column names with a particular column in a table in SQL database and then inform the user to download the processed excel file.
I have multiple ways of doing it:
1) bulk copy the excel in database run the name cleaning process on the excel data in database and then extract the results from the table and write them to an excel and let the user download the excel.
2)upload the file and read it using aspose library and do the processing in memory and when once the operation is done intimate the user to download the file.
I am confused right now which option would be better and if there is a better approach of doing this please feel free to share.
Any leads would be really appreciated
Thanks
As you are talking about processing millions of records in memory using Aspose.Cells, it may affect the performance and memory utilization by Aspose.Cells. I think you should try both the methods and if you face some issue using Aspose.Cells, then let us know. I suggest you use LightCells API in Aspose.Cells which is best for reading and writing large data in Excel files.
https://docs.aspose.com/display/cellsnet/Using+LightCells+API
https://docs.aspose.com/display/cellsjava/Using+LightCells+API
Similarly, Excel may also cause issue while processing large files as it takes lot of time to process large files. Its matter of test and trial both the scenario and come up with a comparison.
One option is that if entire column data is to be used against a column name, then better to save single column excel file as blob in database and return the ready to use Excel file as it is.
You may try these scenarios and provide your feedback.
Note: I am working as Support developer/ Evangelist at Aspose.

C# Query an excel fastest method

I have an excel file and i am querying this on my C# program with SQL using OleDB.
But i faced with a problem. My file has about 300K rows and querying takes too much long time. I have googled for this issue and used some libraries such as spreadsheetlight and EPPlus but they haven't got query feature.
Can anyone advice me for the fastest way to querying my file?
Thanks in advance.
I have worked with 400-800K rows Excel files. The task was to read all rows and insert them into SQL Server DB. From my experience OleDB was not able to process such big files in a timely manner, therefore we had to fall back to Excel file import directly into DB using SQL Server means, e.g. OPENROWSET.
Even smaller files, like 260K rows took approx. an hour with OleDB to import row-by-row into DB table using Core2 Duo generation hardware.
So, in your case you can consider the following:
1.Try reading Excel file in chunks using ranged SELECT:
OleDbCommand date = new OleDbCommand("SELECT ["+date+"] FROM [Sheet1$A1:Z10000]
WHERE ["+key+"]= " + array[i].ToString(), connection);
Note, [Sheet1$A1:Z10000] tells OleDB to process only first 10K rows of columns A to Z of the sheet instead the whole sheet. You can use this approach if for example your Excel file is sorted and you know that you don't need to check ALL rows but only for this year. Or you can change Z10000 dynamically to read the next chunk of the file and combine result with the previous one.
2.Get all your Excel file contents directly into DB using direct DB import, such as mentioned OPENROWSET of the MS SQL Server and then run your search queries against RDBMS instead of the Excel files.
I would personally suggest option #2. Comment if you can use DB at all and what RDBMS product/version is available to you, if any.
Hope this helps!

how to select rows from excel sheet in c#

Hi I have written a code to read from excel sheets and query them according to filter set
But am stuck at
Select * from [sheetname] where [col] not like '%something%'
How can I write the not part?
Rest all query just work fine
The one above ignores the not and executes
If you don't have to use ADO and OLE to read your spreadsheet, I would recommend using EP Plus. It's a project that allows you to work with Spreadsheets in a much better OOP paradigm. It also abstracts all of the gotchas that come from the different internal formatting of .xlsx files versus the older .xls files.
Is it just you don't have quotes around %something%?
Check out this if you want to melt your brain with Excel possibilities (search the comments for 'not like'), and perhaps solve your problem at the same time.

SQL Server export to Excel

How to increase performance on exporting a database with tables with one to many relationship (in so case many to many relationship) into a single excel file.
Right now, I get all the data from the database and process it into a table using a few for loops, then i change the header of the html file to download it as an excel file. But it take a while for the number of records i have (about 300 records. )
I was just wondering, if there is a faster way to improved performance.
Thanks
It sounds like you're loading each table into memory with your c# code, and then building a flat table by looping through the data. A vastly simpler and faster way to do that would be to use a SQL query with a few JOINs in it:
http://www.w3schools.com/sql/sql_join.asp
http://en.wikipedia.org/wiki/Join_(SQL)
Also, I get the impression that you're rendering the resulting flat table to html, and then saving that as an excel file. There are several ways that you can create that excel (or csv) file directly, without having to turn it into an html table first.

Best practice for Uploading Excel data in SQL Server using ASP.NET

I am looking for best practice for uploading excel data in Sql server 2000 database through asp.net web application. Excel data will be predefined Format with almost 42 columns and out of 42 10 fields are mandatory and rest are conditional mandatory. i.e. if data exists it should be in defined format. I also need to validate for special character, length, specified format and so on.
After validating, i need to store valid data into sql server table and provide export to excel functionality for invalid data for exporting in same excel format with indicator to identity the invalid cells.
Can any one suggest me to do the same in optimized way.
Thank you...
You can use ADO.NET to read the data in from the spreadsheet, as outlined here.
Read it in to memory and parse all the data as necessary. Store the parsed data into a DataTable, and then you can persist that data in bulk to the database using a couple of possible methods.
The quickest, most efficient way to bulkload data into SQL Server is using SqlBulkCopy. The alternative method is to use an SqlDataAdapter. I recently outlined both approaches, with examples and performance comparisons here.
You can, but you are, AFAIK, not allowed to use Excel COM Interop on a web server. And it is definately not recommended and supported (source).
So you are left with 2 options:
Try to switch to a different format (XML, CSV) or use an Excel XML format, that you can read and write using System.XML or System.XML.Linq.
Find a component that can read and write Excel binary files. There are commercial and open source components available.
FileHelpers for .net is a decent library that will do alot of the processing for you if you are looking for something quick and efficient without having to build a ton of it yourself. They have an example of loading excel files into a sql database like you describe.

Categories