Logging Data in Excel File using C# - c#

I have a requirement to log the range of data from multiple interfaces in the Excel file.
So, i can open an excel sheet and keep the data in the different worksheets of an excel sheet from multiple interfaces having an iteration period of somewhere 40ms to 100ms for different interfaces.
I have tried using the EPPlus library and able to push the data but its like I just collate the data and then push it in the Excel sheet. I am not finding any way where I can keep writing the data in multiple worksheets in parallel.
One approach I am trying to use with the InterOp but I am not sure if this work file where the very fast data is coming from multiple interfaces and needs to be filled in one or more worksheets.
Anyone can advice the best approach to do so?

You actually describe two kinds of functionality here - logging and reporting.
Excel file is by no means a realtime data storage. It's suitable for reporting, but not for logging.
I would suggest accumulating data somewhere else, for example in a relational database or just in csv files, depending on your reliability and scalability needs, and aggregate the excel files for closed time periods, for example: dayly, hourly, every minute.
If you absolutely need to add many items to different excel sheet at runtime, you can use Microsoft OleDb driver:
const string connectionString =
#"Provider=Microsoft.ACE.OLEDB.12.0;Extended Properties=Excel 12.0 XML;Data Source=C:\source\MyExcel.xlsx;";
using (var conn = new OleDbConnection(connectionString))
{
conn.Open();
foreach (var sheet in new[] { "sheet1", "sheet2", "sheet3" })
{
using (var cmd = new OleDbCommand())
{
cmd.Connection = conn;
try
{
cmd.CommandText = "CREATE TABLE [" + sheet + "] (id INT, datecol DATE );";
cmd.ExecuteNonQuery();
}
catch (Exception) // TODO: find better way to determine existing sheet
{
Console.WriteLine("Can't create {0}", sheet);
}
}
for (var i = 0; i < 1000; i++)
{
using (var cmd = new OleDbCommand())
{
cmd.Connection = conn;
var datecol = DateTime.Now;
var id = i;
cmd.CommandText = "INSERT INTO [" + sheet + "](id, datecol) VALUES(#id,#datecol);";
cmd.Parameters.Add("#id", OleDbType.Integer).Value = id;
cmd.Parameters.Add("#datecol", OleDbType.Date).Value = datecol;
cmd.ExecuteNonQuery();
}
}
}
conn.Close();
}

Related

How to Search Data From Huge CSV Files (20Gb) C# ASP.NET

I want to create a program using .Net to read or search data in a 20Gb CSV file
Is there any way to do it ?
My Code For Search
string search = txtBoxSearch.Text;
string pathOnly = Path.GetDirectoryName(csvPath);
string fileName = Path.GetFileName(csvPath);
string sql = #"SELECT F1 AS StringID, F2 AS StringContent FROM [" + fileName + "] WHERE F2 LIKE '%" + search + "%'";
using (OleDbConnection connection = new OleDbConnection(
#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + pathOnly +
";Extended Properties=\"Text;HDR=No\""))
using (OleDbCommand command = new OleDbCommand(sql, connection))
using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
{
DataTable dataTable = new DataTable();
adapter.Fill(dataTable);
dataTable.Columns.Add("MatchTimes", typeof(System.Int32));
foreach (DataRow row in dataTable.Rows)
{
row["MatchTimes"] = Regex.Matches(row["StringContent"].ToString(), search).Count;
}
GridViewResult.DataSource = dataTable;
GridViewResult.DataBind();
My Code for generate the CSV File
int records = 100000;
File.AppendAllLines(csvPath,
(from r in Enumerable.Range(0, records)
let guid = Guid.NewGuid()
let stringContent = GenerateRandomString(256000)
select $"{guid},{stringContent}"));
This really depends on exactly how you're searching. If you're just doing a single search, you could simply read this one line at a time and do a string comparison or something. If you do this, do not load the whole thing into memory - load it one at a time.
If you have access to the "full" edition of SQL Server, you could do a BULK INSERT. If you don't, though (e.g. you're using one of the express editions), you might run into the maximum table size. In this case, I've never tried this, but you could try SQLite. In theory at least, the database can handle multiple terabytes. Be sure to insert a large number of records in each transaction, though; if you do a commit after each insert your performance will be absolutely wretched. Also, be sure that you're not creating an in-memory database, or you'll just run out of memory again.

C# - postgresql - memory leak over time

Problem: The memory leaks (accumulates) over time, and reaches 99% capacity eventually.
I have the following C# code that constantly pushes data into PostgreSQL DB using while loop. I'm really struggling because I'm not a C# programmer. My main language is Python. I've been trying to look up C# references to solve my issue, but failed cuz I simply don't understand lots of syntax. The C# code is written by someone else in my company, but he's not available now.
Here is the code:
var connString = "Host=x.x.x.x;Port=5432;Username=postgres;Password=password;Database=database";
using (var conn = new Npgsql.NpgsqlConnection(connString)){
conn.Open();
int ctr = 0;
// Insert some data
while(#tag.TerminateTimeScaleLoop == 100)
{
#Info.Trace("Pushed Data: PostGre A " + ctr.ToString());
using (var cmd = new Npgsql.NpgsqlCommand())
{
cmd.Connection = conn;
cmd.CommandText = "INSERT INTO TORQX VALUES (#r,#p)";
cmd.Parameters.AddWithValue("r", System.DateTime.Now.ToUniversalTime());
cmd.Parameters.AddWithValue("p", #Tag.RigData.Time.TORQX);
cmd.ExecuteNonQuery();
cmd.Parameters.Clear();
cmd.CommandText = "INSERT INTO BLKPOS VALUES (#s,#t)";
cmd.Parameters.AddWithValue("s", System.DateTime.Now.ToUniversalTime());
cmd.Parameters.AddWithValue("t", #Tag.RigData.Time.BLKPOS);
cmd.ExecuteNonQuery();
cmd.Parameters.Clear();
// #Info.Trace("Pushed Data: PostGre " + ctr.ToString());
}
ctr = ctr + 1;
}
#Info.Trace("Pushed Data: PostGre A Terminated");
The code successfully established a connection in the beginning, and uses only one connection the entire time. It correctly inserts data to DB. But after memory capacity reaches 99%, its not inserting very well. The source of issue I can think of is that this code is constantly creating new object, but does not clear that object after one iteration is done. Can anyone tell me where the source of problem is & provide possible solution to this?
++ please understand that I'm not a C# programmer... I'm not too familiar with the concept of memory handling. But I will try my best to understand
Here is something you can try. Notice the instantiation of the command and parameters happens outside of the loop, not on every iteration.
I am recycling the parameters. As a result I am using Add(), not AddWithValue() and you must fill in the database type for the second parameter and consider using precision and scale parameters too as appropriate.
This will only work if the two commands use the same parameter types. You might consider creating two commands, one for each query.
Know that variable names beginning with # makes me cringe as a c# developer....
var connString = "Host=x.x.x.x;Port=5432;Username=postgres;Password=password;Database=database";
using (var conn = new Npgsql.NpgsqlConnection(connString))
{
conn.Open();
int ctr = 0;
#Info.Trace("Pushed Data: PostGre A " + ctr.ToString());
using (var cmd = new Npgsql.NpgsqlCommand())
{
cmd.Connection = conn;
var par_1 = cmd.Parameters.Add("#p1", /*< appropriate datatype here >*/);
var par_2 = cmd.Parameters.Add("#p2", /*< appropriate datatype here >*/);
while(#tag.TerminateTimeScaleLoop == 100)
{
cmd.CommandText = "INSERT INTO TORQX VALUES (#p1,#p2)";
par_1.Value = System.DateTime.Now.ToUniversalTime());
par_2.Value = #Tag.RigData.Time.TORQX;
cmd.ExecuteNonQuery();
cmd.CommandText = "INSERT INTO BLKPOS VALUES (#p1,#p2)";
par_1.Value = System.DateTime.Now.ToUniversalTime());
par_2.Value = #Tag.RigData.Time.BLKPOS;
cmd.ExecuteNonQuery();
ctr = ctr + 1;
}
}
}
#Info.Trace("Pushed Data: PostGre A Terminated");

SQL Bulk Insert in C# not inserting values

I'm completely new to C#, so I'm sure I'm going to get a lot of comments about how my code is formatted - I welcome them. Please feel free to throw any advice or constructive criticisms you might have along the way.
I'm building a very simple Windows Form App that is eventually supposed to take data from an Excel file of varying size, potentially several times per day, and insert it into a table in SQL Server 2005. Thereafter, a stored procedure within the database takes over to perform various update and insert tasks depending on the values inserted into this table.
For this reason, I've decided to use the SQL Bulk Insert method, since I can't know if the user will only insert 10 rows - or 10,000 - at any given execution.
The function I'm using looks like this:
public void BulkImportFromExcel(string excelFilePath)
{
excelApp = new Excel.Application();
excelBook = excelApp.Workbooks.Open(excelFilePath);
excelSheet = excelBook.Worksheets.get_Item(sheetName);
excelRange = excelSheet.UsedRange;
excelBook.Close(0);
try
{
using (SqlConnection sqlConn = new SqlConnection())
{
sqlConn.ConnectionString =
"Data Source=" + serverName + ";" +
"Initial Catalog=" + dbName + ";" +
"User id=" + dbUserName + ";" +
"Password=" + dbPassword + ";";
using (OleDbConnection excelConn = new OleDbConnection())
{
excelQuery = "SELECT InvLakNo FROM [" + sheetName + "$]";
excelConn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excelFilePath + ";Extended Properties='Excel 8.0;HDR=Yes'";
excelConn.Open();
using (OleDbCommand oleDBCmd = new OleDbCommand(excelQuery, excelConn))
{
OleDbDataReader dataReader = oleDBCmd.ExecuteReader();
using (SqlBulkCopy bulkImport = new SqlBulkCopy(sqlConn.ConnectionString))
{
bulkImport.DestinationTableName = sqlTable;
SqlBulkCopyColumnMapping InvLakNo = new SqlBulkCopyColumnMapping("InvLakNo", "InvLakNo");
bulkImport.ColumnMappings.Add(InvLakNo);
sqlQuery = "IF OBJECT_ID('ImportFromExcel') IS NOT NULL BEGIN SELECT * INTO [" + DateTime.Now.ToString().Replace(" ", "_") + "_ImportFromExcel] FROM ImportFromExcel; DROP TABLE ImportFromExcel; END CREATE TABLE ImportFromExcel (InvLakNo INT);";
using (SqlCommand sqlCmd = new SqlCommand(sqlQuery, sqlConn))
{
sqlConn.Open();
sqlCmd.ExecuteNonQuery();
while (dataReader.Read())
{
bulkImport.WriteToServer(dataReader);
}
}
}
}
}
}
}
catch(Exception ex)
{
MessageBox.Show(ex.ToString());
}
finally
{
excelApp.Quit();
}
}
The function runs without errors or warnings, and if I replace the WriteToServer with manual SQL commands, the rows are inserted; but the bulkImport isn't inserting anything.
NOTE: There is only one field in this example, and in the actual function I'm currently running to test; but in the end there will be dozens and dozens of fields being inserted, and I'll be doing a ColumnMapping for all of them.
Also, as stated, I am aware that my code is probably horrible - please feel free to give me any pointers you deem helpful. I'm ready and willing to learn.
Thanks!
I think it would be a very long and messy answer if I commented on your code and also gave pointer sample codes in the same message, so I decided to divide then into two messages. Comments first:
You are using automation to get what? You already have the sheet name as I see it and worse you are doing app.Quit() at the end. Completely remove that automation code.
If you needed some information from excel (like sheet names, column names) then you could use OleDbConnecton's GetOleDbSchemaTable method.
You might do the mapping basically in 2 ways:
Excel column ordinal to SQL table column name
Excel column name to SQL table column name
both would do. In a generic code, assuming you have column names same in both sources, but their ordinal and count may differ, you could get the column names from OleDbConnection schema table and do the mapping in a loop.
You are dropping and creating a table named "ImportFromExcel" for the purpose of temp data insertion, then why not simply create a temp SQL server table by using a # prefix in table name? OTOH that code piece is a little weird, it would do an import from "ImportFromExcel" if it is there, then drop and create a new one and attempt to do bulk import into that new one. In first run, SqlBulkCopy (SBC) would fill ImportFromExcel and on next run it would be copied to a table named (DateTime.Now ...) and then emptied via drop and create again. BTW, naming:
DateTime.Now.ToString().Replace(" ", "_") + "_ImportFromExcel"
doesn't feel right. While it looks tempting, it is not sortable, probably you would want something like this instead:
DateTime.Now.ToString("yyyyMMddHHmmss") + "_ImportFromExcel"
Or better yet:
"ImportFromExcel_" +DateTime.Now.ToString("yyyyMMddHHmmss")
so you would have something that is sorted and selectable for all the imports as a wildcard or looping for some reason.
Then you are writing to server inside a reader.Read() loop. That is not the way WriteToServer works. You wouldn't do reader.Read() but simply:
sbc.WriteToServer(reader);
In my next message e I will give simple schema reading and a simple SBC sample from excel into a temp table, as well as a suggestion how you should do that instead.
Here is the sample for reading schema information from Excel (here we read the tablenames - sheet names with tables in them):
private IEnumerable<string> GetTablesFromExcel(string dataSource)
{
IEnumerable<string> tables;
using (OleDbConnection con = new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;" +
string.Format("Data Source={0};", dataSource) +
"Extended Properties=\"Excel 12.0;HDR=Yes\""))
{
con.Open();
var schemaTable = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
tables = schemaTable.AsEnumerable().Select(t => t.Field<string>("TABLE_NAME"));
con.Close();
}
return tables;
}
And here is a sample that does SBC from excel into a temp table:
void Main()
{
string sqlConnectionString = #"server=.\SQLExpress;Trusted_Connection=yes;Database=Test";
string path = #"C:\Users\Cetin\Documents\ExcelFill.xlsx"; // sample excel sheet
string sheetName = "Sheet1$";
using (OleDbConnection cn = new OleDbConnection(
"Provider=Microsoft.ACE.OLEDB.12.0;Data Source="+path+
";Extended Properties=\"Excel 8.0;HDR=Yes\""))
using (SqlConnection scn = new SqlConnection( sqlConnectionString ))
{
scn.Open();
// create temp SQL server table
new SqlCommand(#"create table #ExcelData
(
[Id] int,
[Barkod] varchar(20)
)", scn).ExecuteNonQuery();
// get data from Excel and write to server via SBC
OleDbCommand cmd = new OleDbCommand(String.Format("select * from [{0}]",sheetName), cn);
SqlBulkCopy sbc = new SqlBulkCopy(scn);
// Mapping sample using column ordinals
sbc.ColumnMappings.Add(0,"[Id]");
sbc.ColumnMappings.Add(1,"[Barkod]");
cn.Open();
OleDbDataReader rdr = cmd.ExecuteReader();
// SqlBulkCopy properties
sbc.DestinationTableName = "#ExcelData";
// write to server via reader
sbc.WriteToServer(rdr);
if (!rdr.IsClosed) { rdr.Close(); }
cn.Close();
// Excel data is now in SQL server temp table
// It might be used to do any internal insert/update
// i.e.: Select into myTable+DateTime.Now
new SqlCommand(string.Format(#"select * into [{0}]
from [#ExcelData]",
"ImportFromExcel_" +DateTime.Now.ToString("yyyyMMddHHmmss")),scn)
.ExecuteNonQuery();
scn.Close();
}
}
While this would work, thinking in the long run, you need column names, and maybe their types differ, it might be an overkill to do this stuff using SBC and you might instead directly do it from MS SQL server's OpenQuery:
SELECT * into ... from OpenQuery(...)
the WriteToServer(IDataReader) is intended to do internally the IDataReader.Read()operation.
using (SqlCommand sqlCmd = new SqlCommand(sqlQuery, sqlConn))
{
sqlConn.Open();
sqlCmd.ExecuteNonQuery();
bulkImport.WriteToServer(dataReader);
}
You can check the MSDN doc on that function, has a working example: https://msdn.microsoft.com/en-us/library/434atets(v=vs.110).aspx

Reading excel file in c#.net cell by cell

I'm new to c#.net
I have excel sheet and I want to import into database.
I want to read it cell by cell and want to insert value in database.
this.openFileDialog1.FileName = "*.xls";
DialogResult dr = this.openFileDialog1.ShowDialog();
if (dr == System.Windows.Forms.DialogResult.OK)
{
string path = openFileDialog1.FileName;
string connectionString = String.Format(#"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=""Excel 8.0;HDR=no;IMEX=1;""", openFileDialog1.FileName);
string query = String.Format("select * from [{0}$]", "Sheet3");
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(query, connectionString);
DataSet dataSet = new DataSet();
dataAdapter.Fill(dataSet);
dataGridView1.DataSource = dataSet.Tables[0];
I assume that after you execute the code in your question, you can see the values within dataGridView1.
The actual reading from the excel sheet is done when calling dataAdapter.Fill. So, in your case, reading the cells comes down to indexing columns and rows in dataSet.Tables[0].
For example:
for (int row = 0; row < dataSet.Tables[0].Rows.Count; row++)
{
DataRow r = dataSet.Tables[0].Rows[row];
}
Accessing the cells in row r is trivial (like the sample above, just for cell).
EDIT
I forgot to describe the "insert the values into a database" part. I presume that the database is SQL Server (may be Express edition, too).
First: create a database connection. Instead of manually composing the connection string, use the SqlConnectionStringBuilder:
SqlConnectionStringBuilder csb = new SqlConnectionStringBuilder();
csb.DataSource = <your server instance, e.g. "localhost\sqlexpress">;
csb.InitialCatalog = <name of your database>;
csb.IntegratedSecurity = <true if you use integrated security, false otherwise>;
if (!csb.IntegratedSecurity)
{
csb.UserId = <User name>;
csb.Password = <Password>;
}
Then, create and open a new SqlConnection with the connection string:
using (SqlConnection conn = new SqlConnection(csb.ConnectionString))
{
conn.Open();
Iterate over all the values you want to insert and execute a respective insert command:
for (...)
{
SqlCommand cmd = new SqlCommand("INSERT INTO ... VALUES (#param1, ..., #paramn)", conn);
cmd.Parameters.AddWithValue("#param1", value1);
...
cmd.Parameters.AddWithValue("#paramn", valuen);
cmd.ExecuteNonQuery();
}
This closes the connection, as the using block ends:
}
And there you go. Alternatively, you could use a data adapter with a special insert-command. Then, inserting the values would come down to a one-liner, however, your database table must have the same structure as the Excel-sheet (respectively: as the data table you obtained in the code you posted.
Check out NPOI
http://npoi.codeplex.com/
It's the .NET version of Apache's POI Excel implementation. It'll easily do what you need it to do, and will help avoid some of the problems ( i.e. local copy of Excel, or worse, copy of Excel on the server ) that you'll face when using the Jet provider.

Writing to a blank excel sheet with ADO.NET

I am trying to use ADO.NET to connect to and write to an excel file. I have created a blank file with the default excel sheets (I have also tried with a custom sheet.)
For some reason I am unable to write a full row of data to the sheet. If I create a new sheet it works fine, however then I have too many sheets and I am unable to delete any sheets.
Is there something special you need to do to write a row of data to a blank sheet?
I try to do:
path= the path including my file.
connString = String.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=\"Excel 8.0;HDR=NO;\"", Server.MapPath(path));
dbCmd.CommandText = "Update [Sheet1$] Set F1 = 'Col1', F2 = 'Col2', F3 = 'Col3', F4 = 'Col4'";
dbCmd.ExecuteNonQuery();
Here's an example of creating a brand new spreadsheet, creating a sheet (Sheet1) and then inserting a row into that. Most of this example was based on a blog entry from David Hayden (great blog entry for this task, btw!!).
Also, you should check out this Microsoft KB article for reading/writing to Excel from ADO.NET -- it really goes into a lot of detail.
//Most of this code was from David Hayden's blog:
// http://www.davidhayden.com/blog/dave/archive/2006/05/26/2973.aspx
static void Main(string[] args)
{
string connectionString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Temp\TestSO1.xls;Extended Properties=""Excel 8.0;HDR=NO;""";
DbProviderFactory factory =
DbProviderFactories.GetFactory("System.Data.OleDb");
using (DbConnection connection = factory.CreateConnection())
{
connection.ConnectionString = connectionString;
using (DbCommand command = connection.CreateCommand())
{
connection.Open(); //open the connection
//use the '$' notation after the sheet name to indicate that this is
// an existing sheet and not to actually create it. This basically defines
// the metadata for the insert statements that will follow.
// If the '$' notation is removed, then a new sheet is created named 'Sheet1'.
command.CommandText = "CREATE TABLE [Sheet1$] (F1 number, F2 char(255), F3 char(128))";
command.ExecuteNonQuery();
//now we insert the values into the existing sheet...no new sheet is added.
command.CommandText = "INSERT INTO [Sheet1$] (F1, F2, F3) VALUES(4,\"Tampa\",\"Florida\")";
command.ExecuteNonQuery();
//insert another row into the sheet...
command.CommandText = "INSERT INTO [Sheet1$] (F1, F2, F3) VALUES(5,\"Pittsburgh\",\"Pennsylvania\")";
command.ExecuteNonQuery();
}
}
}
The only problem I found is that even though the connection string states not to use headers, you still have to define column names for your sheet, and ADO.NET inserts a row when you create the sheet that has the row header names. I can't seem to find a way around that besides going in after I insert everything and removing the first row. Not very elegant.
Hope this helps!! Let me know if you have other questions.

Categories