I am looking for the best practices in storing OleDBDataReader's reads. Essentially I want it to retain that same dictionary like reader["Column"]. I am writing an API that returns a data structure made up of "rows." I feel like there must be a better solution than creating an ArrayList of dictionaries but I cannot seem to find a "best practice" for this.
The code below is taken from my current project
using (var commandToQueryDB = new OleDbCommand(query))
{
commandToQueryDB.Connection = Connection;
Connection.Open();
var reader = commandToQueryDB.ExecuteReader();
while (reader.Read())
{
//Insert reader's read in some sort of data structure
}
}
I would like to be able to iterate through the queries and then access each query as a dictionary (e.g query["DistrictName"] if I had a table with DistrictName as a column)
A reader is a "pipe", not a "bucket" - the reader API is not suitable for disconnected data. For that, it depends on whether you know the schema of the data.
If you do know the schema at compile-time, then populate a typed class model - just a List<Foo> will do nicely. There are tools that can make this even simpler, handling the member population for you, etc; "dapper" leaps to mind (although I am biased).
If you do not know the schema in advance, then DataTable may be suitable. I don't usually recommend it, but it does the job here. Just:
table.Load(reader);
is enough to populate a DataTable including schema (columns) and values (rows / cells).
Related
What is a data structure (like list, array, etc...) that could replace a database like SQL?
I would like it to have as many database-like features as possible, like query select and so on...
If is there none, suggest structure how it should be look like
edit:
datatable is good enough i think, thx for the answers
The simplest such data structure would be a record in F# or a class in C#. This would represent your table. A collection of tables would represent a database. You can query this with query expressions (aka Linq), and serialize it as pointed out above. You can also use a DataTable. If you are just looking for an in memory representation of a database you could have that with SQLite.
If you just want to access a database you can do it with the SQLProvider in F#, or Dapper in both F# and C#.
Here is an example with a list of records and a query expression:
open System
type Row = {
Id: bigint
Name: string
Address: string
}
let table = [
{Id = 100I; Name = "Joe"; Address = "NYC"}
{Id = 101I; Name = "Jane"; Address = "KC"}
{Id = 102I; Name = "Jim"; Address = "LA"}
]
let notInNYC =
query {
for user in table do
where (user.Address <> "NYC")
select user.Name
}
|> Seq.toList
//val notInNYC : string list = ["Jane"; "Jim"]
If you are looking to use an actual SQL database, then
(per MSDN):
private static void CreateCommand(string queryString, string connectionString)
{
using (SqlConnection connection = new SqlConnection(connectionString))
{
SqlCommand command = new SqlCommand(queryString, connection);
command.Connection.Open();
command.ExecuteNonQuery();
}
}
If you are looking to not use an actual SQL database and try to save data, relations, etc. directly in your code (not sure why you'd want to do that), you could create your own custom classes for it. You'd want to include some form of table, as well as a search method that could look through the instances of table, etc. There are so many functionalities that you'd have to implement though, so this would be difficult to do if you are trying to replicate all of the functionality of a real SQL db.
Assuming you already know about Entity Framework as an ORM and a gateway to access DBs, here are some alternatives you'd want to have in mind.
One straight forward and quick solution for small data amounts is serialization.
You can choose from:
Json
XML
Binary
Some others.
Serialization allows you to store and retrieve an object graph with not fuss of setting up DBs and connections. But doesn't give sophisticated search and update capabilities.
Another thing you might want to explore is NoSQL databases.
Check out LiteDB to get you started with the concept.
I need to cache some look up tables in memory from sql database. I have hundreds of them. The tables are pretty simple with the following structure.
Tablename= "l_lookupobjectname"
column1Name: ID
Column2Name: Code
Code is mostly a string but can also be an integer in a few cases.
I use entity framework and would like a generic way to load those tables into my web application memory. I do not want to individually load each table by specifying its name.
I'm thinking along having a list of dictionary<in id, dynamic code>.
My problem is:
How do I generate the data access code that will pull all the data to my List of dictionary without having to write repetitive code for all my hundreds of table.
"select ID, Code from all the tables" instead calling this statement for each table.
I'm not concerned about the code for caching the data. This is quite trivial.
Your issue might be types, unless you declare everything is a string or an object (and cast it as needed).
Other than that going with some nested dictionaries seems like your best bet. You can build SQL queries ("select * from {0}"), and just provide a list of tables. Then read each one into a dictionary.
You could use DataSet, but that is quite cumbersome. Probably SqlDataReader is better bet.
You can get column names from it by:
var reader = cmd.ExecuteReader();
var columns = new List<string>();
for(int i=0;i<reader.FieldCount;i++)
{
columns.Add(reader.GetName(i));
}
and then just read it all as strings or objects.
What is the best approach to store information gathered locally in .csv-files with a C#.net sql-database? My reasons for asking is
1: The data i am to handle is massive (millions of rows in each csv). 2: The data is extremely precise since it describes measurements on a nanoscopic scale, and is therefor delicate.
My first though was to store each row of the csv in a correspondant row in the database. I did this using The DataTable.cs-class. When done, i feelt that if something goes wrong when parsing the .csv-file, i would never notice.
My second though is to upload the .csvfiles to a database in it's .csv-format and later parse the file from the database to the local enviroment when the user asks for it. If even possible in c#.net with visual stuido 2013, how could this be done in a efficient and secure manner?
I used .Net DataStreams library from csv reader in my project. It uses the SqlBulkCopy class, though it is not free.
Example:
using (CsvDataReader csvData = new CsvDataReader(path, ',', Encoding.UTF8))
{
// will read in first record as a header row and
// name columns based on the values in the header row
csvData.Settings.HasHeaders = true;
csvData.Columns.Add("nvarchar");
csvData.Columns.Add("float"); // etc.
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(connection))
{
bulkCopy.DestinationTableName = "DestinationTable";
bulkCopy.BulkCopyTimeout = 3600;
// Optionally, you can declare columnmappings using the bulkCopy.ColumnMappings property
bulkCopy.WriteToServer(csvData);
}
}
It sounds like you are simply asking whether you should store a copy of the source CSV in the database, so if there was an import error you can check to see what happened after the fact.
In my opinion, this is probably not a great idea. It immediately makes me ask, how would you know that an error had occurred? You certainly shouldn't rely on humans noticing the mistake so you must develop a way to programmatically check for errors. If you have an automated error checking method you should apply that method when the import occurs and avoid the error in the first place. Do you see the circular logic here?
Maybe I'm missing something but I don't see the benefit of storing the CSV.
You should probably use Bulk Insert. With your csv-file as a source.
But this will only work if the file is accessible from the PC that is running your SQL Server.
Here you can find a nice solution as well. To be short it looks like this:
StreamReader file = new StreamReader(bulk_data_filename);
CsvReader csv = new CsvReader(file, true,',');
SqlBulkCopy copy = new SqlBulkCopy(conn);
copy.DestinationTableName = tablename;
copy.WriteToServer(csv);
i have a query that return only one row (always) and i want to convert this row to class object (lets say obi)
i have a feeling that using data table to this kind of query is to much
but i dont realy know which other data object to use
data reader?
is there a way to execute sql command to data row ?
DataReader is the best choice here - DataAdapters and DataSets may be overkill for a single row, although, that said, if performance is not critical then keeping-it-simple isn't a bad thing. You don't need to go from DataReader -> DataRow -> your object, just read the values off of the DataReader and you're done.
A datareader lets you query individual fields. If you want the row as a single object, I believe the DataTable/DataRowView family of objects is in fact the way to go.
You might seriously consider taking a look at Linq-to-Sql or Linq-to-Entities.
The appeal of these frameworks is they provide automatic serialization of your database data into objects, abstract away many of the mundane details of connection management, and have better compile-time support by providing strongly-typed properties which you can use without string keys or column ordinals.
When using Linq, the difference between retrieving a single row vs. retrieving multiple rows often only involves appending .Single() or .First() to your query.
At any rate, if you already use or are willing to learn one of these frameworks, you may see the bulk and difficulty of data access code reduce substantially.
With respect to DataReader vs. DataSet/DataTable, it is correct that it takes more cycles to allocate and populate a data table; however, I highly doubt you will notice the difference unless creating an extremely high volume of database calls.
In case it is helpful, here are documentation examples of data access using data readers and data sets.
DataReader
DataSet
I'm building an offline C# application that will import data off spread sheets and store them in a SQL Database that I have created (Inside the Project). Through some research I have been able to use some code that can import a static table, into a Database that is exactly the same layout as the columns in the worksheet
What I"m looking to do is have specific columns go to their correct tables based on name. This way I have the database designed correctly and not just have one giant table to store everything.
Below is the code I'm using to import a few static fields into one table, I want to be able to split the imported data into more than one.
What is the best way to do this?
public partial class Form1 : Form
{
string strConnection = ConfigurationManager.ConnectionStrings
["Test3.Properties.Settings.Test3ConnectionString"].ConnectionString;
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
//Create connection string to Excel work book
string excelConnectionString =
#"Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=C:\Test.xls;
Extended Properties=""Excel 8.0;HDR=YES;""";
//Create Connection to Excel work book
OleDbConnection excelConnection = new OleDbConnection(excelConnectionString);
//Create OleDbCommand to fetch data from Excel
OleDbCommand cmd = new OleDbCommand
("Select [Failure_ID], [Failure_Name], [Failure_Date], [File_Name], [Report_Name], [Report_Description], [Error] from [Failures$]", excelConnection);
excelConnection.Open();
OleDbDataReader dReader;
dReader = cmd.ExecuteReader();
SqlBulkCopy sqlBulk = new SqlBulkCopy(strConnection);
sqlBulk.DestinationTableName = "Failures";
sqlBulk.WriteToServer(dReader);
}
You can try an ETL (extract-transform-load) architecture:
Extract: One class will open the file and get all the data in chunks you know how to work with (usually you take a single row from the file and parse its data into a POCO object containing fields that hold pertinent data), and put those into a Queue that other work processes can take from. In this case, maybe the first thing you do is have Excel open the file and re-save it as a CSV, so you can reopen it as basic text in your process and chop it up efficiently. You can also read the column names and build a "mapping dictionary"; this column is named that, so it goes to this property of the data object. This process should happen as fast as possible, and the only reason it should fail is because the format of a row doesn't match what you're looking for given the structure of the file.
Transform: Once the file's contents have been extracted into an instance of a basic row, perform any validation, calculations or other business rules necessary to turn a row from the file into a set of domain objects that conform to your domain model. This process can be as complex as you need it to be, but again it should be as straightforward as you can make it while obeying all the business rules given in your requirements.
Load: Now you've got an object graph in your own domain objects, you can use the same persistence framework you'd call to handle domain objects created any other way. This could be basic ADO, an ORM like NHibernate or MSEF, or an Active Record pattern where objects know how to persist themselves. It's no bulk load, but it saves you having to implement a completely different persistence model just to get file-based data into the DB.
An ETL workflow can help you separate the repetitive tasks into simple units of work, and from there you can identify the tasks that take a lot of time and consider parallel processes.
Alternately, you can take the file and massage its format by detecting columns you want to work with, and arranging them into a format that matches your bulk input spec, before calling a bulk insert routine to process the data. This file processor routine can do anything you want it to, including separating data into several files. However, it's one big process that works on a whole file at a time and has limited opportunities for optimization or parallel processing. However, if your loading mechanism is slow, or you've got a LOT of data that is simple to digest, it may end up faster than even a well-designed ETL.
In any case, I would get away from an Office format and into a plain-text (or XML) format as soon as I possibly could, and I would DEFINITELY avoid having to install Office on a server. If there is ANY way you can require the files be in some easily-parseable format like CSV BEFORE they're loaded, so much the better. Having an Office installation on a server is a Really Bad Thing in general, and OLE operations in a server app is not much better. The app will be very brittle, and anything Office wants to tell you will cause the app to hang until you log onto the server and clear the dialog box.
If you were looking for a more code related answer, you could use the following to modify your code to work with difficult column names / different tables:
private void button1_Click(object sender, EventArgs e)
{
//Create connection string to Excel work book
string excelConnectionString =
#"Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=C:\Test.xls;
Extended Properties=""Excel 8.0;HDR=YES;""";
//Create Connection to Excel work book
OleDbConnection excelConnection = new OleDbConnection(excelConnectionString);
//Create OleDbCommand to fetch data from Excel
OleDbCommand cmd = new OleDbCommand
("Select [Failure_ID], [Failure_Name], [Failure_Date], [File_Name], [Report_Name], [Report_Description], [Error] from [Failures$]", excelConnection);
excelConnection.Open();
DataTable dataTable = new DataTable();
dataTable.Columns.Add("Id", typeof(System.Int32));
dataTable.Columns.Add("Name", typeof(System.String));
// TODO: Complete other table columns
using(OleDbDataReader dReader = cmd.ExecuteReader())
{
DataRow dataRow = dataTable.NewRow();
dataRow["Id"] = dReader.GetInt32(0);
dataRow["Name"] = dReader.GetString(1);
// TODO: Complete other table columns
dataTable.Rows.Add(dataRow);
}
SqlBulkCopy sqlBulk = new SqlBulkCopy(strConnection);
sqlBulk.DestinationTableName = "Failures";
sqlBulk.WriteToServer(dataTable);
}
Now you can control the names of the columns and which tables the data gets imported into. SqlBulkCopy is good for insert large amounts of data. If you only have a small amount of rows, you might be better off creating a standard data access layer to insert your records.
If you are only interested in the text (not the formatting etc.), alternatively you can save the excel file as CSV file, and parse the CSV file instead, it's simple.
Depending on the lifetime of the program, I would recommend one of two options.
If the program is to be short lived in use, or generally a "throw away" project, I would recommend a series of routines which parse and input data into another set of tables using standard SQL with some string processing as needed.
If the program will stick around longer and/or find more use on a day-to-day basis, I would recommend implementing a solution similar to the one recommended by #KeithS. With a set of well defined steps for working with the data, much flexibility is gained. More specifically, the .NET Entity Framework would probably be a great fit.
As a bonus, if you're not already well versed in this area, you might find you learn a great deal about working with data between boundaries (xls -> sql -> etc.) during your first stint with an ORM such as EF.