I have a DBF file and a index file.
I want to read index file and search records satisfy some condition.
(for example: search records which its StudentName begin with "A" by using Student.DBF and StudentName.idx)
How do I do this programmatically?
It would be easiest to query via OleDB Connection
using System.Data.OleDb;
using System.Data;
OleDbConnection oConn = new OleDbConnection("Provider=VFPOLEDB.1;Data Source=C:\\PathToYourDataDirectory");
OleDbCommand oCmd = new OleDbCommand();
oCmd.Connection = oConn;
oCmd.Connection.Open();
oCmd.CommandText = "select * from SomeTable where LEFT(StudentName,1) = 'A'";
// Create an OleDBAdapter to pull data down
// based on the pre-built SQL command and parameters
OleDbDataAdapter oDA = new OleDbDataAdapter(oCmd);
DataTable YourResults
oDA.Fill(YourResults);
oConn.Close();
// then you can scan through the records to get whatever
String EachField = "";
foreach( DataRow oRec in YourResults.Rows )
{
EachField = oRec["StudentName"];
// but now, you have ALL fields in the table record available for you
}
I dont have the code off the top of my head, but if you do not want to use ODBC, then you should look into reading ESRI shape files, they consist of 3 parts (or more) a .DBF (what you are looking for), a PRJ file and a .SHP file. It could take some work, but you should be able to dig out the code. You should take a look at Sharpmap on codeplex. It's not a simple task to read a dbf w/o ODBC but it can be done, and there is a lot of code out there for doing this. You have to deal with big-endian vs little-endian values, and a range of file versions as well.
if you go here you will find code to read a dbf file. specifically, you would be interested in the public void ReadAttributes( Stream stream ) method.
Related
I try to import data from a XML file into SQL Server CE database. I use ErikEJ SQL Server Compact Bulk Insert Library (from NuGet) this library on codeplex. I create database and table. Then I read XML to DataTable and import this DataTable to DB table.
DataSet ds = new DataSet();
ds.ReadXml("myxml.xml");
DataTable table = new DataTable();
table = ds.Tables[0];
String connString = #"Data Source = test.sdf";
SqlCeBulkCopy bulkInsert = new SqlCeBulkCopy(connString);
bulkInsert.DestinationTableName = "testtable";
bulkInsert.WriteToServer(table);
It works on a small xml, but when I use large xml (more then 1gb) I get this error on ReadXml :
"System.OutOfMemoryException" in mscorlib.dll
How to fix this?
update: I know that this error because I use large xml - question is how optimize this algorithm, mayby using buffer or read xml part by part, any idea?
There is no simple libary that will solve this for you.
You need to read the XML file in a streaming fashion ( Reading Xml with XmlReader in C# ) to avoid loading the entire XML file, and then for each element read add these to a List or DataTable, up to say 100,000 entries, then BulkInsert those, dispose/clear all unused objects and go on, until the entire file has been read.
In addition, calls to SqlCeBulkCopy should be wrapped in usings to dispose unmanaged resources:
using (SqlCeBulkCopy bulkInsert = new SqlCeBulkCopy(connString))
{
bulkInsert.DestinationTableName = "testtable";
bulkInsert.WriteToServer(table);
}
How to upload an excel sheet using asp.net and know the structure of the columns in the sheet so that it would be helpful in using sqlbulkcopy to upload to a table with similar structure.
any answers would be appreciated.
Thanks in advance.
I assume you know how to do the uploading part, so I concentrate on the Excel part.
There are a bunch of 3rd-Party tools to read Excel files in .NET, which in my experience is way more flexible than using the capabilities that .NET has out-of-the-box. However here's one way you can do it:
DbProviderFactory factory = DbProviderFactories.GetFactory("System.Data.OleDb");
using (DbConnection connection = factory.CreateConnection())
{
connection.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyExcel.xls;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1";
connection.Open();
using (DbCommand command = connection.CreateCommand())
{
command.CommandText = "SELECT * FROM [Sheet1$]";
using (DbDataReader dr = command.ExecuteReader())
{
while (dr.Read())
{
/* read data here */
}
}
}
}
Keep in mind:
The Jet OLE DB provider reads a registry key to determine how many rows are to be read to guess the type of the source column. The registry setting is: HKLM\Software\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows. By default, the value for this key is 8. Hence, the provider scans the first 8 rows of the source data to determine the data types for the columns (see http://support.microsoft.com/kb/281517) The valid range of values for the TypeGuessRows key is 0 to 16. However, if the value is 0, the number of source rows scanned is 16384.
On 64-bit systems the Microsoft.Jet.OLEDB.4.0 driver is currently not supported.
For more info on the parameters used in the connection string see here: http://www.connectionstrings.com/excel
"HDR=Yes" in the connection string indicates that the provider will not include the first row of the cell range (which may be a header row) in the RecordSet. So if the header row gives you information that you need to build the sqlbulkcopy commands you should set it to "HDR=No".
So I'm trying to get the results from a stored proc (200k rows+) into an Excel file from ASP.NET but having a few difficulties. I don't think csv is an option as the client want the numbers formatted correctly. I've tried three third party Excel libraries but all have fallen over with so much data and are using gigabytes of memory.
I've wrote some code to generate an Excel XML file and it runs very quickly but the file is over 300megs. If I open and save as a native Excel file it gets it down to 30megs. At the moment my best solution is to zip this xml file on the server which gets it down to 7megs but the user is still going to end up with a huge file once unzipped. Ideally I'd like to find a third party Excel library that can write a native Excel file with 200,000+ rows without killing the server, any ideas?
Here's a really quick POC I made that writes 3 columns of 255 characters 200,000 times (600,000 cells). The final file comes in at 4.85MB on my machine.
string ExportFile = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test.xlsx");
string DSN = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0 Xml;HDR=YES\";", ExportFile);
using (System.Data.OleDb.OleDbConnection Con = new System.Data.OleDb.OleDbConnection(DSN))
{
Con.Open();
using (System.Data.OleDb.OleDbCommand Com = new System.Data.OleDb.OleDbCommand())
{
Com.Connection = Con;
Com.CommandText = "CREATE TABLE [TestSheet] (A1 varChar(255), B1 varChar(255), C1 varChar(255))";
Com.ExecuteNonQuery();
string A1 = new string('A', 255);
string B1 = new string('B', 255);
string C1 = new string('C', 255);
Com.CommandText = string.Format("INSERT INTO [TestSheet] (A1, B1, C1) VALUES ('{0}', '{1}', '{2}')", A1, B1, C1);
for (var i = 1; i <= 200000; i++)
{
Com.ExecuteNonQuery();
}
}
Con.Close();
}
On a server I'm not all sure what's needed but you might have to install this:
http://www.microsoft.com/downloads/en/details.aspx?familyid=7554f536-8c28-4598-9b72-ef94e038c891&displaylang=en
Ultimately excel is a flat file system and SQL isn't making compatibility interesting. I'm hoping Chris Haas' code will work for you. It does seem odd, 'I don't think csv is an option as the client want the numbers formatted correctly. ' if they have an SQL database they don't refer/query that instead of having an excel version of the database?
I ended up generating a xlsx myself, it turns out the format is pretty simple
I'm pretty new to C# and Visual Studio. I'm writing a small program that will read a .csv file and then write the records read to a SQL Server database table.
I can manually parse the .csv file, but I was wondering if it is possible to somehow "describe" the .csv file to Visual Studio so that I can use it as a data source? I should mention that the first two lines in the .csv file contain header information and the following lines are the actual comma-delimited data.
Also, I should mention that this program is a stand-alone console program with no user interface.
This is a great example of using the power of LINQ. Here's a quick reference with an example of how to do it.
The run down is this. You can read in your CSV to a string array, then use LINQ to query against that collection. As Reed points out though, you'll have to code around your header line, as it will throw off your query.
You can also use the TextFieldParser too to handle escaping commas. Here's an example on thinqlinq that uses the TextFieldParser to parse the file, and a LINQ query to get the results. It even has a unit test to make sure escaped commas are handled.
If you have a 2 line header, it's not a standard CSV file.
In this case, the automatic tools won't work, and you'll have to revert to parsing the file manually.
If you want to remove one of the header lines, you might be able to use this technique of parsing CSV files into an ADO.NET DataTable.
If not, however, the TextFieldParser in the Microsoft.VisualBasic.dll assembly (usable from C# too) makes parsing CSV files very simple.
To parse it manually is very simple, and you could have a program that parses it, strips out the first two unnecessary lines and then feeds it directly to SSIS.
Here is a link for using LINQ to read it in:
http://blogs.msdn.com/wriju/archive/2009/05/24/linq-to-csv-getting-data-the-way-you-want.aspx
Using The Built In OLEDB CSV Parser via C# in order to parse a CVS file.
You can find a sample here
It basically lets you treat the csv file like a database table.
The link in Development 4.0's post has dissapeared. The code in that link was the following:
class CSVParser
{
public static DataTable ParseCSV(string path)
{
if (!File.Exists(path))
return null;
string full = Path.GetFullPath(path);
string file = Path.GetFileName(full);
string dir = Path.GetDirectoryName(full);
//create the "database" connection string
string connString = "Provider=Microsoft.Jet.OLEDB.4.0;"
+ "Data Source=\"" + dir + "\\\";"
+ "Extended Properties=\"text;HDR=No;FMT=Delimited\"";
//create the database query
string query = "SELECT * FROM " + file;
//create a DataTable to hold the query results
DataTable dTable = new DataTable();
//create an OleDbDataAdapter to execute the query
OleDbDataAdapter dAdapter = new OleDbDataAdapter(query, connString);
try
{
//fill the DataTable
dAdapter.Fill(dTable);
}
catch (InvalidOperationException /*e*/)
{ }
dAdapter.Dispose();
return dTable;
}
}
}
I have to automate something for the finance dpt. I've got an Excel file which I want to read using OleDb:
string connectionString = #"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=A_File.xls;Extended Properties=""HTML Import;IMEX=1;""";
using (OleDbConnection connection = new OleDbConnection())
{
using (DbCommand command = connection.CreateCommand())
{
connection.ConnectionString = connectionString;
connection.Open();
DataTable dtSchema = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if( (null == dtSchema) || ( dtSchema.Rows.Count <= 0 ) )
{
//raise exception if needed
}
command.CommandText = "SELECT * FROM [NameOfTheWorksheet$]";
using (DbDataReader dr = command.ExecuteReader())
{
while (dr.Read())
{
//do something with the data
}
}
}
}
Normally the connectionstring would have an extended property "Excel 8.0", but the file can't be read that way because it seems to be an html file renamed to .xls.
when I copy the data from the xls to a new xls, I can read the new xls with the E.P. set to "Excel 8.0".
Yes, I can read the file by creating an instance of Excel, but I rather not..
Any idea how I can read the xls using OleDb without making manual changes to the xls or by playing with ranges in a instanciated Excel?
Regards,
Michel
I asked this same question on another forum and got the answer so I figured I'd share it here. As per this article: http://ewbi.blogs.com/develops/2006/12/reading_html_ta.html
Instead of using the sheetname, you must use the page title in the select statement without the $. SELECT * FROM [HTMLPageTitle]
I've been searching so many solution, end up I found something really simple and easy -
to import XML file to Excel file, I tried to convert XML to HTML first, use -
http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=63
then I found I could easily change my output file as .xls, instead of .html
//create the output stream
XmlTextWriter myWriter = new XmlTextWriter
("result.html", null);
then the output is perfect Excel file from my XML data file.
hope this will save ur work.
I have run into the same problem. As previously mentioned, it seems to be an html file renamed to .xls. When I copy the data from the xls to a new xls, I can read the new xls with the E.P. set to "Excel 8.0".
In this scenario, the file couldn't be saved in the correct format. So we have to convert that file to the correct format. To do this, use MS Office Excel 2007, Click File -> Convert. The file will be converted to the right format automatically.