I have a String to Date conversion problem using SQL Bulkcopy in asp.net 3.5 with C#
I read a large CSV file (with CSV reader). One of the strings read should be loaded into a SQL server 2008 Date column.
If the textfile contains for example the string '2010-12-31', SQL Bulkcopy loads it without any problems into the Date column.
However, if the string is '20101231', I get an error:
The given value of type String from the data source cannot be converted to type date of the specified target column
The file contains 80 million records so I cannot create a datatable....
SqlBulkcopy Columnmappings etc. are all ok. Also changing to DateTime does not help.
I tried
SET DATEFORMAT ymd;
But that does not help.
Any ideas how to tell SQL Server to accept this format? Otherwise I will create a custom fix in CSV reader but I would prefer something in SQL.
update
Following up on the two answers, I am using SQL bulkcopy like this (as proposed on Stackoverflow in another question):
The CSV reader (see the link above on codeproject) returns string values (not strong typed). The CSVreader implements System.Data.IDataReader so I can do something like this:
using (CsvReader reader = new CsvReader(path))
using (SqlBulkCopy bcp = new SqlBulkCopy(CONNECTION_STRING))
{ bcp.DestinationTableName = "SomeTable";
// columnmappings
bcp.WriteToServer(reader); }
All the fields coming from the iDataReader are strings, so I cannot use the c# approach unless I change quite a bit in the CSVreader
My question is therefore not related on how to fix it in C#, I can do that but i want to prevent that.
It is strange, because if you do a in sql something like
update set [somedatefield] = '20101231'
it also works, just not with bulkcopy.
Any idea why?
Thanks for any advice,
Pleun
Older issue, but wanted to add an alternative approach.
I had the same issue with SQLBulkLoader not allowing DataType/culture specifications for columns when streaming from IDataReader.
In order to reduce the speed overhead of constructing datarows locally and instead have the parsing occur on the target, a simple method I used was to temporarily set the thread culture to the culture which defines the format in use - in this case for US format dates.
For my problem - en-US dates in the input (in Powershell):
[System.Threading.Thread]::CurrentThread.CurrentCulture = 'en-US'
<call SQLBulkCopy>
For your problem, you could do the same but since the date format is not culture specific, create a default culture object (untested):
CultureInfo newCulture = (CultureInfo) System.Threading.Thread.CurrentThread.CurrentCulture.Clone();
newCulture.DateTimeFormat.ShortDatePattern = "yyyyMMDD;
Thread.CurrentThread.CurrentCulture = newCulture;
I found allowing the database server to perform the type conversions once they've gotten through the SQLBulkCopy interface to be considerably faster than performing parsing locally, particularly in a scripting language.
If you can handel it in C# itself then this code will help get the date in the string as a DateTime object which you can pass directly
//datestring is the string read from CSV
DateTime thedate = DateTime.ParseExact(dateString, "yyyyMMdd", null);
If you want it to be formatted as string then:
string thedate = DateTime.ParseExact(dateString, "yyyyMMdd", null).ToString("yyyy-MM-dd");
Good luck.
Update
In your scenario i don't know why date is not automatically formatted but from C# you need to get in and Interfere in the process of passing the data to the WriteToServer() method. Best i think you can do (keeping in mind the Performance) is to have a cache of DataRow items and Pass them to the WriteToServer() method. I will just write the sample code in a minute...
//A sample code.. polish it before implementation
//A counter to track num of records read
long records_read = 0;
While(reader.Read())
{
//We will take rows in a Buffer of 50 records
int i = records_read;//initialize it with the num of records last read
DataRow[] buffered_rows = new DataRow[50];
for(;i<50 ;i++)
{
//Code to initialize each rows with the data in the reader
//.....
//Fill the column data with Date properly formatted
records_read++;
reader.Read();
}
bcp.WriteToServer(buffered_rows);
}
Its not full code but i think you can work it out...
It's not entirely clear how you're using SqlBulkCopy, but ideally you shouldn't be uploading the data to SQL Server in string format at all: parse it to a DateTime or DateTimeOffset in your CSV reader (or on the output of your CSV reader), and upload it that way. Then you don't need to worry about string formats.
Related
I'm currently building a data export/import tool for pulling data into a Visual Fox Pro database from an excel or CSV document.
I believe the code to be functional, however upon execution I recieve a data type mismatch error.
After some investigation I've notice a difference between the format of the dates I'm pulling and the field I'm pushing to.
The Fox pro database is set up to take Date records, however the data i'm trying to push is in date time format (the original record is date) but as far as I'm aware c# can only natively do datetime conversion.
The code getting the date from excel is as such:
importCommand.Parameters["TENSDATE"].Value = exportReader.IsDBNull(0)
? (object) DBNull.Value
: DateTime.Parse(exportReader.GetValue(0).ToString());
Now, I've seen a lot of people use something like:
exportReader.GetValue(0).ToString("dd/MM/yyyy")
However I can't seem to get this functioning. Can someone advise me on the best way to achieve my goal.
You need to supply the type of the field when adding it to parameters. In this specific case, OdbcType.DateTime for a date field.
importCommand.Parameters.Add("#TENSDATE", OdbcType.DateTime).Value = exportReader.IsDBNull(0)
? (object) DBNull.Value
: DateTime.Parse(exportReader.GetValue(0).ToString());
If you want to parse dates which are in specific format you should use DateTime.TryParseExact method. You'll be able to pass specific format as an argument. Please refer to: https://msdn.microsoft.com/en-us/library/ms131044(v=vs.110).aspx
(Joshua Cameron-Macintosh, please close your open threads)
Despite my prior warnings, you are trying to do that the hard way, be it. VFP is a good data centric language and is clever enough to put a DateTime value into a Date or DateTime field. It is also clever enough to parse text values that denote a Date(time) - in the case of text, just like any other database or non-database parsers it does the parsing with given rules (such as using common canonical ODBC format of yyyyMMdd HH:mm:ss with no problem, or if instructed to use a format of say DMY, it knows 1/2/2000 means Feb 1st,2000 etc.). In summary here the problem is not on VFP side at all. If you use CSV, then be sure you are using ODBC canonical format for dates (same goes on with SQL Server for example). In case of Excel file, provided you have the correct data types, you can directly transfer with no additional work, particularly that DBNull trial was totally unnecessary, VFP knows DbNull.Value already.
Anyway code always talks better.
For this sample assume you have an excel file (d:\temp\ExcelImportData.xlsx) with SampleSheet sheet where you have the data columns as:
Customer ID: string
Order ID: integer
Ordered On: DateTime && where time parts were insignificant fro demo purposes
Shipped On: DateTime && Has NULL values
(You can build such a sample sheet using Northwind sample database's Orders table)
There is a VFP table (d:\temp\SampleImport.dbf) as the receiver where column information is:
CustomerId: Char(10) NOT NULL
OrderID: Int NOT NULL
OrderDate: Date NOT NULL
ShippedOn: DateTime NULL
Here is the simple read/write using a reader:
void Main()
{
var vfpConnection = #"Provider=VFPOLEDB;Data Source=D:\temp";
var xlsFileName = #"D:\temp\ExcelImportData.xlsx";
var xlsConnection = $#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={xlsFileName};" +
"Extended Properties=\"Excel 12.0;HDR=Yes\"";
var xlsTableName = "SampleSheet$";
using (var xlsCon = new OleDbConnection(xlsConnection))
using (var vfpCon = new OleDbConnection(vfpConnection))
{
var cmdInsert = new OleDbCommand(#"insert into SampleImport
(CustomerId, OrderId, OrderDate, ShippedOn)
values
(?,?,?,?)", vfpCon);
cmdInsert.Parameters.Add("customerId", OleDbType.WChar);
cmdInsert.Parameters.Add("orderId", OleDbType.Integer);
cmdInsert.Parameters.Add("orderDate", OleDbType.Date);
cmdInsert.Parameters.Add("shippedOn", OleDbType.Date);
var readXl = new OleDbCommand($"select * from [{xlsTableName}]", xlsCon);
xlsCon.Open();
vfpCon.Open();
var xlReader = readXl.ExecuteReader();
while (xlReader.Read())
{
cmdInsert.Parameters["customerId"].Value = xlReader["Customer ID"];
cmdInsert.Parameters["orderId" ].Value = xlReader["Order ID"];
cmdInsert.Parameters["orderDate" ].Value = xlReader["Ordered On"];
cmdInsert.Parameters["shippedOn" ].Value = xlReader["Shipped On"];
cmdInsert.ExecuteNonQuery();
}
xlsCon.Close();
vfpCon.Close();
}
}
I'm trying to find a way to correctly format a Date in a SELECT statement that loads data from an old Visual Fox Pro Database. I need to do this so that I can load it into a CSV, then load it into MySQL, which takes the date format 'yyyy-MM-dd'.
This is my query at the moment;
var dbfCmd = new OleDbCommand(#"SELECT ct_id, ct_cmpid, ct_empid, ct_pplid, ct_cntid, ct_pplnm, ct_date, ct_time, ct_type, ct_doneby, ct_desc FROM contacts", dbfCon);
I need to find a way to format the ct_date column. I've tried a number of ways so far, including using FDATE and FORMAT, but nothing has worked so far. I've looked through the supported commands for OleDB but still haven't come across anything.
Can anyone inform me of the correct way to format the Date query so that I can import to MySQL?
Do not format the date on the server.
Receive as a date column, then read as a DateTime value when enumerating the query. Finally format it on your client in the specify way when writing the file.
Although it has been suggested to convert the data in C# to date format, if you REALLY want to pull the date formatted from VFP OleDb query, you could do this for your date column
STUFF( STUFF( DTOS( ct_date ), 5, 0, "-"), 8, 0, "-" )
The VFP Function DTOS will convert a date (or datetime) to a string and will always be in the format of YYYYMMDD. The STUFF command will do the inserting of the hyphen character to properly set to YYYY-MM-DD for you.
You could simply use ToString("yyyy-MM-dd") but better do not even do that. You could directly connect to MySQL and insert data into it using VFP commands and call that via an ExecScript. Or you could directly import from VFP within MySQL connection (I don't use MySQL but you can do that with other databases like MS SQL Server, postgreSQL etc so I assume MySQL is also capable doing that).
Yet another alternative would be to use an XML format to import/export which is much more reliable than plain text.
Yet another way, you could connect to MySQL and VFP, fill MySQL DataTable with your data and submit changes.
IOW using a text file in between would be my last suggestion.
PS: Have a look at FileHelpers.
so i have a string "09/15/2014" and in c# it converts it to date:
DateTime from = Convert.ToDateTime(fromdate);
this outputs "9/15/2014" and when I send it over to sql I get this:
select convert(varchar, '9/1/2014 12:00:00 AM', 101)
which doesn't work for me because I need to keep any leading zero's.
help?
If you're worried about the string formats for dates with Sql Server, you're doing it wrong. As a comment to another answer indicates, SQL Server internally stores all dates in a machine-optimized numeric format that is not easily human-readable. It only converts them to a human-understandable format for output in your developer tools.
When sending dates to Sql Server, always use query parameters. In fact, when sending any data, of any type, to Sql Server in an SQL statement, always use query parameters. Anything else will not only result in formatting issues like your problem here, but will also leave you crazy-vulnerable to sql injection attacks. If you find yourself using string manipulation to include data of any type into an SQL string from client code, step away from the keyboard and go ask a real programmer how to do it right. If that sounds insulting, it's because it's so hard to understate the importance of this issue and the need to take it seriously.
When retrieving dates from Sql Server, most of the time you should just select the datetime field. Let client code worry about how to format it. Do you want leading zeros? Great! The Sql Datetime column will at some point be available in C# as a .Net DateTime value, and you can use the DateTime's .ToString() method or other formatting option to convert the value to whatever you want, at the client.
SQL queries use a date and time format which goes like this:
2014-09-15
That's year-month-day. As per the comments below, this may be different depending on the collation you have on your database (see Scott's comment for a more accurate way to describe this and get dates into this format).
DateTime's ToString method has an overload which takes a formatting string. So you can pass the format you want the string to be output to. Try it like this:
string queryDate = from.ToString("yyyy-MM-dd");
And see what you get. Use that on your query.
But if you really want this done right, use parameters. Like:
SqlCommand command = new Command(connection, "SELECT * FROM foo WHERE someDate = #date");
command.Parameters.AddWithValue("#date", from);
// where "from" is your DateTime variable from the code you've shown.
This will save you the trouble of DateTime to String conversions.
I use sql server 2008 R2 as a data store.
Until now on the test machine I had the english version of the software and used to make queries formatting the datetime field as
fromDate.ToString("MM/dd/yyyy");
now I have deployed the database on another server which is in the italian language. I shall change the format in my code to
fromDate.ToString("dd/MM/yyyy");
Is there a way to make the query in a neutral format?
thanks!
EDIT:
I forgot to mention that I am using NetTiers with CodeSmith. Here's a complete sample
AppointmentQuery aq = new AppointmentQuery(true, true);
aq.AppendGreaterThan(AppointmentColumn.AppointmentDate, fromDate.ToString("MM/dd/yyyy"));
aq.AppendLessThan(AppointmentColumn.AppointmentDate, toDate.ToString("MM/dd/yyyy"));
AppointmentService aSvc = new AppointmentService();
TList<Appointment> appointmentsList = aSvc.Find(aq);
You should share the code you are using to execute the query, but I guess you are building a SQL query dynamically using string concats to build the query and the arguments. You should rather use a parameterised query then you can pass the data as a date object and no need to converto a string.
For example if your query could be something like this
DateTime fromDate = DateTime.Now;
SqlCommand cmd = new SqlCommand(
"select * from Orders where fromDT = #fromDate", con);
cmd.Parameters.AddWithValue("#fromDate", fromDate);
...
As a good side effect, this will reduce your risk of SQL injection.
Update: After your edit which does change the question context significantly, and I have to admit that I have Zero knowledge of the .netTiers project. But just out of curiosity have you tried just passing the date instances directly as in the following?
AppointmentQuery aq = new AppointmentQuery(true, true);
aq.AppendGreaterThan(AppointmentColumn.AppointmentDate, fromDate);
aq.AppendLessThan(AppointmentColumn.AppointmentDate, toDate);
AppointmentService aSvc = new AppointmentService();
TList<Appointment> appointmentsList = aSvc.Find(aq);
ISO 8601 Data elements and interchange formats — Information interchange — Representation of dates and times allows both the YYYY-MM-DD and YYYYMMDD. SQL Server recognises the ISO specifications.
Although the standard allows both the
YYYY-MM-DD and YYYYMMDD formats for
complete calendar date
representations, if the day [DD] is
omitted then only the YYYY-MM format
is allowed. By disallowing dates of
the form YYYYMM, the standard avoids
confusion with the truncated
representation YYMMDD (still often
used).
I prefer the YYYYMMDD format, but I think that's because I only knew about that to start with, and to me it seems more universal, having done away with characters that might be considered locale specific.
Personally, I always use yyyy-MM-dd. This also makes it sortable as a string.
However, a date is a date is a date. There's no need to change the date to a string. In .NET, user DateTime.
I am using an OleDbConnection to query an Excel 2007 Spreadsheet. I want force the OleDbDataReader to use only string as the column datatype.
The system is looking at the first 8 rows of data and inferring the data type to be Double. The problem is that on row 9 I have a string in that column and the OleDbDataReader is returning a Null value since it could not be cast to a Double.
I have used these connection strings:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source="ExcelFile.xlsx";Persist Security Info=False;Extended Properties="Excel 12.0;IMEX=1;HDR=No"
Provider=Microsoft.Jet.OLEDB.4.0;Data Source="ExcelFile.xlsx";Persist Security Info=False;Extended Properties="Excel 8.0;HDR=No;IMEX=1"
Looking at the reader.GetSchemaTable().Rows[7].ItemArray[5], it's dataType is Double.
Row 7 in this schema correlates with the specific column in Excel I am having issues with. ItemArray[5] is its DataType column
Is it possible to create a custom TableSchema for the reader so when accessing the ExcelFiles, I can treat all cells as text instead of letting the system attempt to infer the datatype?
I found some good info at this page: Tips for reading Excel spreadsheets using ADO.NET
The main quirk about the ADO.NET interface is how datatypes are handled. (You'll notice I've been carefully avoiding the question of which datatypes are returned when reading the spreadsheet.) Are you ready for this? ADO.NET scans the first 8 rows of data, and based on that guesses the datatype for each column. Then it attempts to coerce all data from that column to that datatype, returning NULL whenever the coercion fails!
Thank you,
Keith
Here is a reduced version of my code:
using (OleDbConnection connection = new OleDbConnection(BuildConnectionString(dataMapper).ToString()))
{
connection.Open();
using (OleDbCommand cmd = new OleDbCommand())
{
cmd.Connection = connection;
cmd.CommandText = SELECT * from [Sheet1$];
using (OleDbDataReader reader = cmd.ExecuteReader())
{
using (DataTable dataTable = new DataTable("TestTable"))
{
dataTable.Load(reader);
base.SourceDataSet.Tables.Add(dataTable);
}
}
}
}
As you have discovered, OLEDB uses Jet which is limited in the manner in which it can be tweaked. If you are set on using an OleDbConnection to read from an Excel file, then you need to set the HKLM\...\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows value to zero so that the system will scan the entire resultset.
That said, if you are open to using an alternative engine to read from an Excel file, you might consider trying the ExcelDataReader. It reads all columns as strings but will let you use dataReader.Getxxx methods to get typed values. Here's a sample that fills a DataSet:
DataSet result;
const string path = #"....\Test.xlsx";
using ( var fileStream = new FileStream( path, FileMode.Open, FileAccess.Read ) )
{
using ( var excelReader = ExcelReaderFactory.CreateOpenXmlReader( fileStream ) )
{
excelReader.IsFirstRowAsColumnNames = true;
result = excelReader.AsDataSet();
}
}
Note for 64bit OS it is here:
My Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel
Check out the final answer on this page.
Just noticed the page you refer to says the same thing ...
Update:
The problem seems to be with the JET engine itself and not ADO. Once JET decides on the type, it sticks to it. Anything done after that has no effect; like casting the values to string in the SQL (e.g. Cstr([Column])) just results in an empty string being returned.
At this point (if there are no other answers) I'd opt for other methods: modifying the spreadsheet; modifying registry (not ideal since you will be messing with the settings for every other app the uses JET); Excel automation or a third party component that does not use JET.
If Automation option is to slow then maybe just use it to save the spreadsheet in a different format which is easier to handle.
I have faced the same issue and determined that this is something that many people commonly experience. Here are a number of solutions that have been suggested, many of which I have attempted to implement:
Add the following to your connection string(Source):
TypeGuessRows=0;ImportMixedTypes=Text
Add the following to your connection string(Source, More Discussion, Even More):
IMEX=1;HDR=NO;
Edit the following registry settings, disable "TypeGuessRows", and "ImportMixedTypes" set to "Text"(Source, Not Recommended, More Documentation):
Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/TypeGuessRows
Hkey_Local_Machine/Software/Microsoft/Jet/4.0/Engines/Excel/ImportMixedTypes
Consider using an alternative library for reading the excel file:
EPPlus
ExcelDataReader (also suggested be #Thomas)
OpenXml
Format all data in the source file as Text(at least the first 8 rows), though I understand that's typically impractical(Source, though this is relation to SSIS, but it's the same concepts)
Use a Schema.ini file to define the data type before importing the file, I found this in relation to using "Jet.OleDb" directly, maybe requiring you to modifying your connection string. This may only be applicable to CSV's I have not tried this approach.(Source, Related Post)
None of these have worked for me(though I believe they have worked for others). I am of the opinion expressed by #Asher that there is really no good solution to this problem. In my software I simply display an error message to the user(if any required column contain empty values) instructing them to format all columns as "Text".
Honestly, I think this book is more applicable to situation. The issue, already stated multiple times is:
"The data type at the destination is varchar but the assumed data
type of "double" nullifies any data that doesn't fit."(Source)
"But the problem is actually with the OLEDBDataReader. The problem
is that if it sees mostly numbers in a column, it assumes everything
is a number - if a row item being read is not a number, it simply
sets it to null! Ouch!"(Source)
"The problem seems to be with the JET engine itself and not ADO. Once
JET decides on the type, it sticks to it."(#Asher)
While I haven't found any of this documented in an official capacity I think that it's very clear that this is an intentional design decision and simply how the Jet Database Library works. I hesitate to call this library entirely useless because I think for many people some of these solutions do work, but so far for my project, I have come to the conclusion that this library cannot read multiple data types in a single column and is ill suited for general data retrieval.