How to use FETCH in OleDb query? - c#

I have xlsx table like this:
Name SubDatasetCount Parameter1 Parameter2 ParameterX .......
Dataset1
SubDataset1
SubDataset2
SubDatasetX
Dataset2
SubDataset1
SubDataset2
SubDatasetX
.
.
.
My goal is to load any Dataset Parameters and all its SubDatasets.
Xlsx format and reading method is given. At this moment I read Data1-SubDataCount and then I try to run following SQL query for OleDbReader:
SELECT *
FROM ["SheetName"$]
WHERE Name LIKE '%DatasetName%'
FETCH NEXT [SubDatasetCount] ROWS ONLY
It cause OleDbException: 'IErrorInfo.GetDescription failed with E_FAIL(0x80004005).' . Prior addition of FETCH query worked fine. I have no SQL knowledge, I copied it from here: How to select next rows from database in C#?
In linked answer there is statement that ORDER BYis a MUST, but I can not do that obviously.
And even when I tested following query, error is same:
SELECT *
FROM ["SheetName"$]
WHERE Name LIKE '%DatasetName%'
ORDER BY Name
FETCH NEXT 10 ROWS ONLY
It works when I remove FETCH and leave ORDER BY. Quick study of that specific error yields always same result - reserved keyword is used in the query. But I don't see anything like that in FETCH part of query.
How do I make FETCH work?
In case FETCH is fixed somehow, how to solve ORDER BY requirement? ORDER BY(SELECT NULL) cause exception.

Related

System.IndexOutOfRangeException on SqlDataReader.get_Item() only when using alias

I have a sql query that I am creating and running through C# code using SqlDataReader. The query is quite simple, it amounts to:
SELECT colName1 AS altColName1, colName2 AS altColName2 FROM table.
When I run the query in SQL Server Management Studio it works and gives the expected results.
Additionally, when I run the simpler query:
SELECT colName1, colName2 FROM table
using the SqlDataReader it works fine, except obviously I don't get the aliases.
The problem is, when using SqlDataReader and giving it the first query, I get:
System.IndexOutOfRangeException -- colName1
I'm perplexed since obviously that index works fine without the alias attached. Am I doing something wrong? Or is there some workaround I can use to get the query to work with the alias?
edit: I got it to work correctly by changing the query to:
SELECT DISTINCT colName1 AS altColName1, colName2 AS altColName2 FROM table
though I don't understand exactly why this works and the original did not.
You obtain System.IndexOutOfRangeException for colName1 because, using SELECT with aliases, the correct name to use is altColName1.
So you have to stick to column name or change the code in your SqlDataReader.get_Item().
SqlDataReader.Item property gets the value of the specified column in its native format given the column name. So in your code you could find a call like yourSqlDataReader("colName1") that don't work if you rename your column as altColName1.

SQL Server - Best practice to circumvent large IN (...) clause (>40000 items)

I'm developing an ASP.NET app that analyzes Excel files uploaded by user. The files contain various data about customers (one row = one customer), the key field is CustomerCode. Basically the data comes in form of DataTable object.
At some point I need to get information about the specified customers from SQL and compare it to what user uploaded. I'm doing it the following way:
Make a comma-separated list of customers from CustomerCode column: 'Customer1','Customer2',...'CustomerN'.
Pass this string to SQL query IN (...) clause and execute it.
This was working okay until I ran into The query processor ran out of internal resources and could not produce a query plan exception when trying to pass ~40000 items inside IN (...) clause.
The trivial ways seems to:
Replace IN (...) with = 'SomeCustomerCode' in query template.
Execute this query 40000 times for each CustomerCode.
Do DataTable.Merge 40000 times.
Is there any better way to work this problem around?
Note: I can't do IN (SELECT CustomerCode FROM ... WHERE SomeConditions) because the data comes from Excel files and thus cannot be queried from DB.
"Table valued parameters" would be worth investigating, which let you pass in (usually via a DataTable on the C# side) multiple rows - the downside is that you need to formally declare and name the data shape on the SQL server first.
Alternatively, though: you could use SqlBulkCopy to throw the rows into a staging table, and then just JOIN to that table. If you have parallel callers, you will need some kind of session identifier on the row to distinguish between concurrent uses (and: don't forget to remove your session's data afterwards).
You shouldn't process too many records at once, because of errors as you mentioned, and it is such a big batch that it takes too much time to run and you can't do anything in parallel. You shouldn't process only 1 record at a time either, because then the overhead of the SQL server communication will be too big. Choose something in the middle, process eg. 10000 records at a time. You can even parallelize the processing, you can start running the SQL for the next 10000 in the background while you are processing the previous 10000 batch.

I want to display the missing (non-matching) records

Is there a way to program the following SQL query
SELECT dbo.Assets_Master.Serial_Number, dbo.Assets_Master.Account_Ident, dbo.Assets_Master.Disposition_Ident
FROM dbo.Assets_Master LEFT OUTER JOIN
dbo.Assets ON dbo.Assets_Master.Serial_Number = dbo.Assets.Serial_Number
WHERE (dbo.Assets.Serial_Number IS NULL)
in c# .net code using dataviews or data relation or something else?
I have a spreadsheet of about 4k rows and a data table that should have the same records but if not I want to display the missing (non-matching) records from the table.
Thanks,
Eric
If you've already got that query, you can just pass that text as a SQL command and pull back the results as a dataset. Better might be setting up your query as a stored procedure and then following the same steps (calling a stored proc is cleaner than writing the SQL by hand).
If you want a way to do it without SQL at all, you could use LINQ to grab an IENUMERABLE of your ASSETS_MASTER serial numbers and another IENUMBERABLE of your ASSETS records. Then something like:
foreach(ASSET asset in ASSETS)
{
if(!ASSETS_MASTER_SERIALSNOS.CONTAINS(asset.SerialNumber))
{
//Do Whatever
}
}

how to implement oracle -> oracle conversion/refresher program in C# / ADO.NET 2.0

When program runs 1st time it just gets some fields from a source database table say:
SELECT NUMBER, COLOR, USETYPE, ROOFMATERIALCODE FROM HOUSE; //number is uniq key
it does some in-memory processing say converting USETYPE and ROOFMATERIAL to destination database format (by using cross ref table).
Then program inserts ALL THE ROWS to destination database:
INSERT INTO BUILDING (BUILDINGID, BUILDINGNUMBER, COLOR, BUILDINGTYPE, ROOFMAT)
VALUES (PROGRAM_GENERATED_ID, NUMBER_FROM_HOUSE, COLOR_FROM_HOUSE,
CONVERTED_USETYPE_FROM_HOUSE, CONVERTED_ROOFMATERIALCODE_FROM_HOUSE);
The above is naturally not SQL but you get the idea (the values with underscores just describe the data inserted).
The next times the program should do the same except:
insert only the ones not found from target database.
update only the ones that have updated color, usetype, roofmaterialcode.
My question is:
How to implement this in efficient way?
-Do I first populate DataSet and convert fields to destination format?
-If I use only 1 DataSet how give destination db BUILDING_IDs (can i add columns to populated DataSet?)
-How to efficiently check if destination rows need refresh (if i select them one # time by BUILDING_NUMBER and check all fields it's gonna be slow)?
Thanks for your answers!
-matti
If you are using Oracle, have you looked at the MERGE statement? You give the merge statement a criteria. If records match the criteria, it performs an UPDATE. If they don't match the criteria (they aren't already in the table), it performs an INSERT. That might be helpful for what you are trying to do.
Here is the spec/example of merge.

knowing if a string will be truncated when updating database

I'm working on a software that takes a csv file and put the data in a sqlserver. i'm testing it with bad data now and when i make a data string to long (in a line) to be imported in the database i got the error : String or binary data would be truncated the statement has been terminate. that's normal and that's what i should expect. Now i wanna detecte those error before the update to the database. Is there any clever way to detecte this?
The way my software work is that i importe every line in a dataset then show the user the data that will be imported. Then he can click a button to do the actual update. i then do a dataAdapter.Update( Dataset, "something" ) to make the update to the database.
The probleme is that the error row terminate all the update and report the error. So i want to detect the error before i do the update to the server so the other rows will be inserted.
thanks
You will have to check the columns of each row. See if one exceeds the maximum specified in the database, and if yes, exclude it from being inserted.
A different solution would be to explicitly truncate the data and insert the truncated content, which could be done by using SubString.
The only way that I know of is to pre-check the information schema for the character limit:
Select
Column_Name,
Character_Maximum_Length
From
Information_Schema.Columns
Where
Table_Name = 'YourTableName'
What you need is the column metadata.
MSDN: SqlConnection.GetSchema Method
Or, if you have opened a recordset on your database, another solution would be to browse the field object to use its length to truncate the string. For example, with ADO recordset/VB code, you could have some code like this one:
myRecordset.fields(myField) = left(myString, myRecordset.fields(myField).DefinedSize)

Categories