sql query treating a int as a string - issues? - c#

If i do a query like this
SELECT * from Foo where Bar = '42'
and Bar is a int column. Will that string value be optimized to 42 in the db engine? Will it have some kind of impact if i leave it as it is instead of changing it to:
Select * from Foo where Bar = 42
This is done on a SQL Compact database if that makes a difference.
I know its not the correct way to do it but it's a big pain going though all code looking at every query and DB schema to see if the column is a int type or not.

SQL Server automatically convert it to INT that because INT has higher precedence than VARCHAR.
You should also be aware of the impact that implicit conversions can
have on a query’s performance. To demonstrate what I mean, I’ve created and populated the following table in the AdventureWorks2008 database:
USE AdventureWorks2008;
IF OBJECT_ID ('ProductInfo', 'U') IS NOT NULL
DROP TABLE ProductInfo;
CREATE TABLE ProductInfo
(
ProductID NVARCHAR(10) NOT NULL PRIMARY KEY,
ProductName NVARCHAR(50) NOT NULL
);
INSERT INTO ProductInfo
SELECT ProductID, Name
FROM Production.Product;
As you can see, the table includes a primary key configured with the
NVARCHAR data type. Because the ProductID column is the primary key,
it will automatically be configured with a clustered index. Next, I
set the statistics IO to on so I can view information about disk
activity:
SET STATISTICS IO ON;
Then I run the following SELECT statement to retrieve product
information for product 350:
SELECT ProductID, ProductName
FROM ProductInfo
WHERE ProductID = 350;
Because statistics IO is turned on, my results include the following
information:
Table 'ProductInfo'. Scan count 1, logical reads 6, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob
read-ahead reads 0.
Two important items to notice is that the query performed a scan and
that it took six logical reads to retrieve the data. Because my WHERE
clause specified a value in the primary key column as part of the
search condition, I would have expected an index seek to be performed,
rather than I scan. As the figure below confirms, the database engine performed a scan, rather than a seek. Figure below shows the details of that scan (accessed by hovering the mouse over the scan icon).
Notice that in the Predicate section, the CONVERT_IMPLICIT function is
being used to convert the values in the ProductID column in order to
compare them to the value of 350 (represented by #1) I passed into the
WHERE clause. The reason that the data is being implicitly converted
is because I passed the 350 in as an integer value, not a string
value, so SQL Server is converting all the ProductID values to
integers in order to perform the comparisons.
Because there are relatively few rows in the ProductInfo table,
performance is not much of a consideration in this instance. But if
your table contains millions of rows, you’re talking about a serious
hit on performance. The way to get around this, of course, is to pass
in the 350 argument as a string, as I’ve done in the following
example:
SELECT ProductID, ProductName
FROM ProductInfo
WHERE ProductID = '350';
Once again, the statement returns the product information and the statistics IO data, as shown in the following results:
Now the index is being properly used to locate the record. And if you
refer to Figure below, you’ll see that the values in the ProductID
column are no longer being implicitly converted before being compared
to the 350 specified in the search condition.
As this example demonstrates, you need to be aware of how performance
can be affected by implicit conversions, just like you need to be
aware of any types of implicit conversions being conducted by the
database engine. For that reason, you’ll often want to explicitly
convert your data so you can control the impact of that conversion.
You can read more about Data Conversion in SQL Server.

If you look into the MSDN chart which tells about the implicit conversion you will find that string is implicitly converted into int.

both should work in your case but the norme is to use quote anyway.
cuz if this work.
Select * from Foo where Bar = 42
this not
Select * from Foo where Bar = %42%
and this will
SELECT * from Foo where Bar = '%42%'
ps: you should anyway look at entity framework and linq query it make it simple...

If i am not mistaken, the SQL Server will read it as INT if the string will only contains number (numeric) and you're comparing it to the INTEGER column datatype, but if the string is is alphanumeric , then that is the time you will encounter an error or have an unexpected result.
My suggestion is , in WHERE clause, if you are comparing integer, do not put single quote. that is the best practice to avoid error and unexpected result.

You should use always parameters when executing sql by code, to avoid security lacks (EJ: Sql injection).

Related

T-SQL enum number or string

I have a table Drivers with columns Id, Name, Status. In C# I have an enum for driver status
public enum DriverStatus
{
Online = 0,
Offline = 1,
Busy = 2,
SoonFree = 3
}
Currently in the database, I use a varchar data type for the Status column and this means I have records like:
1 John Online
2 Elsy Offline
This seams to be bad and I think this need to be changed to status column type tinyint because:
T-SQL Tinyint is only one byte size with range 0-255.
Now it is not possible to normally sort by status column because it is varchar so it sorts in alphabetical order not in enum priorities.
If I rename DriverStatus enum value name I also need to update database to be consistent.
Then I asked others why we use varchar for enum columns the only reason was that it is easier to debug as you see text not number like 0 or 3. Is where any really good reasons to have strings for enums in the database?
It is absolutely better to use a Lookup Table for enum values.
Advantages:
Usually less room in the database is used.
Renaming the display value is very easy.
Globalization is possible.
It is easy to retire values that are no longer used.
My lookup tables always contain three fields: [the ID/Primary Key], Name, and Enabled

How can I make SQL Server 2012 truncate insertions if they are too big?

So I have a table with a column of type VARCHAR (100) and I'm wondering if there's a way to configure SQL Server 2012 (T-SQL) so that if a transaction tries to submit a string of 101+ characters then it takes the first 100.
Is this possible, or should I be doing the truncation in the C# side of things ???
Normally, SQL Server will present an error on any attempt to insert more data into a field than it can hold
String or binary data would be truncated. The statement has been terminated.
SQL Server will not permit a silent truncation of data just because the column is too small to accept the data. But there are other ways that SQL Server can truncate data that is about to be inserted into a table that will not generate any form of error or warning.
By default, ANSI_WARNINGS are turned on, and certain activities such as creating indexes on computed columns or indexed views require that they be turned on. But if they are turned off, SQL Server will truncate the data as needed to make it fit into the column. The ANSI_WARNINGS setting for a session can be controlled by
SET ANSI_WARNINGS { ON|OFF }
Unlike with an insert into a table, SQL Server will quietly cut off data that is being assigned to a variable, regardless of the status of ANSI_WARNINGS. For instance:
declare #smallString varchar(5)
declare #testint int
set #smallString = 'This is a long string'
set #testint = 123.456
print #smallString
print #testint
Results is:
This
123
This can occasionally show itself in subtle ways since passing a value into a stored procedure or function assigns it to the parameter variables and will quietly do a conversion. One method that can help guard against this situation is to give any parameter that will be directly inserted into a table a larger datatype than the target column so that SQL Server will raise the error, or perhaps to then check the length of the parameter and have custom code to handle it when it is too long.
For instance, if a stored procedure will use a parameter to insert data into a table with a column that is varchar(10), make the parameter varchar(15). Then if the data that is passed in is too long for the column, it will rollback and raise a truncation error instead of silently truncating and inserting. Of course, that runs the risk of being misleading to anyone who looks at the stored procedures header information without understanding what was done.
Source: Silent Truncation of SQL Server Data Inserts
Do this on code level. When you are inserting the current field check field length and Substring it.
string a = "string with more than 100 symbols";
if(a.Length > 100)
a = a.Substring(0, 100);
After that you are adding a as sql parameter to the insert query.
The other way is to do it in the query, but again I don't advice you to do that.
INSERT INTO Table1('YourColumn') VALUES(LEFT(RTRIM(stringMoreThan100symbols), 100))
LEFT is cutting the string and RTRIM is performing Trim operation of the string.
My suggestion would be to make the application side responsible for validating the input before calling any DB operation.
SQL Server silently truncates any varchars you specify as stored procedure parameters to the length of the varchar. So you should try considering stored procedures for you requirements. So it will get handled automatically.
If you have entity classes (not necessarily from EF) you can use StringLength(your field length) attribute to do this.

Size limit of varchar(MAX) in SQL Server

I have a row that contains a field defined as varchar(MAX). I'm confused about the limit of the field: in some places, I read that varchar(MAX) has a size limit of 8K and in other places it seems that the limit is 2GB.
I have a string that I want to save to a database; it's about 220K. I'm using linq-to-sql and when the write query submits to the database, the row gets written without any exceptions generated. However, when I open the database table in SSMS, the cell that should contain the long string is empty. Why is that and how do I take advantage of the 2GB limit that I read about?
This is the property in the linq-to-sql model:
All MAX datatypes--VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX)--have a limit of 2 GB. There is nothing special you need to do. Without specifying MAX, the limit for VARCHAR and VARBINARY are 8000 and the limit for NVARCHAR is 4000 (due to NVARCHAR being double-byte). If you are not seeing any data come in at all, then something else is going on.
Are you sure that the column is even in the INSERT statement? If you submit test data of only 20 characters, does that get written? If you want to see what SQL is actually submitted by Linq, try running SQL Profiler and look at the SQL Statement: Statement Ended event, I believe.
Also, when you say that the "long string is empty", do you mean an actual empty string or do you mean NULL? If it is not NULL, you can also wrap the field in a LEN() function to see if there are blanks for returns at the beginning that push any non-whitespace characters out of view. Meaning, SELECT LEN(stringField), * FROM Table. Another thing to try is to use "Results to Text" instead of "Results to Grid" (this is a Query option).
EDIT:
Seeing that the field is marked as NOT NULL, are you sure that you are setting the ClientFileJS property of your object correctly? Is it possible that the empty string is due to that property being initialized as string ClientFileJS = ""; and is never updated?

Informix: How to get the rowid of the last insert statement

This is an extension of a question I asked before: C#: How do I get the ID number of the last row inserted using Informix
I am writing some code in C# to insert records into the informix db using the .NET Informix driver. I was able to get the id of the last insert, but in some of my tables the 'serial' attribute is not used. I was looking for a command similar to the following, but to get rowid instead of id.
SELECT DBINFO ('sqlca.sqlerrd1') FROM systables WHERE tabid = 1;
And yes, I do realize working with the rowid is dangerous because it is not constant. However, I plan to make my application force the client apps to reset the data if the table is altered in a way that the rowids got rearranged or the such.
One problem with ROWID is that it is a 4-byte quantity but the value used on a fragmented table is an 8-byte quantity (nominally FRAGID and ROWID), but Informix has never exposed the FRAGID.
In theory, the SQLCA data structure reports the ROWID in the sqlca.sqlerrd[5] element (assuming C-style indexing from 0; it is sqlca.sqlerrd[6] in Informix 4GL which indexes from 1). If anything was going to work with DBINFO, it would be DBINFO('sqlca.sqlerrd5'), but I get:
SQL -728: Unknown first argument of dbinfo(sqlca.sqlerrd5).
So, the indirect approach using DBINFO is not on. In ESQL/C, where sqlca is readily available, the information is available too:
SQL[739]: begin;
BEGIN WORK: Rows processed = 0
SQL[740]: create table p(q integer);
CREATE TABLE: Rows processed = 0
SQL[741]: insert into p values(1);
INSERT: Rows processed = 1, Last ROWID = 257
SQL[742]: select dbinfo('sqlca.sqlerrd5') from dual;
SQL -728: Unknown first argument of dbinfo(sqlca.sqlerrd5).
SQLSTATE: IX000 at /dev/stdin:4
SQL[743]:
I am not a user of C# or the .NET driver, so I have no knowledge of whether there is a back-door mechanism to get at the information. Even in ODBC, there might not be a front-door mechanism to get at it, but you could drop into C code to read the global data structure easily enough:
#include <sqlca.h>
#include <ifxtypes.h>
int4 get_sqlca_sqlerrd5(void)
{
return sqlca.sqlerrd[5];
}
Or, even:
int4 get_sqlca_sqlerrdN(int N)
{
if (N >= 0 && N <= 5)
return sqlca.sqlerrd[N];
else
return -22; /* errno 22 (EINVAL): Invalid argument */
}
If C# can access DLL's written in C, you could package that up.
Otherwise, the approved way of identifying rows of data is via the primary key (or any other unique identifier, sometimes known as an alternative key or candidate key) for the row. If you don't have a primary key or other unique identifier for the row, you are making life difficult for yourself. If it is a compound key, that 'works' but could be inconvenient. Maybe you need to consider adding a SERIAL column (or BIGSERIAL column) to the table.
You can use:
SELECT ROWID
FROM TargetTable
WHERE PK_Column1 = <value1> AND PK_Column2 = <value2>
or something similar to obtain the ROWID, assuming you can identify the row accurately.
In dire straights, there is a mechanism to add a physical ROWID column to a fragmented table (normally, it is a virtual column). You'd then use the query above. This is not recommended, but the option is there.

C# code and SQL Server performance

I have a SQL Server database designed like this :
TableParameter
Id (int, PRIMARY KEY, IDENTITY)
Name1 (string)
Name2 (string, can be null)
Name3 (string, can be null)
Name4 (string, can be null)
TableValue
Iteration (int)
IdTableParameter (int, FOREIGN KEY)
Type (string)
Value (decimal)
So, as you've just understood, TableValue is linked to TableParameter.
TableParameter is like a multidimensionnal dictionary.
TableParameter is supposed to have a lot of rows (more than 300,000 rows)
From my c# client program, I have to fill this database after each Compute() function :
for (int iteration = 0; iteration < 5000; iteration++)
{
Compute();
FillResultsInDatabase();
}
In FillResultsInDatabase() method, I have to :
Check if the label of my parameter already exists in TableParameter. If it doesn't exist, i have to insert a new one.
I have to insert the value in the TableValue
Step 1 takes a long time ! I load all the table TableParameter in a IEnumerable property and then, for each parameter I make a
.FirstOfDefault( x => x.Name1 == item.Name1 &&
x.Name2 == item.Name2 &&
x.Name3 == item.Name3 &&
x.Name4 == item.Name4 );
in order to detect if it already exists (and after to get the id).
Performance are very bad like this !
I've tried to make selection with WHERE word in order to avoid loading every row of TableParameter but performance are worse !
How can I improve the performance of step 1 ?
For Step 2, performance are still bad with classic INSERT. I am going to try SqlBulkCopy.
How can I improve the performance of step 2 ?
EDITED
I've tried with Store Procedure :
CREATE PROCEDURE GetIdParameter
#Id int OUTPUT,
#Name1 nvarchar(50) = null,
#Name2 nvarchar(50) = null,
#Name3 nvarchar(50) = null
AS
SELECT TOP 1 #Id = Id FROM TableParameter
WHERE
TableParameter.Name1 = #Name1
AND
(#Name2 IS NULL OR TableParameter.Name2= #Name2)
AND
(#Name3 IS NULL OR TableParameter.Name3 = #Name3)
GO
CREATE PROCEDURE CreateValue
#Iteration int,
#Type nvarchar(50),
#Value decimal(32, 18),
#Name1 nvarchar(50) = null,
#Name2 nvarchar(50) = null,
#Name3 nvarchar(50) = null
AS
DECLARE #IdParameter int
EXEC GetIdParameter #IdParameter OUTPUT,
#Name1, #Name2, #Name3
IF #IdParameter IS NULL
BEGIN
INSERT TablePArameter (Name1, Name2, Name3)
VALUES
(#Name1, #Name2, #Name3)
SELECT #IdParameter= SCOPE_IDENTITY()
END
INSERT TableValue (Iteration, IdParamter, Type, Value)
VALUES
(#Iteration, #IdParameter, #Type, #Value)
GO
I still have the same performance... :-( (not acceptable)
If I understand what's happening you're querying the database to see if the data is there in step 1. I'd use a db call to a stored procedure that that inserts the data if it not there. So just compute the results and pass to the sp.
Can you compute the results first, and then insert in batches?
Does the compute function take data from the database? If so can you turn the operation in to a set based operation and perform it on the server itself? Or may part of it?
Remember that sql server is designed for a large dataset operations.
Edit: reflecting comments
Since the code is slow on the data inserts, and you suspect that it's because the insert has to search back before it can be done, I'd suggest that you may need to place SQL Indexes on the columns that you search on in order to improve searching speed.
However I have another idea.
Why don't you just insert the data without the check and then later when you read the data remove the duplicates in that query?
Given the fact that name2 - name3 can be null, would it be possible to restructure the parameter table:
TableParameter
Id (int, PRIMARY KEY, IDENTITY)
Name (string)
Dimension int
Now you can index it and simplify the query. (WHERE name = "TheNameIWant" AND Dimension="2")
(And speaking of indices, you do have index the name columns in the parameter table?)
Where do you do your commits on the insert? if you do one statement commits, group multiple inserts into one.
If you are the only one inserting values, if speed is really of essence, load all values from the database into the memory and check there.
just some ideas
hth
Mario
I must admit that I'm struggling to grasp the business process that you are trying to achieve here.
On initial review, it appears as if you are are performing a data comparison within your application tier. I would advise against this and suggest that you let the Database Engine do what it is designed to do, to manage and implement your data access.
As another poster has mentioned, I concur that you should look to create a Stored Procedure to handle your record insertion logic. The procedure can perform a simple check to see if your records already exist.
You should also consider:
Enforcing the insertion logic/rule by creating a Unique Constraint across the four name columns.
Creating a covering non-clustered index incorporating the four name columns.
With regard to performance of your inserts, perhaps you can provide some metrics to qualify what it is that you are seeing and how you are measuring it?
To give you a yardstick the current ETL insertion record for SQL Server is approx 16 million rows per second. What sort of numbers are you expecting and wanting to see?
the fastest way ( i know so far) is bulk insert. but not just lines of INSERT. try insert + select + union. it works pretty fast.
insert into myTable
select a1, b1, c1, ...
union select a2, b2, c2, ...
union select a3, b3, c3, ...

Categories