Insert a row with minimum amount of data in it using SQL - c#

I search for a generic SQL script which can add rows in SQL tables and those added rows should contain minimum possible data: if a column is nullable the script should insert NULL, if not (NVARCHAR for instance) the minimum piece of data which is an empty string ''.
Example:
DECLARE #Test TABLE (
ColumnA INT NULL,
ColumnB BIT NOT NULL,
ColumnC NVARCHAR(MAX) NOT NULL,
ColumnD BIT NULL
)
INSERT INTO #Test (ColumnA, ColumnB, ColumnC, ColumnD)
VALUES (NULL, 0, '', NULL)
This ^ but a generic solution that would work for any table. If this is possible of course.
Use case / reason:
Found an inconsistency between a model and its table in a DB. The table has a BIT NULL column while the respective property is of bool type which cannot take nulls. Who knows how many of these are in the DB. I'd like to find such inconsistencies all at once by providing such "minimum data" rows, trying to load them in program's memory and catching NullReferenceException or InvalidCastException.

Related

How to Auto-increment non-integer primary key in sql-server? [duplicate]

Can I make a primary key like 'c0001, c0002' and for supplier 's0001, s0002' in one table?
The idea in database design, is to keep each data element separate. And each element has its own datatype, constraints and rules. That c0002 is not one field, but two. Same with XXXnnn or whatever. It is incorrect , and it will severely limit your ability to use the data, and use database features and facilities.
Break it up into two discrete data items:
column_1 CHAR(1)
column_2 INTEGER
Then set AUTOINCREMENT on column_2
And yes, your Primary Key can be (column_1, column_2), so you have not lost whatever meaning c0002 has for you.
Never place suppliers and customers (whatever "c" and "s" means) in the same table. If you do that, you will not have a database table, you will have a flat file. And various problems and limitations consequent to that.
That means, Normalise the data. You will end up with:
one table for Person or Organisation containing the common data (Name, Address...)
one table for Customer containing customer-specific data (CreditLimit...)
one table for Supplier containing supplier-specific data (PaymentTerms...)
no ambiguous or optional columns, therefore no Nulls
no limitations on use or SQL functions
.
And when you need to add columns, you do it only where it is required, without affecting all the other sues of the flat file. The scope of effect is limited to the scope of change.
My approach would be:
create an ID INT IDENTITY column and use that as your primary key (it's unique, narrow, static - perfect)
if you really need an ID with a letter or something, create a computed column based on that ID INT IDENTITY
Try something like this:
CREATE TABLE dbo.Demo(ID INT IDENTITY PRIMARY KEY,
IDwithChar AS 'C' + RIGHT('000000' + CAST(ID AS VARCHAR(10)), 6) PERSISTED
)
This table would contain ID values from 1, 2, 3, 4........ and the IDwithChar would be something like C000001, C000002, ....., C000042 and so forth.
With this, you have the best of both worlds:
a proper, perfectly suited primary key (and clustering key) on your table, ideally suited to be referenced from other tables
your character-based ID, properly defined, computed, always up to date.....
Yes, Actually these are two different questions,
1. Can we use varchar column as an auto increment column with unique values like roll numbers in a class
ANS: Yes, You can get it right by using below piece of code without specifying the value of ID and P_ID,
CREATE TABLE dbo.TestDemo
(ID INT IDENTITY(786,1) NOT NULL PRIMARY KEY CLUSTERED,
P_ID AS 'LFQ' + RIGHT('00000' + CAST(ID AS VARCHAR(5)), 5) PERSISTED,
Name varchar(50),
PhoneNumber varchar(50)
)
Two different increments in the same column,
ANS: No, you can't use this in one table.
I prefer artificial primary keys. Your requirements can also be implemented as unique index on a computed column:
CREATE TABLE [dbo].[AutoInc](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Range] [varchar](50) NOT NULL,
[Descriptor] AS ([range]+CONVERT([varchar],[id],(0))) PERSISTED,
CONSTRAINT [PK_AutoInc] PRIMARY KEY ([ID] ASC)
)
GO
CREATE UNIQUE INDEX [UK_AutoInc] ON [dbo].[AutoInc]
(
[Descriptor] ASC
)
GO
Assigning domain meaning to the primary key is a practice that goes way, way back to the time when Cobol programmers and dinosaurs walked the earth together. The practice survives to this day most often in legacy inventory systems. It is mainly a way of eliminating one or more columns of data and embedding the data from the eliminated column(s) in the PK value.
If you want to store customer and supplier in the same table, just do it, and use an autoincrementing integer PK and add a column called ContactType or something similar, which can contain the values 'S' and 'C' or whatever. You do not need a composite primary key.
You can always concatenate these columns (PK and ContactType) on reports, e.g. C12345, S20000, (casting the integer to string) if you want to eliminate the column in order to save space (i.e. on the printed or displayed page), and everyone in your organization understands the convention that the first character of the entity id stands for the ContactType code.
This approach will leverage autoincrementing capabilities that are built into the database engine, simplify your PK and related code in the data layer, and make your program and database more robust.
First let us state that you can't do directly. If you try
create table dbo.t1 (
id varchar(10) identity,
);
the error message tells you which data types are supported directly.
Msg 2749, Level 16, State 2, Line 1
Die 'id'-Identitätsspalte muss vom
Datentyp 'int', 'bigint', 'smallint',
'tinyint' oder 'decimal' bzw.
'numeric' mit 0 Dezimalstellen sein
und darf keine NULL-Werte zulassen.
BTW: I tried to find this information in BOL or on MSDN and failed.
Now knowing that you can't do it the direct way, it is a good choice to follow #marc_s proposal using computed columns.
Instead of doing 'c0001, c0002' for customers and 's0001, s0002' for suppliers in one table, proceed in the following way:
Create one Auto-Increment field "id" of Data Type "int (10) unsigned".
Create another field "type" of Data Type "enum ('c', 's')" (where c=Customer, s=Supplier).
As "#PerformanceDBA" pointed out, you can then make the Primary Key Index for two fields "id" & "type", so that your requirement gets fulfilled with the correct methodology.
INSERT INTO Yourtable (yourvarcharID)
values('yourvarcharPrefix'+(
SELECT CAST((SELECT CAST((
SELECT Substring((
SELECT MAX(yourvarcharID) FROM [Yourtable ]),3,6)) AS int)+1)
AS VARCHAR(20))))
Here varchar column is prefixed with 'RX' then followed by 001, So I selected substring after that prefix of it and incremented the that number alone.
We can add Default Constraint Function with table definition to achieve this.
First create table -
create table temp_so (prikey varchar(100) primary key, name varchar(100))
go
Second create new User Defined Function -
create function dbo.fn_AutoIncrementPriKey_so ()
returns varchar(100)
as
begin
declare #prikey varchar(100)
set #prikey = (select top (1) left(prikey,2) + cast(cast(stuff(prikey,1,2,'') as int)+1 as varchar(100)) from temp_so order by prikey desc)
return isnull(#prikey, 'SB3000')
end
go
Third alter table definition to add default constraint -
alter table temp_so
add constraint df_temp_prikey
default dbo.[fn_AutoIncrementPriKey_so]() for prikey
go
Fourth insert new row into table without specifying value for primary column-
insert into temp_so (name) values ('Rohit')
go 4
Check out data in table now -
select * from temp_so
OUTPUT -
prikey name
SB3000 Rohit
SB3001 Rohit
SB3002 Rohit
SB3003 Rohit
you may try below code:
SET #variable1 = SUBSTR((SELECT id FROM user WHERE id = (SELECT MAX(id) FROM user)), 5, 7)+1;
SET #variable2 = CONCAT("LHPL", #variable1);
INSERT INTO `user`(`id`, `name`) VALUES (#variable2,"Jeet");
1st line to get last inserted Id by removing four character than increase one value and set to a variable1
2nd line to make complete id with four character prefix and assign to variable2
insert new value with generated new primary key = variable2
you should have minimum one data in this table to work above SQL
No. If you really need this, you will have to generate ID manually.

How to get the Identity values in SQL BULK COPY?

I have to get the IDENTITY values from a table after SQLBULKCOPY to the same table. The volume of data could be thousands of records.
Can someone help me out on this ?
Disclaimer: I'm the owner of the project Bulk Operations
In short, this project overcomes SqlBulkCopy limitations by adding MUST-HAVE features like outputting inserted identity value.
Under the hood, it uses SqlBulkCopy and a similar method as #Mr Moose answer.
var bulk = new BulkOperation(connection)
// Output Identity Value
bulk.ColumnMappings.Add("CustomerID", ColumnMappingDirectionType.Output);
// Map Column
bulk.ColumnMappings.Add("Code");
bulk.ColumnMappings.Add("Name");
bulk.ColumnMappings.Add("Email");
bulk.BulkInsert(dt);
EDIT: Answer comment
can I simply get a IList or simply , I see its saved back in the customers table, but there is no variable where I can get a hold of it, can you please help with that. So, I an insert in Orders.CustomerID table
It depends, you can keep a reference to the Customer DataRow named CustomerRef in the Order DataTable.
So once you merged your customer, you are able to populate easily a column CustomerID from the column CustomerRef in your Order DataTable.
Here is an example of what I'm trying to say: https://dotnetfiddle.net/Hw5rf3
I've used a solution similar to this one from Marc Gravell in that it is useful to first import into a temp table.
I've also used the MERGE and OUTPUT as described by Jamie Thomson on this post to track data I have inserted into my temp table to match it with the id generated by the IDENTITY column of the table I want to insert into.
This is particularly useful when you need to use that ID as a foreign key reference to other tables you are populating.
Try this
CREATE TABLE #temp
(
DataRow varchar(max)
)
BULK INSERT #Temp FROM 'C:\tt.txt'
ALTER TABLE #temp
ADD id INT IDENTITY(1,1) NOT NULL
SELECT * FROM #temp
-- dummy schema
CREATE TABLE TMP (data varchar(max))
CREATE TABLE [Table1] (id int not null identity(1,1), data varchar(max))
CREATE TABLE [Table2] (id int not null identity(1,1), id1 int not null, data varchar(max))
-- imagine this is the SqlBulkCopy
INSERT TMP VALUES('abc')
INSERT TMP VALUES('def')
INSERT TMP VALUES('ghi')
-- now push into the real tables
INSERT [Table1]
OUTPUT INSERTED.id, INSERTED.data INTO [Table2](id1,data)
SELECT data FROM TMP

How to insert long type data in to SQL SERVER Big int type

HI recently im working on a project . Its easy for INT but due to overflow situation I want to use bigInt in sql column and i'm sending him a LONG type data like Long l="Somthing value"
and inserting like Insert into Table value(Given Value)
But why there is an error saying "Given value is not matching with the table definition "
Give me a sample code if possible
Make sure you are inserting correct values in table. Suppose you have three column and you are inserting four values then error will occur.
suppose you have following column
i.e. column1, column2 column3 etc. and you insert statement
insert into tablename(column1, column2, column3) values ('somevalue','somevalue','somevalue')
Here value is based on datatype that you have mentioned while creating a table.
Mention the column name in the insert statement, what ever the columns are you going to insert
Since the field in the table is of type bigint, it is not possible to store a long value there. You could change your application to use an integer for the value, or you can cast the long to an int64 (eq to a bigint in sql) this will drop all the decimal values of the long.
Specify column names when inserting.
into Table(Give_Value_Column) Values(Given Value)
command = Insert into Table value(Given=#Value)
command.perameter.add(#value, datatype, values)

SQL Server - passing a table valued parameter with computed columns

How do you pass a table valued parameter to a stored procedure with a computed column?
I have a table data type which I'm trying to pass a DataTable to a stored proc. I've got all my columns matching up in order and data type except for a computed column.
If I leave that column out of my DataTable I get this error:
System.Data.SqlClient.SqlException (0x80131904): Trying to pass a table-valued parameter with 3 column(s) where the corresponding user-defined table type requires 4 column(s).
If I include the column with null values I get this: System.Data.SqlClient.SqlException (0x80131904): The column "TransformationSum" cannot be modified because it is either a computed column or is the result of a UNION operator.
Haven't tried fiddling with any of the various properties of the particular DataColumn yet.
Table type definition:
CREATE TYPE [dbo].[DataPoint] AS TABLE
(
[Id] [int] NOT NULL,
[RawDataPoint] [decimal](18, 9) NULL,
[TransformedDataPoint] [decimal](18, 9) NULL,
[TransformationSum] AS ([RawDataPoint]-[TransformedDataPoint]),
PRIMARY KEY ([Id])
)
And the stored proc definition:
CREATE PROCEDURE [report].[spRptBubblePlot]
#pData [dbo].[DataPoint] READONLY,
....
AS
BEGIN
....
END
And my code which sets the parameter:
var parm = cmd.CreateParameter();
parm.ParameterName = "#pData";
parm.SqlDbType = SqlDbType.Structured;
parm.TypeName = "[dbo].[DataPoint]";
parm.Value = myDataTable;
cmd.Parameters.Add(parm);
(Note: everything works fine without the computed column.)
AFAIK, You can't modify table-valued parameters inside a stored procedure. If you need to perform a calculation and return it from within the proc, you need to pass the table-valued parameter without the column with null values, create a copy with an identical structure plus the extra column you need and return the new table with the extra column or as many extra columns you need.

C# code and SQL Server performance

I have a SQL Server database designed like this :
TableParameter
Id (int, PRIMARY KEY, IDENTITY)
Name1 (string)
Name2 (string, can be null)
Name3 (string, can be null)
Name4 (string, can be null)
TableValue
Iteration (int)
IdTableParameter (int, FOREIGN KEY)
Type (string)
Value (decimal)
So, as you've just understood, TableValue is linked to TableParameter.
TableParameter is like a multidimensionnal dictionary.
TableParameter is supposed to have a lot of rows (more than 300,000 rows)
From my c# client program, I have to fill this database after each Compute() function :
for (int iteration = 0; iteration < 5000; iteration++)
{
Compute();
FillResultsInDatabase();
}
In FillResultsInDatabase() method, I have to :
Check if the label of my parameter already exists in TableParameter. If it doesn't exist, i have to insert a new one.
I have to insert the value in the TableValue
Step 1 takes a long time ! I load all the table TableParameter in a IEnumerable property and then, for each parameter I make a
.FirstOfDefault( x => x.Name1 == item.Name1 &&
x.Name2 == item.Name2 &&
x.Name3 == item.Name3 &&
x.Name4 == item.Name4 );
in order to detect if it already exists (and after to get the id).
Performance are very bad like this !
I've tried to make selection with WHERE word in order to avoid loading every row of TableParameter but performance are worse !
How can I improve the performance of step 1 ?
For Step 2, performance are still bad with classic INSERT. I am going to try SqlBulkCopy.
How can I improve the performance of step 2 ?
EDITED
I've tried with Store Procedure :
CREATE PROCEDURE GetIdParameter
#Id int OUTPUT,
#Name1 nvarchar(50) = null,
#Name2 nvarchar(50) = null,
#Name3 nvarchar(50) = null
AS
SELECT TOP 1 #Id = Id FROM TableParameter
WHERE
TableParameter.Name1 = #Name1
AND
(#Name2 IS NULL OR TableParameter.Name2= #Name2)
AND
(#Name3 IS NULL OR TableParameter.Name3 = #Name3)
GO
CREATE PROCEDURE CreateValue
#Iteration int,
#Type nvarchar(50),
#Value decimal(32, 18),
#Name1 nvarchar(50) = null,
#Name2 nvarchar(50) = null,
#Name3 nvarchar(50) = null
AS
DECLARE #IdParameter int
EXEC GetIdParameter #IdParameter OUTPUT,
#Name1, #Name2, #Name3
IF #IdParameter IS NULL
BEGIN
INSERT TablePArameter (Name1, Name2, Name3)
VALUES
(#Name1, #Name2, #Name3)
SELECT #IdParameter= SCOPE_IDENTITY()
END
INSERT TableValue (Iteration, IdParamter, Type, Value)
VALUES
(#Iteration, #IdParameter, #Type, #Value)
GO
I still have the same performance... :-( (not acceptable)
If I understand what's happening you're querying the database to see if the data is there in step 1. I'd use a db call to a stored procedure that that inserts the data if it not there. So just compute the results and pass to the sp.
Can you compute the results first, and then insert in batches?
Does the compute function take data from the database? If so can you turn the operation in to a set based operation and perform it on the server itself? Or may part of it?
Remember that sql server is designed for a large dataset operations.
Edit: reflecting comments
Since the code is slow on the data inserts, and you suspect that it's because the insert has to search back before it can be done, I'd suggest that you may need to place SQL Indexes on the columns that you search on in order to improve searching speed.
However I have another idea.
Why don't you just insert the data without the check and then later when you read the data remove the duplicates in that query?
Given the fact that name2 - name3 can be null, would it be possible to restructure the parameter table:
TableParameter
Id (int, PRIMARY KEY, IDENTITY)
Name (string)
Dimension int
Now you can index it and simplify the query. (WHERE name = "TheNameIWant" AND Dimension="2")
(And speaking of indices, you do have index the name columns in the parameter table?)
Where do you do your commits on the insert? if you do one statement commits, group multiple inserts into one.
If you are the only one inserting values, if speed is really of essence, load all values from the database into the memory and check there.
just some ideas
hth
Mario
I must admit that I'm struggling to grasp the business process that you are trying to achieve here.
On initial review, it appears as if you are are performing a data comparison within your application tier. I would advise against this and suggest that you let the Database Engine do what it is designed to do, to manage and implement your data access.
As another poster has mentioned, I concur that you should look to create a Stored Procedure to handle your record insertion logic. The procedure can perform a simple check to see if your records already exist.
You should also consider:
Enforcing the insertion logic/rule by creating a Unique Constraint across the four name columns.
Creating a covering non-clustered index incorporating the four name columns.
With regard to performance of your inserts, perhaps you can provide some metrics to qualify what it is that you are seeing and how you are measuring it?
To give you a yardstick the current ETL insertion record for SQL Server is approx 16 million rows per second. What sort of numbers are you expecting and wanting to see?
the fastest way ( i know so far) is bulk insert. but not just lines of INSERT. try insert + select + union. it works pretty fast.
insert into myTable
select a1, b1, c1, ...
union select a2, b2, c2, ...
union select a3, b3, c3, ...

Categories