I have a trigger which needs to fill a table with hundreds of rows, I need to type every single insert manually (it is a kind of pre-config table).
This table has an Int FK to an Enum Table. The Enum Table uses an int as a PK and a varchar (which is UNIQUE).
While typing the insert statements I need to be very careful that the integer FK is the correct one.
I would rather like to insert the data by the varchar of the enum.
So I do something like this now:
INSERT INTO MyTable(ColorId)
VALUES(1)
And I would like to do something like this:
INSERT INTO MyTable(ColorStr)
VALUES('Red')
The reason why the Enum has an int PK is because of performance issues (fast queries), but I don't know if it is a good idea now. What do you think?
Is there a safe way to do it? Is it possible to insert data into a Table View?
Sure. Do not insert.
No joke.
First, you do not need to use one insert statement PER LINE - look at the syntax, you can have one insert statement doing a lot of lines.
Second, nothing in the world says you can not do processing (like select and join) on the inserted data.
I generally use table definition like this (with a merge statement) for all my static lookup library data (like country lists). Full automatic maintenance on every change. WIth inserts and updates happening on demand.
Related
I have a problem with the Query Access. My code is as follows:
string query = "ALTER TABLE Student ADD COLUMN Surname MEMO AFTER 'Name'";
Why always inserts the column at the end of the table? Is there any method to insert a new column in a specific position?
First of all, I don't see any reason to add your column to a specific position. You can always use the column order as you want for a select statement for example..
Why always inserts the column at the end of the table?
Because it is designed like that?
There is a method to insert a new column in a specific position?
As far as I know, there is no way to do it without rebuilding your table.
From ALTER TABLE syntax for changing column order
Today when you use ALTER TABLE ADD to add a column, a new column is
always placed as the last column. This is far from often desireable.
Often developers and database designers want to keep some logic in a
column order, so that related column are close to each other. A
standard rule we keep in the system I work with is to always have
auditing columns at the end. Furthermore many graphical design tools
encourage this kind of design, both bottom-end tools like the Table
Designer in SSMS as well as high-end data-modelling tools such as
Power Designer.
Today, if you want to maintain column order you have no choice but to
go the long way: create a new version of the table and copy over. It
takes time, and if not implemented correctly, things can go very
wrong.
I have defined various text value by int. I store int value in data table for better and fast search. I have three options to display text value:
I declare Enum in my codes and display text value according to int value. It is static and I have to change code if new values is to be added.
To make it dynamic, I can store int and text value in a table which is in another database and admin own it. New values can be updated by admin in this table. I use inner join to display text value whenever a record is fetched.
I store actual text in respective data table. This will make search slow.
My question is which option is best to use under following condition?
Data table has more than records between 1 and 10 millions.
There are more than 5000 users doing fetch, search, update process on table.
Maximum text values are 12 in number and length (max) 50 char.
There are 30 data tables having above conditions and functions.
I like combination of option #2 and option #1 - to use int's but have dictionary table in another database.
Let me explain:
to store int and text in a table which is in another database;
in origin table to store int only;
do not join table from another database to get text but cache dictionary on client and resolve text from that dictionary
I would not go for option 1 for the reason given. Enums are not there as lookups. You could replace 1 with creating a dictionary but again it would need to be recompiled each time a change is made which is bad.
Storing text in a table (ie option 3) is bad if it is guaranteed to be duplicated a lot as here. This is exactly where you should use a lookup table as you suggest in number 2.
So yes, store them in a database table and administer them through that.
The joining shouldn't take long to do at all if it is just to a small table. If you are worried though an alternative might be to load the lookup table into a dictionary in the code the first time you need it and look up the values on the code from your small lookup table. I doubt you'll have problems with just doing it by the join though.
And I'd do this approach no matter what the conditions are (ie number of records, etc.). The conditions do make it more sensible though. :)
If you have literally millions of records, there's almost certainly no point in trying to spin up such a structure in server code or on the client in any form. It needs to be kept in a database, IMHO.
The query that creates the list needs to be smart enough to constrain the count of returned records to a manageable number. Perhaps partitioned views or stored procedures might help in this regard.
If this is primarily a read-only list, with updates only done in the context of management activities, it should be possible to make queries against the table very rapid with proper indexes and queries on the client side.
Relatively simple problem.
Table A has ID int PK, unique Name varchar(500), and cola, colb, etc
Table B has a foreign key to Table A.
So, in the application, we are generating records for both table A and table B into DataTables in memory.
We would be generating thousands of these records on a very large number of "clients".
Eventually we make the call to store these records. However, records from table A may already exist in the database, so we need to get the primary keys for the records that already exist, and insert the missing ones. Then insert all records for table B with the correct foreign key.
Proposed solution:
I was considering sending an xml document to SQL Server to open as a rowset into TableVarA, update TableVarA with the primary keys for the records that already exist, then insert the missing records and output that to TableVarNew, I then select the Name and primary key from TableVarA union all TableVarNew.
Then in code populate the correct FKs into TableB in memory, and insert all of these records using SqlBulkCopy.
Does this sound like a good solution? And if so, what is the best way to populate the FKs in memory for TableB to match the primary key from the returned DataSet.
Sounds like a plan - but I think the handling of Table A can be simpler (a single in-memory table/table variable should be sufficient):
have a TableVarA that contains all rows for Table A
update the ID for all existing rows with their ID (should be doable in a single SQL statement)
insert all non-existing rows (that still have an empty ID) into Table A and make a note of their ID
This could all happen in a single table variable - I don't see why you need to copy stuff around....
Once you've handled your Table A, as you say, update Table B's foreign keys and bulk insert those rows in one go.
What I'm not quite clear on is how Table B references Table A - you just said it had an FK, but you didn't specify what column it was on (assuming on ID). Then how are your rows from Table B referencing Table A for new rows, that aren't inserted yet and thus don't have an ID in Table A yet?
This is more of a comment than a complete answer but I was running out of room so please don't vote it down for not being up to answer criteria.
My concern would be that evaluating a set for missing keys and then inserting in bulk you take a risk that the key got added elsewhere in the mean time. You stated this could be from a large number of clients so it this is going to happen. Yes you could wrap it in a big transaction but big transactions are hogs would lock out other clients.
My thought is to deal with those that have keys in bulk separate assuming there is no risk the PK would be deleted. A TVP is efficient but you need explicit knowledge of which got processed. I think you need to first search on Name to get a list of PK that exists then process that via TVP.
For data integrity process the rest one at a time via a stored procedure that creates the PK as necessary.
Thousands of records is not scary (millions is). Large number of "clients" that is the scary part.
How would I get the primary key ID number from a Table without making a second trip to the database in LINQ To SQL?
Right now, I submit the data to a table, and make another trip to figure out what id was assigned to the new field (in an auto increment id field). I want to do this in LINQ To SQL and not in Raw SQL (I no longer use Raw SQL).
Also, second part of my question is: I am always careful to know the ID of a user that's online because I'd rather call their information in various tables using their ID as opposed to using a GUID or a username, which are all long strings. I do this because I think that SQL Server doing a numeric compare is much (?) more efficient than doing a username (string) or even a guid (very long string) compare. My questions is, am I more concerned than I should be? Is the difference worth always keeping the userid (int32) in say, session state?
#RedFilter provided some interesting/promising leads for the first question, because I am at this stage unable to try them, if anyone knows or can confirm these changes that he recommended in the comments section of his answer?
If you have a reference to the object, you can just use that reference and call the primary key after you call db.SubmitChanges(). The LINQ object will automatically update its (Identifier) primary key field to reflect the new one assigned to it via SQL Server.
Example (vb.net):
Dim db As New NorthwindDataContext
Dim prod As New Product
prod.ProductName = "cheese!"
db.Products.InsertOnSubmit(prod)
db.SubmitChanges()
MessageBox.Show(prod.ProductID)
You could probably include the above code in a function and return the ProductID (or equivalent primary key) and use it somewhere else.
EDIT: If you are not doing atomic updates, you could add each new product to a separate Collection and iterate through it after you call SubmitChanges. I wish LINQ provided a 'database sneak peek' like a dataset would.
Unless you are doing something out of the ordinary, you should not need to do anything extra to retrieve the primary key that is generated.
When you call SubmitChanges on your Linq-to-SQL datacontext, it automatically updates the primary key values for your objects.
Regarding your second question - there may be a small performance improvement by doing a scan on a numeric field as opposed to something like varchar() but you will see much better performance either way by ensuring that you have the correct columns in your database indexed. And, with SQL Server if you create a primary key using an identity column, it will by default have a clustered index over it.
Linq to SQL automatically sets the identity value of your class with the ID generated when you insert a new record. Just access the property. I don't know if it uses a separate query for this or not, having never used it, but it is not unusual for ORMs to require another query to get back the last inserted ID.
Two ways you can do this independent of Linq To SQL (that may work with it):
1) If you are using SQL Server 2005 or higher, you can use the OUTPUT clause:
Returns information from, or
expressions based on, each row
affected by an INSERT, UPDATE, or
DELETE statement. These results can be
returned to the processing application
for use in such things as confirmation
messages, archiving, and other such
application requirements.
Alternatively, results can be inserted
into a table or table variable.
2) Alternately, you can construct a batch INSERT statement like this:
insert into MyTable
(field1)
values
('xxx');
select scope_identity();
which works at least as far back as SQL Server 2000.
In T-SQL, you could use the OUTPUT clause, saying:
INSERT table (columns...)
OUTPUT inserted.ID
SELECT columns...
So if you can configure LINQ to use that construct for doing inserts, then you can probably get it back easily. But whether LINQ can get a value back from an insert, I'll let someone else answer that.
Calling a stored procedure from LINQ that returns the ID as an output parameter is probably the easiest approach.
I have several tables within my database that contains nothing but "metadata".
For example we have different grouptypes, contentItemTypes, languages, ect.
the problem is, if you use automatic numbering then it is possible that you create gaps.
The id's are used within our code so, the number is very important.
Now I wonder if it isn't better not to use autonumbering within these tables?
Now we have create the row in the database first, before we can write our code. And in my opinion this should not be the case.
What do you guys think?
I would use an identity column as you suggest to be your primary key(surrogate key) and then assign your you candidate key (identifier from your system) to be a standard column but apply a unique constraint to it. This way you can ensure you do not insert duplicate records.
Make sense?
if these are FK tables used just to expand codes into a description or contain other attributes, then I would NOT use an IDENTITY. Identity are good for ever inserting user data, metadata tables are usually static. When you deploy a update to your code, you don't want to be suprised and have an IDENTITY value different than you expect.
For example, you add a new value to the "Languages" table, you expect the ID will be 6, but for some reason (development is out of sync, another person has not implemented their next language type, etc) the next identity you get is different say 7. You then insert or convert a bunch of rows having using Language ID=6 which all fail becuase it does not exist (it is 7 iin the metadata table). Worse yet, they all actuall insert or update because the value 6 you thought was yours was already in the medadata table and you now have a mix of two items sharing the same 6 value, and your new 7 value is left unused.
I would pick the proper data type based on how many codes you need, how often you will need to look at it (CHARs are nice to look at for a few values, helps with memory).
for example, if you only have a few groups, and you'll often look at the raw data, then a char(1) may be good:
GroupTypes table
-----------------
GroupType char(1) --'M'=manufacturing, 'P'=purchasing, 'S'=sales
GroupTypeDescription varchar(100)
however, if there are many different values, then some form of an int (tinyint, smallint, int, bigint) may do it:
EmailTypes table
----------------
EmailType smallint --2 bytes, up to 32k different positive values
EmailTypeDescription varchar(100)
If the numbers are hardcoded in your code, don't use identity fields. Hardcode them in the database as well as they'll be less prone to changing because someone scripted a database badly.
I would use an identity column as the primary key also just for simplicity sake of inserting the records into the database, but then use a column for type of metadata, I call mine LookUpType(int), as well as columns for LookUpId (int value in code) or value in select lists, LookUpName(string), and if those values require additional settings so to speak use extra columns. I personally use two extras, LookUpKey for hierarchical relations, and LookUpValue for abbreviations or alternate values of LookUpName.
Well, if those numbers are important to you because they'll be in code, I would probably not use an IDENTITY.
Instead, just make sure you use a INT column and make it the primary key - in that case, you will have to provide the ID's yourself, and they'll have to be unique.