I have a POS like system in C#, and for long time it not present any problem (it was just one POS). But in this days are 4 POS using the system, and connected to the same database and all the sales of one POS go to the same Audit (table) where all of the others sales go.
So in this system this is the procedure
Function to get the last Ticket number (with simple SELECT)
Add 1 to that number (next tickt no).
Generates a ID Code injecting this Ticket number (with the terminal, date, and employee code) into the algorithm
Insert record of the sale into database with all the necesary information (Date, Client, Employee, IDCode, etc.) (with simple INSERT INTO)
But having 4 POS I realize that some sales where having the same Ticket number, fortunately the Ticket ID code are not the same because the terminal and the employee are different, but how can avoid this?
Edit 1:
Every POS system have dual function, in one mode the POS sales are centralized, and every POS in this mode generates consecutive tickets (like they all where one POS), in the other mode every POS have their own Ticket numertion, for that reason I can't use the identity.
Just use a sequence to generate the next ticket number.
CREATE SEQUENCE Tickets
START WITH 1
INCREMENT BY 1;
Then each POS just do
SELECT NEXT VALUE FOR Tickets;
The sequence is guaranteed to never return the same number twice.
As has been mentioned, if the TicketNumber is sequential and unique, it sounds like an IDENTITY field would be the way to go. BUT, if for some reason there is something preventing that, or if that requires too many changes as this time, you could constrain the process itself to be single-threaded by creating a lock on the ID Code generation process itself through the use of Application Locks (see sp_getapplock and sp_releaseapplock). Application Locks let you create locks around arbitrary concepts. Meaning, you can define the #Resource as "generate_id_code" which will force each caller to wait their turn. It would follow this structure:
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'generate_id_code', #LockMode = 'Exclusive';
...current 4 steps to generate the ID Code...
EXEC sp_releaseapplock #Resource = 'generate_id_code';
COMMIT TRANSACTION;
You need to manage errors / ROLLBACK yourself (as stated in the linked MSDN documentation) so put in the usual TRY / CATCH. But, this does allow you to manage the situation.
Please note: sp_getapplock / sp_releaseapplock should be used sparingly; Application Locks can definitely be very handy (such as in cases like this one) but they should only be used when absolutely necessary.
You need to do this in an atomic action. So you can wrap everything in a transaction and lock the table. See here for a good discussion on locking etc.
Locking will slow down everything else since everything will start waiting for the table to free up for it to complete and that may not be something you can live with.
Or you can should use an identity on the column which will be managed by the database and maintain unique incrementing numbers.
You could also create your primary key (hope you have one) to be a combination of a few things. And then you could keeping a running number for each POS endpoint to see more data about how they are performing. But that gets more into analytics which isn't in scope here.
I would strongly suggest moving away from the current approach if you can and changing to a GUID PK.
However, I realize that in some cases a redesign is not possible (we have the exact same scenario that you describe in a legacy database).
In this case, you can get the maximum value safely using the UPDLOCK table hint in combination with the insert command and use the OUTPUT INSERTED functionality to retrieve the new primary key value into a local variable if needed:
DECLARE
#PK Table (PK INT NOT NULL)
INSERT
INTO Audit (
TicketNumber,
Terminal,
Date,
EmployeeCode,
Client,
IDCode,
... other fields )
/* Record the new PK in the tablevariable */
OUTPUT INSERTED.TicketNumber INTO #PK
SELECT IsNull(MAX(TicketNumber), 0) + 1,
#Terminal,
#Date,
#EmployeeCode,
#Client,
#IDCode,
... other values
FROM Audit WITH (UPDLOCK)
DECLARE #TicketNumber INT
/* Move the new PK from the local tablevariable into a local variable for subsequent use */
SELECT #TicketNumber = PK
FROM #PK
Related
I am working on a C# .NET (& TSQL) based project which requires generation of configurable serial tracking numbers which depend on the data it contains.
Background
To elaborate, the software is used to track vehicles entering a factory. Each vehicle is assigned an RFID card and a tracking number. The tracking number is based on first of many warehouses to which the truck is headed. The admin can configure the format of the numbers for each warehouse. This is typically as follows: #YYYYMMDD-XX002211 where XX is a prefix unique to each warehouse. This number restarts from 1 whenever the month changes.
At the time of master (vehicle + RFID card details) creation, the tracking number is null. Once the first line (warehouse details) is added the system generates a tracking number and assigns it to the master. So, until the time a line is added, the tracking number field can have duplicates i.e. many records with null.
The problem
The code to generate the tracking number is written in C# and uses SQL transactions, however there have been infrequent collisions in the tracking which causes the system to be unstable. I did more research into TSQL Transactions to find the Isolation Level settings however I have not been successful with them.
Solution?
I have tried the following approaches:
Check in the master table for a possible duplicate before updating
Use Repeatable Read isolation level to ensure data isn't updated during the transaction execution
Should I?
Move the whole thing to a Stored Procedure?
Start a daemon which is responsible for the number generation, ensuring sequence by a single threaded execution
Thanks for your help!
In case of simplicity, I would suggest you to get a single Stored Procedure to update and get the sequence number, then do other formatting with C#
CREATE PROCEDURE GetSequenceOfWareHouse
#WareHouse char(2)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #Result table (Sequence int)
BEGIN Tran
UPDATE SequenceTable SET Sequence = Sequence + 1
OUTPUT inserted.Sequence INTO #Result
WHERE WareHouse = #WareHouse AND Month = YEAR(GETDATE()) * 100 + MONTH(GETDATE())
IF NOT EXISTS(SELECT * FROM #Result)
INSERT SequenceTable (Sequence, WareHouse, Month)
OUTPUT inserted.Sequence INTO #Result
VALUES (1, #WareHouse, YEAR(GETDATE()) * 100 + MONTH(GETDATE()))
COMMIT Tran
SELECT Sequence FROM #Result
END
GO
CREATE Table SequenceTable
(
Sequence int,
WareHouse char(2),
Month int,
unique (Sequence, WareHouse, Month)
)
I'm not well versed in SQL operations, and would like some help with a task I need to complete in code. I have written a cloud based app that accesses a SQL table containing test results - device ID's, serial numbers, test results etc.
There is a use-case where someone in the field would activate a menu where an update to this table occurs. When the device test result table is updated, I want to store the OLD information in a device test history table. This way, we can go back and see what was changed over time.
So I need to pull all the columns from the TestedDevice table, insert them into TestedDeviceHistory table, and include some additional information; the current date and the operator's id. (these are two new columns found only in TestedDeviceHistory)
At first, I'm using a SELECT INTO command, as follows:
SELECT *
INTO dbo.TestedDevicesHistory
FROM dbo.TestedDevices
WHERE CertificateID = #cert
Then I'm attempting this (obviously broken) SQL command:
UPDATE dbo.TestedDeviceHistory
SET Caller = #caller,
RecordDate = #date
WHERE DeviceHistoryID = MAX(DeviceHistoryID)
Notes:
DeviceHistoryID is an IDENTITY integer column, so it's unique for each entry made in the history table.
CertificateID is unique in the TestedDevices table. It is expected NOT to be unique in the history table.
The code is written in C# 4.5
Maybe this is a case for a stored procedure, which I have never attempted to create or use. Or, perhaps the use of a cursor? Don't know! This is why I'm humbly asking for the more experienced with SQL to help :)
Not clear on if you only want to assign the Caller and RecordDate to the most recent record, or if it could be assigned to all the history records.
For all records, I believe you can do something like
SELECT *, #caller AS Caller, #date AS RecordDate INTO dbo.TestedDevicesHistory
FROM dbo.TestedDevices WHERE CertificateID=#cert
We are working on a solution which fires many search requests torwards three different public databases placed in three different countries. For example a search fetches data from one db and passes them as parameter to another db. The parameter is a list which each item needs to be logically connected with an OR operator. Therefore we end up having a sql select statement with up to 1000 OR operators linked inside the where clause.
Now my question is does 1000 or 500 or even 5000 logical AND or OR Operators inside select statement make the db slower and should I instead better request all data to the pc and do the matching on my pc.
The amount of is data is between 5000 and 10000 records, we are talking about a public db therefore the amount keeps growing.
For example such a sql statement:
select * from some_table
where .. and .. or .. or.. or..
or.. or.. or.. or.. or.. or.. (1000 times)
If I fetch all data to my pc I could have a LINQ Statement that does the filtering.
What do you suggest me to do? Any experiences on this one guys?
Sorry if this is a duplicate just let me know in comments and I'll delete this question.
EDIT:
It should be considered that many users may access the databases at the same time.
I always learned that running a query with hundreds of OR statements are bad for performance. However, even when running a sample here on 12g, querying a table with or or in using an primary key index doesn't seem to change the execution plan.
Therefore I say: it doesn't matter. The only things you could consider are readability, query length, etc.
Still, I personally prefer the where in.
See this other useful question with sample data.
Process this all in the database with a single query. Batching similar operations is usually the best thing you can do for database performance.
The most expensive part of the query is reading the data from disk. Once the data is in memory, filtering out a few thousand conditions is a small amount of work. Your local processor probably is faster than the database server. But it doesn't matter because your machine would spend too much time on unnecessary IO if you returned all the records.
Also, 5000 conditions in a SQL query is only a problem if you run that query a hundred times a second.
I think you should just try.
Create an example that is a simple as possible, yet complex enough to be realistic, and then run it with some form of benchmarking
Whatever works best for you is what you should choose to do.
Edit:
That said - such a large number of and's and or's in a single SQL statement does sound complicated and messy. Unless there is a real benefit from doing it this way(?), I would probably try to find a cleaner way to do this, for instance by splitting the operation into several steps and applying Linq or something similar, as you sugest, even if it is just to make the solution more manageable.
The answer is - depend
How big is the data on the public db ? if you are querying Google than fetching all the data is not an option.
It would be reasonable to assume that those public db have much stronger hardware and db tuning than your home pc.
Is there an option that you will got black listed from those public db ?
Does order matter ? if you query db 1 and then db 2 will be faster then query db 2 and then db 1 ?
Mostly it's try & error and what work best for you and is possible.
SQL Queries on ORACLE Using Multiple Boolean Operators
Comments: I have worked many years with CRYSTAL REPORTS, a database report designer. It was one of the first drag-and-drop, GUI based tools which made it easier for developers with not much database background to construct queries with multiple tables and filter conditions. The trade-off was that the the tool was writing SQL under the hood; many times it was a serious performance hog because the workstation running the report file had to suck down the entire contents of the database tables being queried, only to run the filtering process locally on the client system. That was more than a decade ago, but I see other next-gen tools that also auto-generate really awful SQL code.
No amount of software can compensate for lousy database design. You won't get everything right the first time (as others have noticed), but a little planning can give some breathing room when the product reveals under real-world use the demands of PERFORMANCE and SCALABILITY.
Demonstration Schema and Test Data
The following solution was designed on an ORACLE 11g Release 2 RDBMS system. The first table can be represented by a database VIEW, INLINE QUERY, SUB QUERY, MATERIALIZED VIEW or even a CURSOR output, so the "attributes" discussed in this example could be coming from multiple table sources and joining criteria.
CREATE TABLE "ZZ_DATA_ATTRIBUTES"
( "DATA_ID" NUMBER(10,0) NOT NULL ENABLE,
"NAME" VARCHAR2(50),
"AGE" NUMBER(5,0),
"HH_SIZE" NUMBER(5,0),
"SURVEY_SCORE" NUMBER(5,0),
"DMA_REGION" VARCHAR2(100),
"LAST_CONTACT" DATE,
CONSTRAINT "ZZ_DATA_ATTRIBUTES_PK" PRIMARY KEY ("DATA_ID") ENABLE
)
/
CREATE SEQUENCE "ZZ_DATA_ATTRIBUTES_SEQ" MINVALUE 1 MAXVALUE
9999999999999999999999999999 INCREMENT BY 1 START WITH 41 CACHE 20 NOORDER NOCYCLE
/
CREATE OR REPLACE TRIGGER "BI_ZZ_DATA_ATTRIBUTES"
before insert on "ZZ_DATA_ATTRIBUTES"
for each row
begin
if :NEW."DATA_ID" is null then
select "ZZ_DATA_ATTRIBUTES_SEQ".nextval into :NEW."DATA_ID" from sys.dual;
end if;
end;
/
ALTER TRIGGER "BI_ZZ_DATA_ATTRIBUTES" ENABLE
/
The SEQUENCE and TRIGGER objects are just for unique, auto-incremented values for the primary key on each table.
CREATE TABLE "ZZ_CONDITION_RESULTS"
( "RESULT_ID" NUMBER(10,0) NOT NULL ENABLE,
"DATA_ID" NUMBER(10,0) NOT NULL ENABLE,
"COND_ONE" NUMBER(10,0),
"COND_TWO" NUMBER(10,0),
"COND_THREE" NUMBER(10,0),
"COND_FOUR" NUMBER(10,0),
"COND_FIVE" NUMBER(10,0),
CONSTRAINT "ZZ_CONDITION_RESULTS_PK" PRIMARY KEY ("RESULT_ID") ENABLE
)
/
ALTER TABLE "ZZ_CONDITION_RESULTS" ADD CONSTRAINT "ZZ_CONDITION_RESULTS_FK"
FOREIGN KEY ("DATA_ID") REFERENCES "ZZ_DATA_ATTRIBUTES" ("DATA_ID") ENABLE
/
CREATE SEQUENCE "ZZ_CONDITION_RESULTS_SEQ" MINVALUE 1 MAXVALUE
9999999999999999999999999999 INCREMENT BY 1 START WITH 1 CACHE 20 NOORDER NOCYCLE
/
CREATE OR REPLACE TRIGGER "BI_ZZ_CONDITION_RESULTS"
before insert on "ZZ_CONDITION_RESULTS"
for each row
begin
if :NEW."RESULT_ID" is null then
select "ZZ_CONDITION_RESULTS_SEQ".nextval into :NEW."RESULT_ID" from sys.dual;
end if;
end;
/
ALTER TRIGGER "BI_ZZ_CONDITION_RESULTS" ENABLE
/
The table ZZ_CONDITION_RESULTS should be a TABLE type. It will contain the results of each individual boolean OR criteria. While 1000's of columns may not be practically feasible, the initial approach will show how you can line up lots of boolean outputs and be able to quickly identify and isolate the combinations and patterns of interest.
Sample Data
You can pick your own data values, but these were created to make the examples work. I chose the theme of MARKETING, where the data pulled together are different attributes our fictional company has gathered about their customers: customer name, age, hh_size (Household Size), The scoring results of some bench marked survey, DMA (Demographic Marketing Area) Region and the date the customer was last contacted.
Defined Boolean Arguments Using an Oracle Package Structure
The initial design is to calculate the business logic through an Oracle PL/SQL Package Object. For example, in the OP:
select * from some_table
where .. and .. or .. or.. or..
or.. or.. or.. or.. or.. or.. (1000 times)
Each blank is a separate Oracle function call from within the package(s). The result is represented as a column value for each record of attributes that are evaluated.
create or replace package ZZ_PKG_MARKETING_DEMO as
c_result_true constant pls_integer:= 1;
c_result_false constant pls_integer:= 0;
cursor attrib_cur is
select data_id, name, age, hh_size, survey_score, dma_region,
last_contact
from zz_data_attributes;
TYPE attrib_record_type IS RECORD (
data_id zz_data_attributes.data_id%TYPE,
name zz_data_attributes.name%TYPE,
age zz_data_attributes.age%TYPE,
hh_size zz_data_attributes.hh_size%TYPE,
survey_score zz_data_attributes.survey_score%TYPE,
dma_region zz_data_attributes.dma_region%TYPE,
last_contact zz_data_attributes.last_contact%TYPE
);
function evaluate_cond_one (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_two (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_three (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_four (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_five (
p_attrib_rec attrib_record_type) return pls_integer;
procedure main_driver;
end;
create or replace package body "ZZ_PKG_MARKETING_DEMO" is
function evaluate_cond_one (
p_attrib_rec attrib_record_type) return pls_integer
as
begin
-- Checks if person is from a DMA Region in California.
IF p_attrib_rec.dma_region like 'CA%'
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_ONE;
function evaluate_cond_two (
p_attrib_rec attrib_record_type) return pls_integer
as
c_begin_age_range constant zz_data_attributes.age%TYPE:= 20;
c_end_age_range constant zz_data_attributes.age%TYPE:= 35;
begin
-- Part 1 of 2 Checks if person belongs to the 20 to 35 years age bracket
IF p_attrib_rec.age between c_begin_age_range and c_end_age_range
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_TWO;
function evaluate_cond_three (
p_attrib_rec attrib_record_type) return pls_integer
as
c_lowest_age constant zz_data_attributes.age%TYPE:= 45;
begin
-- Part 2 of 2 Checks if person is from age 45 and up demographic.
IF p_attrib_rec.age >= c_lowest_age
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_THREE;
function evaluate_cond_four (
p_attrib_rec attrib_record_type) return pls_integer
as
c_cutoff_score CONSTANT zz_data_attributes.survey_score%TYPE:= 1200;
begin
-- Checks if person's survey score is higher than c_cutoff_score
IF p_attrib_rec.survey_score >= c_cutoff_score
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_FOUR;
function evaluate_cond_five (
p_attrib_rec attrib_record_type) return pls_integer
as
c_last_contact_period CONSTANT pls_integer:= -750;
-- Note current date is anchored to a static value so the data output
-- in this example will still work regardless of how old this post
-- may get.
c_current_date CONSTANT zz_data_attributes.last_contact%TYPE:=
to_date('03/25/2014','MM/DD/YYYY');
begin
-- Checks if person's last contact date has been in the last 750
-- days.
IF p_attrib_rec.last_contact >=
(c_current_date + c_last_contact_period)
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_FIVE;
procedure MAIN_DRIVER
as
v_rec_attr attrib_record_type;
v_rec_cond zz_condition_results%ROWTYPE;
begin
for i in attrib_cur
loop
-- Set the input record variable with the attribute values queried by the
-- current cursor.
v_rec_attr.data_id := i.data_id;
v_rec_attr.name := i.name;
v_rec_attr.age := i.age;
v_rec_attr.hh_size := i.hh_size;
v_rec_attr.survey_score := i.survey_score;
v_rec_attr.dma_region := i.dma_region;
v_rec_attr.last_contact := i.last_contact;
-- Set each condition column value equal to their matching package function.
v_rec_cond.cond_one := evaluate_cond_one(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_two := evaluate_cond_two(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_three:= evaluate_cond_three(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_four := evaluate_cond_four(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_five := evaluate_cond_five(p_attrib_rec => v_rec_attr);
INSERT INTO zz_condition_results (data_id, cond_one, cond_two,
cond_three, cond_four, cond_five)
VALUES
( v_rec_attr.data_id,
v_rec_cond.cond_one,
v_rec_cond.cond_two,
v_rec_cond.cond_three,
v_rec_cond.cond_four,
v_rec_cond.cond_five );
end loop;
COMMIT;
end MAIN_DRIVER;
end "ZZ_PKG_MARKETING_DEMO";
PL/SQL Notes: Some may not be familiar the CUSTOM DATA TYPES such as the RECORD VARIABLE TYPE defined within the package in procedure MAIN_DRIVER. They provide easier to handle and reference identification of the data being processed.
Boolean Arithmetic in Plain English (well, sort of)
The CURSOR Named ATTRIB_CUR can be modified to operate on a single record or a smaller input data set. For now, invoke the MAIN_DRIVER procedure to process all the records in the attributes data source (again, this doesn't have to be a single table).
BEGIN
ZZ_PKG_MARKETING_DEMO.MAIN_DRIVER;
END;
Now that each example condition has been evaluated for all the sample records, there are several simpler pathways to evaluating the boolean values, currently captured as values of "1" (for TRUE) and "0" (for FALSE).
If only one of this series of conditions need to be met (as in a long chain of OR operators), then the WHERE clause should look something like this:
WHERE COND_ONE = 1 OR COND_TWO = 1 OR COND_THREE = 1 OR COND_FOUR = 1 OR COND_FIVE = 1
A shorthand approach could be:
WHERE (COND_ONE + COND_TWO + COND_THREE + COND_FOUR + COND_FIVE) > 0
What does this buy? There are performance gains by processing an otherwise static evaluation (the custom conditions) at the time that the data record is populated. One good reason is that each subsequent query that asks about this criteria will not need to crunch through the business logic again. We also leverage an advantage through a decision value with a very, very, very low cardinality (TWO!)
The second "shorthand" example of the WHERE filter criteria is a clue about how the final approach will manage "thousands" of Boolean evaluations.
Scalability: How to Do This Several Thousand More Times in a Row
It would be impractical to assume this approach could scale up to the magnitude presented in the OP. The final question: How can this solution apply for an N thousand chain of boolean values?
Hint: PIVOT your results.
Expandable Table Design for Lots of Boolean Conditions
Here is also a mock-up of the table with the way the sample data would fit into it:
The SQL needed to fetch a multiple OR relation between the five sample conditions can be accomplished through an aggregation query:
-- For multiple OR relations:
SELECT DATA_ID
FROM ZZ_CONDITION_PIVOT
GROUP BY DATA_ID
HAVING SUM(RESULT) > 0
Veterans will probably note this syntax can be further simplified with the use of database supported ANALYTICAL FUNCTIONS.
This design should be low maintenance with any number of boolean conditions introduced during or after the implementation. The table designs should remain the same throughout.
Let me know your thoughts, it looks like the discussion has moved on to other issues and contributors so this is probably long enough to get you started. Onward!
We have an ASP.NET/MSSQL based web app which generates orders with sequential order numbers.
When a user saves a form, a new order is created as follows:
SELECT MAX(order_number) FROM order_table, call this max_order_number
set new_order_number = max_order_number + 1
INSERT a new order record, with this new_order_number (it's just a field in the order record, not a database key)
If I enclose the above 3 steps in single transaction, will it avoid duplicate order numbers from being created, if two customers save a new order at the same time? (And let's say the system is eventually on a web farm with multiple IIS servers and one MSSQL server).
I want to avoid two customers selecting the same MAX(order_number) due to concurrency somewhere in the system.
What isolation level should be used? Thank you.
Why not just use an Identity as the order number?
Edit:
As far as I know, you can make the current order_number column an Identity (you may have to reset the seed, it's been a while since I've done this). You might want to do some tests.
Here's a good read about what actually goes on when you change a column to an Identity in SSMS. The author mentions how this may take a while if the table already has millions of rows.
Using an identity is by far the best idea. I create all my tables like this:
CREATE TABLE mytable (
mytable_id int identity(1, 1) not null primary key,
name varchar(50)
)
The "identity" flag means, "Let SQL Server assign this number for me". The (1, 1) means that identity numbers should start at 1 and be incremented by 1 each time someone inserts a record into the table. Not Null means that nobody should be allowed to insert a null into this column, and "primary key" means that we should create a clustered index on this column. With this kind of a table, you can then insert your record like this:
-- We don't need to insert into mytable_id column; SQL Server does it for us!
INSERT INTO mytable (name) VALUES ('Bob Roberts')
But to answer your literal question, I can give a lesson about how transactions work. It's certainly possible, although not optimal, to do this:
-- Begin a transaction - this means everything within this region will be
-- executed atomically, meaning that nothing else can interfere.
BEGIN TRANSACTION
DECLARE #id bigint
-- Retrieves the maximum order number from the table
SELECT #id = MAX(order_number) FROM order_table
-- While you are in this transaction, no other queries can change the order table,
-- so this insert statement is guaranteed to succeed
INSERT INTO order_table (order_number) VALUES (#id + 1)
-- Committing the transaction releases your lock and allows other programs
-- to work on the order table
COMMIT TRANSACTION
Just keep in mind that declaring your table with an identity primary key column does this all for you automatically.
The risk is two processes selecting the MAX(order_number) before one of them inserts the new order. A safer way is to do it in one step:
INSERT INTO order_table
(order_number, /* other fields */)
VALUES
( (SELECT MAX(order_number)+1 FROM order_table ) order_number,
/* other values */
)
I agree with G_M; use an Identity field. When you add your record, just
INSERT INTO order_table (/* other fields */)
VALUES (/* other fields */) ; SELECT SCOPE_IDENTITY()
The return value from Scope Identity will be your order number.
I'm building an ASP.NET MVC 2 site that uses LINQ to SQL. In one of the places where my site accesses the DB, I think a race condition is possible.
DB Architecture
Here are some of the columns of the relevant DB table, named Revisions:
RevisionID - bigint, IDENTITY, PK
PostID - bigint, FK to PK of Posts table
EditNumber - int
RevisionText - nvarchar(max)
On my site, users can submit a Post and edit a Post later on. Users other than the original poster are able to edit a Post - so there is scope for multiple edits on a single Post simultaneously.
When submitting a Post, a record in the Posts table is created, as well as a record in the Revisions table with PostID set to the ID of the Posts record, RevisionText set to the Post text, and EditNumber set to 1.
When editing a Post, only a Revisions record is created, with EditNumber being set to 1 higher than the latest edit number.
Thus, the EditNumber column refers to how many times a Post has been edited.
Incrementing EditNumber
The challenge that I see in implementing those functions is incrementing the EditNumber column. As that column can't be an IDENTITY, I have to manipulate its value manually.
Here's my LINQ query for determining what EditNumber a new Revision should have:
using(var db = new DBDataContext())
{
var rev = new Revision();
rev.EditNumber = db.Revisions.Where(r => r.PostID == postID).Max(r => r.EditNumber) + 1;
// ... (fill other properties)
db.Revisions.InsertOnSubmit(rev);
db.SubmitChanges();
}
Calculating a maximum and incrementing it can lead to a race condition.
Is there a better way to implement that function?
Update directly in the database and return the new revision:
update Revisions
set EditNumber += 1
output INSERTED.EditNumber
where PostID = #postId;
Unfortunately, this is not possible in LINQ. In fact, is not possible in the client at all, no matter the technology used, short of doing pessimistic locking which has too many drawback to worth considering.
Updated:
Here is how I would insert a new revision (including first revision):
create procedure usp_insertPostRevision
#postId int,
#text nvarchar(max),
#revisionId bigint output
as
begin
set nocount on;
declare #nextEditNumber (EditNumber int not null);
declare #rc int = 0;
begin transaction;
begin try
update Posts
set LastRevision += 1
output INSERTED.LastRevision
into #nextEditNumber (EditNumber)
where PostId = #postId;
set #rc = ##rowcount;
if (#rc <> 1)
raiserror (N'Expected exactly one post with Id:%i. Found:%i',
16, 1 , #postId, #rc);
insert into Revisions
(PostId, Text, EditNumber)
select #postID, #text, EditNumber
from #nextEditNumber;
set #revisionId = scope_identity();
commit;
end try
begin catch
... // Error handling omitted
end catch
end
I omitted the error handling, see Exception handling and nested transactions for a template procedure than handles errors and nested transactions properly.
You'll notice the Posts table has a LastRevision field that is used as the increment for the post revisions. This is much better than computing the MAX each time you add a revision, as it avoid a (range) scan of Revisions. It also acts as a concurrency protection: only one transaction at a time will be able to update it, and only that transaction will proceed with inserting a new revision. Concurrent transactions will block and wait until the first one commits, then the next transaction unblocked will correctly update the revision number to +1.
Can multiple users edit the same post at the same time? If not then you do not have a race condition unless some how a single user can submit multiple edits simultaneously.
If revisions are only permitted by the user who submitted the comment then you're OK with the above - if multiple users can be revising a single comment then there's scope for problems.
Since there is only one record in the Posts table per Post, use a lock.
Read the record in the Posts table and use a table hint [WITH (ROWLOCK, XLOCKX)] to get an exclusive lock. Set the lock timeout to wait a few milliseconds.
If the process gets the lock, then it can add the revision record. If the process cannot get the lock, then have the process try again. After a few retries if the process cannot get a lock, return an error.
Since EditNumber is a property determined by membership in a collection, have the collection provide it.
Make EditNumber a computed column - COUNT of records for same post with lesser RevisionID.