Configurable Serial Tracking Number Generation - c#

I am working on a C# .NET (& TSQL) based project which requires generation of configurable serial tracking numbers which depend on the data it contains.
Background
To elaborate, the software is used to track vehicles entering a factory. Each vehicle is assigned an RFID card and a tracking number. The tracking number is based on first of many warehouses to which the truck is headed. The admin can configure the format of the numbers for each warehouse. This is typically as follows: #YYYYMMDD-XX002211 where XX is a prefix unique to each warehouse. This number restarts from 1 whenever the month changes.
At the time of master (vehicle + RFID card details) creation, the tracking number is null. Once the first line (warehouse details) is added the system generates a tracking number and assigns it to the master. So, until the time a line is added, the tracking number field can have duplicates i.e. many records with null.
The problem
The code to generate the tracking number is written in C# and uses SQL transactions, however there have been infrequent collisions in the tracking which causes the system to be unstable. I did more research into TSQL Transactions to find the Isolation Level settings however I have not been successful with them.
Solution?
I have tried the following approaches:
Check in the master table for a possible duplicate before updating
Use Repeatable Read isolation level to ensure data isn't updated during the transaction execution
Should I?
Move the whole thing to a Stored Procedure?
Start a daemon which is responsible for the number generation, ensuring sequence by a single threaded execution
Thanks for your help!

In case of simplicity, I would suggest you to get a single Stored Procedure to update and get the sequence number, then do other formatting with C#
CREATE PROCEDURE GetSequenceOfWareHouse
#WareHouse char(2)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #Result table (Sequence int)
BEGIN Tran
UPDATE SequenceTable SET Sequence = Sequence + 1
OUTPUT inserted.Sequence INTO #Result
WHERE WareHouse = #WareHouse AND Month = YEAR(GETDATE()) * 100 + MONTH(GETDATE())
IF NOT EXISTS(SELECT * FROM #Result)
INSERT SequenceTable (Sequence, WareHouse, Month)
OUTPUT inserted.Sequence INTO #Result
VALUES (1, #WareHouse, YEAR(GETDATE()) * 100 + MONTH(GETDATE()))
COMMIT Tran
SELECT Sequence FROM #Result
END
GO
CREATE Table SequenceTable
(
Sequence int,
WareHouse char(2),
Month int,
unique (Sequence, WareHouse, Month)
)

Related

Avoid duplicates in SQL server due the latency

I have a POS like system in C#, and for long time it not present any problem (it was just one POS). But in this days are 4 POS using the system, and connected to the same database and all the sales of one POS go to the same Audit (table) where all of the others sales go.
So in this system this is the procedure
Function to get the last Ticket number (with simple SELECT)
Add 1 to that number (next tickt no).
Generates a ID Code injecting this Ticket number (with the terminal, date, and employee code) into the algorithm
Insert record of the sale into database with all the necesary information (Date, Client, Employee, IDCode, etc.) (with simple INSERT INTO)
But having 4 POS I realize that some sales where having the same Ticket number, fortunately the Ticket ID code are not the same because the terminal and the employee are different, but how can avoid this?
Edit 1:
Every POS system have dual function, in one mode the POS sales are centralized, and every POS in this mode generates consecutive tickets (like they all where one POS), in the other mode every POS have their own Ticket numertion, for that reason I can't use the identity.
Just use a sequence to generate the next ticket number.
CREATE SEQUENCE Tickets
START WITH 1
INCREMENT BY 1;
Then each POS just do
SELECT NEXT VALUE FOR Tickets;
The sequence is guaranteed to never return the same number twice.
As has been mentioned, if the TicketNumber is sequential and unique, it sounds like an IDENTITY field would be the way to go. BUT, if for some reason there is something preventing that, or if that requires too many changes as this time, you could constrain the process itself to be single-threaded by creating a lock on the ID Code generation process itself through the use of Application Locks (see sp_getapplock and sp_releaseapplock). Application Locks let you create locks around arbitrary concepts. Meaning, you can define the #Resource as "generate_id_code" which will force each caller to wait their turn. It would follow this structure:
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'generate_id_code', #LockMode = 'Exclusive';
...current 4 steps to generate the ID Code...
EXEC sp_releaseapplock #Resource = 'generate_id_code';
COMMIT TRANSACTION;
You need to manage errors / ROLLBACK yourself (as stated in the linked MSDN documentation) so put in the usual TRY / CATCH. But, this does allow you to manage the situation.
Please note: sp_getapplock / sp_releaseapplock should be used sparingly; Application Locks can definitely be very handy (such as in cases like this one) but they should only be used when absolutely necessary.
You need to do this in an atomic action. So you can wrap everything in a transaction and lock the table. See here for a good discussion on locking etc.
Locking will slow down everything else since everything will start waiting for the table to free up for it to complete and that may not be something you can live with.
Or you can should use an identity on the column which will be managed by the database and maintain unique incrementing numbers.
You could also create your primary key (hope you have one) to be a combination of a few things. And then you could keeping a running number for each POS endpoint to see more data about how they are performing. But that gets more into analytics which isn't in scope here.
I would strongly suggest moving away from the current approach if you can and changing to a GUID PK.
However, I realize that in some cases a redesign is not possible (we have the exact same scenario that you describe in a legacy database).
In this case, you can get the maximum value safely using the UPDLOCK table hint in combination with the insert command and use the OUTPUT INSERTED functionality to retrieve the new primary key value into a local variable if needed:
DECLARE
#PK Table (PK INT NOT NULL)
INSERT
INTO Audit (
TicketNumber,
Terminal,
Date,
EmployeeCode,
Client,
IDCode,
... other fields )
/* Record the new PK in the tablevariable */
OUTPUT INSERTED.TicketNumber INTO #PK
SELECT IsNull(MAX(TicketNumber), 0) + 1,
#Terminal,
#Date,
#EmployeeCode,
#Client,
#IDCode,
... other values
FROM Audit WITH (UPDLOCK)
DECLARE #TicketNumber INT
/* Move the new PK from the local tablevariable into a local variable for subsequent use */
SELECT #TicketNumber = PK
FROM #PK

Where to process data? Db or locally?

We are working on a solution which fires many search requests torwards three different public databases placed in three different countries. For example a search fetches data from one db and passes them as parameter to another db. The parameter is a list which each item needs to be logically connected with an OR operator. Therefore we end up having a sql select statement with up to 1000 OR operators linked inside the where clause.
Now my question is does 1000 or 500 or even 5000 logical AND or OR Operators inside select statement make the db slower and should I instead better request all data to the pc and do the matching on my pc.
The amount of is data is between 5000 and 10000 records, we are talking about a public db therefore the amount keeps growing.
For example such a sql statement:
select * from some_table
where .. and .. or .. or.. or..
or.. or.. or.. or.. or.. or.. (1000 times)
If I fetch all data to my pc I could have a LINQ Statement that does the filtering.
What do you suggest me to do? Any experiences on this one guys?
Sorry if this is a duplicate just let me know in comments and I'll delete this question.
EDIT:
It should be considered that many users may access the databases at the same time.
I always learned that running a query with hundreds of OR statements are bad for performance. However, even when running a sample here on 12g, querying a table with or or in using an primary key index doesn't seem to change the execution plan.
Therefore I say: it doesn't matter. The only things you could consider are readability, query length, etc.
Still, I personally prefer the where in.
See this other useful question with sample data.
Process this all in the database with a single query. Batching similar operations is usually the best thing you can do for database performance.
The most expensive part of the query is reading the data from disk. Once the data is in memory, filtering out a few thousand conditions is a small amount of work. Your local processor probably is faster than the database server. But it doesn't matter because your machine would spend too much time on unnecessary IO if you returned all the records.
Also, 5000 conditions in a SQL query is only a problem if you run that query a hundred times a second.
I think you should just try.
Create an example that is a simple as possible, yet complex enough to be realistic, and then run it with some form of benchmarking
Whatever works best for you is what you should choose to do.
Edit:
That said - such a large number of and's and or's in a single SQL statement does sound complicated and messy. Unless there is a real benefit from doing it this way(?), I would probably try to find a cleaner way to do this, for instance by splitting the operation into several steps and applying Linq or something similar, as you sugest, even if it is just to make the solution more manageable.
The answer is - depend
How big is the data on the public db ? if you are querying Google than fetching all the data is not an option.
It would be reasonable to assume that those public db have much stronger hardware and db tuning than your home pc.
Is there an option that you will got black listed from those public db ?
Does order matter ? if you query db 1 and then db 2 will be faster then query db 2 and then db 1 ?
Mostly it's try & error and what work best for you and is possible.
SQL Queries on ORACLE Using Multiple Boolean Operators
Comments: I have worked many years with CRYSTAL REPORTS, a database report designer. It was one of the first drag-and-drop, GUI based tools which made it easier for developers with not much database background to construct queries with multiple tables and filter conditions. The trade-off was that the the tool was writing SQL under the hood; many times it was a serious performance hog because the workstation running the report file had to suck down the entire contents of the database tables being queried, only to run the filtering process locally on the client system. That was more than a decade ago, but I see other next-gen tools that also auto-generate really awful SQL code.
No amount of software can compensate for lousy database design. You won't get everything right the first time (as others have noticed), but a little planning can give some breathing room when the product reveals under real-world use the demands of PERFORMANCE and SCALABILITY.
Demonstration Schema and Test Data
The following solution was designed on an ORACLE 11g Release 2 RDBMS system. The first table can be represented by a database VIEW, INLINE QUERY, SUB QUERY, MATERIALIZED VIEW or even a CURSOR output, so the "attributes" discussed in this example could be coming from multiple table sources and joining criteria.
CREATE TABLE "ZZ_DATA_ATTRIBUTES"
( "DATA_ID" NUMBER(10,0) NOT NULL ENABLE,
"NAME" VARCHAR2(50),
"AGE" NUMBER(5,0),
"HH_SIZE" NUMBER(5,0),
"SURVEY_SCORE" NUMBER(5,0),
"DMA_REGION" VARCHAR2(100),
"LAST_CONTACT" DATE,
CONSTRAINT "ZZ_DATA_ATTRIBUTES_PK" PRIMARY KEY ("DATA_ID") ENABLE
)
/
CREATE SEQUENCE "ZZ_DATA_ATTRIBUTES_SEQ" MINVALUE 1 MAXVALUE
9999999999999999999999999999 INCREMENT BY 1 START WITH 41 CACHE 20 NOORDER NOCYCLE
/
CREATE OR REPLACE TRIGGER "BI_ZZ_DATA_ATTRIBUTES"
before insert on "ZZ_DATA_ATTRIBUTES"
for each row
begin
if :NEW."DATA_ID" is null then
select "ZZ_DATA_ATTRIBUTES_SEQ".nextval into :NEW."DATA_ID" from sys.dual;
end if;
end;
/
ALTER TRIGGER "BI_ZZ_DATA_ATTRIBUTES" ENABLE
/
The SEQUENCE and TRIGGER objects are just for unique, auto-incremented values for the primary key on each table.
CREATE TABLE "ZZ_CONDITION_RESULTS"
( "RESULT_ID" NUMBER(10,0) NOT NULL ENABLE,
"DATA_ID" NUMBER(10,0) NOT NULL ENABLE,
"COND_ONE" NUMBER(10,0),
"COND_TWO" NUMBER(10,0),
"COND_THREE" NUMBER(10,0),
"COND_FOUR" NUMBER(10,0),
"COND_FIVE" NUMBER(10,0),
CONSTRAINT "ZZ_CONDITION_RESULTS_PK" PRIMARY KEY ("RESULT_ID") ENABLE
)
/
ALTER TABLE "ZZ_CONDITION_RESULTS" ADD CONSTRAINT "ZZ_CONDITION_RESULTS_FK"
FOREIGN KEY ("DATA_ID") REFERENCES "ZZ_DATA_ATTRIBUTES" ("DATA_ID") ENABLE
/
CREATE SEQUENCE "ZZ_CONDITION_RESULTS_SEQ" MINVALUE 1 MAXVALUE
9999999999999999999999999999 INCREMENT BY 1 START WITH 1 CACHE 20 NOORDER NOCYCLE
/
CREATE OR REPLACE TRIGGER "BI_ZZ_CONDITION_RESULTS"
before insert on "ZZ_CONDITION_RESULTS"
for each row
begin
if :NEW."RESULT_ID" is null then
select "ZZ_CONDITION_RESULTS_SEQ".nextval into :NEW."RESULT_ID" from sys.dual;
end if;
end;
/
ALTER TRIGGER "BI_ZZ_CONDITION_RESULTS" ENABLE
/
The table ZZ_CONDITION_RESULTS should be a TABLE type. It will contain the results of each individual boolean OR criteria. While 1000's of columns may not be practically feasible, the initial approach will show how you can line up lots of boolean outputs and be able to quickly identify and isolate the combinations and patterns of interest.
Sample Data
You can pick your own data values, but these were created to make the examples work. I chose the theme of MARKETING, where the data pulled together are different attributes our fictional company has gathered about their customers: customer name, age, hh_size (Household Size), The scoring results of some bench marked survey, DMA (Demographic Marketing Area) Region and the date the customer was last contacted.
Defined Boolean Arguments Using an Oracle Package Structure
The initial design is to calculate the business logic through an Oracle PL/SQL Package Object. For example, in the OP:
select * from some_table
where .. and .. or .. or.. or..
or.. or.. or.. or.. or.. or.. (1000 times)
Each blank is a separate Oracle function call from within the package(s). The result is represented as a column value for each record of attributes that are evaluated.
create or replace package ZZ_PKG_MARKETING_DEMO as
c_result_true constant pls_integer:= 1;
c_result_false constant pls_integer:= 0;
cursor attrib_cur is
select data_id, name, age, hh_size, survey_score, dma_region,
last_contact
from zz_data_attributes;
TYPE attrib_record_type IS RECORD (
data_id zz_data_attributes.data_id%TYPE,
name zz_data_attributes.name%TYPE,
age zz_data_attributes.age%TYPE,
hh_size zz_data_attributes.hh_size%TYPE,
survey_score zz_data_attributes.survey_score%TYPE,
dma_region zz_data_attributes.dma_region%TYPE,
last_contact zz_data_attributes.last_contact%TYPE
);
function evaluate_cond_one (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_two (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_three (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_four (
p_attrib_rec attrib_record_type) return pls_integer;
function evaluate_cond_five (
p_attrib_rec attrib_record_type) return pls_integer;
procedure main_driver;
end;​
create or replace package body "ZZ_PKG_MARKETING_DEMO" is
function evaluate_cond_one (
p_attrib_rec attrib_record_type) return pls_integer
as
begin
-- Checks if person is from a DMA Region in California.
IF p_attrib_rec.dma_region like 'CA%'
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_ONE;
function evaluate_cond_two (
p_attrib_rec attrib_record_type) return pls_integer
as
c_begin_age_range constant zz_data_attributes.age%TYPE:= 20;
c_end_age_range constant zz_data_attributes.age%TYPE:= 35;
begin
-- Part 1 of 2 Checks if person belongs to the 20 to 35 years age bracket
IF p_attrib_rec.age between c_begin_age_range and c_end_age_range
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_TWO;
function evaluate_cond_three (
p_attrib_rec attrib_record_type) return pls_integer
as
c_lowest_age constant zz_data_attributes.age%TYPE:= 45;
begin
-- Part 2 of 2 Checks if person is from age 45 and up demographic.
IF p_attrib_rec.age >= c_lowest_age
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_THREE;
function evaluate_cond_four (
p_attrib_rec attrib_record_type) return pls_integer
as
c_cutoff_score CONSTANT zz_data_attributes.survey_score%TYPE:= 1200;
begin
-- Checks if person's survey score is higher than c_cutoff_score
IF p_attrib_rec.survey_score >= c_cutoff_score
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_FOUR;
function evaluate_cond_five (
p_attrib_rec attrib_record_type) return pls_integer
as
c_last_contact_period CONSTANT pls_integer:= -750;
-- Note current date is anchored to a static value so the data output
-- in this example will still work regardless of how old this post
-- may get.
c_current_date CONSTANT zz_data_attributes.last_contact%TYPE:=
to_date('03/25/2014','MM/DD/YYYY');
begin
-- Checks if person's last contact date has been in the last 750
-- days.
IF p_attrib_rec.last_contact >=
(c_current_date + c_last_contact_period)
THEN return c_result_true;
ELSE return c_result_false;
END IF;
end EVALUATE_COND_FIVE;
procedure MAIN_DRIVER
as
v_rec_attr attrib_record_type;
v_rec_cond zz_condition_results%ROWTYPE;
begin
for i in attrib_cur
loop
-- Set the input record variable with the attribute values queried by the
-- current cursor.
v_rec_attr.data_id := i.data_id;
v_rec_attr.name := i.name;
v_rec_attr.age := i.age;
v_rec_attr.hh_size := i.hh_size;
v_rec_attr.survey_score := i.survey_score;
v_rec_attr.dma_region := i.dma_region;
v_rec_attr.last_contact := i.last_contact;
-- Set each condition column value equal to their matching package function.
v_rec_cond.cond_one := evaluate_cond_one(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_two := evaluate_cond_two(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_three:= evaluate_cond_three(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_four := evaluate_cond_four(p_attrib_rec => v_rec_attr);
v_rec_cond.cond_five := evaluate_cond_five(p_attrib_rec => v_rec_attr);
INSERT INTO zz_condition_results (data_id, cond_one, cond_two,
cond_three, cond_four, cond_five)
VALUES
( v_rec_attr.data_id,
v_rec_cond.cond_one,
v_rec_cond.cond_two,
v_rec_cond.cond_three,
v_rec_cond.cond_four,
v_rec_cond.cond_five );
end loop;
COMMIT;
end MAIN_DRIVER;
end "ZZ_PKG_MARKETING_DEMO";
​
PL/SQL Notes: Some may not be familiar the CUSTOM DATA TYPES such as the RECORD VARIABLE TYPE defined within the package in procedure MAIN_DRIVER. They provide easier to handle and reference identification of the data being processed.
Boolean Arithmetic in Plain English (well, sort of)
The CURSOR Named ATTRIB_CUR can be modified to operate on a single record or a smaller input data set. For now, invoke the MAIN_DRIVER procedure to process all the records in the attributes data source (again, this doesn't have to be a single table).
BEGIN
ZZ_PKG_MARKETING_DEMO.MAIN_DRIVER;
END;
Now that each example condition has been evaluated for all the sample records, there are several simpler pathways to evaluating the boolean values, currently captured as values of "1" (for TRUE) and "0" (for FALSE).
If only one of this series of conditions need to be met (as in a long chain of OR operators), then the WHERE clause should look something like this:
WHERE COND_ONE = 1 OR COND_TWO = 1 OR COND_THREE = 1 OR COND_FOUR = 1 OR COND_FIVE = 1
A shorthand approach could be:
WHERE (COND_ONE + COND_TWO + COND_THREE + COND_FOUR + COND_FIVE) > 0
What does this buy? There are performance gains by processing an otherwise static evaluation (the custom conditions) at the time that the data record is populated. One good reason is that each subsequent query that asks about this criteria will not need to crunch through the business logic again. We also leverage an advantage through a decision value with a very, very, very low cardinality (TWO!)
The second "shorthand" example of the WHERE filter criteria is a clue about how the final approach will manage "thousands" of Boolean evaluations.
Scalability: How to Do This Several Thousand More Times in a Row
It would be impractical to assume this approach could scale up to the magnitude presented in the OP. The final question: How can this solution apply for an N thousand chain of boolean values?
Hint: PIVOT your results.
Expandable Table Design for Lots of Boolean Conditions
Here is also a mock-up of the table with the way the sample data would fit into it:
The SQL needed to fetch a multiple OR relation between the five sample conditions can be accomplished through an aggregation query:
-- For multiple OR relations:
SELECT DATA_ID
FROM ZZ_CONDITION_PIVOT
GROUP BY DATA_ID
HAVING SUM(RESULT) > 0
Veterans will probably note this syntax can be further simplified with the use of database supported ANALYTICAL FUNCTIONS.
This design should be low maintenance with any number of boolean conditions introduced during or after the implementation. The table designs should remain the same throughout.
Let me know your thoughts, it looks like the discussion has moved on to other issues and contributors so this is probably long enough to get you started. Onward!

How do I structure this transaction?

We have an ASP.NET/MSSQL based web app which generates orders with sequential order numbers.
When a user saves a form, a new order is created as follows:
SELECT MAX(order_number) FROM order_table, call this max_order_number
set new_order_number = max_order_number + 1
INSERT a new order record, with this new_order_number (it's just a field in the order record, not a database key)
If I enclose the above 3 steps in single transaction, will it avoid duplicate order numbers from being created, if two customers save a new order at the same time? (And let's say the system is eventually on a web farm with multiple IIS servers and one MSSQL server).
I want to avoid two customers selecting the same MAX(order_number) due to concurrency somewhere in the system.
What isolation level should be used? Thank you.
Why not just use an Identity as the order number?
Edit:
As far as I know, you can make the current order_number column an Identity (you may have to reset the seed, it's been a while since I've done this). You might want to do some tests.
Here's a good read about what actually goes on when you change a column to an Identity in SSMS. The author mentions how this may take a while if the table already has millions of rows.
Using an identity is by far the best idea. I create all my tables like this:
CREATE TABLE mytable (
mytable_id int identity(1, 1) not null primary key,
name varchar(50)
)
The "identity" flag means, "Let SQL Server assign this number for me". The (1, 1) means that identity numbers should start at 1 and be incremented by 1 each time someone inserts a record into the table. Not Null means that nobody should be allowed to insert a null into this column, and "primary key" means that we should create a clustered index on this column. With this kind of a table, you can then insert your record like this:
-- We don't need to insert into mytable_id column; SQL Server does it for us!
INSERT INTO mytable (name) VALUES ('Bob Roberts')
But to answer your literal question, I can give a lesson about how transactions work. It's certainly possible, although not optimal, to do this:
-- Begin a transaction - this means everything within this region will be
-- executed atomically, meaning that nothing else can interfere.
BEGIN TRANSACTION
DECLARE #id bigint
-- Retrieves the maximum order number from the table
SELECT #id = MAX(order_number) FROM order_table
-- While you are in this transaction, no other queries can change the order table,
-- so this insert statement is guaranteed to succeed
INSERT INTO order_table (order_number) VALUES (#id + 1)
-- Committing the transaction releases your lock and allows other programs
-- to work on the order table
COMMIT TRANSACTION
Just keep in mind that declaring your table with an identity primary key column does this all for you automatically.
The risk is two processes selecting the MAX(order_number) before one of them inserts the new order. A safer way is to do it in one step:
INSERT INTO order_table
(order_number, /* other fields */)
VALUES
( (SELECT MAX(order_number)+1 FROM order_table ) order_number,
/* other values */
)
I agree with G_M; use an Identity field. When you add your record, just
INSERT INTO order_table (/* other fields */)
VALUES (/* other fields */) ; SELECT SCOPE_IDENTITY()
The return value from Scope Identity will be your order number.

Concurrent access to database - preventing two users from obtaining the same value

I have a table with sequential numbers (think invoice numbers or student IDs).
At some point, the user needs to request the previous number (in order to calculate the next number). Once the user knows the current number, they need to generate the next number and add it to the table.
My worry is that two users will be able to erroneously generate two identical numbers due to concurrent access.
I've heard of stored procedures, and I know that that might be one solution. Is there a best-practice here, to avoid concurrency issues?
Edit: Here's what I have so far:
USE [master]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[sp_GetNextOrderNumber]
AS
BEGIN
BEGIN TRAN
DECLARE #recentYear INT
DECLARE #recentMonth INT
DECLARE #recentSequenceNum INT
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- get the most recent numbers
SELECT #recentYear = Year, #recentMonth = Month, #recentSequenceNum = OrderSequenceNumber
FROM dbo.OrderNumbers
WITH (XLOCK)
WHERE Id = (SELECT MAX(Id) FROM dbo.OrderNumbers)
// increment the numbers
IF (YEAR(getDate()) > IsNull(#recentYear,0))
BEGIN
SET #recentYear = YEAR(getDate());
SET #recentMonth = MONTH(getDate());
SET #recentSequenceNum = 0;
END
ELSE
BEGIN
IF (MONTH(getDate()) > IsNull(#recentMonth,0))
BEGIN
SET #recentMonth = MONTH(getDate());
SET #recentSequenceNum = 0;
END
ELSE
SET #recentSequenceNum = #recentSequenceNum + 1;
END
-- insert the new numbers as a new record
INSERT INTO dbo.OrderNumbers(Year, Month, OrderSequenceNumber)
VALUES (#recentYear, #recentMonth, #recentSequenceNum)
COMMIT TRAN
END
This seems to work, and gives me the values I want. So far, I have not yet added any locking to prevent concurrent access.
Edit 2: Added WITH(XLOCK) to lock the table until the transaction completes. I'm not going for performance here. As long as I don't get duplicate entries added, and deadlocks don't happen, this should work.
you know that SQL Server does that for you, right? You can you a identity column if you need sequential number or a calculated column if you need to calculate the new value based on another one.
But, if that doesn't solve your problem, or if you need to do a complicated calculation to generate your new number that cant be done in a simple insert, I suggest writing a stored procedure that locks the table, gets the last value, generate the new one, inserts it and then unlocks the table.
Read this link to learn about transaction isolation level
just make sure to keep the "locking" period as small as possible
Here is a sample Counter implementation. Basic idea is to use insert trigger to update numbers of lets say, invoices. First step is to create a table to hold a value of last assigned number:
create table [Counter]
(
LastNumber int
)
and initialize it with single row:
insert into [Counter] values(0)
Sample invoice table:
create table invoices
(
InvoiceID int identity primary key,
Number varchar(8),
InvoiceDate datetime
)
Stored procedure LastNumber first updates Counter row and then retrieves the value. As the value is an int, it is simply returned as procedure return value; otherwise an output column would be required. Procedure takes as a parameter number of next numbers to fetch; output is last number.
create proc LastNumber (#NumberOfNextNumbers int = 1)
as
begin
declare #LastNumber int
update [Counter]
set LastNumber = LastNumber + #NumberOfNextNumbers -- Holds update lock
select #LastNumber = LastNumber
from [Counter]
return #LastNumber
end
Trigger on Invoice table gets number of simultaneously inserted invoices, asks next n numbers from stored procedure and updates invoices with that numbers.
create trigger InvoiceNumberTrigger on Invoices
after insert
as
set NoCount ON
declare #InvoiceID int
declare #LastNumber int
declare #RowsAffected int
select #RowsAffected = count(*)
from Inserted
exec #LastNumber = dbo.LastNumber #RowsAffected
update Invoices
-- Year/month parts of number are missing
set Number = right ('000' + ltrim(str(#LastNumber - rowNumber)), 3)
from Invoices
inner join
( select InvoiceID,
row_number () over (order by InvoiceID desc) - 1 rowNumber
from Inserted
) insertedRows
on Invoices.InvoiceID = InsertedRows.InvoiceID
In case of a rollback there will be no gaps left. Counter table could be easily expanded with keys for different sequences; in this case, a date valid-until might be nice because you might prepare this table beforehand and let LastNumber worry about selecting the counter for current year/month.
Example of usage:
insert into invoices (invoiceDate) values(GETDATE())
As number column's value is autogenerated, one should re-read it. I believe that EF has provisions for that.
The way that we handle this in SQL Server is by using the UPDLOCK table hint within a single transaction.
For example:
INSERT
INTO MyTable (
MyNumber ,
MyField1 )
SELECT IsNull(MAX(MyNumber), 0) + 1 ,
"Test"
FROM MyTable WITH (UPDLOCK)
It's not pretty, but since we were provided the database design and cannot change it due to legacy applications accessing the database, this was the best solution that we could come up with.

MySql Batching Stored Procedure Calls with .Net / Connector?

Is there a way to batch stored procedure calls in MySql with the .Net / Connector to increase performance?
Here's the scenario... I'm using a stored procedure that accepts a few parameters as input. This procedure basically checks to see whether an existing record should be updated or a new one inserted (I'm not using INSERT INTO .. ON DUPLICATE KEY UPDATE because the check involves date ranges, so I can't really make a primary key out of the criteria).
I want to call this procedure a lot of times (let's say batches of 1000 or so). I can of course, use one MySqlConnection and one MySqlCommand instance and keep changing the parameter values, and calling .ExecuteNonQuery().
I'm wondering if there's a better way to batch these calls?
The only thought that comes to mind is to manually construct a string like 'call sp_myprocedure(#parama_1,#paramb_1);call sp_myprocedure(#parama_2,#paramb2);...', and then create all the appropriate parameters. I'm not convinced this will be any better than calling .ExecuteNonQuery() a bunch of times.
Any advice? Thanks!
EDIT: More info
I'm actually trying to store data from an external data source, on a regular basis. Basically I'm taking rss feeds of Domain auctions (from various sources like godaddy, pool, etc.), and updating a table with the auction info using this stored procedure (let's call it sp_storeSale). Now, in this table that the sale info gets stored, I want to keep historical records for sales for a given domain, so I have a domain table, and a sale table. The sale table has a many to one relationship with the domain table.
Here's the stored procedure:
-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$
CREATE PROCEDURE `DomainFace`.`sp_storeSale`
(
middle VARCHAR(63),
extension VARCHAR(10),
brokerId INT,
endDate DATETIME,
url VARCHAR(500),
category INT,
saleType INT,
priceOrBid DECIMAL(10, 2),
currency VARCHAR(3)
)
BEGIN
DECLARE existingId BIGINT DEFAULT NULL;
DECLARE domainId BIGINT DEFAULT 0;
SET #domainId = fn_getDomainId(#middle, #extensions);
SET #existingId = (
SELECT id FROM sale
WHERE
domainId = #domainId
AND brokerId = #brokerId
AND UTC_TIMESTAMP() BETWEEN startDate AND endDate
);
IF #existingId IS NOT NULL THEN
UPDATE sale SET
endDate = #endDate,
url = #url,
category = #category,
saleType = #saleType,
priceOrBid = #priceOrBid,
currency = #currency
WHERE
id = #existingId;
ELSE
INSERT INTO sale (domainId, brokerId, startDate, endDate, url,
category, saleType, priceOrBid, currency)
VALUES (#domainId, #brokerId, UTC_TIMESTAMP(), #endDate, #url,
#category, #saleType, #priceOrBid, #currency);
END IF;
END
As you can see, I'm basically looking for an existing record that is not 'expired', but has the same domain, and broker, in which case I assume the auction is not over yet, and the data is an update to the existing auction. Otherwise, I assume the auction is over, it is a historical record, and the data I've got is for a new auction, so I create a new record.
Hope that clears up what I'm trying to achieve :)
I'm not entirely sure what you're trying to do but it sounds kinda house-keeping or maintenance related so I won't be too ashamed at posting the following suggestion.
Why dont you move all of your logic into the database and process it all server side ?
The following example uses a cursor (shock/horror) but it's perfectly acceptable to use them in such circumstances.
If you can avoid using cursors at all - great, but the main point of my suggestion is about moving the logic from your application tier back into the data tier to save on the round trips. You'd call the following sproc once and it would process the entire range of data in single call.
call house_keeping(curdate() - interval 1 month, curdate());
Also, if you can provide just a bit more information about what you're trying to do we might be able to suggest other approaches.
Example stored procedure
drop procedure if exists house_keeping;
delimiter #
create procedure house_keeping
(
in p_start_date date,
in p_end_date date
)
begin
declare v_done tinyint default 0;
declare v_id int unsigned;
declare v_expired_date date;
declare v_cur cursor for
select id, expired_date from foo where
expired_date between p_start_date and p_end_date;
declare continue handler for not found set v_done = 1;
open v_cur;
repeat
fetch v_cur into v_id, v_expired_date;
/*
if <some condition> then
insert ...
else
update ...
end if;
*/
until v_done end repeat;
close v_cur;
end #
delimiter ;
Just incase you think I'm completely mad in suggesting cursors you might want to read this
Optimal MySQL settings for queries that deliver large amounts of data?
Hope this helps :)

Categories