Entity Framework, random query paging - c#

This is what I want to achieve:
I want to query my db to return a list of entities
Randomize the list
Store the IDS of items received for future queries
Run a new query on the same table where the IDs are in the list that I have stored
Order by the list that I have stored.
I have managed to achieve step 1, 2, 3, 4 already but step 5 is difficult. Can anyone help me with a query like so:
SELECT *
FROM table_name
WHERE id IN (1,2,3,4....)
ORDER BY (1,2,3,4....)
Thanks in advance

Try
SELECT table_name.*
FROM crazy_sorted_table
LEFT JOIN
table_name ON crazy_sorted_table.ID=table_name.ID

A normal join (equi join) should do the trick , here is sample approach i tested:
/**crazyOrder filled 100 rows with random value from 1-250 in Id**/
CREATE TABLE [dbo].[crazyOrder] (
[Id] INT NOT NULL,
[Area] VARCHAR (50) NULL,
PRIMARY KEY CLUSTERED ([Id] ASC)
);
/**Normal order is filled with value from 1-100 sequentially in id**/
CREATE TABLE [dbo].[normalOrder] (
[Id] INT NOT NULL,
[Name] VARCHAR (50) NULL,
PRIMARY KEY CLUSTERED ([Id] ASC)
);
create table #tempOrder
(id int)
insert into #tempOrder
Select top 10 Id
from crazyOrder
order by NewID()
go
Select n.*
from normalOrder n
join #tempOrder t
on t.id = n.id
I was able to retrieve the rows in the same order as in the temp table (i used a data generator for the values)

Related

C# / SQL Determine matching score based on properties

On a project where we have SQL tables called Products and Conditions, we want to determine which product belongs to which most matching condition, because a product can belong to multiple conditions.
Is there a way to do this in C# or SQL?
Below you can find a shorted version of the tables with the properties that we want to match on:
CREATE TABLE Products
(
[Id] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
[Property1] SMALLINT NULL,
[Property2] SMALLINT NULL,
[Property3] NVARCHAR(20) NULL,
[Property4] NVARCHAR(20) NULL
)
CREATE TABLE Conditions
(
[Id] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
[Property1] SMALLINT NULL,
[Property2] SMALLINT NULL,
[Property3] NVARCHAR(20) NULL,
[Property4] NVARCHAR(20) NULL
)
As a result we want for each product the conditions and sorted by most matching score based on the 4 properties.
Because we have 4 properties, the resulting score could be 0 / 25 / 50 / 75 / 100.
In sql you can join the two tables on matching properties and use iif method to compute the total score and order the results by the total score like below :
Select * from (
Select p.*, c.*,
iif([Property1] = p.[Property1],25,0) +
iif([Property2] = p.[Property2],25,0) +
iif([Property3] = p.[Property3],25,0) +
iif([Property4] = p.[Property4],25,0) [TotalScore]
from Products p inner join Conditions c
on c.[Property1] = p.[Property1] or
c.[Property2] = p.[Property2] or
c.[Property3] = p.[Property3] or
c.[Property4] = p.[Property4]) q
order by TotalScore desc

Fast Way to Replace Names with Ids in Datatable?

I have a very large CSV file I have to load on a regular basis that contains time series data. Examples of the headers are below:
| SiteName | Company | Date | ResponseTime | Clicks |
This data comes from a service external to the uploader. SiteName and Company are both string fields. In the database these are normalized. There is a Site table and a Company table:
CREATE TABLE [dbo].[Site] (
[Id] INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[Name] NVARCHAR(MAX) NOT NULL
)
CREATE TABLE [dbo].[Company] (
[Id] INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[Name] NVARCHAR(MAX) NOT NULL
)
As well as the data table.
CREATE TABLE [dbo].[SiteStatistics] (
[Id] INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[CompanyId] INT NOT NULL,
[SiteId] INT NOT NULL,
[DataTime] DATETIME NOT NULL,
CONSTRAINT [SiteStatisticsToSite_FK] FOREIGN KEY ([SiteId]) REFERENCES [Site]([Id]),
CONSTRAINT [SiteStatisticsToCompany_FK] FOREIGN KEY ([CompanyId]) REFERENCES [Company]([Id])
)
At around 2 million rows in the CSV file any sort of IO-bound iteration isn't going to work. I need this done in minutes, not days.
My initial thought is that I could pre-load Site and Company into DataTables. I already have the CSV loaded into a datatable in the format that matches the CSV columns. I need to now replace every SiteName with the Id field of Site and every Company with the Id field of Company. What is the quickest, most efficient way to handle this?
If you go with Pre-Loading the Sites and Company's you can get the distinct values using code:
DataView view = new DataView(table);
DataTable distinctCompanyValues = view.ToTable(true, "Company")
DataView view = new DataView(table);
DataTable distinctSiteValues = view.ToTable(true, "Site")
Then load those two DataTables into their SQL Tables using Sql-Bulk-Copy.
Next dump all the data in:
CREATE TABLE [dbo].[SiteStatistics] (
[Id] INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[CompanyId] INT DEFAULT 0,
[SiteId] INT DEFAULT 0,
[Company] NVARCHAR(MAX) NOT NULL,
[Site] NVARCHAR(MAX) NOT NULL,
[DataTime] DATETIME NOT NULL
)
Then do an UPDATE to set the Referential Integrity fields:
UPDATE [SiteStatistics] ss SET
[CompanyId] = (SELECT Id FROM [Company] c Where ss.[Company] = c.Name),
[SiteId] = (SELECT Id FROM [Site] s Where ss.[Site] = s.Name)
Add the Foreign Key constraints:
ALTER TABLE [SiteStatistics] ADD CONSTRAINT [SiteStatisticsToSite_FK] FOREIGN KEY ([SiteId]) REFERENCES [Site]([Id])
ALTER TABLE [SiteStatistics] ADD CONSTRAINT [SiteStatisticsToCompany_FK] FOREIGN KEY ([CompanyId]) REFERENCES [Company]([Id])
Finally delete the Site & Company name fields from SiteStatistics:
ALTER TABLE [SiteStatistics] DROP COLUMN [Company];
ALTER TABLE [SiteStatistics] DROP COLUMN [Site];

How to grab last row in database table with specific requirements?

Okay so I am accepting payments on my site (via Authorize.Net). The payment form redirects to a receipt page.
I will have a column in the database for an invoice code (column InvoiceCode), which is RRC0A in this instance. Then I will have another column for an 8 digit number (column InvoiceNumber). Then I will have InvoiceCode + InvoiceNumber = InvoiceId. For example, the InvoiceId will be RRC0A + 8 numbers. It will increment as such: 00000000, 00000001, 00000002, etc. Therefore the InvoiceId will be RRC0A00000001. I cannot simply increment the column in my database because there will be other InvoiceCodes that also start at 00000000.
I need to increment the InvoiceNumber by one when I add a new row. How can I grab the last InvoiceNumber that was entered into the database? It must be associated with the InvoiceCode RRC0A. This could occur when more than 1 person is making a payment, so I am not sure of the best way.
How can I pad the incrementing InvoiceNumber with 0's in front so that it is always 8 digits?
Using an identity and a computed column you can created you invoice numbers with the correct formatting at the time of insert.
CREATE TABLE [dbo].[Invoices](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Code] [nchar](5) NOT NULL,
[InvoiceNumber] AS ([Code]+right('00000000'+CONVERT([nvarchar](10),[ID]),(8))) PERSISTED,
[Cost] [decimal](18, 2) NOT NULL,
CONSTRAINT [PK_Invoices] PRIMARY KEY CLUSTERED
(
[ID] ASC
)
)
sample bulk insert
INSERT INTO [dbo].[Invoices] ([Code], [Cost])
OUTPUT INSERTED.*
SELECT 'ABC01', 500 UNION ALL
SELECT 'ABC01', 501 UNION ALL
SELECT 'EFG23', 502 UNION ALL
SELECT 'RRAc1', 503 UNION ALL
SELECT 'ABC01', 504
output
ID Code InvoiceNumber Cost
1 ABC01 ABC0100000001 500.00
2 ABC01 ABC0100000002 501.00
3 EFG23 EFG2300000003 502.00
4 RRAc1 RRAc100000004 503.00
5 ABC01 ABC0100000005 504.00
When you insert your records you can get the ID and InvoiceNumber back at the same time.
The values are also persisted so they may be indexed as you would other columns.
SELECT InvoiceCode, MAX(InvoiceID)
FROM yourTable t
GROUP BY InvoiceCode
This should return the latest InvoiceID for each InvoiceCode, but you can add your own WHERE clause to filter it down
As for how to pad-left in sql, check out this answer.
A as in one column is just a bad design
Have composite PK
InvCode (varchar), InvInt (int)
declare #InvCode varchar(20) = 'RRC0A'
insert into invoice (InvCode, InvInt)
OUTPUT INSERTED.InvInt, INSERTED.InvCode
select #InvCode, isnull(max(InvInt),-1) + 1
from invoice
where InvCode = #InvCode;
The isnull will deal with the first one
A single statement is a transaction so I don't think two simultaneous could clobber
Even if they did the PK would be violated so the insert would fail
use a view or a computed column for the formatted invoice number
CREATE TABLE [dbo].[Invoice](
[InvCode] [varchar](10) NOT NULL,
[InvInt] [int] NOT NULL,
[Formatted] AS ([InvCode]+right('00000000'+CONVERT([nvarchar](10),[InvInt]),(8))),
CONSTRAINT [PK_Invoice] PRIMARY KEY CLUSTERED
(
[InvCode] ASC,
[InvInt] ASC
)
You can grab the last InvoiceNumber with a SELECT query.
You can pad the invoice number with the + sign to concatenate two strings, and then use RIGHT() to get the right-most 8 characters.

LINQ Expression for CROSS APPLY two levels deep

Fairly new to LINQ and am trying to figure out how to write a particular query. I have a database where each CHAIN consists of one or more ORDERS and each ORDER consists of one or more PARTIALS. The database looks like this:
CREATE TABLE Chain
(
ID int NOT NULL PRIMARY KEY CLUSTERED IDENTITY(1,1),
Ticker nvarchar(6) NOT NULL,
Company nvarchar(128) NOT NULL
)
GO
CREATE TABLE [Order]
(
ID int NOT NULL PRIMARY KEY CLUSTERED IDENTITY(1,1),
Chart varbinary(max) NULL,
-- Relationships
Chain int NOT NULL
)
GO
ALTER TABLE dbo.[Order] ADD CONSTRAINT FK_Order_Chain
FOREIGN KEY (Chain) REFERENCES dbo.Chain ON DELETE CASCADE
GO
CREATE TABLE Partial
(
ID int NOT NULL PRIMARY KEY CLUSTERED IDENTITY(1,1),
Date date NOT NULL,
Quantity int NOT NULL,
Price money NOT NULL,
Commission money NOT NULL,
-- Relationships
[Order] int NOT NULL
)
GO
ALTER TABLE dbo.Partial ADD CONSTRAINT FK_Partial_Order
FOREIGN KEY ([Order]) REFERENCES dbo.[Order] ON DELETE CASCADE
I want to retrieve the chains, ordered by the earliest date among all the partials of all the orders for each particular chain. In T-SQL I would write the query as this:
SELECT p.DATE, c.*
FROM CHAIN c
CROSS APPLY
(
SELECT DATE = MIN(p.Date)
FROM PARTIAL p
JOIN [ORDER] o
ON p.[ORDER] = o.ID
WHERE o.CHAIN = c.ID
) AS p
ORDER BY p.DATE ASC
I have an Entity Framework context that contains a DbSet<Chain>, a DbSet<Order>, and a DbSet<Partial>. How do I finish this statement to get the result I want?:
IEnumerable<Chain> chains = db.Chains
.Include(c => c.Orders.Select(o => o.Partials))
.[WHAT NOW?]
Thank you!
.[WHAT NOW?]
.OrderBy(c => c.Orders.SelectMany(o => o.Partials).Min(p => p.Date))
Here c.Orders does join Chain to Order, while o.SelectMany(o => o.Partials) does join Order to Partial. Once you have access to Partial records, you can use any aggregate function, like Min(p => p.Date) in your case.

Suggestion for a tag cloud algorithm

I have a MSSQL 2005 table:
[Companies](
[CompanyID] [int] IDENTITY(1,1) NOT NULL,
[Title] [nvarchar](128),
[Description] [nvarchar](256),
[Keywords] [nvarchar](256)
)
I want to generate a tag cloud for this companies. But I've saved all keywords in one column separated by commas. Any suggestions for how to generate tag cloud by most used keywords. There could be millions of companies approx ten keywords per company.
Thank you.
Step 1: separate the keywords into a proper relation (table).
CREATE TABLE Keywords (KeywordID int IDENTITY(1,1) NOT NULL
, Keyword NVARCHAR(256)
, constraint KeywordsPK primary key (KeywordID)
, constraint KeywordsUnique unique (Keyword));
Step 2: Map the many-to-many relation between companies and tags into a separate table, like all many-to-many relations:
CREATE TABLE CompanyKeywords (
CompanyID int not null
, KeywordID int not null
, constraint CompanyKeywords primary key (KeywordID, CompanyID)
, constraint CompanyKeyword_FK_Companies
foreign key (CompanyID)
references Companies(CompanyID)
, constraint CompanyKeyword_FK_Keywords
foreign key (KeywordID)
references Keywords (KeywordID));
Step 3: Use a simple GROUP BY query to generate the 'cloud' (by example taking the 'cloud' to mean the most common 100 tags):
with cte as (
SELECT TOP 100 KeywordID, count(*) as Count
FROM CompanyKeywords
group by KeywordID
order by count(*) desc)
select k.Keyword, c.Count
from cte c
join Keyword k on c.KeywordID = k.KeywordID;
Step 4: cache the result as it changes seldom and it computes expensively.
I'd much rather see your design normalized as suggested by Remus, but if you're at a point where you can't change your design...
You can use a parsing function (the example I'll use is taken from here), to parse your keywords and count them.
CREATE FUNCTION [dbo].[fnParseStringTSQL] (#string NVARCHAR(MAX),#separator NCHAR(1))
RETURNS #parsedString TABLE (string NVARCHAR(MAX))
AS
BEGIN
DECLARE #position int
SET #position = 1
SET #string = #string + #separator
WHILE charindex(#separator,#string,#position) <> 0
BEGIN
INSERT into #parsedString
SELECT substring(#string, #position, charindex(#separator,#string,#position) - #position)
SET #position = charindex(#separator,#string,#position) + 1
END
RETURN
END
go
create table MyTest (
id int identity,
keywords nvarchar(256)
)
insert into MyTest
(keywords)
select 'sql server,oracle,db2'
union
select 'sql server,oracle'
union
select 'sql server'
select k.string, COUNT(*) as count
from MyTest mt
cross apply dbo.fnParseStringTSQL(mt.keywords,',') k
group by k.string
order by count desc
drop function dbo.fnParseStringTSQL
drop table MyTest
Both Remus and Joe are correct but yes as what Joe said if you dont have a choice then you have to live with it. I think I can offer you an easy solution by using an XML Data Type. You can already easily view the parsed column by doing this query
WITH myCommonTblExp AS (
SELECT CompanyID,
CAST('<I>' + REPLACE(Keywords, ',', '</I><I>') + '</I>' AS XML) AS Keywords
FROM Companies
)
SELECT CompanyID, RTRIM(LTRIM(ExtractedCompanyCode.X.value('.', 'VARCHAR(256)'))) AS Keywords
FROM myCommonTblExp
CROSS APPLY Keywords.nodes('//I') ExtractedCompanyCode(X)
now knowing that you can do that, all you have to do is to group them and count, but you cannot group XML methods so my suggestion is create a view of the query above
CREATE VIEW [dbo].[DissectedKeywords]
AS
WITH myCommonTblExp AS (
SELECT
CAST('<I>' + REPLACE(Keywords, ',', '</I><I>') + '</I>' AS XML) AS Keywords
FROM Companies
)
SELECT RTRIM(LTRIM(ExtractedCompanyCode.X.value('.', 'VARCHAR(256)'))) AS Keywords
FROM myCommonTblExp
CROSS APPLY Keywords.nodes('//I') ExtractedCompanyCode(X)
GO
and perform your count on that view
SELECT Keywords, COUNT(*) AS KeyWordCount FROM DissectedKeywords
GROUP BY Keywords
ORDER BY Keywords
Anyways here is the full article -->http://anyrest.wordpress.com/2010/08/13/converting-parsing-delimited-string-column-in-sql-to-rows/

Categories