Random selection of records in a table mongoDb C#

Random selection of records in a table mongoDb C# - c#

I have problem about get random selection of records in a table mongoDb
example in sql server:
select top 10 * from Employee order by NEWID()
now equivalent order by NEWID() in mongoDB C#
thank you
random selection of records in a table mongoDb
I mean was
Randomly selects the specified number of documents from the input documents
$sample
collection.AsQueryable<T>().Where().Select().**Sample(count)**.ToList()

If I understand what you need correctly, you should use aggregate $sample stage. There is no typed version of this stage in .net driver, see here how you can configure it via raw form.

Related

function returns two random ids from the database table. asp.net core

I want to create a function returning two random ids(two questions) from the database table(one table, two Ids) and then never show these two ids together to this user again.
I have all the login functionality...
I need it to have a good performance.

MSSQL
SELECT TOP 5 * FROM Question ORDER BY NEWID()
MySQL
SELECT * FROM Question ORDER BY RAND() LIMIT 2

Insert bulk data into tables that are in a one to many relationship

I have a .NET App connected to a Postgres DB using Npgsql and I am trying to import data into two tables, say Users and Todos. A user has many todos. The User table has an id column that is automatically set by the DB, and the Todos table has a foreign key to the Users table called user_id.
Now, I know how to insert Users, and I know how to insert Todos, but I do not know how to set the user_id for those Todos since the id column from User is only known after the users are inserted into the DB. Any idea?

This depends on how you are importing and which tool you are using. If you are using raw INSERT statements, PostgreSQL has a RETURNING clause which will send you back the ID of the inserted statements (see the docs).
If you are using binary COPY (which is the most efficient way to bulk-import data), there's no such option. This case, one good way is to "allocate" all the ids in one go, by incrementing the sequence backing the ID column, and then sending the IDs when you're importing. This means the database is longer generating those IDs - you're sending them explicitly like any other field.
In practical terms, say you have 100 users (and any number of todos). You can do one call to setval to increment the sequence by 100, and then you can import your users, explicitly setting their IDs to those 100 values. This allows you to also specify the user IDs on the todos. However, if you do this, be mindful of concurrency issues if someone else modifies the sequence at the same time.

How to insert millions of data of different RDBMS in to SQL Server database with insert statement?

I have two databases in my SQL Server with each database containing 1 single table as of now.
I have 2 database like below :
1) Db1 (MySQL)
2) Db2 (Oracle)
Now what I want to do is fill my database table of SQL Server db1 with data from Db1 from MySQL like below :
Insert into Table1 select * from Table1
Select * from Table1(Mysql Db1) - Data coming from Mysql database
Insert into Table1(Sql server Db1) - Insert data coming from Mysql
database considering same schema
I don't want to use sqlbulk copy as I don't want to insert chunk by chunk data. I want to insert all data in 1 go considering millions of data as my operation is just not limited to insert records in database. So user have to sit wait for a long like first millions of data inserting chunk by chunk in database and then again for my further operation which is also long running operation.
So if I have this process speed up then I can have my second operation also speed up considering all records are in my 1 local sql server instance.
Is this possible to achieve in a C# application?
Update: I researched about Linked server as #GorDon Linoff suggested me that linked server can be use to achieve this scenario but based on my research it seems like i cannot create linked server through code.
I want to do this with the help of ado.net.
This is what I am trying to do exactly:
Consider I have 2 different client RDBMS with 2 database and some tables in client premises.
So database is like this :
Sql Server :
Db1
Order
Id Amount
1 100
2 200
3 300
4 400
Mysql or Oracle :
Db1:
Order
Id Amount
1 1000
2 2000
3 3000
4 400
Now I want to compare Amount column from source (SQL Server) to destination database (MySQL or Oracle).
I will be use to join this 2 different RDBMS databases tables to compare Amount columns.
In C# what I can do is like fetch chunk by chunk records in my datatable (in memory) then compare this records with the help of code but this will take so much time considering millions of records.
So I want to do something better than this.
Hence I was thinking that i bring out this 2 RDBMS records in my local SQL server instance in 2 databases and then create join query joining this 2 tables based on Id and then take advantage of DBMS processing capability which can compare this millions of records efficiently.
Query like this compares millions of records efficiently :
select SqlServer.Id,Mysql.Id,SqlServer.Amount,Mysql.Amount from SqlServerDb.dbo.Order as SqlServer
Left join MysqlDb.dbo.Order as Mysql on SqlServer.Id=Mysql.Id
where SqlServer.Amount != Mysql.Amount
Above query works when I have this 2 different RDBMS data in my local server instance with database : SqlServerDb and MysqlDb and this will fetch below records whose amount is not matching :
So I am trying to get those records from source(Sql server Db) to MySQL whose Amount column value is not matching.
Expected Output :
Id Amount
1 1000
2 2000
3 3000
So there is any way to achieve this scenario?

On the SELECT side, create a .csv file (tab-delimited) using SELECT ... INTO OUTFILE ...
On the INSERT side, use LOAD DATA INFILE ... (or whatever the target machine syntax is).
Doing it all at once may be easier to code than chunking, and may (or may not) be faster running.

SqlBulkCopy can accept either a DataTable or a System.Data.IDataReader as its input.
Using your query to read the source DB, set up a ADO.Net DataReader on the source MySQL or Oracle DB and pass the reader to the WriteToServer() method of the SqlBulkCopy.
This can copy almost any number of rows without limit. I have copied hundreds of millions of rows using the data reader approach.

What about adding a changed date in the remote database.
Then you could get all rows that have changed since the last sync and just compare those?

First of all do not use linked server. It is tempting but it will more trouble than it is bringing on the table. Like updates and inserts will fetch all of the target db to source db and do insert/update and post all data to target back.
As far as I understand you are trying to copy changed data to target system for some stuff.
I recommend using a timestamp column on source table. When anything changes on source table timestamp column is updated by sql server.
On target, get max ID and max timestamp. two queries at max.
On source, rows where source.ID <= target.MaxID && source.timestamp >= target.MaxTimeTamp is true, are the rows that changed after last sync (need update). And rows where source.ID > target.MaxID is true, are the rows that are inserted after last sync.
Now you do not have to compare two worlds, and you just got all updates and inserts.

You need to create a linked server connection using ODBC and the proper driver, after that you can execute the queries using openquery.
Take a look at openquery:
https://msdn.microsoft.com/en-us/library/ms188427(v=sql.120).aspx

Yes, SQL Server is very efficient when it's working with sets so let's keep that in play.
In a nutshell, what I'm pitching is
Load data from the source to a staging table on the target database (staging table = table to temporarily hold raw data from the source table, same structure as the source table... add tracking columns to taste). This will be done by your C# code... select from source_table into DataTable then SqlBulkCopy to the staging table.
Have a stored proc on the target database to reconcile the data between your target table and the staging table. Your C# code calls the stored proc.
Given that you're talking about millions of rows, another thing that can make things faster is dropping indices on the staging table before inserting to it and recreating those after the inserts and before any select is performed.

How to pull a SQL Table entry based on highest IDENTITY entry and update two columns

I'm not well versed in SQL operations, and would like some help with a task I need to complete in code. I have written a cloud based app that accesses a SQL table containing test results - device ID's, serial numbers, test results etc.
There is a use-case where someone in the field would activate a menu where an update to this table occurs. When the device test result table is updated, I want to store the OLD information in a device test history table. This way, we can go back and see what was changed over time.
So I need to pull all the columns from the TestedDevice table, insert them into TestedDeviceHistory table, and include some additional information; the current date and the operator's id. (these are two new columns found only in TestedDeviceHistory)
At first, I'm using a SELECT INTO command, as follows:
SELECT *
INTO dbo.TestedDevicesHistory
FROM dbo.TestedDevices
WHERE CertificateID = #cert
Then I'm attempting this (obviously broken) SQL command:
UPDATE dbo.TestedDeviceHistory
SET Caller = #caller,
RecordDate = #date
WHERE DeviceHistoryID = MAX(DeviceHistoryID)
Notes:
DeviceHistoryID is an IDENTITY integer column, so it's unique for each entry made in the history table.
CertificateID is unique in the TestedDevices table. It is expected NOT to be unique in the history table.
The code is written in C# 4.5
Maybe this is a case for a stored procedure, which I have never attempted to create or use. Or, perhaps the use of a cursor? Don't know! This is why I'm humbly asking for the more experienced with SQL to help :)

Not clear on if you only want to assign the Caller and RecordDate to the most recent record, or if it could be assigned to all the history records.
For all records, I believe you can do something like
SELECT *, #caller AS Caller, #date AS RecordDate INTO dbo.TestedDevicesHistory
FROM dbo.TestedDevices WHERE CertificateID=#cert

Best way to check if multiple records exist in database

I am creating an application that takes data from a text file which has sales data from Amazon market place.The market place has items with different names compared to the data in our main database. The application accepts the text file as input and it needs to check if the item exists in our database. If not present I should throw an option to save the item to a Master table or to Sub item table and map it to a master item. My question is if the text file has 100+ items should I hit the database each time to check if the data exists there.Is there any better way of doing that so that we can minimize the database hits.
I have two options that i have used earlier
Hit database and check if it exists in table.
Fill the data in a DataTable and use DataTable.Select to check if it exists.
Can some one tell me the best way to do this?. I have to check two tables (master table, subItem table), maybe 1 at a time. Thanks.
Update:
#Downvoters add an comment .
i am not asking you whats the way to check if an item exists in database.I just want to know the best way of doing that. Should I be hitting database 1000 times if an file has 1000 items? That's my question.
The current query I use:
if exists (select * from [table] where itemname= [itemname] )
select 'True'
else
select 'False'
return

(From Chat)
I would create a Stored Procedure which takes a table valued parameter of all the items that you want to check. You can then use a join (a couple of options here)* to return a result set of items and whether each one exists or not. You can use TVP's from ADO like this.
It will certainly handle the 100 to 1000 row range mentioned in your post. To be honest, I haven't used it in the 1M+ range.
in newer versions of SQL Server, I would prefer TVP's over using an xml input parameter, as it is really quite cumbersome to pack the xml in your .Net code and then unpack it again in your SPROC.
(*) Re Joins : With the result set, you can either just inner join the TVP to your items / product table and check in .Net if the row doesn't exist, or you can do an left outer join with the TVP as the left table, and e.g. ISNULL() missing items to 0 / 'false' etc.

Make it as batch of 100 items to the database. probably a stored procedure might help, since repetitive queries has to be fired. If the data is not changed frequently, you can consider caching. I assume you will be making service calls from ur .net application, so ingest a xml from back end, in batches. Consider increasing batch size based on the filesize.
If your entire application is local, batch size size may very high, as there is no netowrk oberhead, still dont make 100 calls to db.

Try like this
SELECT EXISTS(SELECT * FROM table1 WHERE itemname= [itemname])
SELECT EXISTS(SELECT 1 FROM table1 WHERE itemname= [itemname])

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Random selection of records in a table mongoDb C# - c#

If I understand what you need correctly, you should use aggregate $sample stage. There is no typed version of this stage in .net driver, see here how you can configure it via raw form.

Related

function returns two random ids from the database table. asp.net core

Insert bulk data into tables that are in a one to many relationship

How to insert millions of data of different RDBMS in to SQL Server database with insert statement?

How to pull a SQL Table entry based on highest IDENTITY entry and update two columns

Best way to check if multiple records exist in database

Categories

Resources