Entity Framework Writing Mass Data to Database - c#

I've got a database which amongst others has two tables of data:-
Table 1
ProductID
ProductName
ProductDescription
IsVisible
IsDeleted
Table 2
- ProductPriceID
- ProductID
- LocationID
- Price
Table 2 can hold many prices at different locations for each product in Table 1. I'm reading from a CSV file where the product details are listed in the first columns followed by 15 columns of price values for 15 locations.
I have found that with some nearly 10,000 products being imported each time, that writing this file to the database by first writing the product, and then writing a list of the 15 prices to Table 2, 10000 times over slows the import down HUGELY. It slows it down by up to 2.5x compared to 'attempting' to write in a list of 10000 products first, followed by the some 132,000 product prices. Having 2 writes to the database massively speeds up the whole process, as the lag time is incurred at the database so writing 2 times instead of 20,000 times is much easier.
I've created to lists of the Database types for each object and added the data to each and this is fine. The problem is the ProductID in Table 2. Entity Framework doesn't return this until I call
context.Products.Add(productList);
context.Save();
But at the point this is saved, the list of product prices has already been created but without the relevant ProductID values. When it saves, it crashes because of the foreign key constraint.
Is there anyway with Entity Framework to get the ProductID, that will be assigned to this product without writing each product to the database first? Minimum numbers of database calls is crucial here.
I have the option of re-parsing all the data from the file, but I'm also not keen on this, as its extra processing time. The structure of the file will not be able to be changed.

I weighed up all the options, and it turned out the best way for us to do it, was to write all products to one list, all product prices with the known product code to another list.
We then saved the product list to the database, and then iterated through these products to bring back the ProductID against the Product Code that matched the local list, before saving the product prices list to the database.
2 saves to the database, and one call from the database, and we've cut down a 47 minute data import to 3 minutes.
Thanks for everyones help!

Related

Bulk insert in PGSQL with master-child relation

I am migrating data from old schema to updated schema (PG as new database in place).
I am using C# to automate this process.
Here is a screenshot that shows original data (sample to understand).
Based on updated schema, this source data need to be separated into two tables Vehicles and PartPricing.
The unique combination of Make, Model and Year will be inserted into Vehicles and linked with unique Id.
The Part and PartPrice will then be inserted into PartPricing table and need to be linked with VehicleId. (VehicleId refers to Id of Vehicles table)
Below screen shows the expected output.
The approach i followed is -
Get the unique list of Make, Model and Year and generate bulk insert query and execute.
Fetch all the inserted vehicles and cache into a collection.
Now loop through each line item in the source
Lookup for VehicleId based on Make, Model and Year (from within collection and not from database)
Prepare insert statement for PartPricing
After loop completes, execute the bulk insert query for PartPricing.
Although the Vehicles data is inserted pretty quickly but the preparation of bulk insert for PartPricing is taking considerable amount of time due to lookup.
Is there a better alternative to this problem? please suggest.
Just FYI, when i say bulk insert it follows -
Insert into Vehicles(Make, Model, Year) values
('Honda', 'City', 2010),
('Honda', 'City', 2011),
('Hyundai', 'Accent', 2011),
....
('Toyota', 'Corolla', 2015);

Hold and commit the specific value in the row after specific interval?

I am working on hold and commit functionality to achieve hotel booking in the inventory. I have a room table in SQL Server where I store the number of available rooms for individual room type, so whenever the user books the room I need to hold the specific number of rooms and once the payment is done, I need to commit those rooms what I hold. I am bit confused about how to achieve this? For backend I am using .NET Core and database is SQL Server ?
I created a holdandcommit table as follows:
HoldAndCommit Table
RoomId HoldCount CommitCount HoldTime CommitTime
1 2 0 11:00PM 11:00PM
Step 1:
Room Table :
RoomId PropertyId AvailableRooms
1 1 10
Step 2:
HoldAndCommit Table
RoomId HoldCount CommitCount HoldTime CommitTime
1 2 0 11:00PM 11:00PM
Room Table
RoomId PropertyId AvailableRooms
1 1 8
Step 3:
HoldAndCommit Table
RoomId HoldCount CommitCount HoldTime CommitTime
1 0 2 11:00PM 11:03PM
Room Table
RoomId PropertyId AvailableRooms
1 1 8
Step 4:
HoldAndCommit Table
RoomId HoldCount CommitCount HoldTime CommitTime
1 2 0 11:00PM 11:00PM
Room Table
RoomId PropertyId AvailableRooms
1 1 10
1) Initially, the available rooms will be 10 for room number 1
2) Let say, the user wants to book 2 rooms, then I will create an entry in the holdandcommit table with details such as roomid, number of rooms to hold in the holdcount column and hold time as the system time. also, the availablerooms value in the room table will be deducted so that it will become 8 now
3) when the user completes successful payment, I will mark the commit count as 2 and make the hold count as 0.
4) In case of the failure in the payment then holdcount will be 2 and commit count will be 0 and the available rooms should be reverted back to 10. I was thinking to run the background task in the sql server agent, but I am afraid about the performance.
I am looking at your schema designs and it appears like you had a tenancy to engineer it with human readable tables rather than for the process / booking operations.
I'm going to propose for a simple approach and reduce the schema down to an "expiring token repository" -- not exactly sure if there is a better way to describe it, but more on that later...
Using a Guid type
Firstly, I would highly suggest to include a Guid type as a primary key as this will allow for an easy WebAPI transition for the future (which will allow for greater compatibility with online booking services).
Using a Guid as a primary key will also provide a layer of overall security (obfuscation) as it will not be a running index value.
For example instead of RoomId(int), we will do RoomId(guid). So now instead of RoomId=1 we will have something like RoomId=00000000-0000-0000-0000-000000000001 (note this is just an example Guid value, a real one will be randomized values.
Simple expiring token repository
Sorry to use complicated words to describe this solution. If I try and break it down for you, it is basically a very simple table that will keep track of what your inventory (rooms) have "holds".
Holds Table
Id(Guid) RoomId(Guid)*** Expiration(DateTime)
a8...e7ef 00...0001 2018-12-31 12:59 // expired
ff...e96a 00...0001 2019-01-08 12:00 // not expired
b0...ff84 00...0001 2019-01-08 12:01 // not expired
***Note that I would change the RoomId to something generic like ProductId or AssetId or ItemId. As this will allow you to be able to use this repository to put holds on anything you want that has a Guid (such as a promotion deal for long stay rooms, or valentines day room, etc).
Putting it together
The Holds table is basically a long ledger or journal entry of every single hold ever taken, no items need to get deleted (or even modified) and can remain on record for any audits or reports you wish to generate about lost opportunities, etc.
How it works...
Customer makes a hold on a room.
The system creates a record in the Holds table, notes the expiration time.
Customer continues on to make a reservation.
The system creates a record in the Reservations table.
When you are searching for room availability you run a simple query
Check to see how many records RoomId has in the Holds table that are not yet expired.
Then add that amount to the inventory balance from the Reservations table.

How do you put multiple data items with own unique key on another unique key?

Im making an ordering system where there is a product,supplier and order table. What I'm trying to do is when you order, it can have multiple product and one supplier. Example us OrderID 001 it can have 3 products from product table and 1 supplier from supplier table. How can I do this?
Sorry for asking too much but I don't have a code yet for this part of the system as I don't know where to begin. Thank you.
Create an Orders Table.
Add all ordered products to the order Table.
In the table the three products would all have the same order_id, but a different or (in case someone bought two of the same) same product. You will also need to track the amount they purchase with each row, in case the amount changes.
Select Sum(purchase_amount) from orders where order_id = "YOUR_ORDER_ID"
Select * from orders where order_id = "YOUR_ORDER_ID"
...3 rows show up
You may want to have an order_summary table as well that contains the total amount, etc.

How to create table of tables c# my sql

I got site for selling products. In my database I have 3 tables: Users, Countries, Products.
Each user can sale his products in many countries.
When clients come to visit, my site, they can choose to search for product by country and by product price( same product sold by same user can have different prices in each country).
I thought about two implementations for this:
Create a linked table with user_id, country_id, product_id. In this case each time I would like to add new product I will need to update two tables, the linked table and the products table.
Or create new table for each country, that will have products in it. So when I will have to add new product I will only need to update one table.
I like my second solution more, because I think it will perform faster, for both reading and inserting, how ever it's management is hard, I will have lots of tables, and what if instead of counties I would like to use cities, I will get thousandths of tables.
Is there a way in MySQL to create a table of tables?
What do you think about my design will it rarely perform faster?
Do NOT go for the second solution. Relational databases are meant to have a fixed number of tables, and you will run into a lot of problems if you try to have a variable amount of tables in the manner you describe.
If I understood your requirements correctly, you should probably use two linked tables: one that contains user_id and country_id (thus telling where each user may sell products), and one that contains country_id, product_id, and price (thus telling the price of each product in each country). (This assumes that a product costs the same within a country no matter who sells it.)

Need advice for Database structure (MS SQL / MySQL)

Preferrably MS SQL, I want to make the switch over from MySql.
So I have this awesome customer who has 4 Excel Files. Each Excel file represents a Product Range.
Within each Excel File, there are between 3 to 8 Sheets. Each Sheet representing a Type of Product within that Product Range.
Each Sheet, contains the following Columns:
PartNo, Description, QTY, Price1, Price2, Price3, Price4...
(There's never been, and won't be any more than 8 Price Columns.)
Each Sheet may contain from about 5 to 5000 rows.
Now, the problem I am facing now, is not knowing which would be the best way to go about setting up my new Database.
The way I currently have our existing MySQL Database is each Sheet represents a Table. That's it! (It had to be put "out there" quickly, hence the lack of time invested into setting up a proper format/structure for the DB.)
I've recently found that I am much more competent using MSSQL databases, so I want to make the switch, and the second reason, the main reason, is because I want to restructure the Database so I can make things easier to manage, and make it easier to setup Database searches from my site.
I'm not worried at all about How to insert everything into the database, as in my spare time for the last year I've written an app that parses these Excel files, extracts the sheets, and inserts them into the DB, with optional settings. I am worried about how I should actually setup this new DB.
What would be the best way to go about this given the above details?
Any help at all is greatly appreciated. Thank you!
Update:
About the Pricing Columns (Example), a little info on why there are more than one price columns in each sheet.:
Price column 1 may be Galvanised Unit Price, Price Column 2 may be Galvanised Box Price, Price Column 3 may be Stainless Steel GR304 Box Price, while column 4 may be Stainless Steel GR316 Unit Price. These price columns are different for each Product Range, however, some products in a Product Range may also contain some of the same Price Columns. This is why it was so easy to have each product as a separate table.
I assume that you're seeking advice on the design of the database, right?
I'd never design the database with each sheet representing a table. I'd have one table per conceptual entity, and in your case Product is the obvious one.
So a table called Products, with the columns you mentioned above. To accomodate the Product Type, I'd simply have a Type column that indicates which Product Type any particular Product belongs to. I'd eventually use an enum type in .Net to specify the different types that is supported.
Well, that was my five cents. Hope it helped.
A simple solution would be to split it out into 3 tables. ProductRange and ProductType are linked to the Product table through the foreign key relationships.
ProductRange
Id
RangeName
// Plus any other columns you need. eg description, startdate etc
ProductType
Id
ProductTypeName
Product
Id
ProductRangeId
ProductTypeId
PartNo
Description
Qty
Price1
// etc
If you wanted more flexibility around the price, you would create another Price table with a many-to-many joining table between the Product and Price tables.
Price
Id
Description
Price
ProductPrice
PriceId
ProductId
Your Product table wouldn't contain any price columns in this case. But you are now able to add as many Price Types as you need and each Product could have any number of Prices.
Having multiple columns for price seems a little odd to me unless Price1 means "Business Rate" and Price2 means "Consumer Rate" or something similar. If the different prices are for different features of the same product, you might want to consider separating them out into a different table. For example:
Products table: KeyTable, ProductDescription...
Price table: KeyTable, KeyProduct, PriceType, Price
Alternatively, depending on how the prices are calculated, you could work the various other prices from one base price. For example, if a business rate price would be to add 5% to the base price and a consumer price would be to add 20% to the base price, these probably should not be stored in the database as they can easily be calculated when they are required.

Categories