Rounding up of data in SQL Server 2014 - c#

I have developed an application in C# (ASP.NET 4.0) using SQL Server 2014 as the database.
I have question about rounding up and summing data. I have data that comes in via CSV through FTP and I import the raw data into a table. The data is logged every minute by the customer. The data is identified by customer Id.
I have now been asked to take that data and sum the time series data into 15 minutes chunks from the hour.
They then want that data rounded up into days (midnight to midnight), then that data rounded up into weeks (Monday to Sunday). They also want the day data rounded up into calendar months from midnight to midnight and the month data is to be rounded up into a year.
The idea is that the raw time series data is grouped into its constituent periods, such as day, week, month so they can see the total for that time period.
I have looked at cursors and loops in SQL and I have been told the overhead will be too great as I already have 300 million rows and counting to process.
I don't whether I should develop a service in C# that does it all on the server or use the database. The research I have done contradicts itself slightly in each case.
Any hints would be great as where to look and what to try.

I believe you are looking more for a desighn than a solution here.
I would suggest you to create a table which will hold the data of the ftp load along with a batch id (a unique identifier).
create another table where you load this batch id with a status column, alays insert a row here once if you are doen with ftp load into the table1, make status as N. This polling script should call the below sp.
Now, create a polling script from c# or if you are experienced with service broker in sql use that, to poll this table2 with batch id and status with status as N.
Now, create another stored procedure which will sum up all the records for this batch id only. And add the values to the daily count approaitely..
The same will be done weekly counts and all...
Once all this is done, remove the information from the table1 with the batch id for which we processed, if you need this info for future purpose you can sotre it in different table.

To have the power of managing data and ready for any change for business rules in the future, you need to add some control columns to the table.
The controls managing period/hour/day/month/year /... whatever in future
Just when you add the period fill the corresponding control fields once at a time with the corresponding value:
period 1..4
hour 1-24
day 1..366
week 1..55
month 1..12
year 1.. (if needed)
You can define set of SQL functions to fill these columns at once(during data loading from the file).
Create index for these columns.
Once , you do this you can by, c# code /sql code, you do the summing up dynamically to any period/hour/day/.....
You can benefit from Analysis server / window functions / pivots to do your magic :) on data for any interval.
This approach gives you the power of keeping data , no deletion , except for archiving purpose and managing changes in the future.

Related

Persisting data between sessions in C#

All,
I have a test program that will serialize test subjects for some research sessions. I'll be running the program at different times, so I need this data to persist. It will be a simple ID number in 001 format (two leading zeros until 10, one leading zero until 100) and will max out at 999. How would I accomplish this in C#? Ideally, it starts up, reads the persistent data, then starts registering new test subjects at the latest number. This number will then be used as a primary key to recognize the test subjects in a database. I've never done anything remotely like this before, so I'm clueless as to what I should do.
EDIT:
I probably should have clarified... there are multiple databases. One is a local SQLite file that holds the test subject's trial data (the specific data from each test). The other is a much larger MySQL database that holds more general information (things about the test subject relevant to the study). The MySQL database is remote and data from the application is not directly submitted to it... that's handled by another application that takes the SQLite file and submits that data to the MySQL database. The test environment is variable and may not have a connection to the MySQL database. As such, it's not a viable candidate for holding such data as I need the ID numbers each time I start the program, regardless of the connection state to the MySQL database. The SQLite files are written after program execution from a text file (csv) and need to contain the ID number to be used as a primary key, so the SQLite database might not be the best candidate for storing the persistent data. Sorry I didn't explain this earlier... it's still early in the day :P
If these numbers are used in a database as the index, why not check the database for the next number? If 5 subjects have been registered already, next time just check the database, get the max for the index and add 1 for the next subject. Once you insert that subject, you can add 1 again.

Drop the table Or Delete all data, and how do events based on time on server

In sql server I need to make an archive system, at the end of every month I want to move all rows that posted in last month to the table named with month name, but also delete all existing data first from that target table (as you see this old data posted since one year ago, so I don't need it anymore).
My questions:
1- what is the better way to do this, drop the table itself then create a new one with select-into sentence, or delete all rows using delete sentence then inset the new rows from orginal table. (with the huge data I think this may make a difference)
2- How I can do that operation automatically, Is there any triggers based on time ?, or somthing in asp.net or server can do this ?.
Thanks for help.
Use the TRUNCATE TABLE statement, this deletes everything from the table but keeps the structure
You can set up a job in SQL Server using SQL Agent to run the job periodically
You can also create a cron/scheduler in windows which runs at the end of the month and which executes a procedure which does your job.

Parsing excel sheet in C#, inserting new values into database

I am currently working on a project to parse an excel sheet and insert any values into a database which were not inserted previously. The sheet contains roughly 80 date-value pairs for different names, with an average of about 1500 rows per pair.
Each name has 5 date-value pairs entered manually at the end of the week. Over the weekend, my process will parse the excel file and insert any values that are not currently in the database.
My question is, given the large amount of total data and the small amount added each week, how would you determine easily which values need to be inserted? I have considered adding another table to store the last date inserted for each name and taking any rows after that.
Simplest solution, I would bring it all into a staging table and do the compare in the server. Alternatively, SSIS with an appropriate sort and lookup could determine the differences and insert them.
120000 rows is not significant to compare in the database using SQL, but 120000 individual calls to the database to verify if the row is in the database might take a while on a client-side.
Option 1 would be to create a "lastdate" table that is automatically stamped at the end of your weekend import. Then the next week your program could query the last record in that table, then only read from the excel file after that date. probably your best bet.
Option 2 would be to find a unique field in the data, and row by row check if that key exists in the database. If it doesn't exist, you add it, if it does you won't. This would be my 2nd choice if Option 1 didn't work how you expect it.
It all depends how bullet proof your solution needs to be. If you trust the users that the spreadsheet will not be tweaked in any way that would make it inconsistent, than your solution would be fine.
If you want to be on the safe side (e.g. if some old values could potentially change), you would need to compare the whole thing with the database. To be honest the amount of data you are talking here doesn't seem very big, especially when you process will run on a Weekend. And you can still optimize by writing "batch" type of stored procs for the database.
Thanks for the answers all.
I have decided, rather than creating a new table that stores the last date, I will just select the max date for each name, then insert values after that date into the table.
This assumes that the data prior to the last date remains consistent, which should be fine for this problem.

Import Text Specification in Access Database

We are using C#.net & use access database code for import of text file specification into access table
is there any access database limit for this action, as we may have records > 5 lac (500,000) ,will this process work for huge records??
If No then how can we handle huge records insertion in access database for same ?
Thanks
The import process doesn't have any specific limit on the number of records that you can import or store in a table, however it does limit you to a single table size of 1 gigabyte for Access 2000 or 2 gigabytes for later versions.
A huge number of small records is OK, and a small number of huge records is OK. But, a huge number of huge records will probably hit the limit.
P.S. You shouldn't use lac (lakh) on international forums because it is only understood in India and nearby countries. 1 lac = 100,000
Would you consider:
Load up your data in c# (StreamReader, etc),
start an oleDbTransaction,
Run an Insert query x 500k times using an oleDbCommand
Commit your transaction.
This will take away your dependency on the Access Import specification too, so might port easier to other db types in the future.
The speed should be comparable to the Access Import, but requires you to code up the equivalent of your import specification (ie, 'create table' SQL, 'insert into' SQL).

Best way to incorporate legacy data

I am working on a price list management program for my business in C# (Prototype is in Win Forms but am thinking of using WPF for the final ap as a MVVM learning exercise).
Our EMS system is based on a COBOL back end and will remain that way for at least 3 years so I cannot really access it's data directly. I want to pull data from them EMS system periodically to ensure that pricing remains in sync (And to provide some other information to users in a non-editable manner such as bin locations). What I am looking at doing is...
Use WinBatch to automatically run a report nightly then to Use Monarch to convert the text report to a flat file (.xls?)
Drop the file into a folder and write a small ap to read it in and add it to the database
How should I add this to the database? (SQL Express) I could have a table that is just replaced completely each time but I am a beginner at most of this and I am concerned what would happen if an entire table was replaced while the database was being used by the price list ap.
Mike
If you truncate and refill a whole table you should do it in one single transaction and place a full table lock. This is more secure and faster.
You also could update all changed rows, then insert new (missing rows) and then delete all rows which weren't updated in this run (insert some kind of version number in each row to determine this).
First create a .txt file from the legacy application. Then use a batch insert to pull it into a work table for whatever clean up you need to make. Do the clean up using t-sql. Then run t-sql to insert new data into the proper tables and/or to update rows where data has changed. If there are toomany records, do the inserting and updating in batches. Schedule all this as a job to run during hours when the database is not busy.
You can of course do all of this best in SSIS but I don't know if that is available with Express.
Are there any fields/tables available to tell you when the price was last updated? If so you can just pull the recently updated rows and update that in your database.... assuming you have a readily available unique primary key in your cobol app's datastore.
This wouldn't be up to date though because you're running it as a nightly script to update the database used by the new app. You can maybe create a .net script to query the cobol datastore specifically for whatever price the user is looking for, and if the cobol datastores update time is more recent than what you have logged, update the SQL Server record(s).
(I'm not familiar with cobol at all, just throwing ideas out there)

Categories