Copy data from one sql server to another over LAN programatically - c#

I have one SQL server A located at some place which contains huge number of records(raw data). A continuosuly running process (C# .NET) from there will notify me if there are records that needs to be processed via web service (WCF) and I need to move that record to my SQL server B for processing. What is the elegant and efficient way to do that?
I have a couple of thoughts on that:
1) Sent the records in batches from one to the other via WCF.
2) Save the records in a file and load it to FTP. Then I can download it from there and upload the records to my DB.
Is there any other better way to do that?

I have a couple of thoughts on that: 1) Sent the records in batches from one to the other via WCF. 2) Save the records in a file and load it to FTP. Then I can download it from there and upload the records to my DB.
This really depends on how real-time the data needs to be. In our organization we use a lot of MQ's to keep data synchronized because it needs to be updated real-time between differing applications.
REAL-TIME
If the data needs to be real-time, and you can setup an MQ, that's what I'd recommend. They are fast, light-weight, and durable. They do take some work to setup, but here is a link that can get you started.
BATCH
If the data can be updated in batch you're going to be better off. Real-time data, and the issues that come along with triage, is a lot more complex and cumbersome in practice. With a batch file you can validate and sanitize the data up front to ensure the CRUD operations will succeed. With batch, use a text file, delimited or fixed, and import it using an SSIS job. SSIS can pull it down from the FTP Server and import it, all in one fell swoop.

Related

C# upload CSV file to Netezza

So my team is looking into connecting to Netezza with C# and we plan on loading data into netezza, pulling data from netezza and writing update queries all in C#.
From my research, I see that it's possible to connect to netezza using C# and I'm wondering if you can do all that is bolded above using C# so that we can decide on whether or not we can do just about anything with Netezza using C#. We'd like to know before we commit to anything. The types of data we would be loading are CSV files.
Are there any good resources on this? I haven't been able to find any.
We also have Aginity client tools so maybe it's possible to incorporate Aginity to this (Not that I would want to but if it's easier I'd like to know about it)?
Retrieving data is straightforward and can be done through the usual channels (loop over a cursor to get results) but loading can take a bit longer.
Netezza is not a fan of multiple INSERT queries; loading a large number of records with individual INSERT queries, as it doesn't support multi-row inserts, will take a long time.
When loading multiple records most people usually write out their data to a ".csv" and use the external table syntax to perform the insert.
When in a application we prefer to load/unload our data via a named pipe so that we don't have to write/read the data to disk prior.

system architecture for real-time data

The company I work for is running a C# project that crawling data from around 100 websites, saving it to the DB and running some procedures and calculations on that data.
Each one of those 100 websites is having around 10,000 events, and each event is saved to the DB.
After that, the data that was saved is being generated and aggregated to 1 big xml file, so each one of those 10,000 events that were saved, is now presented as a XML file in the DB.
This design looks like that:
1) crawling 100 websites to collects the data and save it the DB.
2) collect the data that was saved to the DB and generate XML files for each event
3) XML files are saved to the DB
The main issue for this post, is the selection of the saved XML files.
Each XML is about 1MB, and considering the fact that there are around 10,000 events, I am not sure SQL Server 2008 R2 is the right option.
I tried to use Redis, and the save is working very well (and fast!), but the query to get those XMLs works very slow (even locally, so network traffic wont be an issue).
I was wondering what are your thoughts? please take into consideration that it is a real-time system, so caching is not an option here.
Any idea will be welcomed.
Thanks.
Instead of using DB you could try a cloud-base system (Azure blobs or Amazon S3), it seems to be a perfect solution. See this post: azure blob storage effectiveness, same situation, except you have XML files instead of images. You can use a DB for storing the metadata, i.e. source and event type of the XML, the path in the cloud, but not the data itself.
You may also zip the files. I don't know the exact method, but it can surely be handled on client-side. Static data is often sent in zipped format to the client by default.
Your question is missing some details such as how long does your data need to remain in the database and such…
I’d avoid storing XML in database if you already have the raw data. Why not have an application that will query the database and generate XML reports on demand? This will save you a lot of space.
10GBs of data per day is something SQL Server 2008 R2 can handle with the right hardware and good structure optimization. You’ll need to investigate if standard edition will be enough or you’ll have to use enterprise or data center licenses.
In any case answer is yes – SQL Server is capable of handling this amount of data but I’d check other solutions as well to see if it’s possible to reduce the costs in any way.
Your basic arch doesn't seem to be at fault, its the way you've perceived the redis, basically if you design your key=>value right there is no way that the retrieval from redis could be slow.
for ex- lets say I have to store 1 mil objects in redis, and say there is an id against which I am storing my objects, this key is nothing but a guid, the save will be really quick, but when it comes to retrieval, do I know the "key" if i KNOW the key it'll be fast, but if I don't know it or I am trying to retrieve my data not on the basis of key but on the basis of some Value in my objects, then off course it'll be slow.
The point is - when it comes to retrieval you should just work against the "Key" and nothing else, so design your key like a pre-calculated value in itself; so when I need to get some data from redis/memcahce, I could make the KEY, and just do a single hit to get the data.
If you could put more details, we'll be able to help you better.

1000 users can read the single text at same time?

In my website im having one csv file, which is having millions of records.
based on some search key i need to select one record.
this part i completed.
my doubt is If multiple users (1000 users) access my website (only one csv file will be available)... we can able to read the same file with 100 users?
1M records is not a lot. Frankly I'd just load it all into structured data, and reference that. Any number of users can access it once it is memory (especially for read-only).
But ultimately the ideal answer here is: use a database. SQL Server Express is free and will cope with that effortlessly.
As long as the application only has to read you will not have a problem. However it would be more efficent to use a database for this task. You can make indexes and use sql of easy access. No need to parse the file on each request and you can even add/change data when your site is running.

Data-Driven Websites for Very Small Businesses

I have a client who has a product-based website with hundreds of static product pages that are generated by Microsoft Access reports and pushed up to the ISP via FTP (it is an old design). We are thinking about getting a little more sophisticated and creating a data-driven website, probably using ASP.NET MVC.
Here's my question. Since this is a very small business (a handful of employees), I'd like to avoid enterprise patterns like web services if I can. How does one push updated product information to the website, batch-style? In a SQL Server environment, you can't just push up a new copy of the database, can you?
Clarification: The client already has a system at his facility where he keeps all of his product information and specifications. I would like to refresh the database at the ISP with this information.
You don't mention what exactly the data source is, but the implication is that it's not already in SQL Server. If that's the case, have a look at SSIS.
If the source data is in SQL Server, then I think you'd want to be looking at either transactional replication or log shipping to sync the two databases.
If you are modernizing, and it is a handful of employees, why would you push the product info out batch style?
I don't know exactly what you mean by "data driven", but why not allow the ASP.NET app to query the SQL Server product catalog database directly? Why generate static pages at all?
UPDATE: ok, I see, the real question is, how to update the SQL database running at the ISP.
You create an admin panel so the client can edit the data directly on the server. It is perfectly reasonable to have the client keep all their records on the server as long as the server is backed up nightly. Many cloud and virtual services offer easy ways to do replicated backups.
The additional benefit of this model is that more than one user can be adding or updating records at a time, making the workforce a lot more scalable. Likewise, the users can log in from anywhere they have a web browser to add new records, fix mistakes made in old records, etc.
EDIT: This approach assumes you can convince the client to abandon their current data entry system in favor of a centralized web-based management panel. Even if this isn't the case, the SQL database can be hosted on the server and the client's application could be made to talk to that so you're only ever using one database. From the sounds of it, it's a set of Access forms and macros which you should have source access to.
Assuming that there is no way to sync the data directly between your legacy system DB (is it in Access, or is Access just running the reports) and the SQL Server DB on the website (I'm not aware of any):
The problem with "pushing" the data directly into the SQL server will be that "old" (already in the DB) records won't be updated, but instead removed and then recreated. This is a big problem with foreign keys. Plus, I really don't like the idea of giving the client any access to the db at all.
So considering that, I find that the best is to write a relatively simple page that takes an uploaded file and updates the database. The file will likely be CSV, possibly XML. After a few iterations of writing these pages over the years, here's what I've come up with:
Show file upload box.
On next page load, save file to temp location
Loop through each line (element in XML) and validate all the data. Foreign keys, especially, but also business validations. You can also validate that the header row exists, etc. Don't update the database.
3a. If invalid data exists, save an error message to an array
At the end of the looping, show the view.
4a. If there were errors, show the list of error messages and tell them to re-upload the file.
4b. If there were no errors, create a link that has the file location from #2 and a confirmation flag
After the file location and confirm flag have been submitted run the loop in #3 again, but there's an if (confirmed) {} statement that actually makes the updates to the db.
EDIT: I saw your other post. One of the assumptions I made is that the databases won't be the same. ie, the legacy app will have a table or two. Maybe just products. But the new app will have orders, products, categories, etc, etc. This will complicate "just uploading the file".
Why do you need to push anything?
You just need to create a product management portion of the webpage and a secondly a public facing portion of the webpage. Both portions would touch the same SqlServer database.
.Net has the ability to monitor a database and check for updates. then you can run a query to [push] the data elsewhere.
or use sql to push the data with a trigger on the table(s) in question.
Is this what you were looking for?
You can try Dynamic Data Web Application.
You should have a service that regularly updates the data in the target DB. It will probably run on your source data machine (where the Access-DB is)
The service can use SSIS or ADO.NET to write the data. You can do this over the web, because you have access via TCP/IP to the server I assume.
Please check when the updates are done and how long it takes. If you can do the updates during the night you are fine. If not you should check, if you can still access the web during the import. That is sometimes not the case.
Use wget to push the new data file to the mvc app and once the data is received by the action, the mvc app invokes the processing/importing of the data (maybe in a worker process if you dont want long requests).

.Net Data Handling Suggestions

I am just beginning to write an application. Part of what it needs to do is to run queries on a database of nutritional information. What I have is the USDA's SR21 Datasets in the form of flat delimited ASCII files.
What I need is advice. I am looking for the best way to import this data into the app and have it easily and quickly queryable at run time. I'll be using it for all the standard things. Populating controls dynamically, Datagrids, calculations, etc. I will also need to do user specific persistent data storage as well. This will not be a commercial app, so hopefully that opens up the possibilities. I am fine with .Net Framework 3.5 so Linq is a possibility when accessing the data (just don't know if it would be the best solution or not). So, what are some suggestions for persistent storage in this scenario? What sort of gotchas should I be watching for? Links to examples are always appreciated of course.
It looks pretty small, so I'd work out an appropriate object model, load the whole lot into memory, and then use LINQ to Objects.
I'm not quite sure what you're asking about in terms of "persistent storage" - aren't you just reading the data? Don't you already have that in the text files? I'm not sure why you'd want to introduce anything else.
I would import the flat files into SQL Server and access via standard ADO.NET functionality. Not only is DB access always better (more robust and powerful) than file I/O as far as data querying and manipulation goes, but you can also take advantage of SQL Server's caching capabilities, especially since this nutritional data won't be changing too often.
If you need to download updated flat files periodically, then look into developing a service that polls for these files and imports into SQL Server automatically.
EDIT: I refer to SQL Server, but feel free to use any DBMS.
My temptation would be to import the data into SQL Server (Express if you aren't looking to deploy the app) as it's a familiar source for me. Alternatively you can probably create an ODBC data source using the text file handler to get you a database-like connection.
I agree that you would benefit from a database, especially for rapid querying, and even more so if you are saving user changes to the data. In order to load the flat file data into a SQL Server (including Express), you can use SSIS.
Use Linq or text data to list method
1.create a list.
2.Read the text file line by line (or all lines).
3.process the line - get required data and attach to the list.
4.process the list for any further use.
the persistence storage will be files and List is volatile.

Categories