Import data into sql server from files with different format

Import data into sql server from files with different format - c#

I have a program, which watches a folder on the server. when new files (flat file) come in, the program (C#) read data, bulk insert into the table. it works fine.
Now, we extend the system. It means the data files could be in different formats (flat file, csv, txt, excel..), or with different columns (we need map them to the columns in the table).
my question is: is C# the best choice for this? or, SSIS is a better choice?
Thanks

I wouldn’t necessarily choose one or the other but choose depending on the file type and the amount of processing. For some file types its probably easier to go with C# and for some other SSIS works better.
Do you have someone on your team who is good with SSIS? It’s much easier to find a C# dev to do the job for you than to find someone who knows SSIS.
How likely is that requirements/formats are going to be updated in the future? That’s also important thing to keep in mind.
I do agree with what others said that SSIS is more powerful and offers support for more complex transformations but the questions is do you really need this?

It's depends on your context. Different format should not decision go to SSIS. With solution C# program: you can continue go with it because it run stable before. Easy to deployment, specific into your domain, easy to configuration as well.
With solution SSIS: The configuration more complicate required developer has deep knowledge into SSIS. The administration fee required more than C# program. However it easy to visual (has diagram for you see the flow integration more easier).
From my viewpoint, if the integration process does not required complicated about business rule you should go with C# program. Otherwise, SSIS more powerful if integration process required rules complicated. Hope this help.

In C# application I guess you are using the SqlbulkCopy component and compared to SSIS its not that powerful. So if your data size becomes huge,then C# application will become slower.
If you are familiar with SSIS,my suggestion is to go with SSIS. In SSIS,you can implement end-to-end solution as you have developed in C#,right from checking the files in a specific folder to loading the data into database.

Related

Automated deployments with Kentico

Does anybody have experience automating deployments with Kentico? E.g. the difficulty of synchronizing document types, bizforms etc to another server?

I've used the built in content staging module to do this sort of thing. Unfortunately it's not all Unicorns and Rainbows. There were definitely some bugs in the module which essentially serializes the data from one server, and deserializes on the target server.
That was back in version 5.5 or 5.5R2 though, and they released version 6 a few months ago. I would take some time and look at the documentation for it's limitations, and then maybe give it a test before committing to it. It can definitely work for some, but it may not be Content Editor friendly.
Kentico Developer Documentation on Content Staging Module

Another possibility would be to utilize a tool that does database comparisons and syncing. I've used the SQL Examiner Suite before, but I've heard that Red Gate makes good tools too.
SQL Examiner
SQL Data Exminer
Red Gate Tools SQL Compare
While this probably isn't the best method, it can work. If you're not making significant changes on a regular basis this can be good for one off syncs between your local/dev server and production. This probably wouldn't be a good solution for "content staging", but more for changes that occurred due to development oriented tasks.

Another option is to use the Export/Import feature in Kentico: http://devnet.kentico.com/docs/6_0/devguide/index.html?export_and_import_overview.htm.
I haven't automated this process, but you can have a look at the ExportManager class in Kentico's API Reference: http://devnet.kentico.com/Documentation.aspx.
Hope this helps

With Kentico 10 you could use the Continuous Integration Feature. It is now working much better than in Kentico 9.
With the Continuous Integration Feature Database objects could be deployed together with the code files and are serialized automatically into the target database.
If you do not want to use this module, you need to use the Object Export Feature in Kentico (Site => Export site or objects).
In both scenarios you have to know, that content (Pages) are difficult to stage between different servers. Content staging is only usefull if you have a "real" staging server, where contend editors prepare the contet that should be staged to the live server on time.
In case you want to stage from a DEV server to the LIVE server, the pages will be overwritten by the dev version, if the GUID of the page is matching.
If you use Continuous Integration, all pages which are not in the DEV server instance will be deleted!
All other objects (Develop objects like Templates, Web Parts, Page Types, etc.) could be imported without any issues.

SQL based storage vs SVN

My team is developing a new application (C#, .Net 4) that involves a repository for shared users content. We need to decide where to store it. The requirements are as follows:
Share files among users.
Support versions.
Enable search by tags and support further queries such as "all the files created by people from group X"
Different views for different people (team X sees its own content and nobody else can see theirs).
I'm not sure what's best, so:
can I search over SVN using tags (not SVN tags of course, more like stackoverflow's tags)?
Is there any sense in thinking of duplication - both SVN and SQL - the content?
Any other suggestions?
Edit
The application enables users to write validation tests that they later execute. Those tests are shared among many groups on different sites. We need versioning for the regular reasons - undo changes, sudden deletions etc. This calls for SVN.
The thing is, we also want to add the option to find all the tests that are tagged "urgent" and were executed by now, for tracking purposes.
I hope I made myself more clear now :)
Edit II
I ran into SvnQuery and it looks good, but does it have an API I can use? I'd rather use their mechanism with my own GUI.
EDIT III
My colleague strongly supports using only a database and forget file based storage. He claims it is better for persistence (which is needed - a test is more than the list of commands to execute). I'd appreciate inputs on this issue, as I think it should be possible to do it this way or the other.
Thanks!

Firstly, consider using GIT rather than SVN. It's much faster, and I suspect it's more appropriate in your use-case: it's designed to be distributed, meaning your users will be able to use it without an internet access, and you won't have any overhead related to communicating with the server when saving documents.
Other than that, I'm not making full sense of your question but it seems like the gist of it might be better rephrased like so: "Can I do tag-based searches/access restriction onto my version control system, or do I need to create a layer on top to do so?"
If so, the answer is that you need a layer on top. Some exist already, both web-based (e.g., Trac) and desktop-based (e.g. GitX). They won't necessarily implement exactly what you need but they can be a good starting point to do what you're seeking.

You could use SVN.
Shared files: obvious and easy. It also supports the centralised locking that you might need for binary files.
Versions. Obviously.
Search... Now we're getting into difficult territory. There is a Lucene addon that allows web searching of your repo - opengrok, svnquery or svn-search. These would be your best starting points for that.
There is no way to stop people seeing what's present in a svn repo, but you can stop them from accessing it. I don't know if the access control could be extended easily to provide hidden folders, you could ask the svn developers.
There's some great APIs for working with SVN, probably the most accessible is SharpSVN which gives you a .net assembly, but there's Python and C and all sorts available.
As mentioned, there are web tools which sit on top of SVN to provide a view into it, there's Trac, and Redmine and several repo-viewers like webSVN, so there's plenty of sample code to use to cook up your own.
Would you use a DVCS like git or mercurial? I woulnd't. Though these have good mechanisms in themselves, it doesn't sound like they're what tyou're after. These allow people to work on their own and share with others on a peer-to-peer basis (though you can set a 'central' repo and work with that as everyone's peer). They do not work in a centralised, shared way. For example, if you and I both edit a test case locally andthen push to the central repo, we might have issues merging. We will have issues merging if the file is a binary or otherwise non-mergable file. In this case you have a problem with losing one person's changes. That's one, main reason for not using a DVCS in your case.
If you're trying to get shared tests together, have you looked at some apps that already do this. I noticed TestRail recently that sounds like what you're trying to do. It's not free (alas) but it's cheap.

C++ Database Access With No Required Installation

I am looking for a database that can I run SQL statements on without having to have a database server installed. I.e. I need the ability to select/insert/update a database given only the database file and any external libraries.
Here is my situation:
I am using C++ to parse through a number of oddly-formatted binary files,
and I would like to store them into some type of database to offer more
convenient access to the data.
Once the files have been inserted into the database, I will use C# to
write an interface/GUI by which a user can interact with the database.
I'm using C++ for the speed of reading the files and because
I've already written that part.
I'm using C# because it is much easier to do GUI programming.
Here are my requirements:
Database must provide a way to run commands in C++ using only external libraries (no installation)
I should be able to move the database to any (similar [Windows]) computer and run my application
I believe this is possible with MS Access *.mdb files using ADO or JET or something like that, however, I would like to hear some alternatives. Please provide the database and the C++ engine/libraries in your answer.
My priorities are:
"Lite"-ness
Performance (speed of insert/select)
Client code simplicity (i.e. how easy it is to set up)
Thank you all.

You need to look into SQLite. It is perfect for this scenario and very easy to use. It is vastly popular (large community), compact, cross-platform, and simple to use.
There are SQLite implementations for other languages too. For instance, you can also access SQLite databases using C#. There is even a Linq-to-SQLite.

You cannot go wrong with SQLite here.
It is small enough to be embedded in many apps (see e.g. here for a list of famous apps ranging from Photoshop to Apple Mail + Safari, Dropbox, Firefox, Chrome, Skype and more), yet complete enough to cover most SQL aspects you may need. Great support too, and wide coverage in terms of APIs and languages.
It can have issues with locks and multiple write accesses. But for a single client it should work perfectly fine.

SSIS and re-using C#

I'm a newbie to SSIS / C# (I'm generally a Java developer) so apologies if this is a really stupid question.
Essentially the problem is this: I have two Data Flow tasks which load data up and export them to a legacy flat file format. The formatting is done by a Script Task (C#).
What I'd like to do is share some common code between the two. e.g. I could create a common base class and then extend it for my two different script tasks.
However it seems that SSIS doesn't really make provision for this.
Does anyone know if there is a way of accomplishing what I want to do?

You're correct that there is not a straightforward way to do this directly from SSIS.
In a recent project, we took two different approaches, which both worked fairly well depending on what you need to do:
Create a utility class (as a simple class library) and reference it from your script tasks. This is done pretty much the same as any other sort of reference. If you use .NET 3.5, remember that you'll have to update the version manually in the script tasks since SSIS defaults to 2.0. We also found that if we wanted some manner of reusability in the utility assembly (not relying on hardcoded variable names, etc.) then the package still had to have a fairly large amount of "setup" boilerplate to use the utility scripts.
Create a custom data flow component. This is a much more involved process, but ultimately will do the best in terms of avoiding code duplication. Generally, coding the actual data flow is fairly simple and not that much different than a script component, but the various setup code you'll need can tend to make things complicated. There's also not a lot of support in SSIS for when something goes wrong. Led to a lot of detective work on our project.
If you plan on using something a whole lot, and are committed to getting rid of boilerplate code as much as possible, 2 is the preferred option. If it's being used a few places here and there, consider the simple approach of 1.

I am pretty sure it's possible to access .NET assemblies in SSIS scripts. So you could do it this way. See the article "Accessing .NET assemblies with SSIS" on SQL Server Central.

I believe you will have to create an assembly or webservice for this to work.

This does not completely solve your issue but it does help in not having to recreate all the classes every time you need them (I also do not want to deploy referenced assemblies for my current project ). Firstly you need a master copy of your classes, you can copy them from an existing Script Task using the same process below but in reverse.
Open the Editor for the Script Task and on the Property Explorer click on the Project File (the st_[Guid] ), in the Properties window you’ll see the Project Folder location. (This location gets recreated every time you edit the script task)
In explorer, copy your classes to this folder
On the Project Explorer, click on the “Show All Files” icon
Right click on your files and add to Project

Probably way too late to answer this, but you can click on the solution and add a class there. Then when you go into your scripts you can say add existing object and search for that class you created earlier. For me it was located by the solution for the project. Haven't gone through the deployment or anything for this, but at least you can access the class through the individual scripts.

Windows Forms application "design"

I'm planning on writing a "medium-size" WinForms application that I'll write in C#, .NET 3.5. I have some "generic design questions" in mind that I was hoping to get addressed here.
Exception handling in general. What is the best way to handle exceptions? Use try/catch blocks everywhere? this?
Localization. If I'd want to have multiple language support in my application, what should I use? I find the "satellite assemblies" to be a very... well, "bulky"-seeming solution - I don't want a resource file "hell", and I don't want to input translations inside the VS UI.
Storing data locally. Previously, I've used System.Data.SQLite on a project, but I found myself wondering if there's something else I should consider.
Anything else I should keep in mind?
Thanks(?)

1) Don't catch any exceptions. The vast majority of them tell you about a bug in your code, you'll want to know about them right away. If during testing and deployment, you find error conditions that you think you can handle (there aren't many of them), you can always add the try/catch block. If you plan on handling exceptions, be sure to liberally sprinkle try/finally blocks in your code so the state of your classes is preserved even if there's an exception that prevents cleanup code from running. There is no notable cost to using try without catch.
2) Satellite assemblies are not bulky. Just a small DLL in a subdirectory of your deployment folder. No special code is required, everything is automatic. Most of all, it is a standard solution. You can send your .resx files to a localization service and they'll use standard tools (like Winres.exe) to provide you with the translations. Asking them to deal with something custom is going to be expensive and potentially troublesome.
3) Alternatives are SQL Server CE (same approach as SQLite) and SQL Server Express. The latter gives you the most bang for the buck, but must be installed. That isn't hard.
4) It depends on your target audience, but if look-and-feel is at a factor in a buying decision at all, hire a UI designer. S/he'll catch UI bloopers and make it look spiffy.

#1 - If you care about performance of your application , avoid try/catch. Use some profiler for example the one from RedGate (ANTS - it's not a freeware sadly) to see for yourself that try/catch block consume a lot of CPU time , especially if there is a need to jump into "catch". Just try to find any other way around , .net has got a lot other methods you can use to make shore that no exception will occur, I know it's easier to use try/catch but decide what is your aim.
#2 - I guess that you can use resource files that are compiled with you application so you won't have any separate file if that is what you're asking ?
#3 - I really have to answer that one :) , personally I think there is no better/more comfortable way of storing data locally than to use XML , as it was mentioned before, you can use LINQ to XML to query this file , which is extremely simple. It's small , fast , easy to create , maintain and what's more important you can send it trogh the Internet without any problems that may occur using other ways of storing data , example - firewall or any ISD won't be a problem because it's basically a text file. I simply love xml.
Was that helpfull ?

Regarding first question - have a look at Enterprise Library Exception Handling Block. Microsoft did a great job at providing documentation and code to solve this problem.
Regarding other question (especially #4) it is hard to recommend something without knowing details of your application.

I pretty much have the same answer as aku for the first question, you might want to take a look at the Enterprise Library in general since there are several useful blocks such as the Logging and Validation blocks.
Can't help with the second since I haven't worked on any projects that needed to support localization of any kind.
Without a better idea of what kind of data/application you are developing it is kind of difficult to recommend local data storage. A couple of thoughts that I have (no particular order) are:
An Xml file is portable and can be manipulated with LINQ->XML.
Are your objects stable? You could always serialize them.... (Although I don't recommend this)
Although local now, would the data (some or all) be better shared on a server with other users in the future.
You mentioned SQLite, have you considered SQL Server CE?
What kind of query performance does you data layer need to support?

As mentioned before, Exception handling block
Not done much localizaton, but the
resource handling in VS 2008 is much
better than VS 2003
SQL Server Compact Edition,
VistaDB and Codegear
Blackfish are product that you
might investigate. SQL CE is free
but the others cost money

Why not download Visual C# 2005 or 2008 Express Edition?
Designing is easy
Open Visual C#
Open Create
Open Create Project Dialog
Design it
Code it
Publish it
Download 2005 ->
Download 2008 ->

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.