System that needs to upload documents into a MOSS document library

System that needs to upload documents into a MOSS document library - c#

Hi I need your help if you are an expert in MOSS.
I have a system that needs to upload documents into a MOSS document library.
I decided that the easiest approach for a phase 1 system would simply be to map a network path to a MOSS Document library.
The whole thing seems too easy. After that its a straight copy using System.IO.
What I would like to know, is this method reliable enough to be used in a production system?
Speculation would be great, but if you have real experience with working with MOSS in this way, your answer would mean a lot.
Thanks.

So long as you do the proper error checking around the copy, its fine - if you bear in mind the standard caveats with SharePoint document libraries and file naming conventions.
SharePoint does not allow some characters in file names which NTFS and FAT do - these will cause an error when you try to copy them to the DL, regardless of how you do that, so you will need to sanitise your filenames beforehand.
The only downside to using a network path to the webdav interface of SharePoint is that if you stress it too much (a large copy of a lot of files), you can easily overwhelm it and it will cause the network share to become unavailable for a period of time. If you are talking about a few files every now and then, several an hour for example, it should be fine.

You are better off reading the files off the network path and then using the Object Model API or the web services to upload it.
You can use timer jobs which can be scheduled to run at a convenient time. The timer job can read it's configuration settings from an XML file.
This system would be easier to maintain, and troubleshoot, when compared to a straight copy using System.IO.

Related

The best approach for file backup using .NET

There are number of possible solutions to do file backup application. I need to know which method would be rock-solid and professional way to perform copying of data files even though the file is being used or very large sized.
There is a known method called Volume shadow copy (VSS), however I've read that it is an overkill for a simple copying operation and instead the PInvoke BackupRead can be used.
.NET framework provides it's own methods:
File.Copy was (and possibly still is) problematic with large files and sharing the resources
FileStream seems to be suitable for backup purposes however I didn't locate comprehensive description and I am not sure if I'm correct.
Could you please enlighten me which method should be used (maybe I have overlooked some options) and why? If the VSS or PInvoke methods are preferred could you please also provide an example how to use it or some reference to comprehensive documentation (particularly I'm interested in the correct settings to create file handle, which would allow sharing the resources when the file is in use).
Thanks in advance.

Everything you're going to try in a live (i.e. currently running OS) volume will suffer from not being able to open some files. The reason is, applications and the OS itself opens files exclusively - that is, they open the files with ShareMode=0. You won't be able to read those files.
VSS negotiates with VSS-aware applications to release their open files for the duration, but relatively few applications outside Microsoft are VSS aware.
An alternative approach is to boot to another OS (on a USB stick or another on-disk volume) and do your work from there. For example, you could use the Microsoft Preinstallation environment (WinPE). You can, with some effort run a .Net 4.x application from there. From such an environment, you can get to pretty much any file on the target volume without sharing violations.
WinPE runs as local administrator. As such, you need to assert privileges, such as SE_BACKUP_NAME, SE_RESTORE_NAME, SE_SECURITY_NAME, SE_TAKE_OWNERSHIP_NAME, and you need to open the files with a flag called BACKUP_SEMANTCS...as described here.
The BackupRead/BackupWrite APIs are effective, if awkward. You can't use asynchronous file handles with these APIs...or at least MS claims you're in for "subtle errors" if you do. If those APIs are overkill, you can just use FileStreams.
There are bunches of little gotchas either way. For example, you should know when there are hardlinks in play or you'll be backing up redundant data...and when you restore, you don't want to break those links. There are APIs for getting all the hard links for a given file...NtQueryInformationFile, for example.
ReparsePoints (Junctions and SymLinks) require special handling, too...as they are low-level redirects to other file locations. You can run in circles following these reparse points if you're not careful, and even find yourself inadvertently backing up off-volume data.
Not easy to deal with all this stuff, but if thoroughness is an issue, you'll encounter them all before you're done.

SkyDrive as patcher storage server

As of late, we started a pretty large project (C# XNA game).
It seemed to be pretty obvious solution to store all the files in a remote server, use a database for file "versions" and have the patcher download the newer versions and delete any archaic.
Now this is all nice in theory, we even found a service with the space for it (SkyDrive with the 25GB offer).
The problem came up when it got to file manipulations.
We're looking for:
Can programmatically download/upoad (for the patch maker) files to/from SkyDrive.
Has a secure way of containing uname/pass.
Allow me to explain both.
Thing is, we had to make the SkyDrive on my personal account (due to the 25gb offer only being there for old users). I'm not very happy with someone getting my password, even though I'll obviously change it to something completely archaic, they would still get access to most of my other hotmail/msn related stuff. (I guess it's a reason to remake it all then?). So if possible I would secure the actual uname/pass inside the program. Since it's .NET and is compiled on demand, (and can easily be decompiled) I'm having doubts real security in this case is improbable (if it is possible to secure please do tell me how).
On top of that, there's no efficient&official SkyDrive API. This means that there's an even bigger security hole (see previous paragraph) and the communication won't necessarily work as expected). This also means there may be slowness in communication - something bad if you have 1000 users downloading the same file.
So to formulate all of this:
What is the the proper way (read API) to use SkyDrive as a storage server for a patcher considering it's linked to my personal account?
small sidenote, if I must, I can be evil and get our slow artist to host the server
Edit 1:
The idea is to have anyone be able to download the client, but initiating anything requires an active account on our database. As such the files themselves don't have a problem being read by everyone. So I'll add the following: how to programmaticaly get direct downloads from SkyDrive if the files are public? The current links lead to their web UI. And I mean programmatically (maybe during upload time) as to avoid doing it all by hand.

This is a bad idea.
Given #1:
Use a public folder to store your assets and grant everyone access to it
Use httpclient to download the files from the public folder anonymously in your patcher client
Use the SkyDrive descktop client to synchronize the public folder from a 'build' machine

How to provide a huge amount of files as a part of your application

So, my application depends on a huge number of small files. The actual number is somewhere around 90,000. Now, I use a component that needs an access to these files, but the only way it accepts them is by the use of an URI.
So far I have simply added a directory containing all the files to my debug-folder while I have developed the application. However, now I have to consider the deployment. What are my options on including all these files with my deployment?
So far I have come up with a couple of different solutions, none of which I've managed to make work completely. First was to simply add all the files to the installer which would then copy them to their places. This would, in theory at least, work, but it'd make maintaining the installer (a standard MSI-installer generated with VS) an absolute hell.
The next option I came up with was to zip them into a single file and add this as a part of the installer and then unzip them by the use of a custom action. The standard libraries, however, do not seem to support complex zip-files, making this a rather hard option.
Finally, I realized that I could create a separate project and add all the files as resources in that project. What I don't know is how do the URIs pointing to resources stored in other assemblies work. Meaning, is it "standard" for everything to support the "application://,,,:Assembly"-format?
So, are these the only options I have, or are there some other ones as well? And what would be the best option to go about this?

I would use a single zip-like archive file, and not unzip that file on your hard disk but leave it as is. This is also the approach used by several well known applications that depend on lots of smaller files.
Windows supports using zip files as virtual folders (as of XP), users can see and edit their content with standard tools like Windows Explorer.
C# also has excellent support for zip files, if you're not happy with the built in tools I recommend one of the major Zip libraries out there - they're very easy to use.
In case you worry about performance, caching files in memory is a simple exercise. If your use case actually requires the files to exist on disk, also not an issue, just unzip them on first use - it's just a few lines of code.
In short, just use a zip archive and a good library and you won't run into any trouble.
In any case, I would not embed this huge amount of files in your application directly. Data files are to be separate.

You could include the files in a zip archive, and have the application itself unzip them on first launch as part of a final configuration, if it's not practical to do that from the installer. This isn't entirely atypical (e.g. it seems like most Microsoft apps do a post-install config on first run).
Depending on how the resources are used, you could could have a service that provides them on demand from a store of some kind and caches them, rather than dumping them somewhere. This may or may not make sense depending on what these resources are for, e.g. if they're UI elements a delay on first access might not be acceptable.
You could even serve them using http from a local or non-local server, or a SQL server if it's already using one, caching them as well, which would be great for maintenance, but may not work for the environment.
I wouldn't do anything that involves an embedded resource for each file individually, that would be hell to maintain.

Another option could be to create a self-extract zip/rar archive and extract it from the installer.

One of the options is to keep them in compound storage and access them right in the storage. The article on our site describes various types of storages and their advantages / specifics.

App.config vs. .ini files

I'm reviewing a .NET project, and I came across some pretty heavy usage of .ini files for configuration. I would much prefer to use app.config files instead, but before I jump in and make an issue out of this with the devs, I wonder if there are any valid reasons to favor .ini files over app.config?

Well, on average, .INI files are probably more compact and in a way more readable to humans. XML is a bit of a pain to read, and its quite verbose.
However, app.config of course is the standard .NET configuration mechanism that is supported in .NET and has lots of hooks and ways to do things. If you go with .INI files, you're basically "rolling your own all the way". Classic case of "reinventing the wheel".
Then again: is there any chance this is a project that started its life before .NET ? Or a port of an existing pre-.NET Windows app where .INI files were the way to go?
There's nothing inherently wrong with .INI files I think - they're just not really suported in .NET anymore, and you're on your own for extending them, handling them etc. And it certainly is a "stumper" if you ever need to bring outside help on board - hardly any .NET developer will have been exposed to .INI files while the .NET config system is fairly widely known and understood.

Ini files are quite okay in my book. The problem is GetPrivateProfileString() and cousins. Appcompat has turned that into one ugly mutt of an API function. Retrieving a single ini value takes about 50 milliseconds, that's a mountain of time on a modern PC.
But the biggest problem is that you can't control the encoding of the INI file. Windows will always use the system code page to interpret strings. Which is only okay as long as your program doesn't travel far from your desk. If it does, it has a serious risk of producing gibberish when you don't restrict the character set used in your INI file to ASCII.
XML doesn't have this problem, it is well supported by the .NET framework. Whether by using settings or managing your config yourself.

Personally I never user .ini /xml config files to anything more than loading all the values into a singleton or something and then use them runtime like this...
That being said i firmly believe that you should look at the kind of data and the use of data. If the data is in the context of the application in terms of settings and comfigurations then i believe that the app.config file is the right place to hold these settings.
If on the other hand the data is concerned about loading projects, images or other resources concerned with the content of the application then i believe the .ini (does anyone use .ini files anymore? I am thinking a .xml file for storing these information). In short: Segment the content of the data being stored according to the domain and the context.

INI files are preferable for multi-platform applications (e.g., Linux & Windows), where the customer may occasionally edit the configuration parameters directly, and where you want a more user-friendly/-recognizable file name without the extra effort.

Excel macro run from the web?

I have been handed a critical macro that takes an old school file full of invoices which thankfully is quite consistent. The macro reads this file, moves the data around to make it consistant and then generates a three tab speadsheet which is pretty much three CSV's. It then generates off these three CSV's another speadsheet which has a tab for each invoice. The amount of invoices can really vary.
It works, everyone is happy. We would like to put this out on the web with some security. For now, have it so that the user:
1) Logs in, uploads the old school file and presses process which will then spit out the same speadsheet with each tab being an invoice.
2) Store the data in a database for future growth and use of this data, as well as reporting.
I'm teaching myself ASP.NET and C# and think this would be a great learning project. Before I jump into it, can this realistically be done and what would others recommend in this case? Should I simply re-write based off the logic in the macro or is there a way to port over existing VBA code?

You can do it with an Excel COM API. But this tends to lead to memory leaks, I would not recomend it.
Microsoft has Excel Services which allow you to run Excel Spreadsheets on the server. But it is very expensive and may not support Macros.
SpreadSheetGear may be able to do it. But I have not tested it myself.
I would recommend that you rewrite the application in C#, you would get a better solution, and it may not take you any longer than getting the spreadsheet running on the server.

Using the Excel COM API from a web application is difficult. There are security issues which are non-trivial to address. If you wanted to retain the excel processing then you could build some sort of an out of band process which monitors an upload directory and, when it detects a new file, kicks off a process of transforming the excel file as the old macro use to.
There is no easy transition from VBA to C# since all the VBA code assume the existence of excel which may not be the case. However you can call macros in workbooks using the COM API.

Driving Excel from C# is surprisingly hard to get 100% right. Conversely, driving Excel from a VB6 application is surprisingly easy. But, calling this from a web application makes it harder, since you need to deal both with security and concurrency (2 users at once will trip over each other).
Microsoft don't support the use of Excel on the server (apart from Excel Services), so don't expect any help there. SpreadsheetGear is suited to this, but you'd have to pay for it.
You say this would make a good learning project - I'd disagree; it's likely to put you off programming altogether. This particular mix doesn't have a "nice" solution - it's a case of finding the least-unpleasant hack. If you want to learn ASP.NET & C#, I'd say find another pet project.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.