How to use SharpSVN to modify file xml and commit modified file - c#

am a SharpSVN newbie.
We are currently busy with rebranding, and this entails updating all our reports with new colours etc. There are too many reports to do manually, so I am trying to find a way to find and replace the colours/fonts etc in one go.
Our reports are serialized and stored in a database, which is easy to replace, but we also want to apply the changes in the .rdl reports in our source control, which is Subversion.
My question is the following:
I know you can write files to a stream with SharpSVN, which I have done, now I would like to push the updated xml back into Subversion as the latest version.
Is this at all possible? And if so, how would I go about doing this? I have googled alot, but haven't been able to find any definitive answer to this.
My code so far (keep in mind this is a once off thing, so I'm not too concerned about clean code etc) :
private void ReplaceFiles()
{
SvnCommitArgs args = new SvnCommitArgs();
SvnCommitResult result;
args.LogMessage = "Rebranding - Replace fonts, colours, and font sizes";
using (SvnClient client = new SvnClient())
{
client.Authentication.DefaultCredentials = new NetworkCredential("mkassa", "Welcome1");
client.CheckOut(SvnUriTarget.FromString(txtSubversionDirectory.Text), txtCheckoutDirectory.Text);
client.Update(txtCheckoutDirectory.Text);
SvnUpdateResult upResult;
client.Update(txtCheckoutDirectory.Text, out upResult);
ProcessDirectory(txtCheckoutDirectory.Text, args, client);
}
MessageBox.Show("Done");
}
// Process all files in the directory passed in, recurse on any directories
// that are found, and process the files they contain.
public void ProcessDirectory(string targetDirectory, SvnCommitArgs args, SvnClient client)
{
var ext = new List<string> { ".rdl" };
// Process the list of files found in the directory.
IEnumerable<string> fileEntries = Directory.EnumerateFiles(targetDirectory, "*.*", SearchOption.AllDirectories)
.Where(s => ext.Any(e=> s.EndsWith(e)));
foreach (string fileName in fileEntries)
ProcessFile(fileName, args, client);
}
private void ProcessFile(string fileName, SvnCommitArgs args, SvnClient client)
{
using (MemoryStream stream = new MemoryStream())
{
SvnCommitResult result;
if (client.Write(SvnTarget.FromString(fileName), stream))
{
stream.Position = 0;
using (var reader = new StreamReader(stream))
{
string contents = reader.ReadToEnd();
DoReplacement(contents);
client.Commit(txtCheckoutDirectory.Text, args, out result);
//if (result != null)
// MessageBox.Show(result.PostCommitError);
}
}
}
}
Thank you to anyone who can provide some insight on this!

You don't want to perform a merge on the file, as you would only use that to merge the changes from one location into another location.
If you can't just checkout your entire tree and replace+commit on that, you might be able to use something based on:
string tmpDir = "C:\tmp\mytmp";
using(SvnClient svn = new SvnClient())
{
List<Uri> toProcess = new List<Uri>();
svn.List(new Uri("http://my-repos/trunk"), new SvnListArgs() { Depth=Infinity }),
delegate(object sender, SvnListEventArgs e)
{
if (e.Path.EndsWith(".rdl", StringComparison.OrdinalIgnoreCase))
toProcess.Add(e.Uri);
});
foreach(Uri i in toProcess)
{
Console.WriteLine("Processing {0}", i);
Directory.Delete(tmpDir, true);
// Create a sparse checkout with just one file (see svnbook.org)
string name = SvnTools.GetFileName(i);
string fileName = Path.Join(tmpDir, name)
svn.CheckOut(new Uri(toProcess, "./"), new SvnCheckOutArgs { Depth=Empty });
svn.Update(fileName);
ProcessFile(fileName); // Read file and save in same location
// Note that the following commit is a no-op if the file wasn't
// changed, so you don't have to check for that yourself
svn.Commit(fileName, new SvnCommitArgs { LogMessage="Processed" });
}
}
Once you updated trunk I would recommend merging that change to your maintenance branches... and if necessary only fix them after that. Otherwise further merges will be harder to perform than necessary.

I managed to get this done. Posting the answer for future reference.
Basically all I had to do was create a new .rdl file with the modified XML, and replace the checked out file with the new one before committing.
string contents = reader.ReadToEnd();
contents = DoReplacement(contents);
// Create an XML document
XmlDocument doc = new XmlDocument();
string xmlData = contents;
doc.Load(new StringReader(xmlData));
//Save XML document to file
doc.Save(fileName);
client.Commit(txtCheckoutDirectory.Text, args, out result);
Hopefully this will help anyone needing to do the same.

Related

XML file from ZIP Archive is incomplete in C#

I've work with large XML Files (~1000000 lines, 34mb) that are stored in a ZIP archive. The XML file is used at runtime to store and load app settings and measurements. The gets loadeted with this function:
public static void LoadFile(string path, string name)
{
using (var file = File.OpenRead(path))
{
using (var zip = new ZipArchive(file, ZipArchiveMode.Read))
{
var foundConfigurationFile = zip.Entries.First(x => x.FullName == ConfigurationFileName);
using (var stream = new StreamReader(foundConfigurationFile.Open()))
{
var xmlSerializer = new XmlSerializer(typeof(ProjectConfiguration));
var newObject = xmlSerializer.Deserialize(stream);
CurrentConfiguration = null;
CurrentConfiguration = newObject as ProjectConfiguration;
AddRecentFiles(name, path);
}
}
}
}
This works for most of the time.
However, some files don't get read to the end and i get an error that the file contains non valid XML. I used
foundConfigurationFile.ExtractToFile();
and fount that the readed file stops at line ~800000. But this only happens inside this code. When i open the file via editor everything is there.
It looks like the zip doesnt get loaded correctly, or for that matter, completly.
Am i running in some limitations? Or is there an error in my code i don't find?
The file is saved via:
using (var file = File.OpenWrite(Path.Combine(dirInfo.ToString(), fileName.ToString()) + ".pwe"))
{
var zip = new ZipArchive(file, ZipArchiveMode.Create);
var configurationEntry = zip.CreateEntry(ConfigurationFileName, CompressionLevel.Optimal);
var stream = configurationEntry.Open();
var xmlSerializer = new XmlSerializer(typeof(ProjectConfiguration));
xmlSerializer.Serialize(stream, CurrentConfiguration);
stream.Close();
zip.Dispose();
}
Update:
The problem was the File.OpenWrite() method.
If you try to override a file with this method it will result in a mix between the old file and the new file, if the new file is shorter than the old file.
File.OpenWrite() doenst truncate the old file first as stated in the docs
In order to do it correctly it was neccesary to use the File.Create() method. Because this method truncates the old file first.

How to convert many files from doc to docx with multithreading

I have millions of doc files which need to be converted to docx. I am currently using the below method to convert each file in the specified directory. How can I effectively multithread this process?
static void ConvertDocToDocx(string path)
{
Application word = new Application();
var sourceFile = new FileInfo(path);
var document = word.Documents.Open(sourceFile.FullName);
string newFileName = sourceFile.FullName.Replace(".doc", ".docx");
document.SaveAs2(newFileName, WdSaveFormat.wdFormatXMLDocument,
CompatibilityMode: WdCompatibilityMode.wdWord2010);
word.ActiveDocument.Close();
word.Quit();
//File.Delete(path);
}
My current approach is to use Directory.GetFiles to create a list of files which are in my path, then use Parallel.ForEach to convert the files. Here's my code:
string[] filesList = Directory.GetFiles(path);
Parallel.ForEach(filesList, new ParallelOptions { MaxDegreeOfParallelism = 20 }, file =>
{
if (file.Contains(".doc"))
{
ConvertDocToDocx(file);
}
});
However, this doesn't seem to increase performance. Am I misunderstanding the use of Parallel.ForEach?
You are using Word via automation which is equivalent of opening the files manually one by one and saving them. This method may have one performance increasing possibility: there is no need to create new Word instances for each file, just reuse the first instance.
...
var wordInstance = new Application();
try
{
var fileNameList = Directory.GetFiles(path);
foreach(var fileName in fileNameList)
{
if (fileName.Contains(".doc"))
{
ConvertDocToDocx(wordInstance, file);
}
}
}
finally
{
word.Quit();
}
...
static void ConvertDocToDocx(Application wordInstance, string path)
{
var sourceFile = new FileInfo(path);
var newFileName = sourceFile.FullName.Replace(".doc", ".docx");
var document = wordInstance.Documents.Open(sourceFile.FullName);
document.SaveAs2(
newFileName,
WdSaveFormat.wdFormatXMLDocument,
CompatibilityMode: WdCompatibilityMode.wdWord2010);
wordInstance.ActiveDocument.Close();
//File.Delete(path);
}
But as others already mentioned that is the limit of this approach.
You should have a look at solutions which are based on file format knowledge, like e.g. NPOI. It is a C# rewrite of popular Apache POI package so if you search for "POI convert doc to docx" and find Java code do not be afraid almost the same code will compile under C# with NPOI package too, in most cases just minor syntax changes would be required.

Storing and getting back files from MongoDB

I am working on c# .Net 4.5
I have to upload some files on MongoDB and in other module, I have to get them back based on metadata.
for that I am doing like below,
static void uploadFileToMongoDB(GridFSBucket gridFsBucket)
{
if (Directory.Exists(_sourceFilePath))
{
if (!Directory.Exists(_uploadedFilePath))
Directory.CreateDirectory(_uploadedFilePath);
FileInfo[] sourceFileInfo = new DirectoryInfo(_sourceFilePath).GetFiles();
foreach (FileInfo fileInfo in sourceFileInfo)
{
string filePath = fileInfo.FullName;
string remoteFileName = fileInfo.Name;
string extension = Path.GetExtension(filePath);
double fileCreationDate = fileInfo.CreationTime.ToOADate();
GridFSUploadOptions gridUploadOption = new GridFSUploadOptions
{
Metadata = new BsonDocument
{{ "creationDate", fileCreationDate },
{ "extension", extension }}
};
using (Stream fileStream = File.OpenRead(filePath))
gridFsBucket.UploadFromStream(remoteFileName, fileStream, gridUploadOption);
}
}
}
and downloading,
static void getFileInfoFromMongoDB(GridFSBucket bucket, DateTime startDate, DateTime endDate)
{
double startDateDoube = startDate.ToOADate();
double endDateDouble = endDate.ToOADate();
var filter = Builders<GridFSFileInfo>.Filter.And(
Builders<GridFSFileInfo>.Filter.Gt(x => x.Metadata["creationDate"], startDateDoube),
Builders<GridFSFileInfo>.Filter.Lt(x => x.Metadata["creationDate"], endDateDouble));
IAsyncCursor<GridFSFileInfo> fileInfoList = bucket.Find(filter); //****
if (!Directory.Exists(_destFilePath))
Directory.CreateDirectory(_destFilePath);
foreach (GridFSFileInfo fileInfo in fileInfoList.ToList())
{
string destFile = _destFilePath + "\\" + fileInfo.Filename;
var fileContent = bucket.DownloadAsBytes(fileInfo.Id); //****
File.WriteAllBytes(destFile, fileContent);
}
}
in this code (working but) I have two problems which I am not sure how to fix.
If i have uploaded a file and I upload it again, it actually gets
uploaded. How to prevent it?
Ofcourse both uploaded files have different ObjectId but while uploading a file I will not be knowing that which files are already uploaded. So I want a mechanism which throws an exception if i upload already uploaded file. Is it possible? (I can use combination of filename, created date, etc)
If you have noticed in code, actually i am requesting to database server twice to get one file written on disk, How to do it in one shot?
Note lines of code which I have marked with "//****" comment. First I am querying into database to get fileInfo (GridFSFileInfo). I was expecting that I could get actual content of file from this objects only. But I didnot find any related property or method in that object. so I had to do var fileContent = bucket.DownloadAsBytes(fileInfo.Id); to get content. M I missing something basic here ?

Copy permissions from file to file

I need to replace the content of a file, without altering it's permissions. I'm doing it by readiong the file, deleting it and writing a new one with updated content.
I have the following:
static void Main()
{
var file = new FileInfo(#"C:\temp\test.txt");
var file1Security = file.GetAccessControl(AccessControlSections.All);
string s;
using (var stream = file.OpenText())
{
s = stream.ReadToEnd();
}
s += "\n" + DateTime.Now;
file.Delete();
using (var stream = file.OpenWrite())
{
using (var writer = new StreamWriter(stream))
{
writer.Write(s);
}
}
file.SetAccessControl(file1Security);
}
However, this doesn't copy the users' permissions over to the new file.
How do I replace the content of a file and preserve the users' permissions on it?
According to this documentation, you can't copy a FileSecurity from one file and apply it to another. (Apparently that came up enough that they documented it. I would have tried it too.)
You have to create a new FileSecurity object, copy the access control list from the old one to the new one, and then apply the new one to the file.
void ApplySecurityFromOneFileToAnother(FileInfo source, FileInfo destination)
{
var sourceSecurityDescriptor = source.GetAccessControl().GetSecurityDescriptorBinaryForm();
var targetSecurity = new FileSecurity();
targetSecurity.SetSecurityDescriptorBinaryForm(sourceSecurityDescriptor);
destination.SetAccessControl(targetSecurity);
}
Since you're replacing a file you'd have to break that up, of course - first get the security from the old file, then apply it to the same file after it's rewritten.
FileSecurity file1Security = file.GetAccessControl(AccessControlSections.All);
file1Security.SetAccessRuleProtection(true, true);
Try replacing "var" with the first line of code copied here, and add the line underneath. This final part applies the access rules.
Try to use DirectoryInfo at the beginning:
DirectoryInfo dInfo = new DirectoryInfo(#"C:\temp\test.txt");
DirectorySecurity dSecurity = dInfo.GetAccessControl();
and change the last line with:
dInfo.SetAccessControl(dSecurity);

Write different tab or files separated in .csv or Excel file when using StreamWriter in C#

I have a console application that outputs the result into a .csv file. The console application calls several web services and outputs several data. The output contain URLs which differentiate one site from another other. The code works fine, but the output is in one big CSV file and it gets divided when it reaches the header then starts writing the data from the new site.
Here you have how the output is now from the CSV file:
ProjectTitle,PublishStatus,Type,NumberOfUsers, URL
Project one,published,Open,1,http://localhost/test1
Project two,expired,Closed,14,http://localhost/test1
ProjectTitle,PublishStatus,Type,NumberOfUsers,URL
Project one V2,expired,Closed,2,http://localhost/test2
Project two V2,Published,Open,3,http://localhost/test2
What I am trying to do its to either output the first set of data (depending on each URL) and then the second set of data in a different tab in Excel, or just create a new file for the new set of data. My code:
public static XmlDocument xml = new XmlDocument();
static void Main(string[] args)
{
xml.Load("config.xml");
test();
}
private static void test()
{
List<string> url = new List<string>();
int count = xml.GetElementsByTagName("url").Count;
for (int i = 0; i < count; ++i)
{
url.Add(xml.GetElementsByTagName("url")[i].InnerText);
Url = url[i];
}
string listFile = "ListOfSites";
string outCsvFile = string.Format(#"C:\\testFile\\{0}.csv", testFile + DateTime.Now.ToString("_yyyyMMdd HHmms"));
using (FileStream fs = new FileStream(outCsvFile, FileMode.Append, FileAccess.Write))
using (StreamWriter file = new StreamWriter(fs))
file.WriteLine("ProjectTitle,PublishStatus,Type,NumberOfSUsers,URL");
foreach(WS.ProjectData proj in pr.Distinct(new ProjectEqualityComparer()))
{
file.WriteLine("{0},\"{1}\",{2},{3},{4}",
proj.ProjectTitle,
proj.PublishStatus,
proj.type,
proj.userIDs.Length.ToString(NumberFormatInfo.InvariantInfo),
url[i].ToString());
}
}
If you want to write an Excel file with different worksheets you can try and use the EpPlus library, it is open source and free.
Here is the nuget page
To generate more files, you have to decide how to name them and if the records for each file are random in the source list, you can generate different lists of output strings for different output files and use the File.AppendAllLines(filename, collection); to save the data in the different files. I don't recommend you to open and close the files for each row to write because it is a time and resource consuming operation.
This is a quick fix. There may be a more elegant way to accomplish this as well:
public static XmlDocument xml = new XmlDocument();
static void Main(string[] args)
{
xml.Load("config.xml");
// relocated from `test` method to `Main`
List<string> url = new List<string>();
int count = xml.GetElementsByTagName("url").Count;
for (int i = 0; i < count; ++i)
{
url.Add(xml.GetElementsByTagName("url")[i].InnerText);
Url = url[i];
test(Url);
}
}
private static void test(string url)
{
string listFile = "ListOfSites";
string outCsvFile = string.Format(#"C:\\testFile\\{0}.csv", testFile + DateTime.Now.ToString("_yyyyMMdd HHmms"));
using (FileStream fs = new FileStream(outCsvFile, FileMode.Append, FileAccess.Write))
{
using (StreamWriter file = new StreamWriter(fs))
{
file.WriteLine("ProjectTitle,PublishStatus,Type,NumberOfSUsers,URL");
foreach(WS.ProjectData proj in pr.Distinct(new ProjectEqualityComparer()))
{
file.WriteLine("{0},\"{1}\",{2},{3},{4}",
proj.ProjectTitle,
proj.PublishStatus,
proj.type,
proj.userIDs.Length.ToString(NumberFormatInfo.InvariantInfo),
url);
}
}
}
}
NOTE: Please use braces in C# even if you don't NEED them. It makes your code more readable.

Categories