Get currently active document file in MS Office Add-In - c#

I creating VSTO Add-In that will send documents to server via REST API.
I need send currently opened document (e.g. docx) just as a file.
First problem was getting full name of the active document.
If found the only way:
Path.Combine(Globals.ThisAddIn.Application.ActiveDocument.Path,
Globals.ThisAddIn.Application.ActiveDocument.Name)
This code can return good path on local drive: D:\Docs\Doc1.docx
But also it can return HTTP path to document in a cloud (e.g. OneDrive): https://d.docs.live.net/xxxxx/Docs\Doc1.docx
Even if it would only local documents I can't get file of this document. I tryed this code:
using (var stream = new StreamReader(docFullPath)) { }
And in case of locally stored document I got System.IO.IOException: The process cannot access the file because it is being used by another process. Not surprizing.
And in case of cloud stored document I got System.NotSupportedException: The given path's format is not supported. Of cource!
I believe I'm doing all wrong and my goal is reachable.
My question is: How read file of currently opened document on MS Office App from Add-In without closing App?

Even if you could access the file ActiveDocument.FullName points to there is no guarantee that the user already has saved all changes to disk, or even worse, the document is still in the state of creation and has never been saved yet.
There is another little known/documented way to retrieve a file of an open document which is using the IPersistFile COM interface which is implemented by the Document object. The following sample saves a document to the specified location. This happens without modifying the save state of the document, i.e. you get the exact version of the document as it is open (and not as it has previously been saved to disk) and the user may later still decide to save possible modifications of the document or not.
public static void SaveCopyAs(this Document doc, string path)
{
var persistFile = (System.Runtime.InteropServices.ComTypes.IPersistFile)doc;
persistFile.Save(path, false);
}

You can copy an open document on the filesystem using File.Copy(FullName, <some_temp_filename>), and then send the copy to your REST service. This works even though it's still open for exclusive reading/writing in Word.

Related

Azure Storage File Share looses metadata when files updated with MS Word

We are using File Share through Azure Storage account. As part of our application we assign ID to every file and store this ID in metadata:
Set this ID via this block of code:
public static void SetId(this CloudFile cloudFile, Guid id)
{
cloudFile.Metadata[DocumentDbId] = id.ToString();
cloudFile.SetMetadata();
}
However when this file is edited in Microsoft Word 2013 (all the files are .docx), this metadata is wiped clean and we loose references.
If I create a text file, assign it an ID in metadata, then edit it with a notepad, then this metadata stays where it should be and not wiped.
Why editing with MS Word is wiping metadata? and how to prevent this from happening? Is there alternative way to set an arbitrary ID that is not wiped with edits?
UPD: Just to clarify this is my scenario:
I mount a file share to as my local drive via net use K: \http://myaccount.file.core.windows.net \tests /u:AZURE\myaccount uNrI0yyRxyMx, I put a .docx file on the drive. In MS Azure Storage Explorer I right click the file, add metadata - any metadata, save it (tried this with C# as above, but result is just the same). Check it again to verify that the metadata was saved. Then open this file from the mounted drive in MS Word, do a change, save it. Go check the metadata on the file and there is nothing there.
But If I create a txt file, add metadata, then edit the file with a notepad++, save it. Metadata is not wiped out. So something that MS Word does to wipe the metadata
I had a confirmation from Microsoft engineer Json Shay that MS Word does funky stuff when writes to files:
The reason is that MS Word (and many applications) use the Win32 ReplaceFile() API when saving a file, which is effectively a set of move+move+delete operations. Specifically, MS Word:
Writes the new version of the file into a new temporary file, which contains no properties: ~newfile.docx
Rename existingfile.docx --> existingfile_backup.docx
Rename ~newfile.docx --> existingfile.docx
Delete existingfile_backup.docx
The properties were written on the original existingfile.docx, which then gets renamed away, and then deleted.
This is different than notepad, which is modifying the existing file in-place.

Downloading file from database where file is stored in binary format

I have got stuck in a problem while downloading documents from database.
Currently I'm working in ASP.net project and this is my first career project.
We have some documents which we store in database. The documents(.pdf,.doc,.png,.docx,xls,xlsx) are stored in binary format with their type specified.
I can download one document using Response.write. But now i have to concatenate some documents and then allow user to download on button click.
I have googled a lot. Developers have said that this is impossible. But still i feel that this can be done.
However if this is impossible i thought that i will save these individual documents first at some server location and then zip them and then allow user to download. But how would i be able to save the individual document in their original format at server location.
Please help me out. I'm in big problem.
It is not impossible to read more than one document from a database and zip them up before presenting the zip file to the response stream. You can do this all in process, there is no need to save the documents to the server.
This code uses Ionic.Zip to zip up several files and write them to a MemoryStream:
foreach (var file in files)
{
zipFile.AddEntry(file.FileName, file.ContentBytes); // these are the file bytes
}
var zipMs = new MemoryStream();
zipFile.Save(zipMs);
zipMs.Seek(0, SeekOrigin.Begin);
zipMs.Flush();
The file.FileName includes the extensions (.docx) of the documents and when downloading through the browser, everything is displayed and saved correctly.

Opening a file which is being used by someone else in another machine

My task is to generate Word documents. By selecting the appropriate options, we can generate a document. I have the base documents in the server, and if I click generate, then the document will be generated and we can save it on our local drive.
If I am using the portal now and, at the same time, if some one is using it in their machine at the same time, then the document is not getting generated. It's automatically getting posted back.
I want to show a progress bar or something like that, so that the person waits until the document is completely generated.
Is this possible using threads?
When you read your base document on the server, use File.Open with FileAccess.Read or File.OpenRead.
By doing this, you can read the same files with several threads at the same time.

Edit Saved Microsoft Word Document in C#/Asp.net

I am not sure if this is possible and every where I have searched, I cannot find a clear answer. I am saving a Microsoft Word document to a SQL Server 2008 table. Basically just converting the file to a Byte[] and writing that to the table. This word document is a "template" file. The file is a form that the user needs to fill out. What I am wondering, is after reading that file from SQL Server and before opening it up for the user, is there a way to autopopulate some fields in the form for the user? For example, if I know the address of the user already, can I autopopulate the address field in the template for them?
I know that using Microsoft.Office.Interop.Word, I can search the document for bookmarks and insert data at the bookmark. However, as far as I know, you cannot use Microsoft.Office.Interop.Word to open a Byte[].
Is there anyway to complete what I was looking for?
If you want to use OpenXML, then you can do it like this,
//Load your byte[] array into memory stream and then
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
You can do what you are trying to achieve using OpenXML without installing word on the server side..More resources on OpenXMl can be found on http://openxmldeveloper.org/. And the open xml sdk can be downloaded from here.
I think the general steps would be to
1) Save the file to the local hard drive of the user with a file name based on the template but with a .doc extension.
2)Open the file with interop, but keep it invisible.
3)Populate the fields with bookmarks.
4)Show it to the user.

how to find the timestamp of an online pdf file using c#?

I am writing an application that would download and replace a pdf file only if the timestamp is newer than that of the already existing one ...
I know its possible to read the time stamp of a file on a local computer via the code line below,
MessageBox.Show(File.GetCreationTime("C:\\test.pdf").ToString());
is it possible to read the timestamp of a file that is online without downloading it .. ?
Unless the directory containing the file on the site is configured to show raw file listings there's no way to get a timestamp for a file via HTTP. Even with raw listings you'd need to parse the HTML yourself to get at the timestamp.
If you had FTP access to the files then you could do this. If just using the basic FTP capabilities built into the .NET Framework you'd still need to parse the directory listing to get at the date. However there are third party FTP libraries that fill in the gaps such as editFTPnet where you get a FTPFile class.
Updated:
Per comment:
If I were to set up a simple html file with the dates and filenames
written manually , I could simply read that to find out which files
have actually been updated and download just the required files . is
that a feasible solution ..
That would be one approach, or if you have scripting available (ASP.NET, ASP, PHP, Perl, etc) then you could automate this and have the script get the timestamp of the files(s) and render them for you. Or you could write a very simple web service that returns a JSON or XML blob containing the timestamps for the files which would be less hassle to parse than some HTML.
It's only possible if the web server explicitly serves that data to you. The creation date for a file is part of the file system. However, when you're downloading something over HTTP it's not part of a file system at that point.
HTTP doesn't have a concept of "files" in the way people generally think. Instead, what would otherwise be a "file" is transferred as response data with a response header that gives information about the data. The header can specify the type of the data (such as a PDF "file") and even specify a default name to use if the client decides to save the data as a file on the client's local file system.
However, even when saving that, it's a new file on the client's local file system. It has no knowledge of the original file which produced the data that was served by the web server.

Categories