I am looking for a .net winforms component that can compare two formatted documents (in .doc, .docx, .html, .rtf, any one of them will do) and visually spot changes. We prefer to see the changes as MS Word does when it shows the changes in its track changes mode
We expect short documents of only few pages long and not much editing (few words changed, a paragraph added/deleted, etc)
Are you aware of such a component that you can recommend free or otherwise
Thank you,
Kemal
Following code will compare two word doc and save the merging of changes in third doc.
Add reference of Microsoft Word 12.0 Object Library
using Microsoft.Office;
public static void comp()
{
object missing = System.Reflection.Missing.Value;
//create a readonly variable of object type and assign it to false.
object readonlyobj = false;
object filename = "C:\\romil1.docx";
//create a word application object for processing the word file.
Microsoft.Office.Interop.Word.Application app = new Microsoft.Office.Interop.Word.Application();
//create a word document object and open the above file..
Microsoft.Office.Interop.Word.Document doc = app.Documents.Open(
ref filename, ref missing, ref readonlyobj, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
string filenm = "C:\\romil2.docx";
object filenm3 = "C:\\romil3.docx";
doc.TrackRevisions = true;
doc.ShowRevisions = false;
doc.PrintRevisions = true;
doc.Compare(filenm);
doc.Close(Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges);
app.ActiveDocument.SaveAs(ref filenm3, ref missing, ref readonlyobj, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
app.Quit(Microsoft.Office.Interop.Word.WdSaveOptions.wdSaveChanges);
MessageBox.Show("Process complete");
}
You can use one of the following libraries to manipulate Word document and build the document comparison method yourself.
Microsoft Interop (Office installation required)
OpenXML SDK
Aspose.Words for .NET
Since this question is old, now there are more solutions available.
Groupdocs compare
Document Comparison by Aspose.Words for .NET
I work with Aspose as Developer Evangelist.
Related
I want to add gridview in the following code.
How do I add a Gridview into a word document?
My Word document creation code ;
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Word.Application wordApp = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document aDoc = null;
DateTime today = DateTime.Now;
object readOnly = true;
object inVisible = true;
aDoc = wordApp.Documents.Open(ref fileName, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref inVisible, ref missing, ref missing, ref missing, ref missing);
this.FindAndReplace(wordApp, "##formkodu##", TextBox1.Text);
this.FindAndReplace(wordApp, "##sirketadi##", DropDownList11.SelectedItem.Text);
this.FindAndReplace(wordApp, "##il##", ddliller.SelectedItem.Text);
this.FindAndReplace(wordApp, "##isletme##", ddlisletmeler.SelectedItem.Text);
this.FindAndReplace(wordApp, "##yüklenicifirma##", ddlyükleniciler.SelectedItem.Text);
wordApp.Visible = false;
aDoc.Activate();
aDoc.SaveAs(ref saveAs, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
wordApp.Quit(ref missing, ref missing, ref missing);
System.Runtime.InteropServices.Marshal.FinalReleaseComObject(aDoc);
Gridview is not a COM ActiveX control, so it cannot be hosted on Word's document surface - at least, not directly.
If you were able to use VSTO then you could use VSTO's built-in tools to wrap the Gridview in a COM ActiveX control in order to place it on the document surface. VSTO is not supported in ASP.NET, however.
A possible way around this could be do develop a VSTO Add-in that's installed on the machines where the document you create is opened. That could take care of wrapping, inserting and managing the ActiveX control + Gridview.
But you might be better off to simply generate a Word table in the document? That works fine using the Interop (or Open XML)...
There is some old information about the basics for creating an ActiveX control on MSDN, by Geoff Darst: https://social.msdn.microsoft.com/Forums/vstudio/en-US/71a75dc4-dcea-454a-9e4a-011a2f811994/vsto-activex-and-powerpoint?forum=vsto
https://social.msdn.microsoft.com/Forums/vstudio/en-US/4282a65c-ccd7-4fd4-a56c-75f84615fff6/embedding-active-x-control-in-office-application-using-vsto-2005?forum=vsto
Instead of interops i would prefer using OpenXML (*.docx) if it possible. How to create a table is documented here: How to: Insert a table into a word processing document (Open XML SDK). With this you don't need interops, which can cause a lot of trouble if the wrong office version is installed or any other issue. Hope this helps.
I am using Microsoft.Office.Interop to open, manipulate and save a Word document file (.doc).
I can get all Text contents but no success in loading added controls (i.e. TextBoxes) in the opened word document.
I get the text using following command
Microsoft.Office.Interop.Word.ApplicationClass oWordApp = new Microsoft.Office.Interop.Word.ApplicationClass();
Microsoft.Office.Interop.Word.Document oWordDoc = oWordApp.Documents.Open(ref fileName, ref missing, ref readOnly, ref missing,ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref isVisible, ref missing, ref missing, ref missing);
oWordDoc.Activate();
oWordApp.Selection.TypeParagraph();
string test = oWordDoc.Content.Text;
How can I have access to all controls included in the base word document?
Thanks.
check this:
Word.Document oDoc=...;
foreach (Word.Shape shape in oDoc.Shapes)
{
//do some thing with shape
}
By changing
oWordApp.Selection.TypeParagraph();
To
oWordApp.Selection.WholeStory();
And digging in oWordDoc.shapes, I gained access to all controls.
I need to be able to change the name of the default document from Document1 to Report when a Word document is started from my application. The problem is that the name property in the Document object is read only. Any idea on a method I can call at startup that changes the name?
You might be interested in this snippet of code:
Microsoft.Office.Interop.Word.Application wordApp = new Microsoft.Office.Interop.Word.Application();
object missing = System.Reflection.Missing.Value;
object fileName = "Report";
object isReadOnly = false;
object isVisible = true;
Microsoft.Office.Interop.Word.Document doc = wordApp.Documents.Add(ref missing, ref missing, ref missing, ref isVisible);
doc.SaveAs2(ref fileName, ref missing, ref missing, ref missing, ref missing, ref missing, ref isReadOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
wordApp.Visible = true;
This will pop open a new Word document named "Report" as you specified. Notice this uses the concept I mentioned in the comment, that is, it saves the file first with a new name then opens it. In this case, the default location is probably your User's "Documents" folder, but you can specify the path as needed.
Don't forget to close and release the COM objects "doc" and "wordApp" as needed. Sometimes the GC doesn't mop it all up appropriately, especially if the application closes unexpectedly or if you forget to close any of them when you're done.
We have some contents in Ms Word .docx formats, prepared by our customers.
These documents may have equations, images, etc.
We want to transfer these contents to our web environment.
Firstly, I plan to use TinyMCE "paste from word" plugin and fmath editor plugin. No use...
Then I decide to put upload button to transfer ms word contents and showing resulting web contents into TinyMCE editor. Actually something like writing a new plugin.
I am using Microsoft.Office.Interop.Word.Document class's "SaveAs" method.
But I have following problems:
1) I can not change document resources folder path. It generate "..._files" folder same with generated html file. I want to transfer all resources to appropriate places on the server.
2) I can not change the image source paths as absolute paths.
3) Too many garbage styles, codes on generated html file.
I may totally in wrong way to achieve this purpose. So I decided to get your advices, before continue in this directions. I am open any suggestion.
Regards,
I am adding draft version of this code:
var fileName = Request["docfilename"];
var file = Request.Files[0];
var buffer = new byte[file.ContentLength];
file.InputStream.Read(buffer, 0, file.ContentLength);
var root = HttpContext.Current.Server.MapPath(#"~/saveddata/_temp/");
var path = Path.Combine(root, fileName);
using (var fs = new FileStream(path, FileMode.Create))
{
using (var br = new BinaryWriter(fs))
{
br.Write(buffer);
}
}
Microsoft.Office.Interop.Word.ApplicationClass oWord = new ApplicationClass();
object missing = System.Reflection.Missing.Value;
object isVisible = false;
word.Document oDoc;
object filename = path;
object saveFile;
oDoc = oWord.Documents.Open(ref filename, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing,
ref missing,ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing);
oDoc.Activate();
object path2 = Path.Combine(root, "test.html");
object fileFormat = word.WdSaveFormat.wdFormatFilteredHTML;
oDoc.SaveAs(ref path2, ref fileFormat, missing, missing, missing, missing, missing, missing,
missing, missing, missing, missing, missing, missing, missing, missing);
oDoc.Close(ref missing, ref missing, ref missing);
oWord.Application.Quit(ref missing, ref missing, ref missing);
This is a delicate matter. I was facing the same problem as doc has a lot of style tags. If you notice, try to share a url (which has word doc content) on facebook, then in the description/summary of url, the unwanted tags used to come :) So I guess the issue is persistent there too. I would suggest, go through the basics of Information Retrieval and try to intelligently strip the style tags. You will be required to write most of your stripping code with regular expressions
For my VSTO Word solution, I need to programatically "compare" two documents side-by-side. In other words I need to, from code, perform the equivalent of clicking the View > Show Side by Side button.
I tried using the CompareSideBySideWith method after loading two documents. An exception is thrown: "The requested member of the collection does not exist". I am not the first to encounter this; see Microsoft's (boilerplate, not particularly helpful) replies in this thread. The MS rep ended up scratching her head and giving up.
I even tried opening two blank documents and comparing them. This time no exception, but the compare didn't happen and CompareSideBySideWith() returned false.
Document doc1 = this.word.Documents.Add(ref missing, ref missing, ref missing, ref missing);
object doc2 = this.word.Documents.Add(ref missing, ref missing, ref missing, ref missing);
doc1.Windows.CompareSideBySideWith(ref doc2);
Has anyone discovered a workaround for this? It seem a pretty basic piece of functionality to have a in a custom solution.
Note: We need to call the actual "Side by Side" compare, not just arrange the windows via Windows.Arrange(). This is partly because our ribbon contains an alias for the View Side by Side button, which won't be turned on (pressed in) unless the actual Side by Side command is called successfully.
Update: The exception was still thrown in the above example involving two new documents; Word swallowed the exception because I tried it outside of my try-catch block.
Per Otaku below I tried calling doc2.Windows.Compare(ref doc1) instead, and this worked for blank documents as well as test documents saved as .docx and .rtf from Word 2007.
However, we need to compare documents saved as RTF from another RTF editor. When I load one of our documents, it fails. To reproduce my error, try loading RTF documents saved from WordPad--these fail as well. I've tried tinkering with the Encoding and Format parameters of Documents.Open() to no avail. It would be nice to avoid having to convert and save the temp file as .docx, particularly for larger documents! Also note that I can click View Side by Side after opening the WordPad-saved RTF files manually, and it works.
Also, it only seems to matter what format the compare document (the document being passed as parameter to Windows.CompareSideBySideWith() is in. For example, if we are doing doc2.Windows.CompareSideBySideWith(ref doc1) as in Otaku's example, it works when doc1 is a regular docx but not when it's an RTF saved from WordPad. (Regardless of where doc2 came from).
Update 2:
As usual, one line of code resolves several days of chasing one's tail:
doc1.Convert(); // Updates the document to the newest object model (i.e. DOCX)
Can now compare side-by-side without a problem.
Reverse the compares of your documents and it should be fine:
For new documents
Document doc1 = this.word.Documents.Add(ref missing, ref missing, ref missing, ref missing);
Document doc2 = this.word.Documents.Add(ref missing, ref missing, ref missing, ref missing);
object o = doc1;
doc2.Windows.CompareSideBySideWith(ref o);
For existing documents
object missing = System.Reflection.Missing.Value;
object newFilename1 = "C:\\Test\\Test1.docx";
Document doc1 = this.word.Documents.Open(ref newFilename1, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
object newFilename2 = "C:\\Test\\Test2.docx";
Document doc2 = this.word.Documents.Open(ref newFilename2, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
object o = doc1;
doc2.Windows.CompareSideBySideWith(ref o);
If your app isn't visible or you are launching a new instance of Word, you should set this.word.Visible = true; before running the opening of documents as CompareSideBySideWith is a UI routine.