Word Document SaveAs2 in Format wdFormatDocument97 - c#

I'm using Microsoft Interop Word version in order to create a new Word document, insert some text into it, and save it.
When I'm saving it using the following command:
the document is saved in format DOCX.
But when I'm saving it using the following command:
document.SaveAs2(wordFilePath, Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatDocument97);
the document is seemingly saved as Word-97 DOC (Windows explorer display it with Word-97 DOC icon and type), but it is really internally saved as DOCX (I can see this in two ways: it has the same size of the corresponding DOCX, and when I open it with Word-2016 and select SaveAs, the default save format is DOCX!).
How can I save a document in real document-97 format?
Here's the function used to create a new Word document, whose type depends on the extension (DOC vs. DOCX) of given file path:
public static void TextToMsWordDocument(string body, string wordFilePath)
Microsoft.Office.Interop.Word.Application winword = new Microsoft.Office.Interop.Word.Application();
winword.Visible = false;
object missing = System.Reflection.Missing.Value;
Microsoft.Office.Interop.Word.Document document = winword.Documents.Add(ref missing, ref missing, ref missing, ref missing);
if (body != null)
document.Content.SetRange(0, 0);
document.Content.Text = (body + System.Environment.NewLine);
if (System.IO.Path.GetExtension(wordFilePath).ToLower() == "doc")
document.SaveAs2(wordFilePath, Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatDocument97);
else // Assuming a "docx" extension:
document.Close(ref missing, ref missing, ref missing);
document = null;
winword.Quit(ref missing, ref missing, ref missing);
winword = null;
And here's the code used to call this function:
TextToMsWordDocument("abcdefghijklmnopqrstuvwxyz", "text.doc");
TextToMsWordDocument("abcdefghijklmnopqrstuvwxyz", "text.docx");

It's been a rather stupid error...compare ‘ == ".doc" ’ instead of ‘ == "doc"...
I didn't notice it due to the fact that when SaveAs2 received a file path with extension ".doc" and no WdSaveFormat, it - strangely enough- created a Word document file that had the problem I explained here...


How to compare image (Shape) present in each page of word document through Microsoft.Interop.Word using C#.Net?

I am using following code to replace image (Shape in Microsoft.Interop.Office.Word) of the word document with new image but what the requirement from client is that I need to check the 1st Image of the 1st page of the word document and then compare this image with image of the rest of the document and if match it get replaced with new image else not so need help on how can we compare two shapes(Images)
public void ReplaceWordImage(string FilePath)
Word.Document d = new Word.Document();
Word.Application WordApp;
WordApp = new Microsoft.Office.Interop.Word.Application();
bool headerImage = false;
object missing = System.Reflection.Missing.Value;
object yes = true;
object no = false;
object filename = #"D:/ImageToReplace/5.docx";
d = WordApp.Documents.Open(ref filename, ref missing, ref no, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing, ref yes, ref missing, ref missing, ref missing, ref missing);
List<Word.ShapeRange> ranges = new List<Microsoft.Office.Interop.Word.ShapeRange>();
List<Word.ShapeRange> headerRanges = new List<Microsoft.Office.Interop.Word.ShapeRange>();
foreach (Word.Shape shape in d.Shapes)
if (shape.Type == Microsoft.Office.Core.MsoShapeType.msoPicture)
foreach (Word.Range r in ranges)
`enter code here` {
r.InlineShapes.AddPicture(#"D:\Untitled.jpg", ref missing, ref missing);
The Word object model doesn't provide anything to compare two images. The best what you could do is to save both on the disk and then try comparing the bytes representation of both. However, there is a better way to get the job done. The answer is the Open XML SDK which allows getting the bytes representation of images on the fly without saving them to a disk before. The Open XML SDK contains a class WordprocessingDocument that can manipulate a memory stream containing a WordDocument content. And MemoryStream can be converted using ToArray() to a byte[]. See Convert Word of interop object to byte [] without saving physically for more information.

How to programmatically print to PDF file without prompting for filename in C# using the Microsoft Print To PDF printer that comes with Windows 10

Microsoft Windows 10 comes with a Microsoft Print To PDF printer which can print something to a PDF file. It prompts for the filename to download.
How can I programmatically control this from C# to not prompt for the PDF filename but save to a specific filename in some folder that I provide?
This is for batch processing of printing a lot of documents or other types of files to a PDF programmatically.
To print a PrintDocument object using the Microsoft Print to PDF printer without prompting for a filename, here is the pure code way to do this:
// generate a file name as the current date/time in unix timestamp format
string file = (DateTime.UtcNow.Subtract(new DateTime(1970, 1, 1))).TotalSeconds.ToString();
// the directory to store the output.
string directory = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
// initialize PrintDocument object
PrintDocument doc = new PrintDocument() {
PrinterSettings = new PrinterSettings() {
// set the printer to 'Microsoft Print to PDF'
PrinterName = "Microsoft Print to PDF",
// tell the object this document will print to file
PrintToFile = true,
// set the filename to whatever you like (full path)
PrintFileName = Path.Combine(directory, file + ".pdf"),
You can also use this method for other Save as File type printers such as Microsoft XPS Printer
You can print to the Windows 10 PDF printer, by using the PrintOut method and specifying the fourth output file name parameter, as in the following example:
/// <summary>
/// Convert a file to PDF using office _Document object
/// </summary>
/// <param name="InputFile">Full path and filename with extension of the file you want to convert from</param>
/// <returns></returns>
public void PrintFile(string InputFile)
// convert input filename to new pdf name
object OutputFileName = Path.Combine(
// Set an object so there is less typing for values not needed
object missing = System.Reflection.Missing.Value;
// `doc` is of type `_Document`
ref missing, // Background
ref missing, // Append
ref missing, // Range
OutputFileName, // OutputFileName
ref missing, // From
ref missing, // To
ref missing, // Item
ref missing, // Copies
ref missing, // Pages
ref missing, // PageType
ref missing, // PrintToFile
ref missing, // Collate
ref missing, // ActivePrinterMacGX
ref missing, // ManualDuplexPrint
ref missing, // PrintZoomColumn
ref missing, // PrintZoomRow
ref missing, // PrintZoomPaperWidth
ref missing, // PrintZoomPaperHeight
The OutputFile is a full path string of the input document you would like to convert, and the doc is a regular document object. For more info about the doc please see the following MSDN links for _Document.PrintOut()
Office 2003
Office 2013 and later
The PrintOut in the example results a silent print, when you print through the specified inputFile to the OutputFileName, which will be placed in the same folder as the original document, but it will be in PDF format with the .pdf extension.

Embbed outlook mails into word document

Is there a way to embbed outlook mailitems into word document programatically from a Outlook Mailitem List.??
I am trying to achieve something like this
Word.Application wdApp = new Word.Application();
Word.Document wdDoc = wdApp.Documents.Add(ref missing, ref missing, ref missing,
ref missing);
foreach(Outlook.MailItem olMail in mailAttachments)
//Paste/embbed this olMail into the word document
Ya Finally i found an effective solution
I used the InlineShapes.AddOLEObject method
My solution:
static void creatDocument(List<Outlook.MailItems> mailAttachments)
string userprofile = Environment.GetFolderPath(Environment.SpecialFolder.UserProfile);
object missing = System.Reflection.Missing.Value
object start=0;
object end =0;
object classType ="{00020D0B-0000-0000-C000-000000000046}";
object fileName;
object linkToFile = false;
object displayAsIcon = true;
object iconFileName = Path.Combine(userprofile,"Pictures\MailIcon.ico");
object iconIndex =0;
object iconLabel;
object range;
Word.Application wdApp=new Word.Application();
Word.Document wdDoc = wdApp.Documents.Add(ref missing, ref missing, ref missing, ref missing);
Range rng = wdDoc.Range(ref start,ref missing);
foreach(outlook.MailItem olMail in mailAttachments)
olMail.SaveAs(Path.Combine(userprofile,"Documents\TemperoraySave") + CleanFileName(olMail.Subject) + ".msg" ,Outlook.OlSaveAsType.olMsg);
fileName = Path.Combine(userprofile,"Documents\TemperoraySave") + CleanFileName(olMail.Subject) + ".msg"
iconLabel = CleanFIleName(olMail.Subject) + ".msg";
rng = wdDoc.Content;
range = rng;
wdDoc.InLineShapes.AddOLEObject(ref classType,ref fileName,ref linkToFile,ref displayAsIcon,ref iconFIleName,ref iconIndex,ref iconLabel,ref range);
var mailRanger = wdDoc.Paragraphs.Add();
mailRanger.Format.SpaceAfter =10f;
private static string CleanFileName(string fileName)
return Path.GetInvalidFileNameChars().Aggregate(fileName, (current, c) => current.Replace(c.ToString(), string.Empty));
Nope. The Word object model doesn't provide anything for that. Instead, you may consider using the CustomDocumentPropertiesenter link description here collection for storing your custom data. For example, you may save the message as an .msg file and save the path to the file or the ID of the record in the database to a custom document property. After, when you need to open a message you can get the ID or path for retrieving the email message.
You can't embed the source text of the emails, but you can copy the MailItem.HTMLBody or MailItem.Body (text) values and insert them into the Word document.

Getting a textbox value from a word document using ASP.NET?

I have a very basic web application written in ASP.NET(C#) and a basic Microsoft Word (2007) document that contains a text box and a dropdown list.
In my web application code behind file I would like to call the textbox control and a dropdown control by name and extract the values from them.
Any documentation that I have found online simply reads or writes a word document but I can't seem to find anything on accessing controls and extracting the values from them.
Any help would be greatly appreciated
Thank You
This is the only code that I have at the minute that does anything with the word document. It finds the word doc and opens it:
//File path of the word document that contains the required values
string filePath = #"C:\Users\murphycm\Desktop\PlacesFile.docm";
object fileToOpen = (object)filePath;
Microsoft.Office.Interop.Word.Application oWord = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document oWordDoc = new Microsoft.Office.Interop.Word.Document();
oWordDoc = oWord.Documents.Open(ref fileToOpen);
Unless you are going to install Microsoft Office on your server, I would recommend using the Open XML SDK 2.5 from Microsoft. With the SDK you can manipulate Microsoft Office documents for Office 2007 and higher:
Here's some code for getting text from a TextBox using both the OpenXML and Office Interop methods:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using Word = Microsoft.Office.Interop.Word;
namespace OpenXMLSDKTest
class Program
static void Main(string[] args)
// Open XML Method
object fileName = #"OpenXmlTest.docx";
using (WordprocessingDocument myDocument = WordprocessingDocument.Open(fileName.ToString(), true))
var textbox = myDocument.MainDocumentPart.Document.Descendants<TextBoxContent>().First();
// Office Interop Method
object missing = System.Reflection.Missing.Value;
object readOnly = false;
object isVisible = true;
Word.Application wordApp = new Microsoft.Office.Interop.Word.Application();
wordApp.Documents.Open(ref fileName, ref missing, ref readOnly, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref isVisible, ref missing, ref missing, ref missing, ref missing);
object firstShape = 1;
string textFrameText = wordApp.ActiveDocument.Shapes.get_Item(ref firstShape).TextFrame.TextRange.Text;
wordApp.Quit(ref missing, ref missing, ref missing);
Console.WriteLine("Press any key to continue...");
public List<string> GetTagsFromNewTemplate(string filePath)
var tags = new HashSet<string>();
using (WordprocessingDocument myDocument = WordprocessingDocument.Open(filePath, false))
var textbox = myDocument.MainDocumentPart.Document.Descendants<DocumentFormat.OpenXml.Wordprocessing.Tag>().Select(x => x.Val);
textbox.ForEach(x => tags.Add(x));
return tags.Distinct().ToList();

How can I convert an RTF file to a pdf file?

How can I convert an RTF file to a PDF one? I have the adobe PDF printer, should I use it? If so, how can I programmatically access it?
You can use a PDF printer, but then you still have a few problems to solve.
In order to handle text that spans multiple pages, you need this article to create a descendant of RichTextbox that handles the EM_FORMATRANGE Message.
There are a lot of (free) PDF printer out there, but I found that only BioPdf will let you control the filename of the output. They also have reasonable rates for licensed versions.
I have used this to create complex reports (combinations of multiple RTF segments and custom graphics) as attachments for emailing.
You could use the virtual print Driver doPdf http://www.dopdf.com/ if this is permitted on the production machine. This will convert more or less any file type to a pdf format not just rtf. It just appears as another printer within Print Manager once installed.
To use it in say winforms code I adapted the code found on the msdn printing example http://msdn.microsoft.com/en-us/library/system.drawing.printing.printdocument.aspx
private void button1_Click(object sender, EventArgs e)
streamToPrint = new System.IO.StreamReader
printFont = new Font("Arial", 10);
PrintDocument pd = new PrintDocument();
pd.PrinterSettings.PrinterName = "doPDF v6";//<-------added
pd.PrintPage += new PrintPageEventHandler
catch (Exception ex)
The only part of the code I needed to add was that marked above e.g. pd.PrinterSettings.PrinterName = "doPDF v6";
There may be a printer enumeration method which would be more elegant and robust and against this one could test to see if the print driver existed perhaps against a config file setting.
Handling multiple pages is taken care of in this method : this.pd_PrintPage as per the msdn sample.
PrintDocument supports from and to page printing.
DoPdf will pops up a fileSaveAsDialog box automatically so the files can be saved as a pdf document.
What about rtf though?
A Microsoft format not supported very well so it would seem. This article http://msdn.microsoft.com/en-us/library/ms996492.aspx with demo code uses the RichTextBox as a starting point and by using P/Invoke leverages the power of Win32 to print RTF as WYSIWG. The control defines it's own page length method replacing the one used above in the code snippet and still uses PrintDocument so it should be easy to use. You can assign any rtf using Rtb.rtf method.
An RTF document has to be read and interpreted by some app that can understand that format. You would need to programmatically launch that app, load your RTF file, and send it to the PDF printer. Word would be good for that, since it has a nice .NET interface. An overview of the steps would be:
ApplicationClass word = new ApplicationClass();
Document doc = word.Documents.Open(ref filename, ...);
You will need to use the Microsoft.Office.Interop.Word namespace and add a reference to the Microsoft.Office.Interop.Word.dll assembly.
Actually, none of these are terribly reliable or do what I want. The solution is simple, install Adobe Acrobat and just have it open the RTF file using the Process class.
I also found a more reasonable approach. I save the file as an RTF, the open it in word, and save it as PDF (Word's Print As PDF plugin must be installed)
SaveFileDialog sfd = new SaveFileDialog();
sfd.Filter = "Personal Document File (*.pdf)|*.pdf";
if (sfd.ShowDialog() == DialogResult.OK) {
String filename = Path.GetTempFileName() + ".rtf";
using (StreamWriter sw = new StreamWriter(filename)) {
Object oMissing = System.Reflection.Missing.Value; //null for VB
Object oTrue = true;
Object oFalse = false;
Microsoft.Office.Interop.Word.Application oWord = new Microsoft.Office.Interop.Word.Application();
Microsoft.Office.Interop.Word.Document oWordDoc = new Microsoft.Office.Interop.Word.Document();
oWord.Visible = false;
Object rtfFile = filename;
Object saveLoc = sfd.FileName;
Object wdFormatPDF = 17; //WdSaveFormat Enumeration
oWordDoc = oWord.Documents.Add(ref rtfFile, ref oMissing, ref oMissing, ref oMissing);
oWordDoc.SaveAs(ref saveLoc, ref wdFormatPDF, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing,
ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing, ref oMissing);
oWordDoc.Close(ref oFalse, ref oMissing, ref oMissing);
oWord.Quit(ref oFalse, ref oMissing, ref oMissing);
//Get the MD5 hash and save it with it
FileStream file = new FileStream(sfd.FileName, FileMode.Open);
MD5 md5 = new MD5CryptoServiceProvider();
byte[] retVal = md5.ComputeHash(file);
using (StreamWriter sw = new StreamWriter(sfd.FileName + ".md5")) {
sw.WriteLine(sfd.FileName + " - " + DateTime.Now.ToLongDateString() + " " + DateTime.Now.ToShortTimeString() + " md5: " + BinaryToHexConverter.To64CharChunks(retVal)[0]);
