Problem writing HTML content to Word document in ASP.NET - c#

I am trying to export the HTML page contents to Word.
My Html display page is:
What is your favourite color?
NA
List the top three school ?
one National
two Devs
three PS
And a button for click event. The button click event will open MS word and paste the page contents in word.
The word page contains the table property of html design page. It occurs only in Word 2003. But in word 2007 the word document contains the text with out table property. How can I remove this table property in word 2003.
I am not able to add the snapshots. Else i will make you clear.
I am designing the web page by aspx. I am exporting the web page content by the following code.
protected void Button1_Click(object sender, EventArgs e)
{
Response.ContentEncoding = System.Text.Encoding.UTF7;
System.Text.StringBuilder SB = new System.Text.StringBuilder();
System.IO.StringWriter SW = new System.IO.StringWriter();
System.Web.UI.HtmlTextWriter htmlTW = new System.Web.UI.HtmlTextWriter(SW);
tbl.RenderControl(htmlTW);
string strBody = "<html>" +
"<body>" + "<div><b>" + htmlTW.InnerWriter.ToString() + "</b></div>" +
"</body>" +
"</html>";
Response.AppendHeader("Content-Type", "application/msword");
Response.AppendHeader("Content-disposition", "attachment; filename=" + fileName);
Response.ContentEncoding = System.Text.Encoding.UTF7;
string fileName1 = "C://Temp/Excel" + DateTime.Now.Millisecond.ToString();
BinaryWriter writer = new BinaryWriter(File.Open(fileName1, FileMode.Create));
writer.Write(strBody);
writer.Close();
FileStream fs = new FileStream(fileName1, FileMode.Open, FileAccess.Read);
byte[] renderedBytes;
// Create a byte array of file stream length
renderedBytes = new byte[fs.Length];
//Read block of bytes from stream into the byte array
fs.Read(renderedBytes, 0, System.Convert.ToInt32(fs.Length));
//Close the File Stream
fs.Close();
FileInfo TheFile = new FileInfo(fileName1);
if (TheFile.Exists)
{
File.Delete(fileName1);
}
Response.BinaryWrite(renderedBytes);
Response.Flush();
Response.End();
}

You are writing HTML, claiming it is of content type "application/msword", then hoping for the best..
There are more "correct" ways to achieve your objective.
There are a few projects around for converting (X)HTML to WordML content, of which docx4j-ImportXHTML.NET is one. Disclosure: I maintain that; you can find links to others elsewhere here on StackOverflow.
Alternatively, you can use Word's altChunk mechanism, though note:
you have less control over how the import is performed;
AltChunk isn't supported by Word 2003 (even with the compatibility pack).

Question not clear.
Well, these links may help you in understanding MSWord-C# automation:
http://www.codeproject.com/KB/cs/Simple_Ms_Word_Automation.aspx
http://www.c-sharpcorner.com/UploadFile/amrish_deep/WordAutomation05102007223934PM/WordAutomation.aspx

You can also try to create an Open XML document that is now recognized by MS office.
Here is some more info with code samples:
http://msdn.microsoft.com/en-us/library/bb656295.aspx

From what I understand, you are trying to create a ms word document on the fly and are having difficulty when the output is viewed in Word 2003 vs. 2007.
In your code above, you are simply spitting out html and forcing it to be a ms word document. I'm surprised it even works.
Instead, you might want to use Office Interop (Microsoft.Office.Interop.Word) or install DocX using nuget. Look at some examples online, search for "C# create word doc".

Related

Download a file and keeping the original name

After I downloaded a file using this code:
using (FileStream fileStream = File.OpenRead(filePath))
{
MemoryStream memStream = new MemoryStream();
memStream.SetLength(fileStream.Length);
fileStream.Read(memStream.GetBuffer(), 0, (int)fileStream.Length);
Response.Clear();
Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
Response.AddHeader("Content-Disposition", "attachment; filename=" + item.filename);
Response.BinaryWrite(memStream.ToArray());
Response.TransmitFile(filePath);
Response.Flush();
Response.Close();
Response.End();
}
The code works very well, but once I open the docx file after the download it loses its original name and I get the message "the file is corrupt and cannot be opened". this only happened to me with the doc & docx files, I tried for xlsx, jpg, pdf, and it worked very well .
Does this have a relation with my code or is it something else?
I guess your filename has an extension like dotx or dot.
The "t" stands for "Template". The default action for "template" type documents in Microsoft Office is to create a new document from a copy of the template.
Look at the different look of Word document and Word template:
So, if you want the user to download a document and not a template, create a document from the template and send the document to the user with a docx extension.

iTextSharp - "Do you want to save" prompt when closing pdf

I've been attempting to create a pdf using iTextSharp and have run into an issue. Upon closing the pdf, Acrobat Reader prompts the user "Do you want to save changes..."
This seems to be a common issue, and there are probably a dozen questions on stack overflow about it, and just as many different solutions. I've tried as many solutions I can find, to no avail.
My code is below. I create a simple pdf with one paragraph, using MemoryStream and the PdfWriter. I then return the MemoryStream as an array, and I then use the response.outputstream to download the file to the client.
protected void lnkbtnDownloadPdf_Click(object sender, EventArgs e)
{
var Pdf = DownloadPdf();
Response.ContentType = "application/pdf;";
Response.AddHeader("Content-Disposition", "attachment; filename=" + "test.pdf");
Response.OutputStream.Write(Pdf, 0, Pdf.Length);
Response.OutputStream.Close();
}
public static byte[] DownloadPdf()
{
using (MemoryStream ms = new MemoryStream())
{
Document doc = new Document(PageSize.LETTER.Rotate());
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
doc.Open();
doc.Add(new Paragraph("testtesttesttesttesttestesttest"));
doc.Close();
writer.Close();
return ms.ToArray();
}
}
I've tried this - iTextSharp-generated PDFs now cause Save dialog in Adobe Reader X - and I still get the save dialog.
I've also tried to implement this - Using iTextSharp to write data to PDF works great, but Acrobat Reader asks 'Do you want to save changes' when closing file - but my program doesn't use the stamper. Bruno has an answer on that link as well mentioning the acroform dictionary, but I'm not sure how to remove entries from that dictionary, and the user who asked the question was unable to fix their issue doing that anyways.
I need to use the PdfWriter. I've also looked into using the filestream instead of the outputstream like mentioned here - iTextSharp-generated PDFs cause save dialog when closing - but I need to download the pdf to the client and not save it on disk.
After some time I figured out it wasn't an Itext problem at all (to my knowledge).
I added -
Response.End();
-at the end of the lnkbtnDownloadPdf_Click function and it worked. Acrobat no longer asks the user to save, when they close my PDFs.

Export Doc to PDF Asp.net

So I am using this code, to export a formview to Word.
Its works great..But I want it to export to PDF so that it cannot be edited. Or may be to a word doc so that not body can make changes.
protected void Button1_Click(object sender, EventArgs e)
{
Response.Clear();
Response.Buffer = true;
Response.AddHeader("content-disposition",
"attachment;filename=Report.doc");
Response.Charset = "";
Response.ContentType = "application/vnd.ms-word";
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
FormView1.DataBind();
FormView1.RenderControl(hw);
Response.Output.Write(sw.ToString());
Response.Flush();
Response.End();
}
The problem is, even when I change the content type and header element in the above code, it says that the output pdf has errors.
I really want to either convert the doc to pdf or generate pdf using this code.
Please help.
Thanks..
Your best bet to create PDFs in ASP.NET is to use a plug in like iTextSharp. I have used it in the past and it's very simple and free.
http://itextpdf.com/
As mentioned above, creating PDF using one of the existing libraries would be more efficient.
But if you're down to use interop, you can download save as pdf plugin for Microsoft Office.
And then pass "pdf" format to SaveAs method
Alternatively, you can apply several properties to your word document:
1. Mark as Final doc.Final = true;
2. Restrict editing
For newer version of Word, there's a Protect method, that provides a convenient way of restricting editing: http://msdn.microsoft.com/en-us/library/ms178793.aspx

Converting html strings in Excel file to formatted word file with .NET

Input are Excel files - the cells may contain some basic HTML formatting like <b>, <br>, <h2>.
I want to read the strings and insert the text as formatted text into word documents, i.e. <b>Foo</b> would be shown as a bold string in Word.
I don't know which tags are used so I need a "generic solution", a find/replace approach does not work for me.
I found a solution from January 2011 using the WebBrowser component. So the HTML is converted to RTF and the RTF is inserted into Word. I was wondering if there is a better solution today.
Using a commercial component is fine for me.
Update
I came across Matthew Manela's MarkupConverter class. It converts HTML to RTF. Then I use the clipboard to insert the snippet into the word file
// rtf contains the converted html string using MarkupConverter
Clipboard.SetText(rtf, TextDataFormat.Rtf);
// objTable is a table in my word file
objTable.Cell(1, 1).Range.Paste();
This works, but will copy/pasting up to a few thousand strings using the clipboard break anything?
You will need the OpenXML SDK in order to work with OpenXML. It can be quite tricky getting into, but it is very powerful, and a whole lot more stable and reliable than Office Automation or Interop.
The following will open a document, create an AltChunk part, add the HTML to it, and embed it into the document. For a broader overview of AltChunk see Eric White's blog
using (var wordDoc = WordprocessingDocument.Open("DocumentName.docx", true))
{
var altChunkId = "AltChunkId1";
var mainPart = wordDoc.MainDocumentPart;
var chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.Html, altChunkId);
using (var textStream = new MemoryStream())
{
var html = "<html><body>...</body></html>";
var data = Encoding.UTF8.GetBytes(html);
textStream.Write(data, 0, data.Length);
textStream.Position = 0;
chunk.FeedData(textStream);
}
var altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document.Body.InsertAt(altChunk, 0);
mainPart.Document.Save();
}
Obviously for your case, you will want to find (or build) the table you want and insert the AltChunk there instead of at the first position in the body. Note that the HTML that you insert into the word doc must be full HTML documents, with an <html> tag. I'm not sure if <body> is required, but it doesn't hurt. If you just have HTML formatted text, simply wrap the text in these tags and insert into the doc.
It seems that you will need to use Office Automation/Interop to get the table heights. See this answer which says that the OpenXML SDK does not update the heights, only Word does.
Use this code it is working..
Response.AppendHeader("content-disposition", "attachment;filename=FileEName.xls");
Response.Charset = "";
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.ContentType = "application/vnd.ms-excel";
this.EnableViewState = false;
//Response.Write("Your HTML Code");
Response.Write("<table border='1 px solid'><tr><th>sfsd</th><th>sfsdfssd</th></tr><tr>
<td>ssfsdf</td><td><table border='1 px solid'><tr><th>sdf</th><th>hhsdf</th></tr><tr>
<td>sdfds</td><td>sdhjhfds</td></tr></table></td></tr></table>");
Response.End();
Why not let WORD do its owns translation since it understands HTML.
Read your Excel cells
Write your values into a HTML textfile as it would be a WORD document.
Open WORD and let it read that HTML file.
Instruct WORD to save the document as a new WORD document (if that is required).

iTextSharp generated PDF: How to send the pdf to the client and add a prompt?

I have generated a pdf using iTextSharp, when its created it saves automatically in the location provided in my code on the server not on the client side and of course without telling anything to the user.
I need to send it to the client and I need to prompt a dialogue box to ask the user where he wants to save his pdf..
how can i do this please?
this is my pdf code:
using (MemoryStream myMemoryStream = new MemoryStream())
{
Document document = new Document();
PdfWriter PDFWriter = PdfWriter.GetInstance(document, myMemoryStream);
document.AddHeader("header1", "HEADER1");
document.Open();
//..........
document.Close();
byte[] content = myMemoryStream.ToArray();
// Write out PDF from memory stream.
using (FileStream fs = File.Create(HttpContext.Current.Server.MapPath("~\\report.pdf")))
{
fs.Write(content, 0, (int)content.Length);
}
EDIT
this is an example of the result i want
http://examples.extjs.eu/?ex=download
thanks to your replies ,I modified my code to this:
HttpContext.Current.Response.ContentType = "application/pdf";
HttpContext.Current.Response.AppendHeader( "Content-Disposition", "attachment; filename=test.pdf");
using (MemoryStream myMemoryStream = new MemoryStream())
{
Document document = new Document();
PdfWriter PDFWriter = PdfWriter.GetInstance(document, myMemoryStream);
document.AddHeader("Content-Disposition", "attachment; filename=wissalReport.pdf");
document.Open();
//..........
document.Close();
byte[] content = myMemoryStream.ToArray();
HttpContext.Current.Response.Buffer = false;
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ClearContent();
HttpContext.Current.Response.ClearHeaders();
HttpContext.Current.Response.AppendHeader("content-disposition","attachment;filename=" + "my_report.pdf");
HttpContext.Current.Response.ContentType = "Application/pdf";
//Write the file content directly to the HTTP content output stream.
HttpContext.Current.Response.BinaryWrite(content);
HttpContext.Current.Response.Flush();
HttpContext.Current.Response.End();
but i get this error:
Uncaught Ext.Error: You're trying to decode an invalid JSON String:
%PDF-1.4 %���� 3 0 obj <</Type/XObject/Subtype/Image/Width 994/Height 185/Length 13339/ColorSpace/DeviceGray/BitsPerComponent 8/Filter/FlateDecode>>stream x���|E�
...........
im absolutely sure my itextsharp to create pdf is correct because i can save it on the server, but thats not what i need to do ,when i try to send it to the client i got the error above
thanks in advance
In case of a web application you probably want to stream the pdf as binary to user, that would either open the pdf or prompt user to save the file.
Remember pdf generation is happening at server, even if user provides the path it won't be of any use on server. See following links -
How To Write Binary Files to the Browser Using ASP.NET and Visual C# .NET
In your case you are generating the file and hence will already be having a binary stream instead of file, hence you can directly use Response.BinaryWrite instead of Response.WriteFile.
Modified sample:
Response.Buffer = false;
Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
//Set the appropriate ContentType.
Response.ContentType = "Application/pdf";
//Write the file content directly to the HTTP content output stream.
Response.BinaryWrite(content);
Response.Flush();
Response.End();
You need to send a content disposition header to the users browser. From memory the code is something sort of like this:
Response.ContentType = "application/pdf";
Response.AppendHeader("Content-Disposition","attachment; filename=nameofthefile.pdf");
Currently you are saving your file on the file server, thereby overwriting the same pdf with every request. And probably causing errors if you get two requests for a PDF at the same time.
Use Response to return the PDF (from the memorystream) to the user, and skip the writing of the PDF to a file locally on your server.
The browser will ask the user where the file should be saved. Something like:
Response.ContentType = "Application/pdf";
myMemoryStream.CopyTo(Response.OutputStream);
Also look at the answer from Alun, using content-disposition you can propose a filename to the user.
SOLVED
The error is from the submit operation trying to interpret the response which it can not because it is not in a known format.
I just set window.location to download files and this works fine.
{
xtype:'button',
text: 'Generate PDF',
handler: function () {
window.location = '/AddData.ashx?action=pdf';
}
}
Instead of setting the location you can also do window.open().
Whether the file will be downloaded or opened depends on browser settings.
You do not need to use MemoryStream. Use Response.OutputStream instead. That's what it's there for. No need to use Response.BinaryWrite() or any other call to explicitly write the document either; iTextSharp takes care of writing to the stream when you use Response.OutputStream.
Here's a simple working example:
Response.ContentType = "application/pdf";
Response.AppendHeader(
"Content-Disposition",
"attachment; filename=test.pdf"
);
using (Document document = new Document()) {
PdfWriter.GetInstance(document, Response.OutputStream);
document.Open();
document.Add(new Paragraph("This is a paragraph"));
}
Here's how to add the proper HTTP headers. (getting the prompt to save the file) And if your code is in a web form, (button click handler), add Response.End() to the code example above after the using statement so that the web form's HTML output is not appended the PDF document.

Categories