WordprocessingDocument doesnt save as expected in the memory stream - c#

I am trying to make modifications in word document. For some reason the row that I added doesnt get saved in the memory. What I am doing wrong, no errors just the changes are not saved.
public void Generate()
{
byte[] templateDoc = File.ReadAllBytes(#"C:\Desktop\Test\11.docx");
using (MemoryStream stream = new MemoryStream())
{
stream.Write(templateDoc, 0, (int)templateDoc.Length);
using (var document1 = WordprocessingDocument.Open(stream, isEditable: true))
{
foreach (var item in document1.MainDocumentPart.Document.Body)
{
if (item.InnerText.Contains("<test>"))
{
DocumentFormat.OpenXml.Wordprocessing.Table table = new DocumentFormat.OpenXml.Wordprocessing.Table();
DocumentFormat.OpenXml.Wordprocessing.TableRow tr1 = new DocumentFormat.OpenXml.Wordprocessing.TableRow();
DocumentFormat.OpenXml.Wordprocessing.TableCell tc1 = new DocumentFormat.OpenXml.Wordprocessing.TableCell();
tc1.Append(new TableCellProperties(new TableCellWidth() { Type = TableWidthUnitValues.Pct, Width = "50" }));
tc1.Append(new DocumentFormat.OpenXml.Wordprocessing.Paragraph(new Run(new Text("Input 1:"))));
tr1.Append(tc1);
table.Append(tr1);
document1.MainDocumentPart.Document.Body.Append(table);
}
}
File.WriteAllBytes(#"C:\Desktop\Test\22.docx", stream.ToArray());
}
}
}

Related

Set programatically created ReportBook as HTML5 ReportSource

A user can select multiple orders, and download all the reports as one PDF.
We used PdfSmartCopy to merge the reports:
protected void Print(int[] order_ids)
{
byte[] merged_reports;
using (MemoryStream ms = new MemoryStream())
using (Document doc = new Document())
using (PdfSmartCopy copy = new PdfSmartCopy(doc, ms))
{
doc.Open();
foreach (string order_id in order_ids)
{
Telerik.Reporting.InstanceReportSource reportSource = new Telerik.Reporting.InstanceReportSource();
reportSource.ReportDocument = new OrderReport();
reportSource.Parameters.Add(new Telerik.Reporting.Parameter("order_id", order_id));
RenderingResult result = new ReportProcessor().RenderReport("PDF", reportSource, new Hashtable());
using (PdfReader reader = new PdfReader(result.DocumentBytes))
{
copy.AddDocument(reader);
}
}
doc.Close();
merged_reports = ms.ToArray();
}
Response.Clear();
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.Expires = -1;
Response.Buffer = false;
Response.ContentType = "application/pdf";
Response.OutputStream.Write(merged_reports, 0, merged_reports.Length);
}
But we started using the HTML5 ReportViewer elsewhere and we want to use it there as well to be consistent. I thought of creating a ReportBook programmatically and set it as the ReportSource of the ReportViewer, but the only thing I can set is a string. We have already used ReportBook before, but this was an actual SomeReportBook.cs that we could set through new SomeReportBook().GetType().AssemblyQualifiedName;.
Any clue? Here is what I have at the moment:
protected void Print(int[] order_ids)
{
Telerik.Reporting.ReportBook reportBook = new Telerik.Reporting.ReportBook();
foreach (string order_id in order_ids)
{
Telerik.Reporting.InstanceReportSource reportSource = new Telerik.Reporting.InstanceReportSource();
reportSource.ReportDocument = new OrderReport();
reportSource.Parameters.Add(new Telerik.Reporting.Parameter("order_id", order_id));
reportBook.ReportSources.Add(reportSource);
}
this.ReportViewer.ReportSource = new Telerik.ReportViewer.Html5.WebForms.ReportSource()
{
Identifier = // Can't use reportBook.GetType().AssemblyQualifiedName
};
}
I have also struggled with this challenge for quite some time; I would to share in case
someone else faces such a challenge. Kindly do this.
1.Create a class that inherits from - Telerik.Reporting.ReportBook
2.Create a method that loads all your reports in your reportbook class i.e.
this.ReportSources.Add(new TypeReportSource
{
TypeName = typeof(Report1).AssemblyQualifiedName
});
Call you method in your class constructor
use the following code to set the report viewer source
var reportSource = new Telerik.ReportViewer.Html5.WebForms.ReportSource();
reportSource.IdentifierType = IdentifierType.TypeReportSource;
reportSource.Identifier = typeof(ReportCatalog).AssemblyQualifiedName;//or
namespace.class, assembly e.g. "MyReports.Report1, MyReportsLibrary"
reportSource.Parameters.Add("Parameter1", "Parameter1");
reportSource.Parameters.Add("Parameter2", "Parameter2");
ReportsViewer1.ReportSource = reportSource;
Report1 = Newly created class that inherits from Telerik.Reporting.ReportBook

Unable to read Shapes from DOCX file using OpenxmlSDK

I have a requirement where in I have to parse a DOCX file and extract all the text and images. I am using OpenxmlSDK 2.5 to achieve this. I am able to parse the images and text but the DOCX also have a group of Shapes which I trying to parse and convert them to Drawing images, which is giving me wrong results.
Here is the Sample docx file which I am trying to parse.
I referred the this Stack overflow discussion and tried the same way but no luck.
The resulting DOCX which I am creating with following code is not having any parsed images.
using System.Collections.Generic;
using System.Linq;
using System.IO;
using System.Drawing;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using DocumentFormat.OpenXml.Vml;
using DocumentFormat.OpenXml;
namespace ReadGroupShape
{
class Program
{
static List<Bitmap> images = new List<Bitmap>();
static void Main(string[] args)
{
MainDocumentPart mainPart = null;
Body content = null;
WordprocessingDocument newDoc = WordprocessingDocument.Create("NewDocx.docx", WordprocessingDocumentType.Document);
MainDocumentPart newMainPart = newDoc.AddMainDocumentPart();
newMainPart.Document = new Document();
Body newbody = newMainPart.Document.AppendChild(new Body());
byte[] docBytes = File.ReadAllBytes("SampleDoc.docx");
using (MemoryStream ms = new MemoryStream())
{
ms.Write(docBytes, 0, docBytes.Length);
using (WordprocessingDocument wpDoc = WordprocessingDocument.Open(ms, true))
{
mainPart = wpDoc.MainDocumentPart;
content = mainPart.Document.Body;
foreach (Paragraph par in content.Descendants<Paragraph>())
{
Paragraph npar = newbody.AppendChild(new Paragraph());
foreach (Run run in par.Descendants<Run>())
{
Run nrun = npar.AppendChild(new Run());
DocumentFormat.OpenXml.Drawing.Blip pic = run.Descendants<DocumentFormat.OpenXml.Drawing.Blip>().FirstOrDefault();
ImageData imageData = run.Descendants<ImageData>().FirstOrDefault();
if (pic == null && imageData == null)
{
nrun.InsertAfterSelf(run.CloneNode(true));
}
else
{
if (pic != null)
{
nrun.InsertAfterSelf(CreateImageFromBlip(wpDoc, run, newMainPart, pic));
}
else if (imageData != null)
{
nrun.InsertAfterSelf(CreateImageFromShape(wpDoc, run, newMainPart, imageData));
}
}
}
}
mainPart.Document.Save();
}
}
newMainPart.Document.Save();
newDoc.Close();
}
private static Run CreateImageFromShape(WordprocessingDocument sourceDoc, Run sourceRun, MainDocumentPart mainpart, ImageData imageData)
{
ImagePart p = sourceDoc.MainDocumentPart.GetPartById(imageData.RelationshipId) as ImagePart;
return CreateImageRun(sourceDoc, sourceRun, mainpart, p);
}
private static Run CreateImageFromBlip(WordprocessingDocument sourceDoc, Run sourceRun, MainDocumentPart mainpart, DocumentFormat.OpenXml.Drawing.Blip blip)
{
ImagePart newPart = mainpart.AddImagePart(ImagePartType.Png);
ImagePart p = sourceDoc.MainDocumentPart.GetPartById(blip.Embed.Value) as ImagePart;
Bitmap image = new Bitmap(p.GetStream());
using (Stream s = p.GetStream())
{
s.Position = 0;
newPart.FeedData(s);
}
string partId = mainpart.GetIdOfPart(newPart);
Drawing newImage = CreateImage(partId);
return new Run(newImage);
}
private static Run CreateImageRun(WordprocessingDocument sourceDoc, Run sourceRun, MainDocumentPart mainpart, ImagePart p)
{
ImagePart newPart = mainpart.AddImagePart(ImagePartType.Png);
using (Stream s = p.GetStream())
{
s.Position = 0;
newPart.FeedData(s);
}
string partId = mainpart.GetIdOfPart(newPart);
Drawing newImage = CreateImage(partId);
return new Run(newImage);
}
private static Drawing CreateImage(string relationshipId)
{
// Define the reference of the image.
return new Drawing(
new DocumentFormat.OpenXml.Drawing.Wordprocessing.Inline(
new DocumentFormat.OpenXml.Drawing.Wordprocessing.Extent() { Cx = 990000L, Cy = 792000L },
new DocumentFormat.OpenXml.Drawing.Wordprocessing.EffectExtent()
{
LeftEdge = 0L,
TopEdge = 0L,
RightEdge = 0L,
BottomEdge = 0L
},
new DocumentFormat.OpenXml.Drawing.Wordprocessing.DocProperties()
{
Id = (UInt32Value)1U,
Name = "Picture 1"
},
new DocumentFormat.OpenXml.Drawing.Wordprocessing.NonVisualGraphicFrameDrawingProperties(
new DocumentFormat.OpenXml.Drawing.GraphicFrameLocks() { NoChangeAspect = true }),
new DocumentFormat.OpenXml.Drawing.Graphic(
new DocumentFormat.OpenXml.Drawing.GraphicData(
new DocumentFormat.OpenXml.Drawing.Picture(
new DocumentFormat.OpenXml.Drawing.NonVisualPictureProperties(
new DocumentFormat.OpenXml.Drawing.NonVisualDrawingProperties()
{
Id = (UInt32Value)0U,
Name = "New Bitmap Image.jpg"
},
new DocumentFormat.OpenXml.Drawing.NonVisualPictureDrawingProperties()),
new DocumentFormat.OpenXml.Drawing.BlipFill(
new DocumentFormat.OpenXml.Drawing.Blip(
new DocumentFormat.OpenXml.Drawing.BlipExtensionList(
new DocumentFormat.OpenXml.Drawing.BlipExtension()
{
Uri =
"{28A0092B-C50C-407E-A947-70E740481C1C}"
})
)
{
Embed = relationshipId,
CompressionState =
DocumentFormat.OpenXml.Drawing.BlipCompressionValues.Print
},
new DocumentFormat.OpenXml.Drawing.Stretch(
new DocumentFormat.OpenXml.Drawing.FillRectangle())),
new DocumentFormat.OpenXml.Drawing.ShapeProperties(
new DocumentFormat.OpenXml.Drawing.Transform2D(
new DocumentFormat.OpenXml.Drawing.Offset() { X = 0L, Y = 0L },
new DocumentFormat.OpenXml.Drawing.Extents() { Cx = 990000L, Cy = 792000L }),
new DocumentFormat.OpenXml.Drawing.PresetGeometry(
new DocumentFormat.OpenXml.Drawing.AdjustValueList()
)
{ Preset = DocumentFormat.OpenXml.Drawing.ShapeTypeValues.Rectangle }))
)
{ Uri = "http://schemas.openxmlformats.org/drawingml/2006/picture" })
)
{
DistanceFromTop = (UInt32Value)0U,
DistanceFromBottom = (UInt32Value)0U,
DistanceFromLeft = (UInt32Value)0U,
DistanceFromRight = (UInt32Value)0U,
EditId = "50D07946"
});
}
}
}
What is that I am missing? Could anyone please help me to parse the shapes and images?
Thanks.

Create PDF from existing pdf with azure storage

I made a bot application with the Microsoft Botbuilder. Now I want to create a pdf-file from the user input. The file should be stored in my azure storage.
I have a "pdf-template" which should be copied and modified (this file is in the azure storage already). It has some textboxes which should be filled with the user input. I already wrote the code for that with iTextSharp.
But I need a filestream for this code. Does anybody know how to get the filestream from the file in my azure storage? Or is there maybe another way to finish my task?
Edit:
Here is the code where I need the filestream
string fileNameExisting = Path.Combine(Directory.GetCurrentDirectory(), "Some.pdf");
string fileNameNew = #"Path/Some2.pdf";
var inv = new Invention
{
Inventor = new Inventor { Firstname = "TEST!", Lastname= "TEST!" },
Date = DateTime.Now,
Title = "TEST",
Slogan = "TEST!",
Description = "TEST!",
Advantages = "TEST!s",
TaskPosition = "TEST!",
TaskSolution = "TEST!"
};
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
// Open existing PDF
var pdfReader = new PdfReader(existingFileStream);
// PdfStamper, which will create
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
var props = fieldKey.Split('.');
string t = GetProp(props, inv);
form.SetField(fieldKey, t);
}
stamper.Close();
pdfReader.Close();
}
}
public static string GetProp(string[] classes, object oldObj)
{
var obj = oldObj.GetType().GetProperty(classes[0]).GetValue(oldObj, null);
if(classes.Length>1)
{
classes = classes.Skip(1).ToArray();
return GetProp(classes, obj);
}
Console.WriteLine(obj.ToString());
return obj.ToString();
}
The PdfReader constructor also takes a byte array. You should be able to create the object using something like:
var pdfTemplateBytes = await new WebClient().DownloadDataTaskAsync("https://myaccount.blob.core.windows.net/templates/mytemplate.pdf");
var pdfReader = new PdfReader(pdfTemplateBytes );

How to remove plurals in Lucene.NET?

I'm trying to extract some keywords from a text. It works quite fine but I need to remove plurals.
As I'm already using Lucene for searching purpose, I'm trying to use it to extract keyword from indexed terms.
1st, I index the document in a RAMDirectory index,
RAMDirectory idx = new RAMDirectory();
using (IndexWriter writer =
new IndexWriter(
idx,
new CustomStandardAnalyzer(StopWords.Get(this.Language),
Lucene.Net.Util.Version.LUCENE_30, this.Language),
IndexWriter.MaxFieldLength.LIMITED))
{
writer.AddDocument(createDocument(this._text));
writer.Optimize();
}
Then, I extract the keywords:
var list = new List<KeyValuePair<int, string>>();
using (var reader = IndexReader.Open(directory, true))
{
var tv = reader.GetTermFreqVector(0, "text");
if (tv != null)
{
string[] terms = tv.GetTerms();
int[] freq = tv.GetTermFrequencies();
for (int i = 0; i < terms.Length; i++)
list.Add(new KeyValuePair<int, string>(freq[i], terms[i]));
}
}
in the list of terms I can have terms like "president" and "presidents"
How could I remove it?
My CustomStandardAnalyzer use this:
public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
{
//create the tokenizer
TokenStream result = new StandardTokenizer(this.version, reader);
//add in filters
result = new Lucene.Net.Analysis.Snowball.SnowballFilter(result, this.getStemmer());
result = new LowerCaseFilter(result);
result = new ASCIIFoldingFilter(result);
result = new StopFilter(true, result, this.stopWords ?? StopWords.English);
return result;
}
So I already use the SnowballFilter (with the correct language specific stemmer).
How could I remove plurals?
My output from the following program is:
text:and
text:presid
text:some
text:text
text:with
class Program
{
private class CustomStandardAnalyzer : Analyzer
{
public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
{
//create the tokenizer
TokenStream result = new StandardTokenizer(Lucene.Net.Util.Version.LUCENE_30, reader);
//add in filters
result = new Lucene.Net.Analysis.Snowball.SnowballFilter(result, new EnglishStemmer());
result = new LowerCaseFilter(result);
result = new ASCIIFoldingFilter(result);
result = new StopFilter(true, result, new HashSet<string>());
return result;
}
}
private static Document createDocument(string text)
{
Document d = new Document();
Field f = new Field("text", "", Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS);
f.SetValue(text);
d.Add(f);
return d;
}
static void Main(string[] args)
{
RAMDirectory idx = new RAMDirectory();
using (IndexWriter writer =
new IndexWriter(
idx,
new CustomStandardAnalyzer(),
IndexWriter.MaxFieldLength.LIMITED))
{
writer.AddDocument(createDocument("some text with president and presidents"));
writer.Commit();
}
using (var reader = IndexReader.Open(idx, true))
{
var terms = reader.Terms(new Term("text", ""));
if (terms.Term != null)
do
Console.WriteLine(terms.Term);
while (terms.Next());
}
Console.ReadLine();
}
}

OpenXML insert into content control Missing Word.Text

I have written the following code to insert some text in a contentcontrol in the footer of a document.
oItem.File.CheckOut();
byte[] byteArray = oItem.File.OpenBinary();
using (MemoryStream mem = new MemoryStream())
{
mem.Write(byteArray, 0, (int)byteArray.Length);
using (WordprocessingDocument wp = WordprocessingDocument.Open(mem, true))
{
Boolean foundInFooter = false;
MainDocumentPart mainPart = wp.MainDocumentPart;
foreach (FooterPart footerPart in mainPart.FooterParts)
{
Word.Footer footer = footerPart.Footer;
foreach (Word.SdtElement sdt in footer.Descendants<Word.SdtElement>().ToList())
{
Word.SdtAlias alias = sdt.Descendants<Word.SdtAlias>().FirstOrDefault();
if (alias.Val.Value == "Revisionsnummer")
{
foundInFooter = true;
if (sdt.Descendants<Word.Text>().FirstOrDefault() != null)
{
sdt.Descendants<Word.Text>().FirstOrDefault().Text = (string)oItem["Version"];
}
}
}
}
}
}
for some reason sometimes the sdt.Descendants<Word.Text>().FirstOrDefault() return null so i cant insert text. Is there anyway in theese cases to add the Word.Text ?
The premise of .First/Single/OrDefault is so that you can check the result of your expression prior to using it. e.g.
var obj = sdt.Descendants<Word.Text>().FirstOrDefault();
if(obj!=null)
{
obj.Text = (string)oItem["Version"];
}
else
{
...
}
If you automically try to assign a value to the result set of a OrDefault you will be setting yourself up for null reference exceptions.

Categories