Convert an image from bytes in xml to word document - c#

I've created an application in c# where an image is inserted into xml by turning it into bytes. How do i then convert this image from the xml into a word document?

This article might help you:
Inserting images into Word documents using XML
The main idea of the article is to construct a WordML (WordProcessingML) document fragment representing the image to be inserted and then calling Word's InsertXML function to place the image in the document.

Related

pdf extract to docx with unicode characters C#

I'm using the itext7 library to convert pdf file to Docx, the file includes Vietnamese text (type Unicode/utf8 ), some parts of it were converted correctly but some were not. Example: "DÇu th¶o méc (L¹c, võng, c¸m...)" stand for "Dầu thảo mộc( lạc, vừng, cám)". So how can I handle this problem? Do I need include some fonts?

PDF extract coordinates and create Nested XML files

I am trying extract all the words(chunks) / characters with coordinate from a searchable text PDF invoice / statement by iTextSharp using C# program , after getting coordinate, create an XML file, then read the XML file plot the data to DataGridView. I have tried some methods like iTestSharp.
iTextSharp extract each character and getRectangle
anyone could suggest a method to create an XML file with the following format XML :
<PDFExtract>
<PageLayout>Style</PageLayout>
<Page>
<Zone>
<Line>
<LOCX>298</LOCX>
<LOCY>199</LOCY>
<LOCW>1859</LOCW>
<LOCH>138</LOCH>
<WD>
<LOCX>298</LOCX>
<LOCY>199</LOCY>
<LOCW>139</LOCW>
<LOCH>69</LOCH>
<T>Start</T>
</WD>
<WD>
<LOCX>476</LOCX>
<LOCY>216</LOCY>
<LOCW>63</LOCW>
<LOCH>55</LOCH>
<T>Bucks</T>
</WD>
</Zone>
</Page>

Rtf to WordML Convert in C#

I have a windows application to generate report.
It has templates in RTF as "{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang2057{\\fonttbl{\\f0\\fnil\\fcharset0 Arial;}}\r\n\\viewkind4\\uc1\\pard\\fs20\\tab\\tab\\tab\\tab af\\par\r\n}\r\n", which is written to word doc file. then the word is Saved-As XML and close. Then, tags like (say) are extracted and some new
The problem here is Word, which is used as converter in the process and it consumes valuable time in Loop, where it opens word instance, save, close, delete.
Please correct any mistake if i have made and help me with an alternative to convert to WordML .
Use Aspose .Words
//your rtf string
string rtfStrx = "{\\rtf1\\ansi\\ansicpg1252\\deff0\\deflang2057{\\fonttbl{\\f0\\fnil\\fcharset0 Arial;}}\r\n\\viewkind4\\uc1\\pard\\fs20\\tab\\tab\\tab\\tab af\\par\r\n}\r\n"
//convert string to bytes for memory stream
byte[] rtfBytex = Encoding.UTF8.GetBytes(rtfStrx);
MemoryStream rtfStreamx = new MemoryStream(rtfBytex);
Document rtfDocx = new Document(rtfStreamx);
rtfDocx.Save(#"C:\Temp.xml", SaveFormat.WordML);
This saves your RTF text in new document as WordML. I cannot say about time it will take in loop. But it will surely have much less time then MS Word being physically opened and closed.
Unless I am missing something, I assume that you are trying to create Office XML file from RTF template? I think you can use Open XML SDK for creation of the xml file. Specifically, DocumentReflector that comes with that SDK seems to a good fit for that. See this example. Also, there is a http://www.codeguru.com/cpp/controls/richedit/conversions/article.php/c5377/ which shows how to convert from RTF to HTML that might guide you.
use wpf richtextbox. Rtf => xaml. Since xaml is xml_ use xslt or linq to convert it to your desired xml structure

Insert image into xml file using c#

I've looked everywhere for the answer to this question but cant find anything so hoping you guys can help me on here.
Basically I want to insert an image into an element in xml document that i have using c#
I understand i have to turn it into bytes but im unsure of how to do this and then insert it into the correct element...
please help as i am a newbie
Read all the bytes into memory using
File.ReadAllBytes().
Convert the bytes to a Base64 string
using Convert.ToBase64String().
Write the Base64 Encoded string to
your element content.
Doneski!
Here's an example in C# for writing and reading images to/from XML.
You can use a CDATA part or simply put all the bytes in their hexadecimal form as a string.
Another option is to use a base64 encoding
The element you use is up to you.
http://www.dreamincode.net/code/snippet1335.htm seems to do exactly what you want to do. It might be something you might want to try out. Note that it is in VB.NET which you can easily convert to C#.
XML can only contain characters, it can't contain an image. There are various ways you can represent an image using characters, for example by encoding the image in PNG and then encoding the PNG in base64; or you could generate an element that contains a link to a URI from where the image can be retrieved. All such conventions have to be agreed between sender and recipient. So before you rush into base64 encoding, check that this is what the recipient expects.

Insert an XML string into an openXML document

I'm trying to replace a text element placeholder with an image in an openXML docx.
I've found a tutorial here which seems to do what I need, but I'm not quite following what he does to insert the image.
Basically, I have an XML 'image template' stored in a string. I can store my image to media folder and insert the image ID into the XML string:
string imageNode
= _xml.Replace("##imageId##", documentMainPart.GetIdOfPart(newImage));
so now I have the correct XML as a string which I need to insert into the document.
I can find my placeholder text node which I want to replace with the new image XML
var placeholder = documentMainPart.Document.Body
.Descendants<DocumentFormat.OpenXml.Wordprocessing.Text>()
.Where(t => t.Text.Contains("##imagePlaceholder##")).First();
But this is where I get stuck. I can't see how to do a replace/insert which will take an XML string. I've managed to get my XML output as text in the document, but I beed to somehow convert it into an XML element.
If you are asking how you import the XML that displays the image then it shouldn't be a big problem.
How you store the image I'm not sure though, but I guess you will have to import it with a proper name somewhere inside the .docx but I'm assuming you know this by reading your post.
Replacing the placeholder with the image xml thingy is easy
var parent = placeholder.Parent;
parent.ReplaceChild(imageXML, placeholder);
Here you are actually replacing the image thingy with the text tag but I can't be sure how that would work. I know that a Image could be within a run tag wich I assume is the parent of your text tag.
Now if your XML you get form your command is correct you should be OK. It should be Drawing/Inline/Graphic root I think.
Please comment If I'm misunderstanding your question
To convert your string representation to an xml node belonging to the xml document, use XmlDocument.CreateFragment:
XmlDocumentFragment docFrag = doc.CreateDocumentFragment();
docFrag.InnerXml = imageXML;
placeholder.Parent.ReplaceChild(docFrag,placeholder);

Categories