How to remove special characters from file Metadata c#

How to remove special characters from file Metadata c# - c#

I am trying to find solution to this from last 2 hours, I have searched a lot on this but didn't found any solution(may be I am searching with wrong keywords), but the problem is I want remove file properties which contain special characters. Please check attached image for what I want to say.
I am using ASP.NET FileUpload control and C# as programming language. I want to make sure that any file uploaded does not contain any special characters in its properties.
Please help.
Thanks.

Have you tried looping through Image.PropertyItems?
You can modify the image to remove unwanted details via GetPropertyItem() and SetPropertyItem()
References:
https://msdn.microsoft.com/en-us/library/system.drawing.image.propertyitems(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/system.drawing.image.setpropertyitem(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/system.drawing.image.getpropertyitem.aspx
Sample here in StackOveflow:
Value of image property (C#)

Related

Word merge field in header loses value in print preview

I have an ASP WebForms app where I use a word template that contains merge fields, to replace them with data extracted from the database. The app works great, the word document is exported, but when trying to print the document, one of the merge fields, which exists in the header, loses it's value and restores to the initial merge field name. Is this something that has to be fixed from the application's code or is this a word settings issue.
Any help is greatly appreciated.
Thank you!

I have managed to solve this problem using OpenXML Productivity Tool. It turns out that you can't add a merge field in the header, so what I did was to put it inside of a textbox. I forgot to mention that part in the initial description. Thus, the text element was buried deep in open xml. When I managed to find it and log what was inside, I found out that I inserted the MERGEFIELD <> MERGEFORMAT. Every time I tried to insert the value that I wanted in this <>, it got reset when I hit print preview. So what I did, based on a suggestion from someone who had a similar problem, was to delete this textbox and create a clean one where I only entered "Test". It needs to have a string inside so that open xml created the element Text (instrTxt).
In C# I did this:
foreach (var hPart in firstDoc.MainDocumentPart.HeaderParts)
{
foreach (var txt in hPart.Header.Descendants<Text>())
{
if(txt.Text == "Test")
{
txt.Text = "My custom text";
}
}
}
So for each header part (because I can't tell for sure in which one it is..it could even be in multiple ones), get all descendants of type text.
I got a few more than I wanted. I also got a few that contained the page number (since I have it in the header as well). So I added an if to check if it's the text element that I wanted. Once I found it, I added the text I wanted.
So long story short, instead of using a merge field in the header, I just used a text. Perhaps it's not the most efficient way of doing this. Maybe the question still remains, (if I could have inserted a merge field in the header and actually made it work without having word reset the value upon print preview? idk), but this worked for me.

Use OpenXML to replace text in DOCX file - strange content

I'm trying to use the OpenXML SDK and the samples on Microsoft's pages to replace placeholders with real content in Word documents.
It used to work as described here, but after editing the template file in Word adding headers and footers it stopped working. I wondered why and some debugging showed me this:
Which is the content of texts in this piece of code:
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(DocumentFile, true))
{
var texts = wordDoc.MainDocumentPart.Document.Body.Descendants<Text>().ToList();
}
So what I see here is that the body of the document is "fragmented", even though in Word the content looks like this:
Can somebody tell me how I can get around this?
I have been asked what I'm trying to achieve. Basically I want to replace user defined "placeholders" with real content. I want to treat the Word document like a template. The placeholders can be anything. In my above example they look like {var:Template1}, but that's just something I'm playing with. It could basically be any word.
So for example if the document contains the following paragraph:
Do not use the name USER_NAME
The user should be able to replace the USER_NAME placeholder with the word admin for example, keeping the formatting intact. The result should be
Do not use the name admin
The problem I see with working on paragraph level, concatenating the content and then replacing the content of the paragraph, I fear I'm losing the formatting that should be kept as in
Do not use the name admin

Various things can fragment text runs. Most frequently proofing markup (as apparently is the case here, where there are "squigglies") or rsid (used to compare documents and track who edited what, when), as well as the "Go back" bookmark Word sets in the background. These become readily apparent if you view the underlying WordOpenXML (using the Open XML SDK Productivity Tool, for example) in the document.xml "part".
It usually helps to go an element level "higher". In this case, get the list of Paragraph descendants and from there get all the Text descendants and concatenate their InnerText.

OpenXML is indeed fragmenting your text:
I created a library that does exactly this : render a word template with the values from a JSON.
From the documenation of docxtemplater :
Why you should use a library for this
Docx is a zipped format that contains some xml. If you want to build a simple replace {tag} by value system, it can already become complicated, because the {tag} is internally separated into <w:t>{</w:t><w:t>tag</w:t><w:t>}</w:t>. If you want to embed loops to iterate over an array, it becomes a real hassle.
The library basically will do the following to keep formatting :
If the text is :
<w:t>Hello</w:t>
<w:t>{name</w:t>
<w:t>} !</w:t>
<w:t>How are you ?</w:t>
The result would be :
<w:t>Hello</w:t>
<w:t>John !</w:t>
<w:t>How are you ?</w:t>
You also have to replace the tag by <w:t xml:space=\"preserve\"> to ensure that the space is not stripped out if they is any in your variables.

simple spell checking tool in C#

What i'm tying to achieve is a input field where you can put in how you think you spell the word then it will search my text file named words.txt and will find words that are of similar spelling then it will put the results into a new window.
thanks in advance

This is the one I have used and it sounded exactly what you wanted:
Make similar suggestions for input text by remembering old inputs
You can see it in action in the screen capture video here
ps I pre-populated a dictionary.dic file to suit in one instance and in the above example I did some other rules around LogParsers SQL-Like syntax to provide intellisense. HTH

Fillable doc files

I have a samples of some documents in .doc format. So I need to create some "fillable# areas instead of certain values in samples. Then I need to automatically fill this documents using C#. So what do you think about it? Is that possible? Thanks in advance, guys! P.S.: if you need some information from me please feel free to ask me about additions to my question.

Besides simply injecting/replacing text into the document itself you could also utilize docvariables. You can define/create them in your document and then you can codewise set the values.
Using docvariables you seperate the design of the worddoc (where is the text shown) from setting the values which might be usefull for your case.
You can certainly manipulate them using C# but a bit more info using a vba sample can found at What is a DOCVARIABLE in word
One little warning when using c# to edit them. If you set the value of a docvariable to "" (empty string) it results in the docvariable being deleted from the document. If you want to keep the docvariable around set it's value to a " " (space)

Yes this is possible, you can create in your Document a placeholder areas which you search and change when you access the file. Check these results on how to modify the word document using C#

Store arbitrary application data in System.Windows.Forms.RichTextBox

Does anyone know of a way to store arbitrary data in a RichTextBox without the user being able to see this data? The 2007 RTF specification includes annotations ("\atnid", "\atnauthor", "\annotation", etc.) but whenever I insert these into a RichTextBox's .Rtf, the annotations disappear (presumably because the RichTextBox doesn't support RTF annotations.) I have a related question about whether it is possible to store the information inside a Metafile image. Either of these solutions would be acceptable. TIA.
What I'm trying is something like this:
string objectXml = MySerialization.ToXml(object);
string commentRtfFragment = String.Format(#"{{\*\atnid MyApp}}{{\*\atnauthor MyApp}}{{\*\annotation {0}}}", objectXml);
string imageRtf = String.Format(#"{{\rtf1 {{\pict\wmetafile{0}\picw{1}\pich{2}\picwgoal{3}\pichgoal{4} {5}}}{6}}}",
PixelMappingMode.MM_ANISOTROPIC, picw, pich, picwgoal, pichgoal, imageHex, commentRtfFragment);
richTextBox.SelectedRtf = imageRtf;
Update: The application metadata ("annotations") must correspond with particular locations in the RTF. There will also be multiple annotations per RichTextBox (or RTF document if you like.) I also want the metadata to persist with the RTF. So while it would be possible to persist the metadata in a control.Tag, then I would have to take care of adding the information to the database myself, noting whenever the user edited the RTF and somehow determine the new location of the metadata after the edit.

I think the response with atandb will provide the right solution. You can use \v and \v0 to hide the data inbetween and access that hidden data as a specific data to that particular location.
I tried in the richtextbox and the rtf property supports that and it does not modify the rtf contents by skipping the control code. I had the same problem and I luckily ended up with this page and now I am able to have some annotation/comments like feature for any location in the rtf data.
Thank you very much Carl for your question and AtanDB for your answer.

I don't know if there's any special way for doing this for RTF documents, but if you just want to store some data in a control (any kind of Control) without showing it to the user, you could use the Tag property as can be seen here: Control.Tag

I think ho1 has the right idea. Control.Tag is an object so you could use a generic data structure like List, Hash, Dictionary, etc. to store your multiple annotations and store that in the Tag property.

The richtext control supports hidden words with \v and turns hidden off with \v0 and no, I have not confused them even though logically \v would stand for visible it does the opposite.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.