How to Read and modify Hindi .rtf file programmatically? - c#

I have a hindi RTF file with content like:
कोलकाता, 11 दिसंबर पश्चिम बंगाल के बर्दवान जिले में कक्षा नौ की एक छात्रा ने फांसी लगाकर आत्महत्या कर ली।
In my console application I want to read that RTF file and Change some content programatically.
I using streamreader to read the file but when converting to string it is producing the following output:
ÚUæCþUèØ-SßæS‰Ø
×Âý Ñ Sßæ§Ù Üê •¤è ¼ßæ ÂØæü# ×æ˜æ ×ð´ ãUôÙð •¤æ ¼æßæ
ÖæðÂæÜ, vv ç¼â¢ÕÚ (¥æ§ü°°Ù°â)Ð ×ŠØ Âý¼ðàæ ×ð´ Sßæ§Ù Üê ¥æñÚ ÇðU¢»ê âð ¥Õ Ì•¤ •¤§ü Üæð»æð´ •¤è ×æñÌ ãUæð ¿é•¤è ãñU ¥õÚU ¥SÂÌæÜ ×ð´ ç¿ç•¤ˆâ•¤èØ âéçßÏæ¥ô´ •¤è •¤×è •ð¤ âæÍ-âæÍ ¼ßæ¥ô´ •ð¤ ¥Öæß •ð¤ Öè ¥æÚUæð ܻÌð ÚãðU ãñU¢Ð
SßæS‰Ø çßÖæ» Ùð ãUæÜæ¢ç•¤ ØãU ¼æßæ 畤Øæ ãñU 畤 Úæ…Ø ×ð´ ×æñâ×è ÚUæð», Sßæ§Ù Üê •ð¤ ©Â¿æÚ •ð¤ çÜ° Âý¼ðàæ •ð¤ ¥SÂÌæÜæð´ ×ð´ ¥æßàØ•¤ ¼ßæ¥æð´ •¤æ ÂØæü# ÂýÕ¢Ï ç•¤Øæ »Øæ ãñUÐ
I have tried the windows form RichTextBox to read the RTF file, but it always show Invalid File Format.
So what will be the best possible solution to read and modify rtf file in C#
StreamReader sr = new StreamReader(fpath, Encoding.Default, true);
string s1 = sr.ReadToEnd();
sr.Close();
also tried
using (System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox())
{
// Get the contents of the RTF file. Note that when it is
// stored in the string, it is encoded as UTF-16.
string s = System.IO.File.ReadAllText(fpath);
// Convert the RTF to plain text.
rtBox.Rtf = s; // error file format invalid
string plainText = rtBox.Text;
}

The RichTextBox control can load an RTF file directly, do not use StreamReader to read RTF file because it can contain a lot of Control Characters.
After loading the file to the RichTextBox, use the Text property to get the plain text of the file.
RichTextBox also has a SaveFile method to save the modified content to a file.

Related

Path from .rtf file to RichTextBox

I have problem with RichTextBox in C#.
When I try load to RichTextBox text like "C:\Users\adasal\Desktop\raporty_handel\rpt\rtf\bruegman.rtf" from .rtf file I gettingn something like "C:_handel.rtf"
This code is write in Active Reports console.
My code:
string resoult = "C:\\Users\\adasal\\Desktop\\raporty_handel\\rpt\\rtf\\bruegman.rtf"
System.IO.FileStream rtfCreate = System.IO.File.Create(resoult);
System.Byte[] info = new System.Text.UTF8Encoding(true).GetBytes(resoult);
rtfCreate.Write(info, 0, info.Length);
rtfCreate.Close();
System.IO.FileStream streamRTF = new System.IO.FileStream(resoult,
System.IO.FileMode.Open, System.IO.FileAccess.Read);
this.RichTextBox1.Load(streamRTF, RichTextType.Rtf);
Someone can help? I want to show whole path on report.
You have to escape '\' characters which have special meaning in RTF.
For example:
public void ActiveReport_ReportStart()
{
string resoult = "C:\\Users\\adasal\\Desktop\\raporty_handel\\rpt\\rtf\\bruegman.rtf";
this.RichTextBox1.RTF = resoult.Replace("\\", "\\\\");
}

How would I save an RTF file from clipboard in C#?

so I'm attempting to dump some RTF from the clipboard to a file.
Essentially, what's happening is that if the application see's that the user has RTF in the clipboard when they paste, it dumps that RTF to a file that is specified earlier.
The code that I was trying to use to do this is as follows:
private void saveTextLocal(bool plainText = true)
{
object clipboardGetData = Clipboard.GetData(DataFormats.Rtf);
string fileName = filename();
using (FileStream fs = File.Create(fileLoc)) { };
File.WriteAllBytes(fileLoc, ObjectToByteArray(clipboardGetData));
}
private byte[] ObjectToByteArray(Object obj)
{
if (obj == null)
{
return null;
}
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms, obj);
return ms.ToArray();
}
This appears to almost work, producing the following information as the file:
ÿÿÿÿ ‰{\rtf1\ansi\deff0\deftab480
{\fonttbl
{\f000 Courier New;}
{\f001 Courier New;}
{\f002 Courier New;}
{\f003 Courier New;}
}
{\colortbl
\red128\green128\blue128;
\red255\green255\blue255;
\red000\green000\blue128;
\red255\green255\blue255;
\red000\green000\blue000;
\red255\green255\blue255;
\red000\green000\blue000;
\red255\green255\blue255;
}
\f0\fs20\cb7\cf6 \highlight5\cf4 Console\highlight3\cf2\b .\highlight5\cf4\b0 WriteLine\highlight3\cf2\b (\highlight1\cf0\b0 "pie!"\highlight3\cf2\b )}
Which does appear to be almost right. Opening the file I'm copying in Notepad++ looks like this:
{\rtf1\ansi\deff0\nouicompat{\fonttbl{\f0\fnil Courier New;}}
{\colortbl ;\red0\green0\blue0;\red255\green255\blue255;\red0\green0\blue128;\red128\green128\blue128;}
{\*\generator Riched20 6.2.9200}\viewkind4\uc1
\pard\cf1\highlight2\f0\fs20\lang2057 Console\cf3\b .\cf1\b0 WriteLine\cf3\b (\cf4\b0 "pie!"\cf3\b )\cf1\b0\par
}
Did I do something obviously wrong, and if so - how would I amend my code to fix it?
Thanks in advance!
The issue was, as madamission quite rightly pointed out, that RTF is ASCII - not binary, and thus running it through a binary converter was wholly the wrong direction.
Instead, I did a cast of the clipboard data object to get it into a string, and I wrote as you would for a normal text file. This produced the file I was expecting. The following is the working code for anyone who might find this:
private void saveTextLocal(bool plainText = true)
{
//First, cast the clipboard contents to string. Remember to specify DataFormat!
string clipboardGetData = (string)Clipboard.GetData(DataFormats.Rtf);
//This is irrelevant to the question, in my method it generates a unique filename
string fileName = filename();
//Start a StreamWriter pointed at the destination file
using (StreamWriter writer = File.CreateText(filePath + ".rtf"))
{
//Write the entirety of the clipboard to that file
writer.Write(clipboardGetData);
};
//Close the StreamReader
}
RTF is only ASCII I think and not binary so I think you should use a TextWriter instead and don't use the BinaryFormatter.
There are some related solutions here: How to create RTF from plain text (or string) in C#?

Memory leakage while using C# dll into MFC code

I want create parser for reading Text of RTF file.
so I have created a dll in C# using RichTextBox.
after that I convert it from dll to tlb.
and calling it from cpp.
But it produce memory leakage in loop and application memory keeps on increasing.
I am attaching both code snipet..
Please help me.
Thank you
public string Convert(string strRTFTxt)
{
string path = strRTFTxt;
//Create the RichTextBox. (Requires a reference to System.Windows.Forms.)
System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox();
// Get the contents of the RTF file. When the contents of the file are
// stored in the string (rtfText), the contents are encoded as UTF-16.
string rtfText = System.IO.File.ReadAllText(path);
// Display the RTF text. This should look like the contents of your file.
//System.Windows.Forms.MessageBox.Show(rtfText);
// Use the RichTextBox to convert the RTF code to plain text.
rtBox.Rtf = rtfText;
string plainText = rtBox.Text;
rtBox.Clear();
rtBox.Dispose();
//System.Windows.Forms.MessageBox.Show(plainText);
// Output the plain text to a file, encoded as UTF-8.
//System.IO.File.WriteAllText(#"output.txt", plainText);
return plainText;
}
Cpp code where I am reading the text from dll.
long lResult =0;
std::string str ;
char* tcFileName = new char[m_strFilePath.GetLength()+1];
USES_CONVERSION;
strcpy (tcFileName, T2A (m_strFilePath));
printf("char * text: %s\n", tcFileName);
BSTR bstrText = _com_util::ConvertStringToBSTR(tcFileName);
BSTR bstrReturnValue;
wprintf(L"BSTR text: %s\n", bstrText);
iRTFConverter->Convert(bstrText,&bstrReturnValue);
//iRTFConverter->
CString strCIFContent(bstrReturnValue);
ParseString(strCIFContent);
delete []tcFileName;
tcFileName = NULL;
iRTFConverter->Release();
SysFreeString(bstrText);
bstrText = NULL;
SysFreeString(bstrReturnValue);

c# read html file and convert to pdf

I convert small html strings to pdf like this:
// set a path to where you want to write the PDF to.
string sPathToWritePdfTo = #"path\new_pdf.pdf";
System.Text.StringBuilder sbHtml = new System.Text.StringBuilder();
sbHtml.Append("<html>");
sbHtml.Append("<html>");
sbHtml.Append("<body>");
sbHtml.Append("<font size='14'> my first pdf</font>");
sbHtml.Append("<br />");
sbHtml.Append("this is my pdf!!!!");
sbHtml.Append("</body>");
sbHtml.Append("</html>");
// create file stream to PDF file to write to
using (System.IO.Stream stream = new System.IO.FileStream
(sPathToWritePdfTo, System.IO.FileMode.OpenOrCreate))
{
// create new instance of Pdfizer
Pdfizer.HtmlToPdfConverter htmlToPdf = new Pdfizer.HtmlToPdfConverter();
// open stream to write Pdf to to
htmlToPdf.Open(stream);
// write the HTML to the component
htmlToPdf.Run(sbHtml);
// close the write operation and complete the PDF file
htmlToPdf.Close();
I wonder i can make the above conversion for big html strings,without using the append method.I tried this line:
string sbHtml=File.ReadAllText("mypath/pdf.html");
Instead of this line:
System.Text.StringBuilder sbHtml = new System.Text.StringBuilder();
but it didn't work:I had an exception in line:
htmlToPdf.Run(sbHtml);
"xmlexception was unhandled bu user code
I also have to mention that the path i read the html file is from my pc!!
It's not from a server or anything else.I would like to get asnwers for both paths.
If the converter has an overload for string, you can simply use:
htmlToPdf.Run(File.ReadAllText(#"mypath/pdf.html"));
If not and accepts only StringBuilder:
System.Text.StringBuilder sbHtml = new System.Text.StringBuilder();
sbHtml.Append(File.ReadAllText(#"mypath/pdf.html"));
Would this help?
System.Text.StringBuilder sbHtml = new System.Text.StringBuilder();
sbHtml.Append(File.ReadAllText("mypath/pdf.html"));
In regards to the exception, make sure the HTML is valid XHTML. PDFizer requires valid XHTML.

Formatting a text file, how to update the file after I finished parsing it?

How would I open a file, perform some regex on the file, and then save the file?
I know I can open a file, read line by line, but how would I update the actual contents of a file and then save the file?
The following approach would work regardless of file size, and will also not corrupt the original file in anyway if the operation would fail before it is complete:
string inputFile = Path.Combine(Environment.GetFolderPath(
Environment.SpecialFolder.MyDocuments), "temp.txt");
string outputFile = Path.Combine(Environment.GetFolderPath(
Environment.SpecialFolder.MyDocuments), "temp2.txt");
using (StreamReader input = File.OpenText(inputFile))
using (Stream output = File.OpenWrite(outputFile))
using (StreamWriter writer = new StreamWriter(output))
{
while (!input.EndOfStream)
{
// read line
string line = input.ReadLine();
// process line in some way
// write the file to temp file
writer.WriteLine(line);
}
}
File.Delete(inputFile); // delete original file
File.Move(outputFile, inputFile); // rename temp file to original file name
string[] lines = File.ReadAllLines(path);
string[] transformedLines = lines.Select(s => Transform(s)).ToArray();
File.WriteAllLines(path, transformedLines);
Here, for example, Transform is
public static string Transform(string s) {
return s.Substring(0, 1) + Char.ToUpper(s[1]) + s.Substring(2);
}
Open the file for read. Read all the contents of the file into memory. Close the file. Open the file for write. Write all contents to the file. Close the file.
Alternatively, if the file is very large:
Open fileA for read. Open a new file (fileB) for write. Process each line of fileA and save to fileB. Close fileA. Close fileB. Delete fileA. Rename fileB to fileA.
Close the file after you finish reading it
Reopen the file for write
Write back the new contents

Categories