Memory leakage while using C# dll into MFC code - c#

I want create parser for reading Text of RTF file.
so I have created a dll in C# using RichTextBox.
after that I convert it from dll to tlb.
and calling it from cpp.
But it produce memory leakage in loop and application memory keeps on increasing.
I am attaching both code snipet..
Please help me.
Thank you
public string Convert(string strRTFTxt)
{
string path = strRTFTxt;
//Create the RichTextBox. (Requires a reference to System.Windows.Forms.)
System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox();
// Get the contents of the RTF file. When the contents of the file are
// stored in the string (rtfText), the contents are encoded as UTF-16.
string rtfText = System.IO.File.ReadAllText(path);
// Display the RTF text. This should look like the contents of your file.
//System.Windows.Forms.MessageBox.Show(rtfText);
// Use the RichTextBox to convert the RTF code to plain text.
rtBox.Rtf = rtfText;
string plainText = rtBox.Text;
rtBox.Clear();
rtBox.Dispose();
//System.Windows.Forms.MessageBox.Show(plainText);
// Output the plain text to a file, encoded as UTF-8.
//System.IO.File.WriteAllText(#"output.txt", plainText);
return plainText;
}
Cpp code where I am reading the text from dll.
long lResult =0;
std::string str ;
char* tcFileName = new char[m_strFilePath.GetLength()+1];
USES_CONVERSION;
strcpy (tcFileName, T2A (m_strFilePath));
printf("char * text: %s\n", tcFileName);
BSTR bstrText = _com_util::ConvertStringToBSTR(tcFileName);
BSTR bstrReturnValue;
wprintf(L"BSTR text: %s\n", bstrText);
iRTFConverter->Convert(bstrText,&bstrReturnValue);
//iRTFConverter->
CString strCIFContent(bstrReturnValue);
ParseString(strCIFContent);
delete []tcFileName;
tcFileName = NULL;
iRTFConverter->Release();
SysFreeString(bstrText);
bstrText = NULL;
SysFreeString(bstrReturnValue);

Related

How to remove BOM from an encoded base64 UTF string?

I have a file encoded in base64 using openssl base64 -in en -out en1 in a command line in MacOS and I am reading this file using the following code:
string fileContent = File.ReadAllText(Path.Combine(AppContext.BaseDirectory, MConst.BASE_DIR, "en1"));
var b1 = Convert.FromBase64String(fileContent);
var str1 = System.Text.Encoding.UTF8.GetString(b1);
The string I am getting has a ? before the actual file content. I am not sure what's causing this, any help will be appreciated.
Example Input:
import pandas
import json
Encoded file example:
77u/DQppbXBvcnQgY29ubmVjdG9yX2FwaQ0KaW1wb3J0IGpzb24NCg0K
Output based on the C# code:
?import pandas
import json
Normally, when you read UTF (with BOM) from a text file, the decoding is handled for you behind the scene. For example, both of the following lines will read UTF text correctly regardless of whether or not the text file has a BOM:
File.ReadAllText(path, Encoding.UTF8);
File.ReadAllText(path); // UTF8 is the default.
The problem is that you're dealing with UTF text that has been encoded to a Base64 string. So, ReadAllText() can no longer handle the BOM for you. You can either do it yourself by (checking and) removing the first 3 bytes from the byte array or delegate that job to a StreamReader, which is exactly what ReadAllText() does:
var bytes = Convert.FromBase64String(fileContent);
string finalString = null;
using (var ms = new MemoryStream(bytes))
using (var reader = new StreamReader(ms)) // Or:
// using (var reader = new StreamReader(ms, Encoding.UTF8))
{
finalString = reader.ReadToEnd();
}
// Proceed to using finalString.

Removing char hex codes from string

I have a folder named Folderć, which contains smth.jpg. As folder has letter ć in name, filepath is saved in database as Folder%C0%01%/smth.jpg.
Letter ć is saved as Hex code. This is not a problem while previewing image on website.
Problem happens when i am trying to make a subfolder in Folderć via C# function. Function gets filepath string, finds folder name and creates subfolder in it. As my string contains hex code instead of letter ć function cant find that path thus cant create subfolder.
That string is in UTF-8 format, so changing the encoding doesnt change anything.
Anyone knows where is problem and how to solve it?
You can encode the name with Base64:
public string ToBase64String(string text)
{
byte[] data = Encoding.UTF8.GetBytes(text);
return Convert.ToBase64String(data);
}
and use it string myEncFolderPath = ToBase64String(myFolderPath); before saving the string in the DB.
After you receive the string from the DB, you can decode it back to a normal string with:
public string FromBase64String(string base64)
{
byte[] data = Convert.FromBase64String(base64);
return Encoding.UTF8.GetString(data);
}
by using string myFolderPath = FromBase64String(myEncFolderPath);.
That way, you can save strings freely in the DB.

How to Read and modify Hindi .rtf file programmatically?

I have a hindi RTF file with content like:
कोलकाता, 11 दिसंबर पश्चिम बंगाल के बर्दवान जिले में कक्षा नौ की एक छात्रा ने फांसी लगाकर आत्महत्या कर ली।
In my console application I want to read that RTF file and Change some content programatically.
I using streamreader to read the file but when converting to string it is producing the following output:
ÚUæCþUèØ-SßæS‰Ø
×Âý Ñ Sßæ§Ù Üê •¤è ¼ßæ ÂØæü# ×æ˜æ ×ð´ ãUôÙð •¤æ ¼æßæ
ÖæðÂæÜ, vv ç¼â¢ÕÚ (¥æ§ü°°Ù°â)Ð ×ŠØ Âý¼ðàæ ×ð´ Sßæ§Ù Üê ¥æñÚ ÇðU¢»ê âð ¥Õ Ì•¤ •¤§ü Üæð»æð´ •¤è ×æñÌ ãUæð ¿é•¤è ãñU ¥õÚU ¥SÂÌæÜ ×ð´ ç¿ç•¤ˆâ•¤èØ âéçßÏæ¥ô´ •¤è •¤×è •ð¤ âæÍ-âæÍ ¼ßæ¥ô´ •ð¤ ¥Öæß •ð¤ Öè ¥æÚUæð ܻÌð ÚãðU ãñU¢Ð
SßæS‰Ø çßÖæ» Ùð ãUæÜæ¢ç•¤ ØãU ¼æßæ 畤Øæ ãñU 畤 Úæ…Ø ×ð´ ×æñâ×è ÚUæð», Sßæ§Ù Üê •ð¤ ©Â¿æÚ •ð¤ çÜ° Âý¼ðàæ •ð¤ ¥SÂÌæÜæð´ ×ð´ ¥æßàØ•¤ ¼ßæ¥æð´ •¤æ ÂØæü# ÂýÕ¢Ï ç•¤Øæ »Øæ ãñUÐ
I have tried the windows form RichTextBox to read the RTF file, but it always show Invalid File Format.
So what will be the best possible solution to read and modify rtf file in C#
StreamReader sr = new StreamReader(fpath, Encoding.Default, true);
string s1 = sr.ReadToEnd();
sr.Close();
also tried
using (System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox())
{
// Get the contents of the RTF file. Note that when it is
// stored in the string, it is encoded as UTF-16.
string s = System.IO.File.ReadAllText(fpath);
// Convert the RTF to plain text.
rtBox.Rtf = s; // error file format invalid
string plainText = rtBox.Text;
}
The RichTextBox control can load an RTF file directly, do not use StreamReader to read RTF file because it can contain a lot of Control Characters.
After loading the file to the RichTextBox, use the Text property to get the plain text of the file.
RichTextBox also has a SaveFile method to save the modified content to a file.

How would I save an RTF file from clipboard in C#?

so I'm attempting to dump some RTF from the clipboard to a file.
Essentially, what's happening is that if the application see's that the user has RTF in the clipboard when they paste, it dumps that RTF to a file that is specified earlier.
The code that I was trying to use to do this is as follows:
private void saveTextLocal(bool plainText = true)
{
object clipboardGetData = Clipboard.GetData(DataFormats.Rtf);
string fileName = filename();
using (FileStream fs = File.Create(fileLoc)) { };
File.WriteAllBytes(fileLoc, ObjectToByteArray(clipboardGetData));
}
private byte[] ObjectToByteArray(Object obj)
{
if (obj == null)
{
return null;
}
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms, obj);
return ms.ToArray();
}
This appears to almost work, producing the following information as the file:
ÿÿÿÿ ‰{\rtf1\ansi\deff0\deftab480
{\fonttbl
{\f000 Courier New;}
{\f001 Courier New;}
{\f002 Courier New;}
{\f003 Courier New;}
}
{\colortbl
\red128\green128\blue128;
\red255\green255\blue255;
\red000\green000\blue128;
\red255\green255\blue255;
\red000\green000\blue000;
\red255\green255\blue255;
\red000\green000\blue000;
\red255\green255\blue255;
}
\f0\fs20\cb7\cf6 \highlight5\cf4 Console\highlight3\cf2\b .\highlight5\cf4\b0 WriteLine\highlight3\cf2\b (\highlight1\cf0\b0 "pie!"\highlight3\cf2\b )}
Which does appear to be almost right. Opening the file I'm copying in Notepad++ looks like this:
{\rtf1\ansi\deff0\nouicompat{\fonttbl{\f0\fnil Courier New;}}
{\colortbl ;\red0\green0\blue0;\red255\green255\blue255;\red0\green0\blue128;\red128\green128\blue128;}
{\*\generator Riched20 6.2.9200}\viewkind4\uc1
\pard\cf1\highlight2\f0\fs20\lang2057 Console\cf3\b .\cf1\b0 WriteLine\cf3\b (\cf4\b0 "pie!"\cf3\b )\cf1\b0\par
}
Did I do something obviously wrong, and if so - how would I amend my code to fix it?
Thanks in advance!
The issue was, as madamission quite rightly pointed out, that RTF is ASCII - not binary, and thus running it through a binary converter was wholly the wrong direction.
Instead, I did a cast of the clipboard data object to get it into a string, and I wrote as you would for a normal text file. This produced the file I was expecting. The following is the working code for anyone who might find this:
private void saveTextLocal(bool plainText = true)
{
//First, cast the clipboard contents to string. Remember to specify DataFormat!
string clipboardGetData = (string)Clipboard.GetData(DataFormats.Rtf);
//This is irrelevant to the question, in my method it generates a unique filename
string fileName = filename();
//Start a StreamWriter pointed at the destination file
using (StreamWriter writer = File.CreateText(filePath + ".rtf"))
{
//Write the entirety of the clipboard to that file
writer.Write(clipboardGetData);
};
//Close the StreamReader
}
RTF is only ASCII I think and not binary so I think you should use a TextWriter instead and don't use the BinaryFormatter.
There are some related solutions here: How to create RTF from plain text (or string) in C#?

Base64 decode in C# or Java

I have a Base64-encoded object with the following header:
application/x-xfdl;content-encoding="asc-gzip"
What is the best way to proceed in decoding the object? Do I need to strip the first line? Also, if I turn it into a byte array (byte[]), how do I un-gzip it?
Thanks!
I think I misspoke initially. By saying the header was
application/x-xfdl;content-encoding="asc-gzip"
I meant this was the first line of the file. So, in order to use the Java or C# libraries to decode the file, does this line need to be stripped?
If so, what would be the simplest way to strip the first line?
To decode the Base64 content in C# you can use the Convert Class static methods.
byte[] bytes = Convert.FromBase64String(base64Data);
You can also use the GZipStream Class to help deal with the GZipped stream.
Another option is SharpZipLib. This will allow you to extract the original data from the compressed data.
I was able to use the following code to convert an .xfdl document into a Java DOM Document.
I used iHarder's Base64 utility to do the Base64 Decode.
private static final String FILE_HEADER_BLOCK =
"application/vnd.xfdl;content-encoding=\"base64-gzip\"";
public static Document OpenXFDL(String inputFile)
throws IOException,
ParserConfigurationException,
SAXException
{
try{
//create file object
File f = new File(inputFile);
if(!f.exists()) {
throw new IOException("Specified File could not be found!");
}
//open file stream from file
FileInputStream fis = new FileInputStream(inputFile);
//Skip past the MIME header
fis.skip(FILE_HEADER_BLOCK.length());
//Decompress from base 64
Base64.InputStream bis = new Base64.InputStream(fis,
Base64.DECODE);
//UnZIP the resulting stream
GZIPInputStream gis = new GZIPInputStream(bis);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(gis);
gis.close();
bis.close();
fis.close();
return doc;
}
catch (ParserConfigurationException pce) {
throw new ParserConfigurationException("Error parsing XFDL from file.");
}
catch (SAXException saxe) {
throw new SAXException("Error parsing XFDL into XML Document.");
}
}
Still working on successfully modifying and re-encoding the document.
Hope this helps.
In Java, you can use the Apache Commons Base64 class
String decodedString = new String(Base64.decodeBase64(encodedBytes));
It sounds like you're dealing with data that is both gzipped and Base 64 encoded. Once you strip off any mime headers, you should convert the Base64 data to a byte array using something like Apache commons codec. You can then wrap the byte[] in a ByteArrayInputStream object and pass that to a GZipInputStream which will let you read the uncompressed data.
For java, have you tried java's built in java.util.zip package? Alternately, Apache Commons has the Commons Compress library to work with zip, tar and other compressed file types. As to decoding Base 64, there are several open source libraries, or you can use Sun's sun.misc.BASE64Decoder class.
Copied from elsewhere, for Base64 I link to commons-codec-1.6.jar:
public static String decode(String input) throws Exception {
byte[] bytes = Base64.decodeBase64(input);
BufferedReader in = new BufferedReader(new InputStreamReader(
new GZIPInputStream(new ByteArrayInputStream(bytes))));
StringBuffer buffer = new StringBuffer();
char[] charBuffer = new char[1024];
while(in.read(charBuffer) != -1) {
buffer.append(charBuffer);
}
return buffer.toString();
}

Categories