Printer is printing Korean characters when Arabic CodePages are specified - c#

Printer: Yujin Thermal Printer
Library: ESC-POS-.NET (C#)
Different Diagnostics done so far:
Printed Arabic words using Notepad / Wordpad/ MS Word, Arabic characters prints perfect.
Printed Arabic characters using wrong codepage, prints question marks as placeholders for Arabic characters.
Code:
var e = new EPSON();
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
Encoding enc = Encoding.GetEncoding(1256);
var bytes = ByteSplicer.Combine(
e.CodePage(CodePage.WPC1256_ARABIC),
e.CenterAlign(),
e.PrintLine(" Test Page See the Arabic text Below WPC1256_ARABIC"),
enc.GetBytes("طباعة صفحة إختبار "),
e.PrintLine("\x1D\x56\x42\x00"),
e.CodePage(CodePage.PC720_ARABIC),
e.CenterAlign(),
e.PrintLine(" Test Page See the Arabic text Below PC864_ARABIC"),
enc.GetBytes("طباعة صفحة إختبار "),
e.PrintLine("\x1D\x56\x42\x00"),
e.CodePage(CodePage.PC864_ARABIC),
e.CenterAlign(),
e.PrintLine(" Test Page See the Arabic text Below PC864_ARABIC"),
enc.GetBytes("طباعة صفحة إختبار "),
e.PrintLine("\x1D\x56\x42\x00"),
e.CodePage(CodePage.PC864_ARABIC),
);`
This is the method I use to send bytes to printer. I hope I am not loosing any byte data.
public static bool SendBytesToPrinter(string szPrinterName, byte[] data)
{
var pUnmanagedBytes = Marshal.AllocCoTaskMem(data.Length); // Allocate unmanaged memory
Marshal.Copy(data, 0, pUnmanagedBytes, data.Length); // copy bytes into unmanaged memory
var retval = SendBytesToPrinter(szPrinterName, pUnmanagedBytes, data.Length);
Marshal.FreeCoTaskMem(pUnmanagedBytes); // Free the allocated unmanaged memory
return retval;
}
Output:
These are the Korean characters that gets printed: "핸 턱한"
I have opened a github issue on this as well
Thank you.

Related

get all Text Encoding in universal windows app

I want to make windows application that convert text encoding of Txt file.
so I need to get all text encoding supported by windows.
I look for Encoding.GetEncodings() Method but it is not exist in Universal Windows Apps.
so I try to get encoding by CodePage, and this is my code:
List<string> list = new List<string>();
List<string> errors = new List<string>();
int[] code_page = { 0, 1200, 1201, 1252, 10003, 10008, 12000, 12001, 20127, 20936, 20949, 28591, 28598, 38598, 50220, 50221,
50222, 50225, 50227, 51932, 51936, 51949, 52936, 57002, 57003, 57004, 57005, 57006, 57007, 57008, 57009, 57010, 57011, 65000, 65001 };
for (int i = 0; i < code_page.Length; i++)
{
try
{
list.Add(Encoding.GetEncoding(code_page[i]).EncodingName);
}
catch (Exception ex) { errors.Add(code_page[i] + "\t\t" + ex.Message); }
}
}
And I have this result:
First list (Encoding)
Unicode (UTF-8)
Unicode
Unicode (Big-Endian)
Unicode (UTF-32)
Unicode (UTF-32 Big-Endian)
US-ASCII
Western European (ISO)
Unicode (UTF-7)
Unicode (UTF-8)
Errors list
37___No data is available for encoding 37. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
437__No data is available for encoding 437. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
500__No data is available for encoding 500. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.
...etc
and my question is there any way to get all windows text encoding.
Thank you.

Memory leakage while using C# dll into MFC code

I want create parser for reading Text of RTF file.
so I have created a dll in C# using RichTextBox.
after that I convert it from dll to tlb.
and calling it from cpp.
But it produce memory leakage in loop and application memory keeps on increasing.
I am attaching both code snipet..
Please help me.
Thank you
public string Convert(string strRTFTxt)
{
string path = strRTFTxt;
//Create the RichTextBox. (Requires a reference to System.Windows.Forms.)
System.Windows.Forms.RichTextBox rtBox = new System.Windows.Forms.RichTextBox();
// Get the contents of the RTF file. When the contents of the file are
// stored in the string (rtfText), the contents are encoded as UTF-16.
string rtfText = System.IO.File.ReadAllText(path);
// Display the RTF text. This should look like the contents of your file.
//System.Windows.Forms.MessageBox.Show(rtfText);
// Use the RichTextBox to convert the RTF code to plain text.
rtBox.Rtf = rtfText;
string plainText = rtBox.Text;
rtBox.Clear();
rtBox.Dispose();
//System.Windows.Forms.MessageBox.Show(plainText);
// Output the plain text to a file, encoded as UTF-8.
//System.IO.File.WriteAllText(#"output.txt", plainText);
return plainText;
}
Cpp code where I am reading the text from dll.
long lResult =0;
std::string str ;
char* tcFileName = new char[m_strFilePath.GetLength()+1];
USES_CONVERSION;
strcpy (tcFileName, T2A (m_strFilePath));
printf("char * text: %s\n", tcFileName);
BSTR bstrText = _com_util::ConvertStringToBSTR(tcFileName);
BSTR bstrReturnValue;
wprintf(L"BSTR text: %s\n", bstrText);
iRTFConverter->Convert(bstrText,&bstrReturnValue);
//iRTFConverter->
CString strCIFContent(bstrReturnValue);
ParseString(strCIFContent);
delete []tcFileName;
tcFileName = NULL;
iRTFConverter->Release();
SysFreeString(bstrText);
bstrText = NULL;
SysFreeString(bstrReturnValue);

How to encode and decode Broken Chinese/Unicode characters?

I've tried googling around but wasn't able to find what charset that this text below belongs to:
具有éœé›»ç”¢ç”Ÿè£ç½®ä¹‹å½±åƒè¼¸å…¥è£ç½®
But putting <meta http-equiv="Content-Type" Content="text/html; charset=utf-8"> and keeping that string into an HTML file, I was able to view the Chinese characters properly:
具有靜電產生裝置之影像輸入裝置
So my question is:
What tools can I use to detect the character set of this text?
And how do I convert/encode/decode them properly in C#?
Updates:
For completion sake, I've updated this test.
[TestMethod]
public void TestMethod1()
{
string encodedText = "具有éœé›»ç”¢ç”Ÿè£ç½®ä¹‹å½±åƒè¼¸å…¥è£ç½®";
Encoding utf8 = new UTF8Encoding();
Encoding window1252 = Encoding.GetEncoding("Windows-1252");
byte[] postBytes = window1252.GetBytes(encodedText);
string decodedText = utf8.GetString(postBytes);
string actualText = "具有靜電產生裝置之影像輸入裝置";
Assert.AreEqual(actualText, decodedText);
}
}
What is happening when you save the "bad" string in a text file with a meta tag declaring the correct encoding is that your text editor is saving the file with Windows-1252 encoding, but the browser is reading the file and interpreting it as UTF-8. Since the "bad" string is incorrectly decoded UTF-8 bytes with the Windows-1252 encoding, you are reversing the process by encoding the file as Windows-1252 and decoding as UTF-8.
Here's an example:
using System.Text;
using System.Windows.Forms;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
string s = "具有靜電產生裝置之影像輸入裝置"; // Unicode
Encoding Windows1252 = Encoding.GetEncoding("Windows-1252");
Encoding Utf8 = Encoding.UTF8;
byte[] utf8Bytes = Utf8.GetBytes(s); // Unicode -> UTF-8
string badDecode = Windows1252.GetString(utf8Bytes); // Mis-decode as Latin1
MessageBox.Show(badDecode,"Mis-decoded"); // Shows your garbage string.
string goodDecode = Utf8.GetString(utf8Bytes); // Correctly decode as UTF-8
MessageBox.Show(goodDecode, "Correctly decoded");
// Recovering from bad decode...
byte[] originalBytes = Windows1252.GetBytes(badDecode);
goodDecode = Utf8.GetString(originalBytes);
MessageBox.Show(goodDecode, "Re-decoded");
}
}
}
Even with correct decoding, you'll still need a font that supports the characters being displayed. If your default font doesn't support Chinese, you still might not see the correct characters.
The correct thing to do is figure out why the string you have was decoded as Windows-1252 in the first place. Sometimes, though, data in a database is stored incorrectly to begin with and you have to resort to these games to fix the problem.
string test = "敭畳灴獩楫n"; //incoming data. must be mesutpiskin
byte[] bytes = Encoding.Unicode.GetBytes(test);
string s = string.Empty;
for (int i = 0; i < bytes.Length; i++)
{
s += (char)bytes[i];
}
s = s.Trim((char)0);
MessageBox.Show(s);
//s=mesutpiskin
I'm not really sure what you mean, but I'm guessing you want to convert between a string in a certain encoding in byte array form and a string. Let's assume the character encoding is called "FooBar":
This is how you encode and decode:
Encoding myEncoding = Encoding.GetEncoding("FooBar");
string myString = "lala";
byte[] myEncodedBytes = myEncoding.GetBytes(myString);
string myDecodedString = myEncoding.GetString(myEncodedBytes);
You can learn more about the Encoding class over at MSDN.
Answering your question at the end of your post:
If you want to determine the text encoding on runtime you should look at that: http://code.google.com/p/ude/
for converting character sets you can use http://msdn.microsoft.com/en-us/library/system.text.encoding.convert(v=vs.100).aspx
It's Windows Latin 1. I pasted the Chinese text as UTF-8 into BBEDIT (a text editor for Mac) and re-opened the file as Windows Latin 1 and bang, the exact diacritics appeared.

Decoding base64 file contents between PHP and C#

I need to serve an AES encrypted, base64 encoded file from PHP to a C# client (Mono, on various platforms). I've successfully got the AES encryption/decryption working fine but as soon as I attempt the base64 encoding/decoding I run into trouble. Both the examples below have the AES disabled, so that shouldn't be a factor.
My simplest test case, a Hello World string, works fine:
PHP serving output-
// Save encoded data to file
$data = base64_encode("Hello encryption world!!");
$file = fopen($targetPath, 'w');
fwrite($file, $data);
fclose($file);
// Later on, serve the file
header("Pragma: public");
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: private",false);
header("Content-Type: application/octet-stream");
header("Content-Disposition: attachment; filename=".basename($product->PackageFilename($packageId)));
header("Content-Transfer-Encoding: binary");
header("Content-Length: ".filesize($targetPath));
ob_clean();
flush();
$handle = fopen($targetPath, "r");
fpassthru($handle);
fclose($handle);
C# decoding and using-
StreamReader reader = new StreamReader(stream);
char[] buffer = DecodeBuffer;
string decoded = "";
int read = 0;
while (0 < (read = reader.Read(buffer, 0, DecodeBufferSize)))
{
byte[] decodedBytes = Convert.FromBase64CharArray(buffer, 0, read);
decoded += System.Text.Encoding.UTF8.GetString(decodedBytes);
}
Log(decoded); // Correctly logs "Hello encryption world!!"
However once I start trying to do the same thing with the contents of a file, a FormatException: Invalid character found is thrown by Convert.FromBase64CharArray:
PHP serving output-
// Save encoded data to file
$data = base64_encode(file_get_contents($targetPath));
$file = fopen($targetPath, 'w');
fwrite($file, $data);
fclose($file);
// Later on, serve the file
// Same as above
C# decoding and using-
using (Stream file = File.Open(zipPath, FileMode.Create))
{
using (StreamReader reader = new StreamReader(stream))
{
char[] buffer = DecodeBuffer;
byte[] decodedBytes;
int read = 0;
while (0 < (read = reader.Read(buffer, 0, DecodeBufferSize)))
{
// Throws FormatException: Invalid character found
decodedBytes = Convert.FromBase64CharArray(buffer, 0, read);
file.Write(decodedBytes, 0, decodedBytes.Length);
}
}
}
Is there some kind of additional processing that should be done on larger data for base64 to be valid? Is it perhaps just not appropriate to be doing this with large binary data-and if so how else would you prevent potential problems with characters unsafe for transmission?
Your reading Base64 text code is not correct.
Base64 is text, so consider using text reader instead
Base64 may contain new lines/white spaces. I.e. it is custom to have split whole Base64 encoded value into 70-80 character long strings.
To verify if data in the file is correct read whole file as string (StreamReader.ReadToEnd) and convert to byte array (Convert.FromBase64String).
If file contains valid Base64 data and you can't read it as single string you should implement your own Base64 decoding or manually read correct number of non white space characters (multiple of 4) and decode such chunks.
Base64 encoding converts 3 octets into 4 encoded characters. Thus, the length of the data you provide for decoding needs to be a multiple of 4.
First, ensure that DecodeBufferSize is such a multiple of 4. Next, since StreamReader.Read does not guarantee that all the requested bytes will be read, you should continue reading into buffer until either it has been filled, or the end of the stream been reached.

c# encoding issue with?

i have an input like: DisplaygröÃe
And i want output like: Displaygröÿe
With notepad++ problem was solved by: converting to ansi, encoding to utf8 and converting back to ansi.
I need to do this programmatically in c#.
I've tried converting to / from ansi, utf8, latin-1 and none work properly, it shows ? with a function that uses Encoding.Default.GetBytes, then
res = Enconding.Convert(src1,dest1,bytes) and
EncodingDest.GetChars(res);
where EncodingDest it represent output encoding..
Code is running in Console application, but same result are on WPF.
Doesn't matter with encoding is good for output only if it works, these problems also are for country's like spain, italy or sweden.
use System.Text.Encoding
var ascii = Encoding.ASCII.GetBytes("DisplaygröÃe");
var utf8 = Encoding.Convert(Encoding.ASCII, Encoding.UTF8, ascii);
var output = Encoding.UTF8.GetString(utf8);
When you output a string somewhere (like a TextWriter, or a Stream, or a byte[]), you should always specify the encoding, unless you want the UTF-8 output (the default one):
using(StreamWriter sw = new StreamWriter("file.txt", Encoding.GetEncoding("windows-1252"))
sw.WriteLine("Displaygröÿe");
#DanM: You need to know what character set your input is in.
"DisplaygröÃe" is what you will see if you take the string "Displaygröße" (suggested by Vlad) encode it to bytes as UTF-8, and then incorrectly decode it as latin1.
If you do the same with Displaygröÿe, you would see "Displaygröÿe" (the inverted question mark is literally there, it is not a placeholder for something that can't be displayed.) Technically, "DisplaygröÃe" probably has another character between the à and e, but it is a control code, and is thus invisible to you.
If you have an character set foo, this is true: my_string = foo_decode(foo_encode(my_string)). If you have another character set bar, this is true: barf = bar_decode(foo_encode(my_string)) where barf is garbage like you're seeing.
If you don't know what character set your input is in, you will only decode it correctly by chance.
It appears that your input files are in UTF-8, and you will need to decode the bytes from the file as such. (I don't speak enough C# to help you here... I only speak character encodings.)
using (var rdr = new StreamReader(fs, Encoding.GetEncoding(1252))) {
result = rdr.ReadToEnd();
}
we had similar problem when sending data to text printer, and only one I get working is this (written as extension):
public static byte[] ToAnsiMemBytes(this string input)
{
int length = input.Length;
byte[] result = new byte[length];
try
{
IntPtr bytes = Marshal.StringToCoTaskMemAnsi(input);
Marshal.Copy(bytes, result, 0, length);
}
catch (Exception)
{
result = null;
}
return result;
}

Categories