UTF-8 Encoding converts Control Characters >127 to?

UTF-8 Encoding converts Control Characters >127 to? - c#

I have binary file with alphabets, numbers and Special Characters and Control characters as follows.
ESC!€STANDARD + UNDERLINE
ESC!COMPRESSED + UNDERLINE
ESC!ˆSTANDARD + EMPHASIZED + UNDERLINE
i am performing the following.
string file = FileOpenDlg.FileName;
System.IO.StreamReader myFile = new System.IO.StreamReader(file);
string data = myFile.ReadToEnd();
byte[] sendCmd = Encoding.UTF8.GetBytes(data);
This replaces €,ˆ etc whose hex value is >80 to ?. When i send this to printer it gives wrong answer.
How handle the these characters? Timely help is appreciated.

Related

Replace a string in a text read from a csv and save it

I managed to load the csv and now want to change a few strings inside and then save it again.
First problem: He doesnt want to change the text to '0 . Replacing only "4" with "0" works, but never when my string has more than 1 character.
Second problem: The last replace where I delete all ' to "". When opening the csv in an editor it shows some weird asian characters instead of nothing.
(䈀攀稀甀最猀瀀爀)
There are no spaces in my csv. The csv looks like
.....";"++49 then more random numbers and so on.
This is just the part where ++49 is to be found.
Relevant code:
Encoding ansi = Encoding.GetEncoding(1252);
foreach (string file in Directory.EnumerateFiles(#"path comes here, "*.csv"))
{
string text = File.ReadAllText(file, ansi);
text = text.Replace(#"++49", "'0");
text = text.Replace("+49", "'0");
text = text.Replace(#"""", "");
File.WriteAllText(file, text, ansi);
}
Am i doing something fundamentally wrong?
edit: What it looks like: ";"++49<morenumbers>";; What it should look like: ;0<morenumbers>;;

As people mentioned in comments, problem is with your file encoding decoding. So in this case you can try this:
foreach(string file in Directory.EnumerateFiles(#"path comes here","*.csv"))
{
Encoding ansi;
using (var reader = new System.IO.StreamReader(file, true))
{
ansi = reader.CurrentEncoding; // please tell what you have here ! :)
}
string text = File.ReadAllText(file, ansi);
text = text.Replace(#"++49", "'0");
text = text.Replace(#"+49", "'0");
text = text.Replace(#"""", "");
File.WriteAllText(file, text, ansi);
}
For me it works fine with all formats I was able to set. Then you do not have to set your encoding as hardcoded value

Put string into textbox -> not complete

I clicked together a small WinForms app for testing. It has two multiline textboxes and a single button, which on press sends a request to a server and posts response headers and content into the textboxes like this:
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
int len = 0;
foreach (var header in response.Headers)
{
var str = header.ToString();
textBox1.AppendText(str + "=" + response.Headers[str] + "\n");
if (str == "Content-Length") len = Convert.ToInt32(response.Headers[str]);
}
Stream respStream = response.GetResponseStream();
byte[] x = new byte[len];
respStream.Read(x, 0, len);
var s = new string(ascii.GetChars(x, 0, len));
// textBox2.Text = s;
textBox2.Clear();
textBox2.AppendText(s);
MessageBox.Show(textBox2.TextLength.ToString(), s.Length.ToString());
But no matter whether I use AppendText or whether I assign the string, the MessageBox always shows the caption 7653 with message 3964, and the headers textbox contains the line Content-length=7653.
So it seems that the string is not completely appended to the TextBox. Why would that be?
Btw: I am requesting an HTML document; the last two chars shown are ".5", and the first two chars missing are "16", so it does not break at some special characters.

Check out this Post
Your problem is that with Stream.Read you may read less than the total number of characters as they may not be available yet on the network.
So your string already contains only the first part of the text. s.Length indicates the right number of characters as it gets copied over from the byte array x but most of the characters are 0 (Char '\0'). textBox2.TextLength then indicates the right number of characters that have been read. I suppose it trims the '\0' characters.
You should use a while loop instead and check the result of Read as indicated before.
Also check the encoding of your html page. For UTF8 (default in HTML 5) one byte doesn't necessarily correspond to one character.

How do I read chars from other countries such as ß ä?

How do I read chars from other countries such as ß ä?
The following code reads all chars, including chars such as 0x0D.
StreamReader srFile = new StreamReader(gstPathFileName);
char[] acBuf = null;
int iReadLength = 100;
while (srFile.Peek() >= 0) {
acBuf = new char[iReadLength];
srFile.Read(acBuf, 0, iReadLength);
string s = new string(acBuf);
}
But it does not interpret correctly chars such as ß ä.
I don't know what coding the file uses. It is exported from code (into a .txt file) that was written 20 plus years ago from a C-Tree database.
The ß ä display fine with Notepad.

By default, the StreamReader constructor assumes the UTF-8 encoding (which is the de facto universal standard today). Since that's not decoding your file correctly, your characters (ß, ä) suggest that it's probably encoded using Windows-1252 (Western European):
var encoding = Encoding.GetEncoding("Windows-1252");
using (StreamReader srFile = new StreamReader(gstPathFileName, encoding))
{
// ...
}
A closely-related encoding is ISO/IEC 8859-1. If the above gives some unexpected results, use Encoding.GetEncoding("ISO-8859-1") instead.

c# Rich text Format error in code

hoping you can help
I have the following code
List<string> comconfig = populate.resolveconfig(_varibledic, populate.GetVaribles[0].Substring(populate.GetVaribles[0].IndexOf("=") + 1)); //get the aray of strings
string config = ""; //create a empty otput string
config = #"\rtf1\ansi\deff0\deftab240 {\fonttbl {\f000 Monaco;} {\f001 Monaco;} } {\colortbl \red255\green255\blue255; \red000\green000\blue000; \red255\green255\blue255; \red000\green000\blue000; }";
config = config + #"\f96\fs20\cb3\cf2 \highlight1\cf0 "; // assigned rtf header to output string
foreach (var strings in comconfig) //loop though array adding to output string
{
config = config + strings + #"\par ";
}
config = config + "}"; //close of RTF code
So trying to create a RTF string that I can later display. comconfig is an array of strings with some RTF mark up for highlighting and stuff.
trouble is that if I use # then I get double \ which mess up the RTF, and if i dont use them, then the escape charatures mess up the code??
what is the best way to build up this string by adding a preformated RTF header and the aray of strings in the middle. it is displayed finaly in a RTF.textbox. or converted to a plain text string at the users request. I need to ignore the escape charatures with out messing up the RTF?
Cheers
Aaron

No, you don't get a double \. You're getting confuzzled by the debugger display of the string. It shows you what the string looks like if you had written it in C# without the #. Click the spy glass icon at the far right and select the Text Visualizer.

ASCII raw symbols to control a printer from a .txt file

A label printer is controled by sending a string of raw ASCII characters (which formats a label). Like this:
string s = "\x02L\r" + "D11\r" + "ySWR\r" + "421100001100096" + date + "\r" + "421100002150096" + time + "\r" + "421100001200160" + price + "\r" + "E\r";
RawPrinterHelper.SendStringToPrinter(printerName, s);
This hardcoded variant works well.
Now I want to put the control string to a .txt file and read it during runtime. Like this:
string printstr;
TextReader tr = new StreamReader("print.txt");
printstr = tr.ReadLine();
tr.Close();
But in this case printer prints nothing.
It seems, that StreamReader adds something else to this string
(If I put the read string to a MessageBox.Show(printstr); everything looks OK. Though, this way we can not see control characters added).
What could be a solution to this problem?

Your code calls tr.ReadLine() once, but it looks like you have multiple lines in that string.

Looks like a Zebra label printer, I've had the displeasure. The first thing you need to fix is the way you generate the print.txt file. You'll need to write one line for each section of the command string that's terminated with \r. For example, your command string should be written like this:
printFile.WriteLine("\x02L");
printFile.WriteLine("D11");
printFile.WriteLine("ySWR");
printFile.WriteLine("421100001100096" + date);
printFile.WriteLine("421100002150096" + time);
printFile.WriteLine("421100001200160" + price);
printFile.WriteLine("E");
printFile.WriteLine();
Now you can use ReadLine() when you read the label from print.txt. You'll need to read multiple lines to get the complete label. I added a blank line at the end, you could use that when you read the file to detect that you got all the lines that creates the label. Don't forget to append "\r" again when you send it to the printer.

It could be that the StreamReader is reading it in an Unicode format. By the way, you are reading in only just one line...you need to iterate the lines instead...Your best bet would be to do it this way:
string printstr;
TextReader tr = new StreamReader("print.txt",System.Text.Encoding.ASCII);
printstr = tr.ReadToEnd();
tr.Close();
Or read it as a binary file and read the whole chunk into a series of bytes instead, error checking is omitted.
System.IO.BinaryReader br = new System.IO.BinaryReader(new StreamReader("print.txt", System.Text.Encoding.ASCII));
byte[] data = br.ReadBytes(br.BaseStream.Length);
br.Close();
Edit:
After rem's comment I thought it best to include this additional snippet here...this follows on from the previous snippet where the variable data is referenced...
string sData = System.Text.Encoding.ASCII.GetString(data);
Hope this helps,
Best regards,
Tom.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

UTF-8 Encoding converts Control Characters >127 to? - c#

Related

Replace a string in a text read from a csv and save it

Put string into textbox -> not complete

How do I read chars from other countries such as ß ä?

c# Rich text Format error in code

ASCII raw symbols to control a printer from a .txt file

Categories

Resources