I just ran into a problem I haven't seen before. The problem short is that I need to send two different strings to a method which will validate if the string are the same.
one of the string look like this
Sample 1
JVBERi0xLjQNCiW1tbW1DQoxIDAgb2JqDQo8PC9UeXBlL0NhdGFsb2cvUGFnZXMgMiAwIFIvTGFu
ZyhkYS1ESykgL1N0cnVjdFRyZWVSb290IDU3IDAgUi9NYXJrSW5mbzw8L01hcmtlZCB0cnVlPj4v
the second one looks like
Sample 2 ZyhkYS1ESykgL1N0cnVjdFRyZWVSb290IDU3IDAgUi9NYXJrSW5mbzw8L01hcmtlZCB0cnVlPj4vZyhkYS1ESykgL1N0cnVjdFRyZWVSb290IDU3IDAgUi9NYXJrSW5mbzw8L01hcmtlZCB0cnVlPj4v
The actually string is a PDF document compressed into base64 (this is only a part of it)
I tried to take sample one into notepad++ and say show all special characters, it shows me CRLF in the end of each line.
Now im in the situation that i need to have sample 2 looking like sample 1, so I need to read a file into the same encoding, is this possible?
So to sum up here is what I want to do
EDIT/ADD:
What i want is that
1. Take a pdf
2. Convert it into base64encoded with cr/lf
3. in a validation method in another library it needs to be validated as this format.
Well I didnt find any nice way to create a CR/LF split
byte[] bytes = System.IO.File.ReadAllBytes(#"C:\Testdata\SSVALID.pdf");
string temp_inBase64 = Convert.ToBase64String(bytes);
string returnString = "";
int maxLenght = 76;
int counts = temp_inBase64.Length / maxLenght;
for (int i = 0; i < counts; i++)
{
returnString += temp_inBase64.Substring((i * 76), 76);
returnString += "\r\n";
}
returnString += temp_inBase64.Substring(76 * counts, temp_inBase64.Length - (76 * counts));
Related
I have some xml files where some control sequences are included in the text: EOT,ETX(anotherchar)
The other char following EOT comma ETX is not always present and not always the same.
Actual example:
<FatturaElettronicaHeader xmlns="">
</F<EOT>‚<ETX>èatturaElettronicaHeader>
Where <EOT> is the 04 char and <ETX> is 03. As I have to parse the xml this is actually a big issue.
Is this some kind of encoding I never heard about?
I have tried to remove all the control characters from my string but it will leave the comma that is still unwanted.
If I use Encoding.ASCII.GetString(file); the unwanted characters will be replaced with a '?' that is easy to remove but it will still leave some unwanted characters causing parse issues:
<BIC></WBIC> something like this.
string xml = Encoding.ASCII.GetString(file);
xml = new string(xml.Where(cc => !char.IsControl(cc)).ToArray());
I hence need to remove all this kind of control character sequences to be able to parse this kind of files and I'm unsure about how to programmatically check if a character is part of a control sequence or not.
I have find out that there are 2 wrong patterns in my files: the first is the one in the title and the second is EOT<.
In order to make it work I looked at this thread: Remove substring that starts with SOT and ends EOT, from string
and modified the code a little
private static string RemoveInvalidCharacters(string input)
{
while (true)
{
var start = input.IndexOf('\u0004');
if (start == -1) break;
if (input[start + 1] == '<')
{
input = input.Remove(start, 2);
continue;
}
if (input[start + 2] == '\u0003')
{
input = input.Remove(start, 4);
}
}
return input;
}
A further cleanup with this code:
static string StripExtended(string arg)
{
StringBuilder buffer = new StringBuilder(arg.Length); //Max length
foreach (char ch in arg)
{
UInt16 num = Convert.ToUInt16(ch);//In .NET, chars are UTF-16
//The basic characters have the same code points as ASCII, and the extended characters are bigger
if ((num >= 32u) && (num <= 126u)) buffer.Append(ch);
}
return buffer.ToString();
}
And now everything looks fine to parse.
sorry for the delay in responding,
but in my opinion the root of the problem might be an incorrect decoding of a p7m file.
I think originally the xml file you are trying to sanitize was a .xml.p7m file.
I believe the correct way to sanitize the file is by using a library such as Buoncycastle in java or dotnet and the class CmsSignedData.
CmsSignedData cmsObj = new CmsSignedData(content);
if (cmsObj.SignedContent != null)
{
using (var stream = new MemoryStream())
{
cmsObj.SignedContent.Write(stream);
content = stream.ToArray();
}
}
I'm having a hard time trying to figure this out. I'm writing a Unit test that verifies that the MD5 that a site displays matches the actual MD5 of the file. I do this by simply grabbing what the page displays and then calculating my own MD5 of the file. I get the text on the page by using Selenium WebDriver.
As expected, the strings show up as the same...or it appears to be
When I try to test the two strings using Assert.AreEqual or Assert.IsTrue, it fails no matter how I try to compare them
I've tried the following ways:
Assert.AreEqual(md5, md5Text); //Fails
Assert.IsTrue(md5 == md5Text); //Fails
Assert.IsTrue(String.Equals(md5, md5Text)); //Fails
Assert.IsTrue(md5.Normalize() == md5Text.Normalize()); //Fails
Assert.AreEqul(md5.Normalize(), md5Text.Normalize()); //Fails
At first, I thought the strings were actual different, but looking at them in the debugger shows that both strings are exactly the same
So I tried looking at their lengths, that's when I saw why
The strings are different lengths..so I tried to substring the md5 variable to match the size of the md5Text variable. My thinking here was maybe md5 had a bunch of 0 width characters. However doing this got rid of the last half of md5
SOO, this must mean their in different encodings correct? But wouldn't Normalize() fix that?
This is how the variable md5 is created
string md5;
using (var stream = file.Open()) //file is a custom class with an Open() method that returns a Stream
{
using (var generator = MD5.Create())
{
md5 = BitConverter.ToString(generator.ComputeHash(stream)).Replace("-", "").ToLower().Trim();
}
}
and this is how the md5Text variable is created
//I'm using Selenium WebDrvier to grab the text from the page
var md5Element = row.FindElements(By.XPath("//span[#data-bind='text: MD5Hash']")).Where(e => e.Visible()).First();
var md5Text = md5Element.Text;
How can I make this test pass? as it should be passing (since they are the same)
UPDATE:
The comments suggested I turn the strings into a char[] and iterate over it. Here are the results of that (http://pastebin.com/DX335wU8) and the code I added to do it
char[] md5Characters = md5.ToCharArray();
char[] md5TextCharacters = md5Text.ToCharArray();
//Use md5 length since it's bigger
for (int i = 0; i < md5Characters.Length; i++)
{
System.Diagnostics.Debug.Write("md5: " + md5Characters[i]);
if (i >= md5TextCharacters.Length)
{
System.Diagnostics.Debug.Write(" | Exhausted md5Text characters..");
}
else
{
System.Diagnostics.Debug.Write(" | md5Text: " + md5TextCharacters[i]);
}
System.Diagnostics.Debug.WriteLine("");
}
One thing I found interesting is that the md5 char array has a bunch of random characters inside of it every 2 letters
.Replace("-", "")
Your "" is not empty, there is actually a " then unicode zero width non-joiner + zero width space then " so you are not replacing "-" with an empty string rather you are inserting additional characters.
Delete and retype "" or use String.Empty.
I try to create a csv file in c#. First I wrote a little piece of code to test. I saw on that post that the "," character is use to create different columns but it doesn't work for me.
I searched on different topic but all answer are something like String.Format("{0},{1}", first, second);
My code :
String first = "Test name";
String second = "Value";
String newLine = String.Format("{0},{1}", first, second);
csv.AppendLine(newLine);
for (int i = 0; i < 3; i++)
{
first = "Test_" + i;
second = i.ToString();
newLine = String.Format("{0},{1}", first, second);
csv.AppendLine(newLine);
}
System.IO.File.WriteAllText(path, csv.ToString());
This create rightly the lines but all in column 1
Assuming csv to be a StringBuilder, this should create the file
Test name,Value
Test_0,0
Test_1,1
Test_2,2
which is a valid CSV file with two columns and three rows (and headers).
Are you by any chance using a non-English locale (specifically, one that uses the comma as a decimal separator) and trying to load the CSV in Excel? In that case Excel assumes ; as the field separator instead of ,. You can split the data with custom settings if needed (I believe the feature is called “Text to fields”, or “Text to columns” on the Data tab in the ribbon).
I clicked together a small WinForms app for testing. It has two multiline textboxes and a single button, which on press sends a request to a server and posts response headers and content into the textboxes like this:
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
int len = 0;
foreach (var header in response.Headers)
{
var str = header.ToString();
textBox1.AppendText(str + "=" + response.Headers[str] + "\n");
if (str == "Content-Length") len = Convert.ToInt32(response.Headers[str]);
}
Stream respStream = response.GetResponseStream();
byte[] x = new byte[len];
respStream.Read(x, 0, len);
var s = new string(ascii.GetChars(x, 0, len));
// textBox2.Text = s;
textBox2.Clear();
textBox2.AppendText(s);
MessageBox.Show(textBox2.TextLength.ToString(), s.Length.ToString());
But no matter whether I use AppendText or whether I assign the string, the MessageBox always shows the caption 7653 with message 3964, and the headers textbox contains the line Content-length=7653.
So it seems that the string is not completely appended to the TextBox. Why would that be?
Btw: I am requesting an HTML document; the last two chars shown are ".5", and the first two chars missing are "16", so it does not break at some special characters.
Check out this Post
Your problem is that with Stream.Read you may read less than the total number of characters as they may not be available yet on the network.
So your string already contains only the first part of the text. s.Length indicates the right number of characters as it gets copied over from the byte array x but most of the characters are 0 (Char '\0'). textBox2.TextLength then indicates the right number of characters that have been read. I suppose it trims the '\0' characters.
You should use a while loop instead and check the result of Read as indicated before.
Also check the encoding of your html page. For UTF8 (default in HTML 5) one byte doesn't necessarily correspond to one character.
A label printer is controled by sending a string of raw ASCII characters (which formats a label). Like this:
string s = "\x02L\r" + "D11\r" + "ySWR\r" + "421100001100096" + date + "\r" + "421100002150096" + time + "\r" + "421100001200160" + price + "\r" + "E\r";
RawPrinterHelper.SendStringToPrinter(printerName, s);
This hardcoded variant works well.
Now I want to put the control string to a .txt file and read it during runtime. Like this:
string printstr;
TextReader tr = new StreamReader("print.txt");
printstr = tr.ReadLine();
tr.Close();
But in this case printer prints nothing.
It seems, that StreamReader adds something else to this string
(If I put the read string to a MessageBox.Show(printstr); everything looks OK. Though, this way we can not see control characters added).
What could be a solution to this problem?
Your code calls tr.ReadLine() once, but it looks like you have multiple lines in that string.
Looks like a Zebra label printer, I've had the displeasure. The first thing you need to fix is the way you generate the print.txt file. You'll need to write one line for each section of the command string that's terminated with \r. For example, your command string should be written like this:
printFile.WriteLine("\x02L");
printFile.WriteLine("D11");
printFile.WriteLine("ySWR");
printFile.WriteLine("421100001100096" + date);
printFile.WriteLine("421100002150096" + time);
printFile.WriteLine("421100001200160" + price);
printFile.WriteLine("E");
printFile.WriteLine();
Now you can use ReadLine() when you read the label from print.txt. You'll need to read multiple lines to get the complete label. I added a blank line at the end, you could use that when you read the file to detect that you got all the lines that creates the label. Don't forget to append "\r" again when you send it to the printer.
It could be that the StreamReader is reading it in an Unicode format. By the way, you are reading in only just one line...you need to iterate the lines instead...Your best bet would be to do it this way:
string printstr;
TextReader tr = new StreamReader("print.txt",System.Text.Encoding.ASCII);
printstr = tr.ReadToEnd();
tr.Close();
Or read it as a binary file and read the whole chunk into a series of bytes instead, error checking is omitted.
System.IO.BinaryReader br = new System.IO.BinaryReader(new StreamReader("print.txt", System.Text.Encoding.ASCII));
byte[] data = br.ReadBytes(br.BaseStream.Length);
br.Close();
Edit:
After rem's comment I thought it best to include this additional snippet here...this follows on from the previous snippet where the variable data is referenced...
string sData = System.Text.Encoding.ASCII.GetString(data);
Hope this helps,
Best regards,
Tom.