C# converting binary to text -- question marks? - c#

I'm converting a binary file to text and dumping it into a PDF. I have this working, but I need to produce output that is identical to some samples of another program in a different language (it makes the text, then converts it to binary, so I guess I'm converting back?).
I get identical output except for one thing. I should have a bunch of dashes to set off subject headers, but instead I'm getting question marks (?). If I use Notepad++ to display the binary file, the question marks turn into some random Korean character (컴). I've tried doing result.Replace("?", "-"); and result.Replace("컴", "-"); and I've even tried checking with Contains(), but nothing is triggered.
How can I replace them?
Not sure if it will help, but here's my code:
private void btnConvertBinaryToPDF_Click(object sender, EventArgs e)
{
PdfDocument document = new PdfDocument(); //make new pdf document
PdfPage page = document.AddPage(); //add a page to the document
XGraphics gfx = XGraphics.FromPdfPage(page); //use this to draw/write on the specified page
XFont font = new XFont("Courier New", 10); //need a font to write with
string result = "";
string path = #"C:\Users\file";
byte[] b = new byte[1024];
UTF8Encoding temp = new UTF8Encoding(true);
FileStream fs = File.OpenRead(path);
int i = 1;
while (fs.Read(b, 0, b.Length) > 0)
{
string tmp = temp.GetString(b);
result += tmp;
b = new byte[1024]; //clear the buffer
}
if (result.Contains("?"))
{
Console.WriteLine("contains!");
}
result.Replace("컴", "-");
XTextFormatter tf = new XTextFormatter(gfx);
XRect rect = new XRect(40, 100, 500, 100);
tf.DrawString(result, font, XBrushes.Black, rect, XStringFormats.TopLeft);
string filename = "HelloWorld.pdf"; //make the filename
document.Save(filename); //save the document to the filename
Process.Start(filename); //open the file to show the document
}
EDIT: path contains binary data. I need to get the text representation of its contents. The above works fine, except in the case of ASCII characters numbered higher than 127.

It looks like you're simply making a mess of reading from the file. I'll assume that path contains text data; in which case, you might be better off simply using:
string result = File.ReadAllText(path);
optionally specifying an encoding:
string result = File.ReadAllText(path, Encoding.UTF8);
At the moment, you are:
treating more bytes as data than you read each iteration
not handling partial character reads
(there are also some inefficiencies in how you handle the string, the byte[] and the FileStream, but frankly that is moot if you're also getting the wrong answer)
Finally, your replace: does nothing:
result.Replace("컴", "-");
should be:
result = result.Replace("컴", "-");
(if it is still needed)

Related

Streamed File Contains Strange Characters - An Encoding Issue

I have a WCF service end-point which generates an excel file and returns this file as a MemoryStream in the end in order to make client download the relevant file.
The file generated on the respective directory has no issues. I don't see any strange characters when I open and check it.
But, the file I returned with MemoryStream is full of strange unreadable characters.
My end-point is like that,
public Stream GetEngagementFeedFinalizeData(int workspaceId, string startDate, string endDate, Stream data)
{
try
{
string contentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;";
string extension = "xls";
string fileName = "report-" + DateTime.Now.Ticks.ToString();
string contentDisposition = string.Format(CultureInfo.InvariantCulture, "attachment; filename={0}.{1}", fileName, extension);
WebOperationContext.Current.OutgoingResponse.ContentType = contentType;
WebOperationContext.Current.OutgoingResponse.Headers.Set("Content-Disposition", contentDisposition);
//Here is some business logic and fetching data from db. Not any encoding
//related issue. The data set is assigned to a variable
//named "feedFinalizeDataTable" in the end
feedFinalizeDataTable.TableName = "Summary";
DataSet dataSet = new DataSet();
dataSet.Tables.Add(feedFinalizeDataTable);
using (ExcelPackage excelPackage = new ExcelPackage())
{
foreach (DataTable dt in dataSet.Tables)
{
ExcelWorksheet sheet = excelPackage.Workbook.Worksheets.Add(dt.TableName);
sheet.Cells["A1"].LoadFromDataTable(dt, true);
}
var path = System.IO.Path.Combine(System.AppDomain.CurrentDomain.BaseDirectory);
var filePath = path + "\\" + "New.xls";
excelPackage.SaveAs(new System.IO.FileInfo(filePath)); //This file is flawless
FileStream fs = new FileStream(filePath, FileMode.Open);
int length = (int)fs.Length;
WebOperationContext.Current.OutgoingResponse.ContentLength = length;
byte[] buffer = new byte[length];
int sum = 0;
int count;
while ((count = fs.Read(buffer, sum, length - sum)) > 0)
{
sum += count;
}
fs.Close();
return new MemoryStream(buffer); //This file is full of unreadable chars as per above shared screenshot
}
I'm using OfficeOpenXml to generate excel files.
Then, I checked both files encoding by open them with notepad. I saw that the file on the directory (the flawless one) has ANSI encoding. And, the one which is returned by the end-point (the broken one) has UTF-8 encoding.
After that, I try to change the encoding type of the stream like this,
var byteArray = System.IO.File.ReadAllBytes(filePath);
string fileStr = new StreamReader(new MemoryStream(byteArray), true).ReadToEnd();
var encd = Encoding.GetEncoding(1252); //On the other topics I saw that ANSI represented with 1252
var end = encd.GetBytes(fileStr);
return new MemoryStream(end);
But, this doesn't help too. Though some of the strange characters are replaced with some other strange characters, but as I said, streamed file is still unreadable. And, when I open it with notepad to see its encoding, I saw that its still UTF-8.
Thus, I'm kind of stuck. I have also try directly to stream the generated excel file (without writing it to a directory and then reading it) with OfficeOpenXml's built in function called .GetAsByteArray(), but the downloaded file looks exactly the same as per above screenshot.
Thanks in advance.

Fix this is not a valid bitmap file

string sDir = #"\\Q1875G\Vehicle";
NetworkCredential NCredentials = new NetworkCredential("FOLDER_ACCESS_USER", "Welcome#2020");
using (new NetworkConnection(sDir, NCredentials))
{
string path = $"{sDir}\\483";
if (!Directory.Exists(path))
Directory.CreateDirectory(path);
string fileName = "add_274400.jpg";
path = $"{sDir}\\483\\{fileName}";
byte[] byteArrayIn = imageByteArray;
using (var ms = new MemoryStream(byteArrayIn))
{
using (var fs = new FileStream(path, FileMode.Create))
{
ms.WriteTo(fs);
}
}
}
Using this code image file getting created but when I try to open it, it gives an error that this is not a valid bitmap file, or its format is not currently supported.
That's not a JPEG yet; it's the bytes of a jpeg, base64 encoded, and prefixed with a header that would make it suitable for plonking inline into an <img src= tag attribute
The jpeg data starts with the /9j so you'll have to do something like:
var b64jpeg = Encoding.ASCII.GetString(imageByteArray, 23, imageByteArray.Length - 23);
var jpegBytes = Convert.FromBase64String(b64jpeg);
Then write jpegBytes to a file. There is no need to put it in a MemoryStream first; just File.WriteAllBytes it
If this imageByteArray has been delivered to you as a string (outside the code visible in the question) it would be better to keep it as that and substring it, rather than having this "to array (in the other code), from array (in this code)" step
Side note: you don't need if (!Directory.Exists(path)) either; Directory.CreateDirectory does nothing if the directory exists, so just call it without the Exists check

How can i read the end of a file in c#?

i searched in stackoverflow and got one way but this method only let me to write word by word in the console. My goal is to get the end of my file but get the complete result not char by char.
This code only show me char by char the end of my file:
using (var reader = new StreamReader("file.dll")
{
if (reader.BaseStream.Length > 1024)
{
reader.BaseStream.Seek(-1024, SeekOrigin.End);
}
string line;
while ((line = reader.ReadLine()) != null)
{
Console.WriteLine(line);
Console.ReadKey();
}
}
I was trying to get something like this, it's c++ but i was trying to get the same result in c#.
QFile *archivo;
archivo = new QFile();
archivo->setFileName("file.dll");
archivo->open(QFile::ReadOnly);
archivo->seek(archivo->size() - 1024);
trama = archivo->read(1024);
It's possible to get the complete result of the end of my file in c#?
If the file is line-delimited text file, you can use ReadAllLines.
string[] lines = System.IO.File.ReadAllLines("file.txt");
If it's a binary file, you can use ReadAllBytes. Shocker, I know.
byte[] data = System.IO.File.ReadAllBytes("file.dll");
if you want to be able to seek first (e.g. if you want only the last 1024 bytes of the file) you can use the stream's Read method. Again, crazy.
reader.BaseStream.Seek(-1024, SeekOrigin.End);
var chars = new char[1024];
reader.Read(chars, 0, 1024);
And before you ask, you can convert the characters to a string by passing them to the constructor:
char[] chars = new char[1024];
string s = new string(chars);
Console.WriteLine(s);
Not sure what it'll look like, since you're reading characters from a binary file, but good luck. My guess is you should be reading bytes instead though:
reader.BaseStream.Seek(-1024, SeekOrigin.End);
var bytes = new byte[1024];
reader.BaseStream.Read(bytes, 0, 1024);
(Notice you don't even need the StreamReader, since the FileStream (your base stream) exposes the Read method you need).

ITextSharp/Pdftk: place Base64 Image from Web on PDF as Pseude-Signature

I am trying to conceptualize a way to get base64 image onto an already rendered PDF in iText. The goal is to have the PDF save to disk then reopen to apply the "signature" in the right spot.
I haven't had any success with finding other examples online so I'm asking Stack.
My app uses .net c#.
Any advice on how to get started?
As #mkl mentioned the question is a confusing, especially the title - usually base64 and signature do not go together. Guessing you want to place a base64 image from web on the PDF as a pseudo signature?!?!
A quick working example to get you started:
static void Main(string[] args)
{
string currentDir = AppDomain.CurrentDomain.BaseDirectory;
// 'INPUT' => already rendered pdf in iText
PdfReader reader = new PdfReader(INPUT);
string outputFile = Path.Combine(currentDir, OUTPUT);
using (var stream = new FileStream(outputFile, FileMode.Create))
{
using (PdfStamper stamper = new PdfStamper(reader, stream))
{
AcroFields form = stamper.AcroFields;
var fldPosition = form.GetFieldPositions("lname")[0];
Rectangle rectangle = fldPosition.position;
string base64Image = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==";
Regex regex = new Regex(#"^data:image/(?<mediaType>[^;]+);base64,(?<data>.*)");
Match match = regex.Match(base64Image);
Image image = Image.GetInstance(
Convert.FromBase64String(match.Groups["data"].Value)
);
// best fit if image bigger than form field
if (image.Height > rectangle.Height || image.Width > rectangle.Width)
{
image.ScaleAbsolute(rectangle);
}
// form field top left - change parameters as needed to set different position
image.SetAbsolutePosition(rectangle.Left + 2, rectangle.Top - 2);
stamper.GetOverContent(fldPosition.page).AddImage(image);
}
}
}
If you're not working with a PDF form template, (AcroFields in code snippet) explicitly set the absolute position and scale the image as needed.

How to read .png file and show as text in a TextBox?

I need to open a .png file as a string and put it in the textbox.I trying to do that by this code:
private void button1_Click(object sender, EventArgs e)
{
OpenFileDialog dialog = new OpenFileDialog();
dialog.ShowDialog();
text1.Text = dialog.FileName;
string text = System.IO.File.ReadAllText(dialog.FileName);
text2.Text = text;
}
I need to get in my multiline textbox something like this:
‰PNG
IHDR O Ů /ç%O sRGB ®Îé gAMA ±Źüa pHYs Ă ĂÇo¨d
(IDATx^íť˝ŽŢF˛†'T°łUčĐá *ô,°'Zl˛€®b7t8—0ˇB‡ľ'(;7Tb#p$ř«Ş9ŐĹźŻŮě˙®ŻŰ{H6«ş^5ÉŤžţ0ăÉÁŰĆ#,WXš*CĆxŰ0˛aEVv¶yۨ•ť&ˇÉŘ
oFU¬5Ńć$cĽm”ÂrĄIX:čëŢ6ęałĄ)26ŔŰFKŘĽbiÚŚ
đ¶a´yŰ…éJśť}ěí“/F×XŮŇ®čëŢŕŇÎFŘ”Ň}šäL/¶ľ=ń÷ĎƦ,ÎŇ$çuq¶Młan¦Ý4)3«MĂ0®ŇŠ”™Í؆‘ś:¦jŮŰM]Śa$${eŁŻx»y;5~yĆ›˛#§i±5ÂŰŇőĹωMY
·Ň„ľ^ŕmèU` ŇDĆxŰ0Ś®8´.;ŰĽml°Âčž3š?€6gĆ’p‚+’EîłŃ 6[«ŕ
but I get only one word:
�PNG
Please, help me!
Binary data are best read with the BinaryReader. To display them in a TextBox you need to replace the 0x00 character so it won't disrupt the Text in the control.
This will replace the 0x00 character by a '.' :
using (BinaryReader br = new BinaryReader(File.Open(yourFile, FileMode.Open)))
{
var data = br.ReadChars ((int)br.BaseStream.Length);
StringBuilder sb = new StringBuilder();
foreach (char c in data)
if ((int)c > 0) sb.Append(c.ToString()); else sb.Append(".");
text2.Text = sb.ToString();
}
Edit:
Your original code will also work if you modify the final assignment like this:
text2.Text = text.Replace((char)0, '.');
Explanation: In C# a string can hold arbitrary bit patterns; but the old Winform TextBox is still the same as way back before C#, probably written in C++ and will not handle the old string termination character 0x0 correctly.
While the original problem is not so much the use of File.ReadAllText, it is well worth having the BinaryReader with its many interesting methods in your toolbox..
And the result is not totally useless - I just found that my test file has an embedded Photoshop ICC profile ;-)
not sure why you are trying to do this, but If that's what you really want you can use base64 encoded string
Read a Image file:
Bitmap loadedBitmap = Bitmap.FromFile(dialog.Filename);
Image imgFile = Image.FromFile(dialog.Filename);
using (MemoryStream ms = new MemoryStream())
{
// Convert Image to byte[]
image.Save(ms, format);
byte[] imageBytes = ms.ToArray();
// Convert byte[] to Base64 String
string base64String = Convert.ToBase64String(imageBytes);
text2.Text = base64String;
}
and when you are reading that string back, you can do the reverse and convert base64 encoded string into an image....

Categories