How to convert the text direction to normal - c#

I am reading an arabic localized pdf document in memory using c# and after reading the text i am getting it like this
٩٠/٤٠/٧٣٤١ ٩١/١٠/٦١٠٢
but the correct direction of this text in pdf is ٢٠١٦/٠١/١٩ ١٤٣٧/٠٤/٠٩
Can somebody please guide how can change this text direction to proper direction as it is appearing in pdf.
Edit
This is the function i am using. I am using Devexpress Document server, I am skipping upto line 36 as I do not need the data before line 36.
private void button1_Click(object sender, EventArgs e)
{
using (var documentStream = new FileStream(#"D:\Data\Projects\DotNet\ElectricBillReader\electricbill.pdf", FileMode.Open, FileAccess.Read))
{
using (PdfDocumentProcessor documentProcessor = new PdfDocumentProcessor())
{
documentProcessor.LoadDocument(documentStream);
using (var sr = new StringReader(documentProcessor.Text))
{
var counter = 0;
string line = string.Empty;
do
{
line = sr.ReadLine();
if (counter > 36)
{
if (line != null)
{
}
}
counter++;
} while (line!=null);
}
}
}
}

You're gonna need a library that implements the Unicode bidirectional algorithm, i'm not aware of any libraries that does this for .NET but there's an effort to port ICU to .NET here
Also, check this out: https://sourceforge.net/projects/nbidi/

Did you think about just reversing the string?
public static string Reverse( string s )
{
char[] charArray = s.ToCharArray();
Array.Reverse( charArray );
return new string( charArray );
}
Source

Related

How can I read from streamreader the first character from the first line? ( Example the Letter H that is in the txt file

I need to read the first char from the first line and then others in a streamreader and if statements.
private void buttonEdit_Click(object sender, EventArgs e)
{
NewSalariedEmployee newSalariedEmployee = new NewSalariedEmployee();
HourlyEmployeeDetails hourlyEmployeeDetails = new HourlyEmployeeDetails();
employees = new Employees();
string firstLine;
using (StreamReader reader = new StreamReader("employees.txt"))
{
firstLine = reader.ReadLine();
}
if (firstLine.Contains(firstLine.StartsWith("H"))
{
hourlyEmployeeDetails.ShowDialog();
}
}
The StreamReader.Read method that "Reads the next character from the input stream and advances the character position by one character." is not what I'm looking for...
If I understand what you are trying to do (though I'm guessing a bit).
If you are trying to read each line in a file and process the first character of each line, you just need a loop to keep reading lines until you have read them all. You can use the Peek method to see if you have more to read.
private void buttonEdit_Click(object sender, EventArgs e)
{
NewSalariedEmployee newSalariedEmployee = new NewSalariedEmployee();
HourlyEmployeeDetails hourlyEmployeeDetails = new HourlyEmployeeDetails();
employees = new Employees();
using (StreamReader reader = new StreamReader("employees.txt"))
{
// see if there is more to read from the reader
while (reader.Peek() > -1)
{
// read the next line and perform your logic on the first char
string currentLine = reader.ReadLine();
if (currentLine != null && currentLine.StartsWith("H"))
{
hourlyEmployeeDetails.ShowDialog();
}
}
}
}

How can I read a Lync conversation file containing HTML?

I'm having trouble reading a local file, into a string, in c#.
Here's what I came up with till now:
string file = #"C:\script_test\{5461EC8C-89E6-40D1-8525-774340083829}.html";
using (StreamReader reader = new StreamReader(file))
{
string line = "";
while ((line = reader.ReadLine()) != null)
{
textBox1.Text += line.ToString();
}
}
And it's the only solution that seems to work.
I've tried some other suggested methods for reading a file, such as:
string file = #"C:\script_test\{5461EC8C-89E6-40D1-8525-774340083829}.html";
string html = File.ReadAllText(file).ToString();
textBox1.Text += html;
Yet it does not work as expected.
Here are the first few lines of the file i'm trying to read:
as you can see, it has some funky characters, honestly I don't know if that's the cause of this weird behavior.
But in the first case, the code seems to skip those lines, printing only "Document generated by Office Communicator..."
Your task would be easier if you could use an API or the SDK or even would have a description of the format you try to read. However the binary format looks not to be that complicated and with an hexviewer installed I got this far to get the html out of the example you provided.
To parse non-text files you fall-back to the BinaryReader and then use one of the Read methods to read the correct type from the bytestream. I used ReadByte and ReadInt32. Notice how in the description of the method is explained how many bytes are read. That becomes handy when you try to decipher your file.
private string ParseHist(string file)
{
using (var f = File.Open(file, FileMode.Open))
{
using (var br = new BinaryReader(f))
{
// read 4 bytes as an int
var first = br.ReadInt32();
// read integer / zero ended byte arrays as string
var lead = br.ReadInt32();
// until we have 4 zero bytes
while (lead != 0)
{
var user = ParseString(br);
Trace.Write(lead);
Trace.Write(":");
Trace.Write(user.Length);
Trace.Write(":");
Trace.WriteLine(user);
lead = br.ReadInt32();
// weird special case
if (lead == 2)
{
lead = br.ReadInt32();
}
}
// at the start of the html block
var htmllen = br.ReadInt32();
Trace.WriteLine(htmllen);
// parse the html
var html = ParseString(br);
Trace.Write(len);
Trace.Write(":");
Trace.Write(html.Length);
Trace.Write(":");
Trace.WriteLine(html);
// other structures follow, left unparsed
return html.ToString();
}
}
}
// a string seems to be ascii encoded and ends with a zero byte.
private static string ParseString(BinaryReader br)
{
var ch = br.ReadByte();
var sb = new StringBuilder();
while (ch != 0)
{
sb.Append((char)ch);
ch = br.ReadByte();
}
return sb.ToString();
}
You could use the simple parsing logic in a winform application as follows:
private void button1_Click(object sender, EventArgs e)
{
webBrowser1.DocumentText = ParseHist(#"5461EC8C-89E6-40D1-8525-774340083829-Copia.html");
}
Keep in mind that this is not bullet proof or the recommended way but it should get you started. For files that don't parse well you'll need to go back to the hexviewer and work-out what other byte structures are new or different from what you already had. That is not something I intend to help you with, that is left as an exercise for you to figure out.
I don't know if it's the right way to answer this, but here's what I've managed to do so far:
string file = #"C:\script_test\{1C0365BC-54C6-4D31-A1C1-586C4575F9EA}.hist";
string outText = "";
//Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
StreamReader reader = new StreamReader(file, utf8);
char[] text = reader.ReadToEnd().ToCharArray();
//skip first n chars
/*
for (int i = 250; i < text.Length; i++)
{
outText += text[i];
}
*/
for (int i = 0; i < text.Length; i++)
{
//skips non printable characters
if (!Char.IsControl(text[i]))
{
outText += text[i];
}
}
string source = "";
source = WebUtility.HtmlDecode(outText);
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(source);
string html = "<html><style>";
foreach (HtmlNode node in htmlDoc.DocumentNode.SelectNodes("//style"))
{
html += node.InnerHtml+ Environment.NewLine;
}
html += "</style><body>";
foreach (HtmlNode node in htmlDoc.DocumentNode.SelectNodes("//body"))
{
html += node.InnerHtml + Environment.NewLine;
}
html += "</body></html>";
richTextBox1.Text += html+Environment.NewLine;
webBrowser1.DocumentText = html;
The conversation displays correctly, both style and encoding.
So it's a start for me.
Thank you all for the support!
EDIT
Char.IsControl(char)
skips non printable characters :)

Merging 2 Text Files in C#

Firstly, i'd just like to mention that I've only started learning C# a few days ago so my knowledge of it is limited.
I'm trying to create a program that will parse text files for certain phrases input by the user and then output them into a new text document.
At the moment, i have it the program searching the original input file and gathering the selected text input by the user, coping those lines out, creating new text files and then merging them together and also deleting them afterwards.
I'm guessing that this is not the most efficient way of creating this but i just created it and had it work in a logical manor for me to understand as a novice.
The code is as follows;
private void TextInput1()
{
using (StreamReader fileOpen = new StreamReader(txtInput.Text))
{
using (StreamWriter fileWrite = new StreamWriter(#"*DIRECTORY*\FIRSTFILE.txt"))
{
string file;
while ((file = fileOpen.ReadLine()) != null)
{
if (file.Contains(txtFind.Text))
{
fileWrite.Write(file + "\r\n");
}
}
}
}
}
private void TextInput2()
{
using (StreamReader fileOpen = new StreamReader(txtInput.Text))
{
using (StreamWriter fileWrite = new StreamWriter(#"*DIRECTORY*\SECONDFILE.txt"))
{
string file;
while ((file = fileOpen.ReadLine()) != null)
{
if (file.Contains(txtFind2.Text))
{
fileWrite.Write("\r\n" + file);
}
}
}
}
}
private static void Combination()
{
ArrayList fileArray = new ArrayList();
using (StreamWriter writer = File.CreateText(#"*DIRECTORY*\FINALOUTPUT.txt"))
{
using (StreamReader reader = File.OpenText(#"*DIRECTORY*\FIRSTFILE.txt"))
{
writer.Write(reader.ReadToEnd());
}
using (StreamReader reader = File.OpenText(#"*DIRECTORY*\SECONDFILE.txt"))
{
writer.Write(reader.ReadToEnd());
}
}
}
private static void Delete()
{
if (File.Exists(#"*DIRECTORY*\FIRSTFILE.txt"))
{
File.Delete(#"*DIRECTORY*\FIRSTFILE.txt");
}
if (File.Exists(#"*DIRECTORY*\SECONDFILE.txt"))
{
File.Delete(#"*DIRECTORY*\SECONDFILE.txt");
}
}
The output file that is being created is simply outputting the first text input followed by the second. I am wondering if it is possible to be able to merge them into 1 file, 1 line at a time as it is a consecutive file meaning have the information from Input 1 followed 2 is needed rather than all of 1 then all of 2.
Thanks, Neil.
To combine the two files content in an one merged file line by line you could substitute your Combination() code with this
string[] file1 = File.ReadAllLines("*DIRECTORY*\FIRSTFILE.txt");
string[] file2 = File.ReadAllLines("*DIRECTORY*\SECONDFILE.txt");
using (StreamWriter writer = File.CreateText(#"*DIRECTORY*\FINALOUTPUT.txt"))
{
int lineNum = 0;
while(lineNum < file1.Length || lineNum < file2.Length)
{
if(lineNum < file1.Length)
writer.WriteLine(file1[lineNum]);
if(lineNum < file2.Length)
writer.WriteLine(file2[lineNum]);
lineNum++;
}
}
This assumes that the two files don't contains the same number of lines.
try this method. You can receive three paths. File 1, File 2 and File output.
public void MergeFiles(string pathFile1, string pathFile2, string pathResult)
{
File.WriteAllText(pathResult, File.ReadAllText(pathFile1) + File.ReadAllText(pathFile2));
}
If the pathResult file exists, the WriteAllText method will overwrite it. Remember to include System.IO namespace.
Important: It is not recommended for large files! Use another options available on this thread.
If your input files are quite large and you run out of memory, you could also try wrapping the two readers like this:
using (StreamWriter writer = File.CreateText(#"*DIRECTORY*\FINALOUTPUT.txt"))
{
using (StreamReader reader1 = File.OpenText(#"*DIRECTORY*\FIRSTFILE.txt"))
{
using (StreamReader reader2 = File.OpenText(#"*DIRECTORY*\SECONDFILE.txt"))
{
string line1 = null;
string line2 = null;
while ((line1 = reader1.ReadLine()) != null)
{
writer.WriteLine(line1);
line2 = reader2.ReadLine();
if(line2 != null)
{
writer.WriteLine(line2);
}
}
}
}
}
Still, you have to have an idea how many lines you have in your input files, but I think it gives you the general idea to proceed.
Using a FileInfo extension you could merge one or more files by doing the following:
public static class FileInfoExtensions
{
public static void MergeFiles(this FileInfo fi, string strOutputPath , params string[] filesToMerge)
{
var fiLines = File.ReadAllLines(fi.FullName).ToList();
fiLines.AddRange(filesToMerge.SelectMany(file => File.ReadAllLines(file)));
File.WriteAllLines(strOutputPath, fiLines.ToArray());
}
}
Usage
FileInfo fi = new FileInfo("input");
fi.MergeFiles("output", "File2", "File3");
I appreciate this question is almost old enough to (up)vote (itself), but for an extensible approach:
const string FileMergeDivider = "\n\n";
public void MergeFiles(string outputPath, params string[] inputPaths)
{
if (!inputPaths.Any())
throw new ArgumentException(nameof(inputPaths) + " required");
if (inputPaths.Any(string.IsNullOrWhiteSpace) || !inputPaths.All(File.Exists))
throw new ArgumentNullException(nameof(inputPaths), "contains invalid path(s)");
File.WriteAllText(outputPath, string.Join(FileMergeDivider, inputPaths.Select(File.ReadAllText)));
}

Delete Lines in a textfile

Hi I have a text file with table schema and data when user checks not schema required then i need to delete schema and leave the data . I am using StreamReader to read the file and checking one condition and it should delete all the lines in the file till it satisfies my condition .
Let say if i am checking
using (StreamReader tsr = new StreamReader(targetFilePath))
{
do
{
string textLine = tsr.ReadLine() + "\r\n";
{
if (textLine.StartsWith("INSERT INTO"))
{
// It should leave these lines
// and no need to delete lines
}
else
{
// it should delete the lines
}
}
}
while (tsr.Peek() != -1);
tsr.Close();
Please suggest me how to delete lines and note if textline finds "InsertInto" it should not delete any content from there .
Use a second file where to put only required lines, and, at the end of the process, remove original file and rename new one to target file.
using (StreamReader tsr = new StreamReader(targetFilePath))
{
using (StreamWriter tsw = File.CreateText(targetFilePath+"_temp"))
{
string currentLine;
while((currentLine = tsr.ReadLine()) != null)
{
if(currentLine.StartsWith("A long time ago, in a far far away galaxy ..."))
{
tsw.WriteLine(currentLine);
}
}
}
}
File.Delete(targetFilePath);
File.Move(targetFilePath+"_temp",targetFilePath);
You could use Linq:
File.WriteAllLines(targetFilePath, File.ReadAllLines(targetFilePath).Where(x => x.StartsWith("INSERT INTO")));
You read in the file just the same way you were doing. However, if the line doesn't contain what you are looking for, you simply skip it. In the end, whatever data you are left over with you then write to a new text file.
private void button1_Click(object sender, EventArgs e)
{
StringBuilder newText = new StringBuilder();
using (StreamReader tsr = new StreamReader(targetFilePath))
{
do
{
string textLine = tsr.ReadLine() + "\r\n";
{
if (textLine.StartsWith("INSERT INTO"))
{
newText.Append(textLine + Environment.NewLine);
}
}
}
while (tsr.Peek() != -1);
tsr.Close();
}
System.IO.TextWriter w = new System.IO.StreamWriter(#"C:\newFile.txt");
w.Write(newText.ToString());
w.Flush();
w.Close();
}

Delete specific line from a text file?

I need to delete an exact line from a text file but I cannot for the life of me workout how to go about doing this.
Any suggestions or examples would be greatly appreciated?
Related Questions
Efficient way to delete a line from a text file (C#)
If the line you want to delete is based on the content of the line:
string line = null;
string line_to_delete = "the line i want to delete";
using (StreamReader reader = new StreamReader("C:\\input")) {
using (StreamWriter writer = new StreamWriter("C:\\output")) {
while ((line = reader.ReadLine()) != null) {
if (String.Compare(line, line_to_delete) == 0)
continue;
writer.WriteLine(line);
}
}
}
Or if it is based on line number:
string line = null;
int line_number = 0;
int line_to_delete = 12;
using (StreamReader reader = new StreamReader("C:\\input")) {
using (StreamWriter writer = new StreamWriter("C:\\output")) {
while ((line = reader.ReadLine()) != null) {
line_number++;
if (line_number == line_to_delete)
continue;
writer.WriteLine(line);
}
}
}
The best way to do this is to open the file in text mode, read each line with ReadLine(), and then write it to a new file with WriteLine(), skipping the one line you want to delete.
There is no generic delete-a-line-from-file function, as far as I know.
One way to do it if the file is not very big is to load all the lines into an array:
string[] lines = File.ReadAllLines("filename.txt");
string[] newLines = RemoveUnnecessaryLine(lines);
File.WriteAllLines("filename.txt", newLines);
Hope this simple and short code will help.
List linesList = File.ReadAllLines("myFile.txt").ToList();
linesList.RemoveAt(0);
File.WriteAllLines("myFile.txt"), linesList.ToArray());
OR use this
public void DeleteLinesFromFile(string strLineToDelete)
{
string strFilePath = "Provide the path of the text file";
string strSearchText = strLineToDelete;
string strOldText;
string n = "";
StreamReader sr = File.OpenText(strFilePath);
while ((strOldText = sr.ReadLine()) != null)
{
if (!strOldText.Contains(strSearchText))
{
n += strOldText + Environment.NewLine;
}
}
sr.Close();
File.WriteAllText(strFilePath, n);
}
You can actually use C# generics for this to make it real easy:
var file = new List<string>(System.IO.File.ReadAllLines("C:\\path"));
file.RemoveAt(12);
File.WriteAllLines("C:\\path", file.ToArray());
This can be done in three steps:
// 1. Read the content of the file
string[] readText = File.ReadAllLines(path);
// 2. Empty the file
File.WriteAllText(path, String.Empty);
// 3. Fill up again, but without the deleted line
using (StreamWriter writer = new StreamWriter(path))
{
foreach (string s in readText)
{
if (!s.Equals(lineToBeRemoved))
{
writer.WriteLine(s);
}
}
}
Read and remember each line
Identify the one you want to get rid
of
Forget that one
Write the rest back over the top of
the file
I cared about the file's original end line characters ("\n" or "\r\n") and wanted to maintain them in the output file (not overwrite them with what ever the current environment's char(s) are like the other answers appear to do). So I wrote my own method to read a line without removing the end line chars then used it in my DeleteLines method (I wanted the option to delete multiple lines, hence the use of a collection of line numbers to delete).
DeleteLines was implemented as a FileInfo extension and ReadLineKeepNewLineChars a StreamReader extension (but obviously you don't have to keep it that way).
public static class FileInfoExtensions
{
public static FileInfo DeleteLines(this FileInfo source, ICollection<int> lineNumbers, string targetFilePath)
{
var lineCount = 1;
using (var streamReader = new StreamReader(source.FullName))
{
using (var streamWriter = new StreamWriter(targetFilePath))
{
string line;
while ((line = streamReader.ReadLineKeepNewLineChars()) != null)
{
if (!lineNumbers.Contains(lineCount))
{
streamWriter.Write(line);
}
lineCount++;
}
}
}
return new FileInfo(targetFilePath);
}
}
public static class StreamReaderExtensions
{
private const char EndOfFile = '\uffff';
/// <summary>
/// Reads a line, similar to ReadLine method, but keeps any
/// new line characters (e.g. "\r\n" or "\n").
/// </summary>
public static string ReadLineKeepNewLineChars(this StreamReader source)
{
if (source == null)
throw new ArgumentNullException(nameof(source));
char ch = (char)source.Read();
if (ch == EndOfFile)
return null;
var sb = new StringBuilder();
while (ch != EndOfFile)
{
sb.Append(ch);
if (ch == '\n')
break;
ch = (char)source.Read();
}
return sb.ToString();
}
}
Are you on a Unix operating system?
You can do this with the "sed" stream editor. Read the man page for "sed"
What?
Use file open, seek position then stream erase line using null.
Gotch it? Simple,stream,no array that eat memory,fast.
This work on vb.. Example search line culture=id where culture are namevalue and id are value and we want to change it to culture=en
Fileopen(1, "text.ini")
dim line as string
dim currentpos as long
while true
line = lineinput(1)
dim namevalue() as string = split(line, "=")
if namevalue(0) = "line name value that i want to edit" then
currentpos = seek(1)
fileclose()
dim fs as filestream("test.ini", filemode.open)
dim sw as streamwriter(fs)
fs.seek(currentpos, seekorigin.begin)
sw.write(null)
sw.write(namevalue + "=" + newvalue)
sw.close()
fs.close()
exit while
end if
msgbox("org ternate jua bisa, no line found")
end while
that's all..use #d

Categories