Special character not reading in .txt file - c#

I am using stream reader for reading a text file.
This is the contents of the .txt file:
</a> Schools's are a suitable public </a>
When I read that text I got:
<a>Schoolss are a suitable public<a>
As you can see I did't receive the quotation. How can I receive the special character in a stream reader?
I used following code:
using (StreamReader reader = new StreamReader(CommonGetSet.FileName, System.Text.Encoding.ASCII))
{
string text = reader.ReadToEnd();
docKeyword = XDocument.Parse(text);
}

The problem you are having is that you are trying to load a text file with an xml reader, i.e. this part:
XDocument.Load(reader);
If you look at this question: What characters do I need to escape in XML documents?, you will see other characters that will be stripped/need escaping too.
If you inspect the StreamReader in the debugger you will see it shows the correct text, something that the answer by #JinsPeter shows. So you need to read in a text file, the easiest way is to use either File.ReadAllText or File.ReadAllLines depending on whether you want the result as a string or string[] respectively:
string contents = File.ReadAllText(path);
string[] lines = File.ReadAllLines(path);
However, if for some reason you really want to use a StreamReader you can read directly from the stream using ReadToEnd, ReadLine or any other appropriate read method:
using (StreamReader reader = new StreamReader(path))
{
string contents = reader.ReadToEnd();
}
However, note that the StreamReader methods will read from the current position in the stream so you may need to set the position yourself.
For a list of other ways to read in a file in C# see this question: How to read an entire file to a string using C#?.

When I printed the same text inside the StreamReader I got the ' .
So the issue is with writing it to XML or HTML. Try to fix that rather than finding issue in StreamReader.
using (StreamReader inputStream = new StreamReader(filepath, System.Text.Encoding.UTF8))
{
string line = inputStream.ReadToEnd();
Console.WriteLine(line);
}

Related

StreamReader adds unwanted backslashes

I am reading a file that contains other file paths and i am putting it into a StringReader:
var inputReader = new StreamReader(input);
var reader = new StringReader(inputReader.ReadToEnd());
When i debug the StreamReader.ReadToEnd() and look at the whole text everything seems right. The paths have this form: C:\\User\\ ....
However when i now read each line in the file with:
string line = reader.ReadLine();
The paths have the following form: C:\\\\User\\\\ ....
Any suggestions on how to fix that? Thank you in advance!

C# creating Word file - Error opening file

I am creating some word files (and replacing some words) from word template using this code snippet:
File.Copy(sourceFile, destinationFile, true);
try
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(destinationFile, true))
{
string docText = null;
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd();
}
foreach (KeyValuePair<string, string> item in keyValues)
{
Regex regexText = new Regex(item.Key);
docText = regexText.Replace(docText, item.Value);
}
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(docText);
}
}
}
But when I am trying to open the produced file I get this error:
We're sorry. We can't open XXX.docx because we found a problem with its contents
XML parsing error Location:Part:/word/document.xml, Line:2, Column:33686
Here are the contents of document.xml in the specific place:
.....<w:szCs w:val="20"/><w:lang w:val="en-US"/></w:rPr><w:t> </w:t></w:r><w:proofErr w:type="spellEnd"/>....
Where 33686 is the position of the &nbsp. How I can fix that problem?
EDIT In another file that produced correctly in the same position there are some random characters I used for testing which are used also in the title of the document
It looks like you're using regular expressions to directly modify XML, which is typically going to lead to difficult-to-troubleshoot issues like this, especially if any of your regexes match anything that could be interpreted as XML.
As an alternative, you may want to investigate that WordProcessDocument class more deeply. It looks like it has strongly-typed objects like Paragraph that you can modify more safely.

streamReader.ReadToEnd() return just header OpenXML

Please, I want to find a word and replace it with another word in word doccument using openXML
  I use this method
public static void AddTextToWord(string filepath, string txtToFind,string ReplaceTxt)
{
WordprocessingDocument wordDoc = WordprocessingDocument.Open(filepath, true);
string docText = null;
StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream());
docText = sr.ReadToEnd();
System.Diagnostics.Debug.WriteLine(docText);
Regex regexText = new Regex(txt);
docText = regexText.Replace(docText,txt2);
StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create));
System.Diagnostics.Debug.WriteLine(docText);
wordDoc.Close();
}
but
docText
return just the head of the page that the xml shema of the document.
<?xml version="1.0" encoding=.......
Check Your Strings
If you want to replace a specific word or phrase within your existing content, you may just want to use the String.Replace() method as opposed to performing a Regex.Replace() which may not work as expected (as it expects a regular expression as opposed to a traditional string). This may not matter if you expect to use regular expressions, but it's worth noting.
Ensure You Are Pulling The Content
Word Documents are obviously not as easy to parse as plain text, so in order to get the actual "content", you may have to use an approach similar to the one mentioned here that targets the Document.Body properties instead of reading using a StreamReader object :
docText = wordDoc.MainDocumentPart.Document.Body.InnerText;
Performing Your Replacement
With that said, you currently appear to be reading the contents of your file and storing it in a string called docText. Since you have that string and know your values to find and replace, just call the Replace() method as seen below :
docText = docText.Replace(txtToFind,ReplaceTxt);
Writing Out Your Content
After performing the replacement, you'll just need to write your updated text out to a stream :
using (var sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(docText);
}

csvReader skips characters

I am reading a Csv file in a asp.net Web application to produce a report. The CsvReader element does not read in special characters such as ± or Σ.
var avar = FileUploader.PostedFile.FileName;
var myfile = File.OpenText(avar);
CsvReader csv = new CsvReader(myfile);
data = csv.GetRecords<T>().ToList();
The reader skips the special characters mentioned above. Every other characters is read included characters surrounding the special characters. Can anyone tell me how to fix this? Thanks.
I use GetEncoding from this link Effective way to find any file's Encoding to find the encoding of my file.
then, I set the configurations:
CsvConfiguration config = new CsvConfiguration();
config.Delimiter = ",";
Encoding enc = GetEncoding(FileUploader.PostedFile.FileName);
config.Encoding = enc;
config.HasHeaderRecord = true;
config.QuoteNoFields = true;
Next, I use a FileStream to load file and send it to a StreamReader.
FileStream stream = File.OpenRead(FileUploader.PostedFile.FileName);
StreamReader reader = new StreamReader(stream, Encoding.GetEncoding(enc.HeaderName));
CsvReader csv = new CsvReader(reader, config);
datas = csv.GetRecords<T>().ToList();
All characters are readable when I load a file. is an IEnumerable class

Replace found strings with new strings?

I have a open file dialog that open XML file. The regex expression find every string between > and <, and write every string in new line to the rich text box.
private void button1_Click(object sender, EventArgs e)
{
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
StreamReader sr = new StreamReader(openFileDialog1.FileName);
string s = sr.ReadToEnd();
richTextBox1.Text = s;
}
string txt = richTextBox1.Text;
var foundWords = Regex.Matches(txt, #"(?<=>)([\w ]+?)(?=<)");
richTextBox1.Text = string.Join("\n", foundWords.Cast<Match>().Select(x => x.Value).ToArray());
}
Then I can change those strings. But how can I import those changed strings back to original XML file on its same place?
You could try to replace these strings inside a file, but once you replace something with a different length, it would be simpler to just write the entire file instead.
It looks like the user is able to modify these strings - that's your challenge there: you will have to keep track of which word was where in the original file to replace them back into the data. Furthermore the user is able to remove or add lines to the textbox, what would your application do in that case?
It would be easier to process the xml file using XDocument and store the XElements that contain the original values. XDocument allows you to replace these values and store the file.
Note that since you're not explicitly closing the StreamReader, the file may still be in use when you try to write it. Simply put the StreamReader in a using block to prevent this.

Categories