Reading a text file from Unity3d - c#

I have a error in a script which reads from a text file outside the program.
The error is
FormatException: Input string was not in the correct format
Its obvious whats wrong, but I just don't understand why it cant read it properly.
My code:
using (FileStream fs = new FileStream(#"D:\Program Files (x86)\Steam\SteamApps\common\blabla...to my file.txt))
{
byte[] b = new byte[1024];
UTF8Encoding temp = new UTF8Encoding(true);
while (fs.Read(b, 0, b.Length) > 0)
{
//Debug.Log(temp.GetString(b));
var converToInt = int.Parse(temp.GetString(b));
externalAmount = converToInt;
}
fs.Close();
}
The text file has 4 lines of values.
Each line represent a object in a game. All I am trying to do is read these values. Unfortunately I get the error which I can't explain.
So how can I read new lines without getting the error?
the text file looks like this
12
5
6
0
4 lines no more, all values on a seperate line.

There's no closing " on your new Filestream(" ...); but I'm gonna assume that's an issue when copy pasting your code to Stackoverflow.
The error you're getting is likely because you're trying to parse spaces to int, which wont work; the input string (" " in this case) was not in the correct format (int).
Split your lines on spaces (Split.(' ')) and parse every item in the created array.

A couple problems:
Problem 1
fs.Read(b, 0, b.Length) may read one byte, or all of them. The normal way to read a text file like this is to use StreamReader instead of FileStream. The Streamreader has a convenience constructor for opening a file that works the same way, but it can read line by line and is much more convenient. Here's the documentation and an excellent example: https://msdn.microsoft.com/en-us/library/f2ke0fzy(v=vs.110).aspx
If you insist on reading directly from a filestream, you will either need to
Parse your string outside the loop so you can be certain you've read the whole file into your byte buffer (b), or
Parse the new content byte by byte until you find a particular separator (for example a space or a newline) and then parse everything in your buffer and reset the buffer.
Problem 2
Most likely your buffer already contains everything in the file. Your file is so small that the filestream object is probably reading the whole thing in a single shot, even though that's not gauranteed.
Since your string buffer contains ALL the characters in the file you are effectively trying to parse "12\n5\n6\n0" as an integer and the parser is choking on the newline characters. Since newlines are non-numeric, it has no idea how to interpret them.

Related

Write text to file in C# with 513 space characters

Here is a code that writes the string to a file
System.IO.File.WriteAllText("test.txt", "P ");
It's basically the character 'P' followed by a total of 513 space character.
When I open the file in Notepad++, it appears to be fine. However, when I open in windows Notepad, all I see is garbled characters.
If instead of 513 space character, I add 514 or 512, it opens fine in Notepad.
What am I missing?
What you are missing is that Notepad is guessing, and it is not because your length is specifically 513 spaces ... it is because it is an even number of bytes and the file size is >= 100 total bytes. Try 511 or 515 spaces ... or 99 ... you'll see the same misinterpretation of your file contents. With an odd number of bytes, Notepad can assume that your file is not any of the double-byte encodings, because those would all result in 2 bytes per character = even number of total bytes in the file. If you give the file a few more low-order ASCII characters at the beginning (e.g., "PICKLE" + spaces), Notepad does a much better job of understanding that it should treat the content as single-byte chars.
The suggested approach of including Encoding.UTF8 is the easiest fix ... it will write a BOM to the beginning of the file which tells Notepad (and Notepad++) what the format of the data is, so that it doesn't have to resort to this guessing behavior (you can see the difference between your original approach and the BOM approach by opening both in Notepad++, then look in the bottom-right corner of the app. With the BOM, it will tell you the encoding is UTF-8-BOM ... without it, it will just say UTF-8).
I should also say that the contents of your file are not 'wrong', per se... the weird format is purely due to Notepad's "guessing" algorithm. So unless it's a requirement that people use Notepad to read your file with 1 letter and a large, odd number of spaces ... maybe just don't sweat it. If you do change to writing the file with Encoding.UTF8, then you do need to ensure that any other system that reads your file knows how to honor the BOM, because it is a real change to the contents of your file. If you cannot verify that all consumers of your file can/will handle the BOM, then it may be safer to just understand that Notepad happens to make a bad guess for your specific use case, and leave the raw contents exactly how you want them.
You can verify the physical difference in your file with the BOM by doing a binary read and then converting them to a string (you can't "see" the change with ReadAllText, because it honors & strips the BOM):
byte[] contents = System.IO.File.ReadAllBytes("test.txt");
Console.WriteLine(Encoding.ASCII.GetString(contents));
Try passing in a different encoding:
i. System.IO.File.WriteAllText(filename , stringVariable, Encoding.UTF8);
ii. System.IO.File.WriteAllText(filename , stringVariable, Encoding.UTF32);
iii. etc.
Also You could try using another way to build your string, to make it be easier to read, change and count, instead of tapping the space bar 513 times;
i. Use the string constructor (like #Tigran suggested)
var result = "P" + new String(' ', 513);
ii. Use the stringBuilder
var stringBuilder = new StringBuilder();
stringBuilder.Append("P");
for (var i = 1; i <= 513; i++) { stringBuilder.Append(" "); }
iii. Or both
public string AppendSpacesToString(string stringValue, int numberOfSpaces)
{
var stringBuilder = new StringBuilder();
stringBuilder.Append(stringValue);
stringBuilder.Append(new String(' ', numberOfSpaces));
return stringBuilder.ToString();
}

Replacing a word in a text file

I'm doing a little program where the data saved on some users are stored in a text file. I'm using Sytem.IO with the Streamwriter to write new information to my text file.
The text in the file is formatted like so :
name1, 1000, 387
name2, 2500, 144
... and so on. I'm using infos = line.Split(',') to return the different values into an array that is more useful for searching purposes. What I'm doing is using a While loop to search for the correct line (where the name match) and I return the number of points by using infos[1].
I'd like to modify this infos[1] value and set it to something else. I'm trying to find a way to replace a word in C# but I can't find a good way to do it. From what I've read there is no way to replace a single word, you have to rewrite the complete file.
Is there a way to delete a line completely, so that I could rewrite it at the end of the text file and not have to worried about it being duplicated?
I tried using the Replace keyword, but it didn't work. I'm a bit lost by looking at the answers proposed for similar problems, so I would really appreciate if someone could explain me what my options are.
If I understand you correctly, you can use File.ReadLines method and LINQ to accomplish this.First, get the line you want:
var line = File.ReadLines("path")
.FirstOrDefault(x => x.StartsWith("name1 or whatever"));
if(line != null)
{
/* change the line */
}
Then write the new line to your file excluding the old line:
var lines = File.ReadLines("path")
.Where(x => !x.StartsWith("name1 or whatever"));
var newLines = lines.Concat(new [] { line });
File.WriteAllLines("path", newLines);
The concept you are looking for is called 'RandomAccess' for file reading/writing. Most of the easy-to-use I/O methods in C# are 'SequentialAccess', meaning you read a chunk or a line and move forward to the next.
However, what you want to do is possible, but you need to read some tutorials on file streams. Here is a related SO question. .NET C# - Random access in text files - no easy way?
You are probably either reading the whole file, or reading it line-for-line as part of your search. If your fields are fixed length, you can read a fixed number of bytes, keep track of the Stream.Position as you read, know how many characters you are going to read and need to replace, and then open the file for writing, move to that exact position in the stream, and write the new value.
It's a bit complex if you are new to streams. If your file is not huge, copying a file line for line can be done pretty efficiently by the System.IO library if coded correctly, so you might just follow your second suggestion which is read the file line-for-line, write it to a new Stream (memory, temp file, whatever), replace the line in question when you get to that value, and when done, replace the original.
It is most likely you are new to C# and don't realize the strings are immutable (a fancy way of saying you can't change them). You can only get new strings from modifying the old:
String MyString = "abc 123 xyz";
MyString.Replace("123", "999"); // does not work
MyString = MyString.Replace("123", "999"); // works
[Edit:]
If I understand your follow-up question, you could do this:
infos[1] = infos[1].Replace("1000", "1500");

Text Parsing Tab Delimited file

I have a method that reads a file. This file has roughly 30000 lines. However when I read it into an array I get a random length for my array. I have seen it as low 6000.
I used both
string[] lines = System.IO.File.ReadAllLines(#"C:\out\qqqqq.txt");
and
System.IO.StreamReader file = new System.IO.StreamReader(#"C:\out\qqqqq.txt");
(and use a counter.)
But I get the same result. I can see in Excel these are too small.
If the line endings in the file are inconsistent (sometimes \n, sometimes \r\n and sometimes \r) then you could try reading the entire file as a string and splitting it yourself:
string file = System.IO.File.ReadAllText(#"C:\out\qqqqq.txt");
var lines = file.Split(new[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries);
For large files, this is inefficient, because it needs to read the entire file - using StreamReader you would be able to read the file line-by-line as you're processing it. If performance is an issue, then you could write simple tool that first corrects the line endings.

Why does StreamReader.ReadLine() return a value for a one line file with no newline?

I want to append two text files together.
I have one file with a carriage return line feed at the end. Observe file A which is 28 bytes.
this is a line in the file\n
then I have another file which is the same thing without the new line. Observe file B which is 26 bytes.
this is a line in the file
I want to append the same file to itself (file A to A, and file B to B) and compare the byte counts.
However, when using StreamReader.ReadLine() on file A, I get a value returned but MSDN says:
A line is defined as a sequence of characters followed by a line feed ("\n"), a carriage return ("\r") or a carriage return immediately followed by a line feed ("\r\n"). The string that is returned does not contain the terminating carriage return or line feed. The returned value is null if the end of the input stream is reached.
However, there is no crlf in the file.
How can I safely append these files without adding an extra line break at the end? For example, StreamWriter.WriteLine() will put an extra line break on file A when I don't want it to. What would be an ideal approach?
You'll only get null if you call ReadLine at the end of the stream. Otherwise, you'll get all data up until either a CRLF or the end of the stream.
If you're trying to do a byte-for-byte duplication (and comparison), you're better off reading either characters (using StreamReader/StreamWriter as you're using now) or bytes (using just using the Stream class) using the normal Read and Write functions rather than ReadLine and WriteLine.
You could also just read the entire contents of the file using ReadToEnd then write it by calling Write (not WriteLine), though this isn't practical if the file is large.
string data;
using(StreamReader reader = new StreamReader(path))
{
data = reader.ReadToEnd();
}
using(StreamWriter writer = new StreamWriter(path, true))
{
writer.Write(data);
}
StreamReader and StreamWriter (which derive from TextReader and TextWriter) are not suitable for situations requiring an exact form of binary data. They are high level abstractions of a file which consists of bytes, not text or lines. In fact, not only could you wind up with different number of newlines, but depending on the environment you might write out a line terminator other than the expected CR/LF.
You should instead just copy from one stream to another. This is quite easy actually.
var bytes = File.ReadAllBytes(pathIn);
var stream = File.Open(pathOut, FileMode.Append);
stream.Write(bytes, 0, bytes.Length);
stream.Close();
If the size of the file is potentially large, you should open both the input and output file at the same time and use a fixed-sized buffer to copy a block at a time.
using (var streamIn = File.Open(pathIn, FileMode.Read))
using (var streamOut = File.Open(pathOut, FileMode.Append)) {
var bytes = new byte[BLOCK_SIZE];
int count;
while ((count=streamIn.Read(bytes, 0, bytes.Length)) > 0) {
streamOut.Write(bytes, 0, count);
}
}
Also worth noting is that the above code could be replaced by Stream.CopyTo which is new in .NET 4.
You can use StreamWriter.Write instead of WriteLine to avoid the extra crlf.
As to the ReadLine docs, I beleive the problem is a poorly worded explanation. You certainly wouldn't want the last bytes of a file discarded just because there is no formal line ending flag.
Well it really depends on the reasons for your implementation (Why are you reading it by line and writing it back line by line?) You could just use StreamWriter.Write(string) and output all the text you have stored, the WriteLine() methods are named as such because they append a newline.
TextWriter.WriteLine Method (String)
Writes a string followed by a line terminator to the text stream.

Problem with indexed XML file

I scanned 2,8GB XML file for positions (Index) of particular tags. The I use Seek method to set a start point in that file. File is UTF-8 encoded.
So indexing is like that:
using(StreamReader sr = new StreamReader(pathToFile)){
long index = 0;
while(!sr.EndOfStream){
string line = sr.ReadLine();
index += (line.Length + 2); //remeber of \r\n chars
if(LineHasTag(line)){
SaveIndex(index-line.Length); //need beginning of the line
}
}
}
So afterwards I have in another file indexed positions. But when I use seek it doesn't seem to be good, because the position is set somewhere before it should be.
I have loaded some content of that file into char array and I manually checked the good index of a tag I need. It's the same as I indexed by code above. But still Seek method on StreamReader.BaseStream places the pointer earlier in the file. Quite strange.
Any suggestions?
Best regards,
ventus
Seek deals in bytes - you're assuming there's one byte per character. In UTF-8, one character in the BMP can take up to three bytes.
My guess is that you've got non-ASCII characters in your file - those will take more than one byte.
I think there may also be a potential problem with the byte order mark, if there is one. I can't remember offhand whether StreamReader will swallow that automatically - which would put you 3 bytes to start with.

Categories