I've made a program and I want to save the data. Saving is working, but "Loading" doesn't work.
public void Save(StreamWriter sw)
{
for (int i = 0; i < buecher.Count; i++)
{
Buch b = (Buch)buecher[i];
if (i == 0)
sw.WriteLine("ISDN ; Autor ; Titel");
sw.WriteLine(b.ISDN + ";" + b.Autor + ";" + b.Titel);
}
}
public void Load(StreamReader sr)
{
int isd;
string aut;
string tit;
while (sr.ReadLine() != "")
{
string[] teile = sr.ReadLine().Split(';');
try
{
isd = Convert.ToInt32(teile[0]);
aut = teile[1];
tit = teile[2];
}
catch
{
throw new Exception("umwandlung fehlgeschlagen");
}
Buch b = new Buch(isd, aut, tit);
buecher.Add(b);
}
}
If I'm doing that with an break after buecher.Add(b); than its everything fine but it obviously shows me only 1 book... if I'm not using the break he gives me an error "nullreference.."
Would be awesome if someone could help me
best regards
Ramon
The problem is that you are reading two lines for each iteration in the loop (and throwing away the first one). If there are an odd number of lines in the file, the second call to Read will return null.
Read the line into a variable in the condition, and use that variable in the loop:
public void Load(StreamReader sr) {
int isd;
string aut;
string tit;
// skip header
sr.ReadLine();
string line;
while ((line = sr.ReadLine()) != null) {
if (line.Length > 0) {
string[] teile = line.Split(';');
try {
isd = Convert.ToInt32(teile[0]);
aut = teile[1];
tit = teile[2];
} catch {
throw new Exception("umwandlung fehlgeschlagen");
}
Buch b = new Buch(isd, aut, tit);
buecher.Add(b);
}
}
}
You are calling sr.ReadLine() twice for every line, once in the while() and once right after. You are hitting the end of the file, which returns a null.
Different approach to this but I suggest it because it's simpler;
Load(string filepath)
{
try
{
List<Buch> buches = File.ReadAllLines(filepath)
.Select(x => new Buch(int.Parse(x.Split(';')[0]), x.Split(';')[1], x.Split(';')[2]));
{
catch
{
throw new Exception("umwandlung fehlgeschlagen");
}
}
You could do it in more lines if you find it to be more readable but I've come to prefer File.ReadAllText and File.ReadAllLines to StreamReader approach of reading files.
Instead of using the LINQ statement you could also do;
Load(string filepath)
{
try
{
string[] lines = File.ReadAllLines(filepath);
foreach (string line in lines)
{
string[] tokens = line.Split(';');
if (tokens.Length != 3)
// error
int isd;
if (!int.TryParse(tokens[0], out isd))
//error, wasn't an int
buetcher.Add(new Buch(isd, tokens[1], tokens[2]);
}
{
catch
{
throw new Exception("umwandlung fehlgeschlagen");
}
}
Related
I'm relatively new to C# and I'm trying to get my head around a problem that I believe should be pretty simple in concept, but I just cant get it.
I am currently, trying to display a message to the console when the program is run from the command line with two arguments, if a sequence ID does not exist inside a text file full of sequence ID's and DNA sequences against a query text file full of Sequence ID's. For example args[0] is a text file that contains 41534 lines of sequences which means I cannot load the entire file into memory.:
NR_118889.1 Amycolatopsis azurea strain NRRL 11412 16S ribosomal RNA, partial sequence
GGTCTNATACCGGATATAACAACTCATGGCATGGTTGGTAGTGGAAAGCTCCGGCGT
NR_118899.1 Actinomyces bovis strain DSM 43014 16S ribosomal RNA, partial sequence
GGGTGAGTAACACGTGAGTAACCTGCCCCNNACTTCTGGATAACCGCTTGAAAGGGTNGCTAATACGGGATATTTTGGCCTGCT
NR_074334.1 Archaeoglobus fulgidus DSM 4304 16S ribosomal RNA, complete sequence >NR_118873.1 Archaeoglobus fulgidus DSM 4304 strain VC-16 16S ribosomal RNA, complete sequence >NR_119237.1 Archaeoglobus fulgidus DSM 4304 strain VC-16 16S ribosomal RNA, complete sequence
ATTCTGGTTGATCCTGCCAGAGGCCGCTGCTATCCGGCTGGGACTAAGCCATGCGAGTCAAGGGGCTT
args[1] is a query text file with some sequence ID's:
NR_118889.1
NR_999999.1
NR_118899.1
NR_888888.1
So when the program is run, all I want are the sequence ID's that were not found in args[0] from args[1] to be displayed.
NR_999999.1 could not be found
NR_888888.1 could not be found
I know this probably super simple, and I have spent far too long on trying to figure this out by myself to the point where I want to ask for help.
Thank you in advance for any assistance.
You can try this.
It loads each file content and compare with each other.
static void Main(string[] args)
{
if ( args.Length != 2 )
{
Console.WriteLine("Usage: {exename}.exe [filename 1] [filename 2]");
Console.ReadKey();
return;
}
string filename1 = args[0];
string filename2 = args[1];
bool checkFiles = true;
if ( !File.Exists(filename1) )
{
Console.WriteLine($"{filename1} not found.");
checkFiles = false;
}
if ( !File.Exists(filename2) )
{
Console.WriteLine($"{filename2} not found.");
checkFiles = false;
}
if ( !checkFiles )
{
Console.ReadKey();
return;
}
var lines1 = System.IO.File.ReadAllLines(args[0]).Where(l => l != "");
var lines2 = System.IO.File.ReadAllLines(args[1]).Where(l => l != "");
foreach ( var line in lines2 )
if ( !lines1.StartsWith(line) )
{
Console.WriteLine($"{line} could not be found");
checkFiles = false;
}
if (checkFiles)
Console.WriteLine("There is no difference.");
Console.ReadKey();
}
This works, but it only processes the first line of the files...
using( System.IO.StreamReader sr1 = new System.IO.StreamReader(args[1]))
{
using( System.IO.StreamReader sr2 = new System.IO.StreamReader(args[2]))
{
string line1,line2;
while ((line1 = sr1.ReadLine()) != null)
{
while ((line2 = sr2.ReadLine()) != null)
{
if(line1.Contains(line2))
{
found = true;
WriteLine("{0} exists!",line2);
}
if(found == false)
{
WriteLine("{0} does not exist!",line2);
}
}
}
}
}
var saved_ids = new List<String>();
foreach (String args1line in File.ReadLines(args[1]))
{
foreach (String args2line in File.ReadLines(args[2]))
{
if (args1line.Contains(args2line))
{
saved_ids.Add(args2line);
}
}
}
using (System.IO.StreamReader sr1 = new System.IO.StreamReader(args[1]))
{
using (System.IO.StreamReader sr2 = new System.IO.StreamReader(args[2]))
{
string line1, line2;
while ((line1 = sr1.ReadLine()) != null)
{
while ((line2 = sr2.ReadLine()) != null)
{
if (line1.Contains(line2))
{
saved_ids.Add(line2);
break;
}
if (!line1.StartsWith(">"))
{
break;
}
if (saved_ids.Contains(line1))
{
break;
}
if (saved_ids.Contains(line2))
{
break;
}
if (!line1.Contains(line2))
{
saved_ids.Add(line2);
WriteLine("The sequence ID {0} does not exist", line2);
}
}
if (line2 == null)
{
sr2.DiscardBufferedData();
sr2.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);
continue;
}
}
}
}
I am writing a program for an assignment that is meant to read two text files and use their data to write to a third text file. I was instructed to pass the contents of the one file to a list. I have done something similar, passing the contents to an array (see below). But I can't seem to get it to work with a list.
Here is what I have done in the past with arrays:
StreamReader f1 = new StreamReader(args[0]);
StreamReader f2 = new StreamReader(args[1]);
StreamWriter p = new StreamWriter(args[2]);
double[] array1 = new double[20];
double[] array2 = new double[20];
double[] array3 = new double[20];
string line;
int index;
double value;
while ((line = f1.ReadLine()) != null)
{
string[] currentLine = line.Split('|');
index = Convert.ToInt16(currentLine[0]);
value = Convert.ToDouble(currentLine[1]);
array1[index] = value;
}
If it is of any interest, this is my current setup:
static void Main(String[] args)
{
// Create variables to hold the 3 elements of each item that you will read from the file
// Create variables for all 3 files (2 for READ, 1 for WRITE)
int ID;
string InvName;
int Number;
string IDString;
string NumberString;
string line;
List<InventoryNode> Inventory = new List<InventoryNode>();
InventoryNode Item = null;
StreamReader f1 = new StreamReader(args[0]);
StreamReader f2 = new StreamReader(args[1]);
StreamWriter p = new StreamWriter(args[2]);
// Read each item from the Update File and process the data
//Data is separated by pipe |
If you want to convert Array to List, you can just call Add or Insert to make it happen.
According to your code, you can do Inventory.Add(Item).
while ((line = f1.ReadLine()) != null)
{
string[] currentLine = line.Split('|');
Item = new InventoryItem {
Index = Convert.ToInt16(currentLine[0]),
Value = Convert.ToDouble(currentLine[1])
};
Inventory.Add(Item);
}
like this.
If I understand it correctly all you want to do is read two input file, parse the data in these file in a particular format (in this case int|double) and then write it to a new file. If this is the requirement, please try out the following code, as it is not sure how you want the data to be presented in the third file I have kept the format as it is (i.e. int|double)
static void Main(string[] args)
{
if (args == null || args.Length < 3)
{
Console.WriteLine("Wrong Input");
return;
}
if (!ValidateFilePath(args[0]) || !ValidateFilePath(args[1]))
{
return;
}
Dictionary<int, double> parsedFileData = new Dictionary<int, double>();
//Read the first file
ReadFileData(args[0], parsedFileData);
//Read second file
ReadFileData(args[1], parsedFileData);
//Write to third file
WriteFileData(args[2], parsedFileData);
}
private static bool ValidateFilePath(string filePath)
{
try
{
return File.Exists(filePath);
}
catch (Exception)
{
Console.WriteLine($"Failed to read file : {filePath}");
return false;
}
}
private static void ReadFileData(string filePath, Dictionary<int, double> parsedFileData)
{
try
{
using (StreamReader fileStream = new StreamReader(filePath))
{
string line;
while ((line = fileStream.ReadLine()) != null)
{
string[] currentLine = line.Split('|');
int index = Convert.ToInt16(currentLine[0]);
double value = Convert.ToDouble(currentLine[1]);
parsedFileData.Add(index, value);
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Exception : {ex.Message}");
}
}
private static void WriteFileData(string filePath, Dictionary<int, double> parsedFileData)
{
try
{
using (StreamWriter fileStream = new StreamWriter(filePath))
{
foreach (var parsedLine in parsedFileData)
{
var line = parsedLine.Key + "|" + parsedLine.Value;
fileStream.WriteLine(line);
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Exception : {ex.Message}");
}
}
There are few things you should always remember while writing a C# code :
1) Validate command line inputs before using.
2) Always lookout for any class that has dispose method, instantiate it inside using block.
3) Proper mechanism in the code to catch exceptions, else your program would crash at runtime with invalid inputs or inputs that you could not validate!
I am getting null reference exception when reading data from my txt file.
public class Appointments : List<Appointment>
{
Appointment appointment;
public Appointments()
{
}
public bool Load(string fileName)
{
string appointmentData = string.Empty;
using (StreamReader reader = new StreamReader(fileName))
{
while((appointmentData = reader.ReadLine()) != null)
{
appointmentData = reader.ReadLine();
//**this is where null ref. exception is thrown** (line below)
if(appointmentData[0] == 'R')
{
appointment = new RecurringAppointment(appointmentData);
}
else
{
appointment = new Appointment(appointmentData);
}
this.Add(appointment);
}
return true;
}
}
RecurringAppointment inherits from Appointments. File exists, file location is correct. Funny thing is that program was working 30 min ago I've only changed Load method from below to what u can see above :
public bool Load(string fileName)
{
string appointmentData = string.Empty;
using (StreamReader reader = new StreamReader(fileName))
{
while((appointmentData = reader.ReadLine()) != null)
{
appointmentData = reader.ReadLine();
if(appointmentData[0] == 'R')
{
this.Add(appointment = new RecurringAppointment(appointmentData));
}
else
{
this.Add(appointment = new Appointment(appointmentData));
}
}
return true;
}
}
Now it does not work in either case.
Your code reads two times at each loop. This means that, if your file has an odd number of rows when you read the last line of the file, the check against null inside the while statement allows your code to enter the loop but the following ReadLine returns a null string. Of course trying to read the char at index zero of a null string will throw the NRE exception.
There is also the problem of empty lines in your file. If there is an empty line then, again reading at index zero will throw an Index out of range exception
You could fix your code in this way
public bool Load(string fileName)
{
string appointmentData = string.Empty;
using (StreamReader reader = new StreamReader(fileName))
{
while((appointmentData = reader.ReadLine()) != null)
{
if(!string.IsNullOrWhiteSpace(appointmentData))
{
if(appointmentData[0] == 'R')
this.Add(appointment = new RecurringAppointment(appointmentData));
else
this.Add(appointment = new Appointment(appointmentData));
}
}
return true;
}
}
I have the PAF raw data in several files (list of all addresses in the UK).
My goal is to create a PostCode lookup in our software.
I have created a new database but there is no need to understand it for the moment.
Let's take a file, his extension is ".c01" and can be open with a text editor. The data in this file are in the following format :
0000000123A
With (according to the developer guide), 8 char for the KEY, 50 char for the NAME.
This file contains 2,449,652 rows (it's a small one !)
I create a Parsing class for this
private class SerializedBuilding
{
public int Key
{
get; set;
}
public string Name
{
get; set;
}
public bool isValid = false;
public Building ToBuilding()
{
Building b = new Building();
b.BuildingKey = Key;
b.BuildingName = Name;
return b;
}
private readonly int KEYLENGTH = 8;
private readonly int NAMELENGTH = 50;
public SerializedBuilding(String line)
{
string KeyStr = null;
string Name = null;
try
{
KeyStr = line.Substring(0, KEYLENGTH);
}
catch (Exception e)
{
Console.WriteLine("erreur parsing key line " + line);
return;
}
try
{
Name = line.Substring(KEYLENGTH - 1, NAMELENGTH);
}
catch (Exception e)
{
Console.WriteLine("erreur parsing name line " + line);
return;
}
int value;
if (!Int32.TryParse(KeyStr, out value))
return;
if (value == 0 || value == 99999999)
return;
this.Name = Name;
this.Key = value;
this.isValid = true;
}
}
I use this method to read the file
public void start()
{
AddressDataContext d = new AddressDataContext();
Count = 0;
string line;
// Read the file and display it line by line.
System.IO.StreamReader file =
new System.IO.StreamReader(filename);
SerializedBuilding sb = null;
Console.WriteLine("Number of line detected : " + File.ReadLines(filename).Count());
while ((line = file.ReadLine()) != null)
{
sb = new SerializedBuilding(line);
if (sb.isValid)
{
d.Buildings.InsertOnSubmit(sb.ToBuilding());
if (Count % 100 == 0)
d.SubmitChanges();
}
Count++;
}
d.SubmitChanges();
file.Close();
Console.WriteLine("building added");
}
I use Linq to SQL classes to insert data to my database. The connection string is the default one.
This seems to work, I have added 67200 lines. It just crashed but my questions are not about that.
My estimations :
33,647,015 rows to parse
Time needed for execution : 13 hours
It's a one-time job (just needs to be done on my sql and on the client server later) so I don't really care about performances but I think it can be interesting to know how it can be improved.
My questions are :
Is readline() and substring() the most powerful ways to read these huge files ?
Can the performance be improved by modifying the connection string ?
Given this log file, how can I read a line with multiple new lines (\n) with a StreamReader?
The ReadLine method literally returns each line, but a message may span more that one line.
Here is what I have so far
using (var sr = new StreamReader(filePath))
using (var store = new DocumentStore {ConnectionStringName = "RavenDB"}.Initialize())
{
IndexCreation.CreateIndexes(typeof(Logs_Search).Assembly, store);
using (var bulkInsert = store.BulkInsert())
{
const char columnDelimeter = '|';
const string quote = #"~";
string line;
while ((line = sr.ReadLine()) != null)
{
batch++;
List<string> columns = null;
try
{
columns = line.Split(columnDelimeter)
.Select(item => item.Replace(quote, string.Empty))
.ToList();
if (columns.Count != 5)
{
batch--;
Log.Error(string.Join(",", columns.ToArray()));
continue;
}
bulkInsert.Store(LogParser.Log.FromStringList(columns));
/* Give some feedback */
if (batch % 100000 == 0)
{
Log.Debug("batch: {0}", batch);
}
/* Use sparingly */
if (ThrottleEnabled && batch % ThrottleBatchSize == 0)
{
Thread.Sleep(ThrottleThreadWait);
}
}
catch (FormatException)
{
if (columns != null) Log.Error(string.Join(",", columns.ToArray()));
}
catch (Exception exception)
{
Log.Error(exception);
}
}
}
}
And the Model
public class Log
{
public string Component { get; set; }
public string DateTime { get; set; }
public string Logger { get; set; }
public string Level { get; set; }
public string ThreadId { get; set; }
public string Message { get; set; }
public string Terms { get; set; }
public static Log FromStringList(List<string> row)
{
Log log = new Log();
/*log.Component = row[0] == string.Empty ? null : row[0];*/
log.DateTime = row[0] == string.Empty ? null : row[0].ToLower();
log.Logger = row[1] == string.Empty ? null : row[1].ToLower();
log.Level = row[2] == string.Empty ? null : row[2].ToLower();
log.ThreadId = row[3] == string.Empty ? null : row[3].ToLower();
log.Message = row[4] == string.Empty ? null : row[4].ToLower();
return log;
}
}
I would use Regex.Split and break the file up on anything that matches the date pattern (ex. 2013-06-19) at the beginning of each error.
If you can read the entire file into memory (i.e. File.ReadAllText), then you can treat it as a single string and use regular expressions to split on the date, or some such.
A more general solution that takes less memory would be to read the file line-by-line. Append lines to a buffer until you get the next line that starts with the desired value (in your case, a date/time stamp). Then process that buffer. For example:
StringBuilder buffer = new StringBuilder();
foreach (var line in File.ReadLines(logfileName))
{
if (line.StartsWith("2013-06-19"))
{
if (sb.Length > 0)
{
ProcessMessage(sb.ToString());
sb.Clear();
}
sb.AppendLine(line);
}
}
// be sure to process the last message
if (sb.Length > 0)
{
ProcessMessage(sb.ToString());
}
It is hard to see your file. But I would say read it line by line and Append to some variable.
Check for end of message. When you see it, do whatever you want to do with the message in that variable (insert into DB etc...) and then keep reading the next message.
Pseudo code
read the line
variable a = a + new line
if end of message
insert into DB
reset the variable
continue reading the message.....