C#: Loop over Textfile, split it and Print a new Textfile

C#: Loop over Textfile, split it and Print a new Textfile - c#

I get many lines of String as an Input that look like this. The Input is a String that comes from
theObjects.Runstate;
each #VAR;****;#ENDVAR; represents one Line and one step in the loop.
#VAR;Variable=Speed;Value=Fast;Op==;#ENDVAR;#VAR;Variable=Fabricator;Value=Freescale;Op==;#ENDVAR;
I split it, to remove the unwanted fields, like #VAR,#ENDVAR and Op==.
The optimal Output would be:
Speed = Fast;
Fabricator = Freescale; and so on.
I am able to cut out the #VAR and the#ENDVAR. Cutting out the "Op==" wont be that hard, so thats now not the main focus of the question. My biggest concern right now is,thatI want to print the Output as a Text-File. To print an Array I would have to loop over it. But in every iteration, when I get a new line, I overwrite the Array with the current splitted string. I think the last line of the Inputfile is an empty String, so the Output I get is just an empty Text-File. It would be nice if someone could help me.
string[] w;
Textwriter tw2;
foreach (EA.Element theObjects in myPackageObject.Elements)
{
theObjects.Type = "Object";
foreach (EA.Element theElements in PackageHW.Elements)
{
if (theObjects.ClassfierID == theElements.ElementID)
{
t = theObjects.RunState;
w = t.Replace("#ENDVAR;", "#VAR;").Replace("#VAR;", ";").Split(new string[] { ";" }, StringSplitOptions.RemoveEmptyEntries);
foreach (string s in w)
{
tw2.WriteLine(s);
}
}
}
}

This linq-query gives the exptected result:
var keyValuePairLines = File.ReadLines(pathInputFile)
.Select(l =>
{
l = l.Replace("#VAR;", "").Replace("#ENDVAR;", "").Replace("Op==;", "");
IEnumerable<string[]> tokens = l.Split(new[]{';'}, StringSplitOptions.RemoveEmptyEntries)
.Select(t => t.Split('='));
return tokens.Select(t => {
return new KeyValuePair<string, string>(t.First(), t.Last());
});
});
foreach(var keyValLine in keyValuePairLines)
foreach(var keyVal in keyValLine)
Console.WriteLine("Key:{0} Value:{1}", keyVal.Key, keyVal.Value);
Output:
Key:Variable Value:Speed
Key:Value Value:Fast
Key:Variable Value:Fabricator
Key:Value Value:Freescale
If you want to output it to another text-file with one key-value pair on each line:
File.WriteAllLines(pathOutputFile, keyValuePairLines.SelectMany(l =>
l.Select(kv => string.Format("{0}:{1}", kv.Key, kv.Value))));
Edit according to your question in the comment:
"What would I have to change/add so that the Output is like this. I
need AttributeValuePairs, for example: Speed = Fast; or Fabricator =
Freescale ?"
Now i understand the logic, you have key-value pairs but you are interested only in the values. So every two key-values belong together, the first value of a pair specifies the attibute and the second value the value of that attribute(f.e. Speed=Fast).
Then it's a little bit more complicated:
var keyValuePairLines = File.ReadLines(pathInputFile)
.Select(l =>
{
l = l.Replace("#VAR;", "").Replace("#ENDVAR;", "").Replace("Op==;", "");
string[] tokens = l.Split(new[]{';'}, StringSplitOptions.RemoveEmptyEntries);
var lineValues = new List<KeyValuePair<string, string>>();
for(int i = 0; i < tokens.Length; i += 2)
{
// Value to a variable can be found on the next index, therefore i += 2
string[] pair = tokens[i].Split('=');
string key = pair.Last();
string value = null;
string nextToken = tokens.ElementAtOrDefault(i + 1);
if (nextToken != null)
{
pair = nextToken.Split('=');
value = pair.Last();
}
var keyVal = new KeyValuePair<string, string>(key, value);
lineValues.Add(keyVal);
}
return lineValues;
});
File.WriteAllLines(pathOutputFile, keyValuePairLines.SelectMany(l =>
l.Select(kv=>string.Format("{0} = {1}", kv.Key, kv.Value))));
Output in the file with your single sample-line:
Speed = Fast
Fabricator = Freescale

Related

How to remove space between two words in the same string [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I have a string like this
string input = "\r\n\r\nMaster = \r\nSlave\r\nRed =\r\n Blue";
What I want is that Master = Slave, Red= Blue so that I can create a dictionary.
The method that I am trying to use is:
1) String temp= Regex.Replace(str, “/r/n/r/n”, “”);
2) String temp= str.Replace(“/r/n/r/n”, “”);
Both the methods don’t seem to give me the result that I want. I even tried removing the white space but even that didn’t work out. Please help. Thank you.

It sounds like you have a string like this:
string Str = "Master = Slave\r\nRed = Blue";
And for output are you looking for something like this:
var dict = new Dictionary<string, string> { { "Master", "Slave" }, { "Red", "Blue" } };
If so, one way to do this is to first split the string on the newline characters, then split each of those on the equals character, and then add the resulting pair to a dictionary.
For example:
string input = "Master = Slave\r\nRed = Blue";
string[] keyValuePairs = input.Split( '\r', '\n');
Dictionary<string, string> dict = new Dictionary<string, string>();
foreach (var keyValuePair in keyValuePairs)
{
var parts = keyValuePair.Split('=');
if (parts.Length > 1)
{
dict.Add(parts[0].Trim(), parts[1].Trim());
}
}
// Result:
// dict
// Count = 2
// [0]: {[Master, Slave]}
// [1]: {[Red, Blue]}
The code above can be shortened using some System.Linq extension methods:
Dictionary<string, string> dict = input
.Split(new[] {'\r', '\n'}, StringSplitOptions.RemoveEmptyEntries)
.Select(kvp => kvp.Split('='))
.Where(parts => parts.Length > 1)
.ToDictionary(x => x[0].Trim(), x => x[1].Trim());
Another way to do this, since in the comments you've mentioned the newline characters may appear anywhere, is to examine the results after splitting the input on \r\n and splitting the result of that on the = character.
If the result has two parts, we have a key and a value, so add it to the dictionary. If there's only one part, and we haven't saved a key value yet, then save it as a key. Otherwise, add the saved key and this part as the value.
For example:
var input = "\r\n\r\nMaster = \r\nSlave\r\nRed =\r\n Blue";
var dict = new Dictionary<string, string>();
var currentKey = "";
foreach (var item in input.Split(new[] { '\r', '\n' },
StringSplitOptions.RemoveEmptyEntries))
{
var parts = item.Split(new[] { '=' },
StringSplitOptions.RemoveEmptyEntries);
if (currentKey.Length == 0)
{
if (parts.Length > 1 && !string.IsNullOrWhiteSpace(parts[1]))
{
dict.Add(parts[0].Trim(), parts[1].Trim());
}
else
{
currentKey = parts[0].Trim();
}
}
else
{
dict.Add(currentKey, parts.Length > 1
? parts[1].Trim()
: parts[0].Trim());
currentKey = "";
}
}

Remove duplicate combination of numbers in C# from csv

I'm trying to remove the duplicate combination from a csv file.
I tried using Distinct but it seems to stay the same.
string path;
string newcsvpath = #"C:\Documents and Settings\MrGrimm\Desktop\clean.csv";
OpenFileDialog openfileDial = new OpenFileDialog();
if (openfileDial.ShowDialog() == DialogResult.OK)
{
path = openfileDial.FileName;
var lines = File.ReadLines(path);
var grouped = lines.GroupBy(line => string.Join(", ", line.Split(',').Distinct())).ToArray();
var unique = grouped.Select(g => g.First());
var buffer = new StringBuilder();
foreach (var name in unique)
{
string value = name;
buffer.AppendLine(value);
}
File.WriteAllText(newcsvpath ,buffer.ToString());
label5.Text = "Complete";
}
For example, I have a combination of
{ 1,1,1,1,1,1,1,1 } { 1,1,1,1,1,1,1,2 }
{ 2,1,1,1,1,1,1,1 } { 1,1,1,2,1,1,1,1 }
The output should be
{ 1,1,1,1,1,1,1,1 }
{ 2,1,1,1,1,1,1,1 }

From you example, it seems that you want to treat each line as a sequence of numbers and that you consider two lines equal if one sequence is a permutation of the other.
So from reading your file, you have:
var lines = new[]
{
"1,1,1,1,1,1,1,1",
"1,1,1,1,1,1,1,2",
"2,1,1,1,1,1,1,1",
"1,1,1,2,1,1,1,1"
};
Now let's convert it to an array of number sequences:
var linesAsNumberSequences = lines.Select(line => line.Split(',')
.Select(int.Parse)
.ToArray())
.ToArray();
Or better, since we are not interested in permutations, we can sort the numbers in the sequences immediately:
var linesAsSortedNumberSequences = lines.Select(line => line.Split(',')
.Select(int.Parse)
.OrderBy(number => number)
.ToArray())
.ToArray();
When using Distinct on this, we have to pass a comparer which considers two array equal, if they have the same elements. Let's use the one from this SO question
var result = linesAsSortedNumberSequences.Distinct(new IEnumerableComparer<int>());

Try it
HashSet<string> record = new HashSet<string>();
foreach (var row in dtCSV.Rows)
{
StringBuilder textEditor= new StringBuilder();
foreach (string col in columns)
{
textEditor.AppendFormat("[{0}={1}]", col, row[col].ToString());
}
if (!record.Add(textEditor.ToString())
{
}
}

C# Use Regex to split on Words

This is a stripped down version of code I am working on. The purpose of the code is to take a string of information, break it down, and parse it into key value pairs.
Using the info in the example below, a string might look like:
"DIVIDE = KE48 CLACOS = 4556D DIV = 3466 INT = 4567"
One further point about the above example, at least three of the features we have to parse out will occasionally include additional values. Here is an updated fake example string.
"DIVIDE = KE48, KE49, KE50 CLACOS = 4566D DIV = 3466 INT = 4567 & 4568"
The problem with this is that the code refuses to split out DIVIDE and DIV information separately. Instead, it keeps splitting at DIV and then assigning the rest of the information as the value.
Is there a way to tell my code that DIVIDE and DIV need to be parsed out as two separate values, and to not turn DIVIDE into DIV?
public List<string> FeatureFilterStrings
{
// All possible feature types from the EWSD switch.
get
{
return new List<string>() { "DIVIDE", "DIV", "CLACOS", "INT"};
}
}
public void Parse(string input){
Func<string, bool> queryFilter = delegate(string line) { return FeatureFilterStrings.Any(s => line.Contains(s)); };
Regex regex = new Regex(#"(?=\\bDIVIDE|DIV|CLACOS|INT)");
string[] ms = regex.Split(updatedInput);
List<string> queryLines = new List<string>();
// takes the parsed out data and assigns it to the queryLines List<string>
foreach (string m in ms)
{
queryLines.Add(m);
}
var features = queryLines.Where(queryFilter);
foreach (string feature in features)
{
foreach (Match m in Regex.Matches(workLine, valueExpression))
{
string key = m.Groups["key"].Value.Trim();
string value = String.Empty;
value = Regex.Replace(m.Groups["value"].Value.Trim(), #"s", String.Empty);
AddKeyValue(key, value);
}
}
private void AddKeyValue(string key, string value)
{
try
{
// Check if key already exists. If it does, remove the key and add the new key with updated value.
// Value information appends to what is already there so no data is lost.
if (this.ContainsKey(key))
{
this.Remove(key);
this.Add(key, value.Split('&'));
}
else
{
this.Add(key, value.Split('&'));
}
}
catch (ArgumentException)
{
// Already added to the dictionary.
}
}
}
Further information, the string information does not have a set number of spaces between each key/value, each string may not include all of the values, and the features aren't always in the same order. Welcome to parsing old telephone switch information.

I would create a dictionary from your input string
string input = "DIVIDE = KE48 CLACOS = 4556D DIV = 3466 INT = 4567";
var dict = Regex.Matches(input, #"(\w+?) = (.+?)( |$)").Cast<Match>()
.ToDictionary(m => m.Groups[1].Value, m => m.Groups[2].Value);
Test the code:
foreach(var kv in dict)
{
Console.WriteLine(kv.Key + "=" + kv.Value);
}

This might be a simple alternative for you.
Try this code:
var input = "DIVIDE = KE48 CLACOS = 4556D DIV = 3466 INT = 4567";
var parts = input.Split(new [] { '=', ' ' }, StringSplitOptions.RemoveEmptyEntries);
var dictionary =
parts.Select((x, n) => new { x, n })
.GroupBy(xn => xn.n / 2, xn => xn.x)
.Select(xs => xs.ToArray())
.ToDictionary(xs => xs[0], xs => xs[1]);
I then get the following dictionary:
Based on your updated input, things get more complicated, but this works:
var input = "DIVIDE = KE48, KE49, KE50 CLACOS = 4566D DIV = 3466 INT = 4567 & 4568";
Func<string, char, string> tighten =
(i, c) => String.Join(c.ToString(), i.Split(c).Select(x => x.Trim()));
var parts =
tighten(tighten(input, '&'), ',')
.Split(new[] { '=', ' ' }, StringSplitOptions.RemoveEmptyEntries);
var dictionary =
parts
.Select((x, n) => new { x, n })
.GroupBy(xn => xn.n / 2, xn => xn.x)
.Select(xs => xs.ToArray())
.ToDictionary(
xs => xs[0],
xs => xs
.Skip(1)
.SelectMany(x => x.Split(','))
.SelectMany(x => x.Split('&'))
.ToArray());
I get this dictionary:

Cannot replace last element in string List

I have an input file that includes data on an entertainer and their performance score. For example,
1. Bill Monohan from North Town 10.54
2. Mary Greenberg from Ohio 3.87
3. Sean Hollen from Markell 7.22
I want to be able to take the last number from a line (their score), perform some math on it, and then replace the old score with the new score.
Here's a brief piece of code for what I'm trying to do:
string line;
StreamReader reader = new StreamReader(#"file.txt");
//Read each line and split by spaces into a List.
while ((line = reader.ReadLine())!= null){
//Find last item in List and convert to a Double in order to perform calculations.
List<string> l = new List<string>();
l = line.Split(null).ToList();
string lastItem = line.Split(null).Last();
Double newItem = Convert.ToDouble(lastItem);
/*Do some math*/
/*Replace lastItem with newItem*/
System.Console.WriteLine(line); }
When I write the new line, nothing changes but I want lastItem to be switched with newItem at the end of the line now. I've tried using:
l[l.Length - 1] = newItem.ToString();
But I'm getting no luck. I just need the best way to replace the last value of a string List like this. I've been going at this for a few hours now and I'm almost at the end of my rope.
Please help me c# masters!

You can use regular expression MatchEvaluator to get number from each line, do calculations, and replace original number with new one:
string line = "1. Bill Monohan from North Town 10.54";
line = Regex.Replace(line, #"(\d+\.?\d*)$", m => {
decimal value = Decimal.Parse(m.Groups[1].Value);
value = value * 2; // calculation
return value.ToString();
});
This regex captures decimal number at the end of input string. Output:
1. Bill Monohan from North Town 21.08

You're not changing anything to your line object before doing your WriteLine.
You will have to rebuild your line, something like this:
var items = string.Split();
items.Last() = "10";//Replace
var line = string.Join(" ", items)
Tip: strings are immutable, look it up.

This should work:
//var l = new List<string>(); // you don't need this
var l = line.Split(null).ToList();
var lastItem = l.Last(); // line.Split(null).Last(); don't split twice
var newItem = Convert.ToDouble(lastItem, CultureInfo.InvariantCulture);
/*Do some math*/
/*Replace lastItem with newItem*/
l[l.Count - 1] = newItem.ToString(); // change the last element
//Console.WriteLine(line); // line is the original string don't work
Console.WriteLine(string.Join(" ", l)); // create new string

This would probably do the job for you. A word on reading files though, if possible, ie they fit in memory, read the entire file at once, it gives you one disk access (well, depends on file size, but yeah) and you do not have to worry about filehandles.
// Read the stuff from the file, gets an string[]
var lines = File.ReadAllLines(#"file.txt");
foreach (var line in lines)
{
var splitLine = line.Split(' ');
var score = double.Parse(splitLine.Last(), CultureInfo.InvariantCulture);
// The math wizard is in town!
score = score + 3;
// Put it back
splitLine[splitLine.Count() - 1] = score.ToString();
// newLine is the new line, what should we do with it?
var newLine = string.Join(" ", splitLine);
// Lets print it cause we are out of ideas!
Console.WriteLine(newLine);
}
What do you want to do with the end result? Do you want it written back to file?

Try this
string subjectString = "Sean Hollen from Markell 7.22";
double Substring =double.Parse(subjectString.Substring(subjectString.IndexOf(Regex.Match(subjectString, #"\d+").Value), subjectString.Length - subjectString.IndexOf(Regex.Match(subjectString, #"\d+").Value)).ToString());
double NewVal = Substring * 10; // Or any of your operation
subjectString = subjectString.Replace(Substring.ToString(), NewVal.ToString());
Note: This will not work if the number appears twice on the same line

You are creating and initializing the list in a loop, hence it contains always only the current line. Do you want to find the highest score of all entertainers or the highest score of each entertainer (in case an entertainer could repeat in the file)?
However, here is an approach that gives you both:
var allWithScore = File.ReadAllLines(path)
.Select(l =>
{
var split = l.Split();
string entertainer = string.Join(" ", split.Skip(1).Take(split.Length - 2));
double score;
bool hasScore = double.TryParse(split.Last(), NumberStyles.Float, CultureInfo.InvariantCulture, out score);
return new { line = l, split, entertainer, hasScore, score };
})
.Where(x => x.hasScore);
// highest score of all:
double highestScore = allWithScore.Max(x => x.score);
// entertainer with highest score
var entertainerWithHighestScore = allWithScore
.OrderByDescending(x => x.score)
.GroupBy(x => x.entertainer)
.First();
foreach (var x in entertainerWithHighestScore)
Console.WriteLine("Entertainer:{0} Score:{1}", x.entertainer, x.score);
// all entertainer's highest scores:
var allEntertainersHighestScore = allWithScore
.GroupBy(x => x.entertainer)
.Select(g => g.OrderByDescending(x => x.score).First());
foreach (var x in allEntertainersHighestScore)
Console.WriteLine("Entertainer:{0} Score:{1}", x.entertainer, x.score);

Occurence of elements in the file with c# and Dictionary

I have a file as
outlook temperature Humidity Windy PlayTennis
sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
I want to find occurence of each element e.g
sunny: 2
rain: 3
overcast:1
hot: 3
and so on
My code is:
string file = openFileDialog1.FileName;
var text1 = File.ReadAllLines(file);
StringBuilder str = new StringBuilder();
string[] lines = File.ReadAllLines(file);
string[] nonempty=lines.Where(s => s.Trim(' ')!="")
.Select(s => Regex.Replace(s, #"\s+", " ")).ToArray();
string[] colheader = null;
if (nonempty.Length > 0)
colheader = nonempty[0].Split();
else
return;
var linevalue = nonempty.Skip(1).Select(l => l.Split());
int colcount = colheader.Length;
Dictionary<string, string> colvalue = new Dictionary<string, string>();
for (int i = 0; i < colcount; i++)
{
int k = 0;
foreach (string[] values in linevalue)
{
if(! colvalue.ContainsKey(values[i]))
{
colvalue.Add(values[i],colheader[i]);
}
label2.Text = label2.Text + k.ToString();
}
}
foreach (KeyValuePair<string, string> pair in colvalue)
{
label1.Text += pair.Key+ "\n";
}
Output I get here is
sunny
overcast
rain
hot
mild
cool
N
P
true
false
I also want to find the occurence, which I am unable to get. Can u please help me out here.

This LINQ query will return Dictionary<string, int> which will contain each word in file as key, and word's occurrences as value:
var occurences = File.ReadAllLines(file).Skip(1) // skip titles line
.SelectMany(l => l.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries))
.GroupBy(w => w)
.ToDictionary(g => g.Key, g => g.Count());
Usage of dictionary:
int sunnyOccurences = occurences["sunny"];
foreach(var pair in occurences)
label1.Text += String.Format("{0}: {1}\n", pair.Key, pair.Value);

Seems to me like you are implementing a simple Tag Cloud. I have used non-generic collection but you can replace it with generic. Replace the HashTable with Dictionary
Follow this code:
Hashtable tagCloud = new Hashtable();
ArrayList frequency = new ArrayList();
Read from a file and store it as array
string[] lines = File.ReadAllLines("file.txt");
//use the specific delimiter
char[] delimiter = new char[] { ' ' };
StringBuilder buffer = new StringBuilder();
foreach (string line in lines)
{
if (line.ToString().Length != 0)
{
buffer.Append((" " + line.Trim()));
}
}
string[] words = buffer.ToString().Trim().Split(delimiter);
Storing occurrence of each word.
List<string> listOfWords = new List<string>(words);
foreach (string i in listOfWords)
{
int c = 0;
foreach (string j in words)
{
if (i.Equals(j))
c++;
}
frequency.Add(c);
}
Store as key value pair. Value will be word and key will be its occurrence
for (int i = 0; i < listOfWords.Count; i++)
{
//use dictionary here
tagCloud.Add(listOfWords[i], (int)frequency[i]);
}

If all you want is the keyword and a count of how many times they appear in the file, then lazyberezovsky's solution is about as elegant of a solution as you will find. But if you need to do any other metrics on the file's data, then I would load the file into a collection that keeps your other metadata intact.
Something simple like:
var forecasts = File.ReadAllLines(file).Skip(1) // skip the header row
.Select(line => line.Split(new []{' '}, StringSplitOptions.RemoveEmptyEntries)) // split the line into an array of strings
.Select (f =>
new
{
Outlook = f[0],
Temperature = f[1],
Humidity = f[2],
Windy = f[3],
PlayTennis = f[4]
});
will give you an IEnumerable<> of an anonymous type that has properties that can be queried.
For example if you wanted to see how many times "sunny" occurred in the Outlook then you could just use LINQ to do this:
var count = forecasts.Count( f => f.Outlook == "sunny");
Or if you just wanted the list of all outlooks you could write:
var outlooks = forecasts.Select(f => f.Outlook).Distinct();
Where this is useful is when you want to do more complicated queries like "How many rainy cool days are there?
var count = forecasts.Count (f => f.Outlook == "rain" && f.Temperature == "cool");
Again if you just want all words and their occurrence count, then this is overkill.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C#: Loop over Textfile, split it and Print a new Textfile - c#

Related

How to remove space between two words in the same string [closed]

Remove duplicate combination of numbers in C# from csv

C# Use Regex to split on Words

Cannot replace last element in string List

Occurence of elements in the file with c# and Dictionary

Categories

Resources