I need help with text. I got a code which for example finds if the line has even number of words, then it finds every 2nd word in a text file. The problem is i don't know how to append a string to that every 2nd word and print it out.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Collections;
using System.IO;
namespace sd
{
class Program
{
const string CFd = "..\\..\\A.txt";
const string CFr = "..\\..\\Rezults.txt";
static void Main(string[] args)
{
Apdoroti(CFd, CFr);
Console.WriteLine();
}
static void Apdoroti(string fd, string fr)
{
string[] lines = File.ReadAllLines(fd, Encoding.GetEncoding(1257));
using (var far = File.CreateText(fr))
{
StringBuilder news = new StringBuilder();
VD(CFd,news);
far.WriteLine(news);
}
}
static void VD(string fv, StringBuilder news)
{
using (StreamReader reader = new StreamReader(fv,
Encoding.GetEncoding(1257)))
{
string[] lines = File.ReadAllLines(fv, Encoding.GetEncoding(1257));
int nrl;
int prad = 1;
foreach (string line in lines)
{
nrl = line.Trim().Split(' ').Count();
string[] parts = line.Split(' ');
if (nrl % 2 == 0)
{
Console.WriteLine(nrl);
for (int i = 0; i < nrl; i += 2)
{
int ind = line.IndexOf(parts[i]);
nauja.Append(parts[i]);
Console.WriteLine(" {0} ", news);
}
}
}
}
}
}
}
For example if i got a text like:
"Monster in the Jungle Once upon a time a wise lion lived in jungle.
He was always respected for his intelligence and kindness."
Then it should print out:
"Monster in abb the Jungle abb Once upon abb a time abb a wise abb lion lived abb in jungle.
He was always respected for his intelligence and kindness."
You can do it with a regex replace, like this regex:
#"\w+\s\w+\s"
It maches a Word, a Space, a Word and a Space.
Now replace it with:
"$&abb "
How to use:
using System.Text.RegularExpressions;
string text = "Monster in the Jungle Once upon a time a wise lion lived in jungle. He was always respected for his intelligence and kindness.";
Regex regex = new Regex(#"\w+\s\w+\s");
string output = regex.Replace(text, "$&abb ");
Now you will get the desired output.
Edit:
To Work with any number of Words, you can use:
#"(\w+\s){3}"
where the quantifier (here 3) can be set to whatever you want.
Edit2:
If you don't want to replace numbers:
#"([a-zA-Z]+\s){2}"
You can use linq, first parse the line on spaces to get a list of words (you are doing) and then for every odd element add the text required, finally convert the array back into a string.
string test = "Monster in the Jungle Once upon a time a wise lion lived in jungle. He was always respected for his intelligence and kindness.";
var words = test.Split(' ');
var wordArray = words.Select((w, i) =>
(i % 2 != 0) ? (w+ " asd ") : (w + " ")
).ToArray();
var res = string.Join("", wordArray);
Also this can be easily changed to insert after every n words by changing the mod function. Do remember that array index will start at 0 though.
Related
Hello there again dear friends. I do not for the word of me understand what is going on in this code. I'm trying to implement a dictionary that counts the instances that a word pops up disregarding upper case or not. It keeps showing "isthis" and I dont know where its coming from. How do i rectify this?
The question is as such
Write a program that counts how many times each word from a given
text file words.txt appears in it. The result words should be ordered by
their number of occurrences in the text.
Here is the code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Text.RegularExpressions;
namespace Chapter_18_Question_3
{
class Program
{
static void Main(string[] args)
{
const string path = "words.txt";
string line;
using (var reader = new StreamReader(path))
{
line = reader.ReadToEnd();
}
string text = line.ToLower();
string tmp = Regex.Replace(text, "[^a-zA-Z0-9 ]", "");
string[] newText = tmp.Split(' ');
var table = new SortedDictionary<string, int>();
foreach(var item in newText)
{
if(!table.ContainsKey(item))
{
table.Add(item, 1);
}
else
{
table[item] += 1;
}
}
foreach (var item in table)
{
Console.WriteLine("The word {0} appeared {1} times",
item.Key, item.Value);
}
}
}
My text is this:
"This is the TEXT. Text, text, text – THIS TEXT! Is this the text?"
And the output is this
The word appeared 1 times
The word is appeared 1 times
The word isthis appeared 1 times
The word text appeared 6 times
The word the appeared 2 times
The word this appeared 2 times
If I were to guess, I'd say that your file contains a line break (LF or CRLF) that gets replaced by your regex (which only allows letters and spaces).
For instance, if the file contents were:
This
is the text.
The line break between This and is would be removed, leaving you with the text:
Thisis the text.
If this is the case, you might want to use "[^a-zA-Z0-9 \r\n]" instead as a replacement pattern.
So I am coding a converter program that convers a old version of code to the new version you just put the old text in a text box and it converts Txt to Xml and im trying to get each items beetween two characters and below is the string im trying to split. I have put just the name of the param in the " " to protect my users credentials. So i want to get every part of code beetween the ","
["Id","Username","Cash","Password"],["Id","Username","Cash","Password"]
And then add each string to a list so it would be like
Item 1
["Id","Username","Cash","Password"]
Item 2
["Id","Username","Cash","Password"]
I would split it using "," but then it would mess up because there is a "," beetween the params of the string so i tried using "],"
string input = textBox1.Text;
string[] parts1 = input.Split(new string[] { "]," }, StringSplitOptions.None);
foreach (string str in parts1)
{
//Params is a list...
Params.Add(str);
}
MessageBox.Show(string.Join("\n\n", Params));
But it sort of take the ] of the end of each one. And it messes up in other ways
This looks like a great opportunity for Regular Expressions.
My approach would be to get the row parts first, then get the column parts. I'm sure there are about 30 ways to do this, but this is my (simplistic) approach.
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var rowPattern = new Regex(#"(?<row>\[[^]]+\])", RegexOptions.Multiline | RegexOptions.ExplicitCapture);
var columnPattern = new Regex(#"(?<column>\"".+?\"")", RegexOptions.Multiline | RegexOptions.ExplicitCapture);
var data = "[\"Id\",\"Username\",\"Cash\",\"Password\"],[\"Id\",\"Username\",\"Cash\",\"Password\"]";
var rows = rowPattern.Matches(data);
var rowCounter = 0;
foreach (var row in rows)
{
Console.WriteLine("Row #{0}", ++rowCounter);
var columns = columnPattern.Matches(row.ToString());
foreach (var column in columns)
Console.WriteLine("\t{0}", column);
}
Console.ReadLine();
}
}
}
Hope this helps!!
You can use Regex.Split() together with positive lookbehind and lookahead to do this:
var parts = Regex.Split(input, "(?<=]),(?=\\[)");
Basically this says “split on , with ] right before it and [ right after it”.
Assuming that the character '|' does not occur in your original data, you can try:
input.Replace("],[", "]|[").Split(new char[]{'|'});
If the pipe character does occur, use another (non-occurring) character.
I am still a beginner in C# and I know there is a method that can be used to do this but I can't seem to find it online.
I have a function that permutates a word
static void Main(string[] args)
{
string[] list = "a b c d".Split();
foreach (string[] permutation in Permutations<string>.AllFor(list))
{
System.Console.WriteLine(string.Join(" ", permutation));
}
}
However it only works with words that are broken up. (eg. "a b c d" ) Since that is not really a practical way to ask a user for input, I want to find a way to take a word from the user (an unbroken word like "hello" ) and break it up for the function to understand. Eg. form the input word of the use "happy" to a spaced word for the program to understand = "h a p p y"
I tried this code:
//splits the word into an array
string[] arr = name.Split();
//splits the array with spaces to enter into the program
name = string.Join(" ",arr);
arr = name.Split();
But it just ends up coming out unbroken anyway. Can someone tell me the easiest way to do this?
Just to mention I am still a beginner in C# and programming in total I might not understand some of the higher level concepts. I have been through some answers on this website and I have seen some answers that I don't understand at all.
You can loop over the string to convert it to an array, and then use Join.
using System.Text.RegularExpressions;
using System;
public class Program{
public static void Main(string[] args) {
string v = "hello";
// Convert into the a string array, the old-fashioned way.
string[] name = new string[v.Length];
for (int i = 0; i < v.Length; i++)
name[i] = v[i] + "";
string feedToPermutationFunction = string.Join(" ",name));
// Feed the above string into your permutation code.
}
}
You just need to separate each character and then concatenate them with a space:
This is the simplest way:
var userInput = Console.ReadLine();
var output = string.Join<char>(" ", userInput);
Console.WriteLine(output);
char[] array=input.ToArray();
string val="";
for(int i=0;i<array.Length;i++)
{
val+=array[i]+" ";
}
this will give you a string with spaces like you wanted Val
create an array with the string length
string[] strarray=new string[val.Length];
for(int i=0;i<strarray.Length;i++)
{
strarray[i]=val.Substring(i,len); //**i** is for string index,,,**len** string length in each index
}
Currently fiddling with a little project I'm working on, it's a count down type game (the tv show).
Currently, the program allows the user to pick a vowel or consonant to a limit of 9 letters and then asks them to input the longest word they can think of using these 9 letters.
I have a large text file acting as a dictionary that i search through using the user inputted string to try match a result to check if the word they entered is a valid word. My problem, is that I want to then search my dictionary for the longest word made up of the nine letters, but i just cant seem to find a way to implement it.
So far I've tried putting every word into an array and searching through each element to check if it contains the letters but this wont cover me if the longest word that can be made out of the 9 letters is a 8 letter word. Any idea's?
Currently I have this (This is under the submit button on the form, sorry for not providing code or mentioning it's a windows form application):
StreamReader textFile = new StreamReader("C:/Eclipse/Personal Projects/Local_Projects/Projects/CountDown/WindowsFormsApplication1/wordlist.txt");
int counter1 = 0;
String letterlist = (txtLetter1.Text + txtLetter2.Text + txtLetter3.Text + txtLetter4.Text + txtLetter5.Text + txtLetter6.Text + txtLetter7.Text + txtLetter8.Text + txtLetter9.Text); // stores the letters into a string
char[] letters = letterlist.ToCharArray(); // reads the letters into a char array
string[] line = File.ReadAllLines("C:/Eclipse/Personal Projects/Local_Projects/Projects/CountDown/WindowsFormsApplication1/wordlist.txt"); // reads every line in the word file into a string array (there is a new word on everyline, and theres 144k words, i assume this will be a big performance hit but i've never done anything like this before so im not sure ?)
line.Any(x => line.Contains(x)); // just playing with linq, i've no idea what im doing though as i've never used before
for (int i = 0; i < line.Length; i++)// a loop that loops for every word in the string array
// if (line.Contains(letters)) //checks if a word contains the letters in the char array(this is where it gets hazy if i went this way, i'd planned on only using words witha letter length > 4, adding any words found to another text file and either finding the longest word then in this text file or keeping a running longest word i.e. while looping i find a word with 7 letters, this is now the longest word, i then go to the next word and it has 8 of our letters, i now set the longest word to this)
counter1++;
if (counter1 > 4)
txtLongest.Text += line + Environment.NewLine;
Mike's code:
using System;
using System.Collections.Generic;
using System.Linq;
class Program
static void Main(string[] args) {
var letters = args[0];
var wordList = new List<string> { "abcbca", "bca", "def" }; // dictionary
var results = from string word in wordList // makes every word in dictionary into a seperate string
where IsValidAnswer(word, letters) // calls isvalid method
orderby word.Length descending // sorts the word with most letters to top
select word; // selects that word
foreach (var result in results) {
Console.WriteLine(result); // outputs the word
}
}
private static bool IsValidAnswer(string word, string letters) {
foreach (var letter in word) {
if (letters.IndexOf(letter) == -1) { // checks if theres letters in the word
return false;
}
letters = letters.Remove(letters.IndexOf(letter), 1);
}
return true;
}
}
Here's an answer I knocked together in a couple of minutes which should do what you want. As others have said, this problem is complex and so the algorithm is going to be slow. The LINQ query evaluates each string in the dictionary, checking whether the supplied letters can be used to produce said word.
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args) {
var letters = args[0];
var wordList = new List<string> { "abcbca", "bca", "def" };
var results = from string word in wordList
where IsValidAnswer(word, letters)
orderby word.Length descending
select word;
foreach (var result in results) {
Console.WriteLine(result);
}
}
private static bool IsValidAnswer(string word, string letters) {
foreach (var letter in word) {
if (letters.IndexOf(letter) == -1) {
return false;
}
letters = letters.Remove(letters.IndexOf(letter), 1);
}
return true;
}
}
So where are you getting stuck? Start with the slow brute-force method and just find all the words that contain all the characters. Then order the words by length to get the longest. If you don't want to return a word that is shorter than the number of characters being sought (which I guess is only an issue if there are duplicate characters???), then add a test and eliminate that case.
I've had some more thoughts about this. I think the way to do it efficiently is by preprocessing the dictionary, ordering the letters in each word in alphabetical order and ordering the words in the list alphabetically too (you'll probably have to use some sort of multimap structure to store the original word and the sorted word).
Once you've done that you can much more efficiently find the words that can be generated from your pool of letters. I'll come back and flesh out an algorithm for doing this later, if someone else doesn't beat me to it.
Step 1: Construct a trie structure with each word sort by letter.
Example: EACH is sorted to ACEH is stored as A->C->E->H->(EACH, ACHE, ..) in the trie (ACHE is an anagram of EACH).
Step 2: Sort the input letters and find find the longest word corresponding to that set of letters in the trie.
Have you tried implementing something like this? It would be great to see your code you have tried.
string[] strArray = {"ABCDEFG", "HIJKLMNOP"};
string findThisString = "JKL";
int strNumber;
int strIndex = 0;
for (strNumber = 0; strNumber < strArray.Length; strNumber++)
{
strIndex = strArray[strNumber].IndexOf(findThisString);
if (strIndex >= 0)
break;
}
System.Console.WriteLine("String number: {0}\nString index: {1}",
strNumber, strIndex);
This must do the job :
private static void Main()
{
char[] picked_char = {'r', 'a', 'j'};
string[] dictionary = new[] {"rajan", "rajm", "rajnujaman", "rahim", "ranjan"};
var words = dictionary.Where(word => picked_char.All(word.Contains)).OrderByDescending(word => word.Length);
foreach (string needed_words in words)
{
Console.WriteLine(needed_words);
}
}
Output :
rajnujaman
ranjan
rajan
rajm
A search procedure using full text search (it means: is hard to reproduce the match outside the procedure) return rows highlighting the matched string inside, like:
"i have been <em>match</em>ed"
"a <em>match</em> will happen in the word <em>match</em>"
"some random words including the word <em>match</em> here"
Now I need to get the first x characters of the string but I'm getting a few troubles with the html tags inside.
Like:
"i have been <em>mat</em>..." -> first 15 characters
"a <em>match</em> will happen in the word <em>m</em>..." -> first 33 characters
"some rando..." -> first 10 characters
I have tried using some if else, but I ended up with a big spaghetti.
Any tips?
This should do what you want based on there only being <em> tags.
using System;
using System.Collections.Generic;
using System.Text;
namespace Test
{
public class Program
{
public static void Main(string[] args)
{
var dbResults = GetMatches();
var firstLine = HtmlSubstring(dbResults[0], 0, 15);
Console.WriteLine(firstLine);
var secondLine = HtmlSubstring(dbResults[1], 0, 33);
Console.WriteLine(secondLine);
var thirdLine = HtmlSubstring(dbResults[2], 0, 10);
Console.WriteLine(thirdLine);
Console.Read();
}
private static List<string> GetMatches()
{
return new List<string>
{
"i have been <em>match</em>ed"
,"a <em>match</em> will happen in the word <em>match</em>"
, "some random words including the word <em>match</em> here"
};
}
private static string HtmlSubstring(string mainString, int start, int length = int.MaxValue)
{
StringBuilder substringResult = new StringBuilder(mainString.Replace("</em>", "").Replace("<em>", "").Substring(start, length));
// Get indexes between start and (start + length) that need highlighting.
int matchIndex = mainString.IndexOf("<em>", start);
while (matchIndex > 0 && matchIndex < (substringResult.Length - start))
{
int matchIndexConverted = matchIndex - start;
int matchEndIndex = mainString.IndexOf("</em>", matchIndex) - start;
substringResult.Insert(matchIndexConverted, "<em>");
substringResult.Insert(Math.Min(substringResult.Length, matchEndIndex), "</em>");
matchIndex = mainString.IndexOf("<em>", matchIndex + 1);
}
return substringResult.ToString();
}
}
}
I suggest writing a simple parser with a few states - InText, InOpeningTag, InClosingTag are a few that come to mind.
Just loop through the characters, figure out if you are InText, only counting those characters... Once you reach your limit, don't add any more text and if you are between opening and closing tags, just add the closing tag.
Take a look at the source code for the HTML Agility Pack if you don't know what I am talking about (look for the Parse methods).