read specific websourcecode in c# - c#

When I press a button the following happens:
HttpWebRequest request = (HttpWebRequest)WebRequest
.Create("http://oldschool.runescape.com/slu");
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader sr = new StreamReader(response.GetResponseStream());
richTextBox1.Text = sr.ReadToEnd();
sr.Close();
In short the data gets transferred to my textbox (this works perfectly)
Now if I choose world 78 (for example, from a combobox, it will refer to the last digits of that line) I want to get the value 968, if i choose world 14, I want to get the value 973.
This is an example of the printed data
e(378,true,0,"oldschool78",968,"United States","US","Old School 78");
e(314,true,0,"oldschool14",973,"United States","US","Old School 14");
What can I use to read this?

So there are two problems here, the first is selecting the right line, then getting the number out.
First you want a method for getting each of the lines in to a list, eg using something like this:
List<String> lines = new List<String>()
string line = sr.ReadLine();
while(line != null)
{
lines.Add(line);
line = sr.ReadLine(); // read the next line
}
Then you need to find the relevant line and get the token out of it.
Probably the most simple way is, for each line, split the string up by ',', '\"', '(' and ')' (using
String.Split). Ie, we get basically the parameters.
Eg
foreach(string lineInFile in lines)
{
// split the string in to tokens
string[] tokens = lineInFile.Split(',', '\"', '(', ')');
// based on the sample strings and how we've split this,
// we take the 15th entry
string endParameter = tokens[15]; //endParamter = "Old School 14"
...
We now use a regular expression to extract the number. The pattern we will use is d+, ie 1 or more digits.
Regex numberFinder = new Regex("\\d+");
Match numberMatch = numberFinder.Match(endParameter);
// we assume that there is a match, because if there isn't the string isn't
// correct, you should do some error handling here
string matchedNumber = numberMatch.Value;
int value = Int32.Parse(matchedValue); // we convert the string in to the number
if(value == desiredValue)
...
We check if the value matches the value we were looking for (eg 14), we now need to get the number you wanted.
We've already split the parameters, and the number we want is the 8th item (eg index 7 in string[] tokens). Since, at least in your example, this is just a lone number, we can just parse this to get the int.
{
return Int32.Parse(tokens[7]);
}
}
Again here we are assuming that the string is in the formats you showed, and you should do error protection here to.

Related

How to contact whole text from file into the string avoiding empty lines beetwen strings

How to get whole text from document contacted into the string. I'm trying to split text by dot: string[] words = s.Split('.'); I want take this text from text document. But if my text document contains empty lines between strings, for example:
pat said, “i’ll keep this ring.”
she displayed the silver and jade wedding ring which, in another time track,
she and joe had picked out; this
much of the alternate world she had elected to retain. he wondered what - if any - legal basis she had kept in addition. none, he hoped; wisely, however, he said nothing. better not even to ask.
result looks like this:
1. pat said ill keep this ring
2. she displayed the silver and jade wedding ring which in another time track
3. she and joe had picked out this
4. much of the alternate world she had elected to retain
5. he wondered what if any legal basis she had kept in addition
6. none he hoped wisely however he said nothing
7. better not even to ask
but desired correct output should be like this:
1. pat said ill keep this ring
2. she displayed the silver and jade wedding ring which in another time track she and joe had picked out this much of the alternate world she had elected to retain
3. he wondered what if any legal basis she had kept in addition
4. none he hoped wisely however he said nothing
5. better not even to ask
So to do this first I need to process text file content to get whole text as single string, like this:
pat said, “i’ll keep this ring.” she displayed the silver and jade wedding ring which, in another time track, she and joe had picked out; this much of the alternate world she had elected to retain. he wondered what - if any - legal basis she had kept in addition. none, he hoped; wisely, however, he said nothing. better not even to ask.
I can't to do this same way as it would be with list content for example: string concat = String.Join(" ", text.ToArray());,
I'm not sure how to contact text into string from text document
I think this is what you want:
var fileLocation = #"c:\\myfile.txt";
var stringFromFile = File.ReadAllText(fileLocation);
//replace Environment.NewLine with any new line character your file uses
var withoutNewLines = stringFromFile.Replace(Environment.NewLine, "");
//modify to remove any unwanted character
var withoutUglyCharacters = Regex.Replace(withoutNewLines, "[“’”,;-]", "");
var withoutTwoSpaces = withoutUglyCharacters.Replace(" ", " ");
var result = withoutTwoSpaces.Split('.').Where(i => i != "").Select(i => i.TrimStart()).ToList();
So first you read all text from your file, then you remove all unwanted characters and then split by . and return non empty items
Have you tried replacing double new-lines before splitting using a period?
static string[] GetSentences(string filePath) {
if (!File.Exists(filePath))
throw new FileNotFoundException($"Could not find file { filePath }!");
var lines = string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line)));
var sentences = Regex.Split(lines, #"\.[\s]{1,}?");
return sentences;
}
I haven't tested this, but it should work.
Explanation:
if (!File.Exists(filePath))
throw new FileNotFoundException($"Could not find file { filePath }!");
Throws an exception if the file could not be found. It is advisory you surround the method call with a try/catch.
var lines = string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line)));
Creates a string, and ignores any lines which are purely whitespace or empty.
var sentences = Regex.Split(lines, #".[\s]{1,}?");
Creates a string array, where the string is split at every period and whitespace following the period.
E.g:
The string "I came. I saw. I conquered" would become
I came
I saw
I conquered
Update:
Here's the method as a one-liner, if that's your style?
static string[] SplitSentences(string filePath) => File.Exists(filePath) ? Regex.Split(string.Join("", File.ReadLines(filePath).Where(line => !string.IsNullOrEmpty(line) && !string.IsNullOrWhiteSpace(line))), #"") : null;
I would suggest you to iterate through all characters and just check if they are in range of 'a' >= char <= 'z' or if char == ' '. If it matches the condition then add it to the newly created string else check if it is '.' character and if it is then end your line and add another one :
List<string> lines = new List<string>();
string line = string.Empty;
foreach(char c in str)
{
if((char.ToLower(c) >= 'a' && char.ToLower(c) <= 'z') || c == 0x20)
line += c;
else if(c == '.')
{
lines.Add(line.Trim());
line = string.Empty;
}
}
Working online example
Or if you prefer "one-liner"s :
IEnumerable<string> lines = new string(str.Select(c => (char)(((char.ToLower(c) >= 'a' && char.ToLower(c) <= 'z') || c == 0x20) ? c : c == '.' ? '\n' : '\0')).ToArray()).Split('\n').Select(s => s.Trim());
I may be wrong about this. I would think that you may not want to alter the string if you are splitting it. Example, there are double/single quote(s) (“) in part of the string. Removing them may not be desired which brings up the possibly of a question, reading a text file that contains single/double quotes (as your example data text shows) like below:
var stringFromFile = File.ReadAllText(fileLocation);
will not display those characters properly in a text box or the console because the default encoding using the ReadAllText method is UTF8. Example the single/double quotes will display (replacement characters) as diamonds in a text box on a form and will be displayed as a question mark (?) when displayed to the console. To keep the single/double quotes and have them display properly you can get the encoding for the OS’s current ANSI encoding by adding a parameter to the ReadAllText method like below:
string stringFromFile = File.ReadAllText(fileLocation, ASCIIEncoding.Default);
Below is code using a simple split method to .split the string on periods (.) Hope this helps.
private void button1_Click(object sender, EventArgs e) {
string fileLocation = #"C:\YourPath\YourFile.txt";
string stringFromFile = File.ReadAllText(fileLocation, ASCIIEncoding.Default);
string bigString = stringFromFile.Replace(Environment.NewLine, "");
string[] result = bigString.Split('.');
int count = 1;
foreach (string s in result) {
if (s != "") {
textBox1.Text += count + ". " + s.Trim() + Environment.NewLine;
Console.WriteLine(count + ". " + s.Trim());
count++;
}
else {
// period at the end of the string
}
}
}

Got wrong data while split file into arrays

I have file contains two lines and each line contains 200 fields and I would like to split it into arrays
using (StreamReader sr = File.OpenText(pathSensorsCalc))
{
string s = String.Empty;
while ((s = sr.ReadLine()) == null) { };
String line1 = sr.ReadToEnd();
String line2 = sr.ReadToEnd();
CalcValue[0] = new String[200];
CalcValue[1] = new String[200];
CalcValue[0] = line1.Split(' ');
CalcValue[1] = line2.Split(' ');
}
After the code above, CalcValue[1] is empty and CalcValue[0] contains data of the second line (instad of the first one). Any ideas?
When using
sr.ReadToEnd()
, you are reading to the end of your input stream. That means, after the first call of
String line1 = sr.ReadToEnd()
your stream is already at the last position. Replace your ReadToEnd() call with ReadLine() calls. That should work.
In the Windows OS, a new line is represented by \r\n. So you should not split the lines by spaces (" ").
Which means you should use another overload of the Split method - Split(char[], StringSplitOptions). The first argument is the characters you want to split by and the second is the options. Why do you need the options? Because if you split by 2 continuous characters you get an empty element.
So now it is easy to understand what this code does and why:
line1.Split (new[] {'\r', '\n'}, StringSplitOptions.RemoveEmptyEntries);

Importing .csv file in to listview

I'm trying to load a .csv file into a listview:
ofDialog.Filter = #"CSV Files|*.csv";
ofDialog.Title = #"Select your backlink file...";
ofDialog.FileName = "backlinks.csv";
// is cancel pressed?
if (ofDialog.ShowDialog() == DialogResult.Cancel)
return;
try
{
string filename = ofDialog.FileName;
var lines = File.ReadAllLines(filename);
foreach (string line in lines)
{
var parts = line.Split(' ');
ListViewItem lvi = new ListViewItem(parts[0]);
lvi.SubItems.Add(parts[1]);
listViewMain.Items.Add(lvi);
}
// update count
Helpers.returnMessage(File.ReadAllLines(ofDialog.FileName).Count() + " rows imported.");
}
catch (Exception ex)
{
Helpers.returnMessage(ex.Message);
}
The csv contents looks like:
URL Rating Domain Rating IP From Referring Page URL Referring Page Title Internal Links Count External Links Count Link URL TextPre Link Anchor TextPost Size Type NoFollow Site-wide Image Encoding Alt First Seen Previous Visited Last Check Original
24 89 91.198.174.192 http://en.wikipedia.org/wiki/Humbug_(sweet) "Humbug (sweet) - Wikipedia, the free encyclopedia" 118 16 http://www.bestbritishsweets.co.uk/user/products/large/everton.jpg http://www.bestbritishsweets.co.uk/user/products/large/everton.jpg 12163 href True False False utf8 2013-09-08T15:14:50Z 2015-03-11T01:48:40Z 2015-03-11T01:48:40Z True
There is no delimeter "," like in regular .csv files, and has different spaces between some fields, i'm stuck on the best way to split each section and add to the listview, i have a mental block lol
any help would be appreciated :)
cheers guys
Graham
For opening the CSV file, I would first check it is not a tab separated file, where you can use \t as the delimiter to read the file in a similar method as you are.
Failing this you could use a (very long and complicated) regex string to match the different "columns" as different parts. The regex string would look something like:
\s+([0-9]*)\s+([0-9]*)\s+([0-9]*.[0-9]*.[0-9]*.[0-9]*)\s+([a-zA-Z:\/._\(\)]*)\s+(\"[a-zA-Z0-9 \-\(\),]*\")\s+([0-9]*)\s+([0-9]*)\s+([a-zA-Z:\/._\(\)]*)\s+([a-zA-Z:\/._\(\)]*)\s+([0-9]*)\s+([a-zA-Z]*)\s+(True|False)\s+(True|False)\s+(True|False)\s+([a-z0-9]*)\s+([0-9\-T:Z]*)\s+([0-9\-T:Z]*)\s+([0-9\-T:Z]*)\s+(True|False)
This would return each column as a different group, which you can access as detailed below:
var regex = new Regex(regexString);
foreach(var line in lines)
{
var match = regex.Match(line);
var urlRating = match.Groups[0].Value;
var domainRating = match.Groups[1].Value;
var ip = match.Groups[2].Value;
// ...
}
You can see more about the regex string I have created (and possibly simplify it/extend it for the additional lines) here: https://regex101.com/r/oN4tW3/1
For more on C# regex look here: https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex(v=vs.110).aspx
Edit: I would avoid the regex method if it is tab seperated as it is more complex and fragile

split a string from a text file into another list

Hi i know the Title might sound a little confusing but im reading in a text file with many lines of data
Example
12345 Test
34567 Test2
i read in the text 1 line at a time and add to a list
using (StreamReader reader = new StreamReader("Test.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
list.Add(line);
}
}
how do i then separate the 1234 from the test so i can pull only the first column of data if i need like list(1).pars[1] would be 12345 and list(2).pars[2] would be test2
i know this sounds foggy but i hope someone out there understands
Maybe something like this:
string test="12345 Test";
var ls= test.Split(' ');
This will get you a array of string. You can get them with ls[0] and ls[1].
If you just what the 12345 then ls[0] is the one to choose.
If you're ok with having a list of string[]'s you can simply do this:
var list = new List<string[]>();
using (StreamReader reader = new StreamReader("Test.txt"))
{
string line;
while ((line = reader.ReadLine()) != null)
{
list.Add(line.Split(' '));
}
}
string firstWord = list[0][0]; //12345
string secondWord = list[0][1]; //Test
When you have a string of text you can use the Split() method to split it in many parts. If you're sure every word (separated by one or more spaces) is a column you can simply write:
string[] columns = line.Split(' ');
There are several overloads of that function, you can specify if blank fields are skipped (you may have, for example columns[1] empty in a line composed by 2 words but separated by two spaces). If you're sure about the number of columns you can fix that limit too (so if any text after the last column will be treated as a single field).
In your case (add to the list only the first column) you may write:
if (String.IsNullOrWhiteSpace(line))
continue;
string[] columns = line.TrimLeft().Split(new char[] { ' ' }, 2);
list.Add(columns[0]);
First check is to skip empty or lines composed just of spaces. The TrimLeft() is to remove spaces from beginning of the line (if any). The first column can't be empty (because the TrimLeft() so yo do not even need to use StringSplitOptions.RemoveEmptyEntries with an additional if (columns.Length > 1). Finally, if the file is small enough you can read it in memory with a single call to File.ReadAllLines() and simplify everything with a little of LINQ:
list.Add(
File.ReadAllLines("test.txt")
.Where(x => !String.IsNullOrWhiteSpace(x))
.Select(x => x.TrimLeft().Split(new char[] { ' ' }, 2)[0]));
Note that with the first parameter you can specify more than one valid separator.
When you have multiple spaces
Regex r = new Regex(" +");
string [] splitString = r.Split(stringWithMultipleSpaces);
var splitted = System.IO.File.ReadAllLines("Test.txt")
.Select(line => line.Split(' ')).ToArray();
var list1 = splitted.Select(split_line => split_line[0]).ToArray();
var list2 = splitted.Select(split_line => split_line[1]).ToArray();

Get parameters out of text file

I have a C# asp.net page that has to get username/password info from a text file.
Could someone please tell me how.
The text file looks as follows: (it is actually a lot larger, I just got a few lines)
DATASOURCEFILE=D:\folder\folder
var1= etc
var2= more
var3 = misc
var4 = stuff
USERID = user1
PASSWORD = pwd1
all I need is the UserID and password out of that file.
Thank you for your help,
Steve
This would work:
var dic = File.ReadAllLines("test.txt")
.Select(l => l.Split(new[] { '=' }))
.ToDictionary( s => s[0].Trim(), s => s[1].Trim());
dic is a dictionary, so you easily extract your values, i.e.:
string myUser = dic["USERID"];
string myPassword = dic["PASSWORD"];
Open the file, split on the newline, split again on the = for each item and then add it to a dictionary.
string contents = String.Empty;
using (FileStream fs = File.Open("path", FileMode.OpenRead))
using (StreamReader reader = new StreamReader(fs))
{
contents = reader.ReadToEnd();
}
if (contents.Length > 0)
{
string[] lines = contents.Split(new char[] { '\n' });
Dictionary<string, string> mysettings = new Dictionary<string, string>();
foreach (string line in lines)
{
string[] keyAndValue = line.Split(new char[] { '=' });
mysettings.Add(keyAndValue[0].Trim(), keyAndValue[1].Trim());
}
string test = mysettings["USERID"]; // example of getting userid
}
You can use Regular expressions to extract each variable. You can read one line at a time, or the entire file into one string. If the latter, you just look for a newline in the expression.
Regards,
Morten
Dictionary is not needed.
Old-fashioned parsing can do more, with less executable code, the same amount of compiled data, and less processing:
public string MyPath1;
public string MyPath2;
...
public void ReadConfig(string sConfigFile)
{
MyPath1 = MyPath2 = ""; // Clear the external values (in case the file does not set every parameter).
using (StreamReader sr = new StreamReader(sConfigFile)) // Open the file for reading (and auto-close).
{
while (!sr.EndOfStream)
{
string sLine = sr.ReadLine().Trim(); // Read the next line. Trim leading and trailing whitespace.
// Treat lines with NO "=" as comments (ignore; no syntax checking).
// Treat lines with "=" as the first character as comments too.
// Treat lines with "=" as the 2nd character or after as parameter lines.
// Side-benefit: Values containing "=" are processed correctly.
int i = sLine.IndexOf("="); // Find the first "=" in the line.
if (i <= 0) // IF the first "=" in the line is the first character (or not present),
continue; // the line is not a parameter line. Ignore it. (Iterate the while.)
string sParameter = sLine.Remove(i).TrimEnd(); // All before the "=" is the parameter name. Trim whitespace.
string sValue = sLine.Substring(i + 1).TrimStart(); // All after the "=" is the value. Trim whitespace.
// Extra characters before a parameter name are usually intended to comment it out. Here, we keep them (with or without whitespace between). That makes an unrecognized parameter name, which is ignored (acts as a comment, as intended).
// Extra characters after a value are usually intended as comments. Here, we trim them only if whitespace separates. (Parsing contiguous comments is too complex: need delimiter(s) and then a way to escape delimiters (when needed) within values.) Side-drawback: Values cannot contain " ".
i = sValue.IndexOfAny(new char[] {' ', '\t'}); // Find the first " " or tab in the value.
if (i > 1) // IF the first " " or tab is the second character or after,
sValue = sValue.Remove(i); // All before the " " or tab is the parameter. (Discard the rest.)
// IF a desired parameter is specified, collect it:
// (Could detect here if any parameter is set more than once.)
if (sParameter == "MyPathOne")
MyPath1 = sValue;
else if (sParameter == "MyPathTwo")
MyPath2 = sValue;
// (Could detect here if an invalid parameter name is specified.)
// (Could exit the loop here if every parameter has been set.)
} // end while
// (Could detect here if the config file set neither parameter or only one parameter.)
} // end using
}

Categories