Since a while I'm coding in C# and I'm trying to create a couple of small tools for myself and my friends but I've ran into a problem, which stops me from continuing.
The problem is this. I want to use HmtlAgilityPack to get a changing value to use it for a couple of diffrent actions. But the problem is, that the value gets stuck on the same value until I restart the program.
So here is the code I'm using:
public static void Main(string[] args)
{
Console.WriteLine("Running the program!");
Console.WriteLine("Reading the value!");
int i = 0;
string url = "Website";
while (i < 300)
{
i++;
HtmlWeb web = new HtmlWeb();
HtmlDocument LoadWebsite = web.Load(url);
HtmlNode rateNode = LoadWebsite.DocumentNode.SelectSingleNode("//div[#class='the-value']");
string rate = rateNode.InnerText;
Console.WriteLine(i + ". " + rate);
Thread.Sleep(1000);
}
Console.WriteLine("Done");
Console.ReadLine();
}
So here it first loads the website. Next it gets the value from div. After that it writes the value so I can check it. But it just keeps writing the same value.
My question here is, that I don't know what I have to change to get newest value because the value changes every few seconds and I need the most recent value from my website. It's like the value is needed to keep the system running.
Declare your HtmlWeb web = new HtmlAgilityPack.HtmlWeb(); outside the loop, it isn't necessary to create for every loop.
You could be having caching issues in the website that you want to crawl.
Set web.UsingCache = false;.
If it doesn't work append some random string so that every call is different.
Code:
HtmlWeb htmlWeb = new HtmlAgilityPack.HtmlWeb();
htmlWeb.UsingCache = false;
int i = 0;
while (i < 300)
{
var uri = new Uri($"yoururl?z={Guid.NewGuid()}");
i++;
HtmlAgilityPack.HtmlDocument LoadWebsite = htmlWeb.Load(uri.AbsoluteUri);
HtmlNode rateNode = LoadWebsite.DocumentNode.SelectSingleNode("//div[#class='the-value']");
string rate = rateNode.InnerText;
Console.WriteLine(i + ". " + rate);
Thread.Sleep(1000);
}
Related
I am creating an application that fetches information about a website. I have been trying several approaches on getting the information from the HTML tags. The website is who.is and I am trying to get information about Google (as a test!) Source can be found on view-source:https://who.is/whois/google.com/ < (if using Chrome browser)
Now the problem is that I am trying to get the name of the creator of the website (Mark or something) but I am not receiving the correct result. My code:
//GET name
string getName = source;
string nameBegin = "<div class=\"col-md-4 queryResponseBodyKey\">Name</div><div class=\"col-md-8 queryResponseBodyValue\">";
string nameEnd = "</div>";
int nameStart = getName.IndexOf(nameBegin) + nameBegin.Length;
int nameIntEnd = getName.IndexOf(nameEnd, nameStart);
string creatorName = getName.Substring(nameStart, nameIntEnd - nameStart);
lb_name.Text = creatorName;
(source contains html of page)
This doesn't put out the correct answer though... I think it has something to do with the fact that I use a [\] because of the multiple "" 's...
What am I doing wrong? :(
Instead of trying the parse the html result manually, use a real html parser like HtmlAgilityPack
using (var client = new HttpClient())
{
var html = await client.GetStringAsync("https://who.is/whois/google.com/");
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var nodes = doc.DocumentNode.SelectNodes("//*[#class='col-md-4 queryResponseBodyKey']");
var results = nodes.ToDictionary(n=>n.InnerText, n=>n.NextSibling.NextSibling.InnerText);
//print
foreach(var kv in results)
{
Console.WriteLine(kv.Key + " => " + kv.Value);
}
}
string getName = "<div class=\"col-md-4 queryResponseBodyKey\">Name</div><div class=\"col-md-8 queryResponseBodyValue\">";
string nameBegin = "<div class=\"col-md-4 queryResponseBodyKey\">";
string nameEnd = "</div>";
int nameStart = getName.IndexOf(nameBegin) + nameBegin.Length;
int nameIntEnd = getName.IndexOf(nameEnd, nameStart);
string creatorName = getName.Substring(nameStart, nameIntEnd - nameStart);
//lb_name.Text = creatorName;
Console.WriteLine(creatorName);
Console.ReadLine();
Is this what you are looking for, to get Name from that div ?
So, I have a simple screen that registers the saves made in various text files (this is what I call "escenario". I can haveseveral escenarios for the important changes made to the files, and later on, if I want to return to a past escenario, I just load the files I saved. That works great, but my problem is when I try to start the process of the .exe for the python code that makes the copies of the files.
protected void AddItem(object sender, EventArgs e)
{
int num = 0;
int totalItems = escenario_list.Items.Count;
if (totalItems > 0){
string esc = "Escenario ";
num = totalItems + 1;
var x = esc + num.ToString();
var date = DateTime.Now.ToString();
var str = x + " - " + date;
escenario_list.Items.Add(new ListItem(str, num.ToString()));
Process.Start(#"C:\inetpub\wwwroot\AdministracionEscenarios\bin\MoveFiles\MoveFiles.exe");
}
else if(totalItems == 0){
string esc = "Escenario ";
num = 1;
var x = esc + num.ToString();
var date = DateTime.Now.ToString();
var str = x + " - " + date;
escenario_list.Items.Add(new ListItem(str, num.ToString()));
Process.Start(#"C:\inetpub\wwwroot\AdministracionEscenarios\bin\MoveFiles\MoveFiles.exe");
}
}
So Process.Start just won't start that .exe file. What can be the reason?? I'm also using using System.Diagnostics; at the beggining, so that's not the issue here. Thanks for the help!
The issue is how you're referencing your local drive on the server. Use this method instead Path.Combine(HostingEnvironment.MapPath("~/bin/MoveFiles/"), "MoveFiles.exe"); The HostingEnvironment will reference the directory on the server. You've been referencing the customer's local directory and not the server's local directory
Also, download and launch process explorer, and go to Find and Find Handle. Run your program and search for the MoveFiles.exe file. This will ensure that your application has executed on the server.
Basically im trying to save a new password and avatar for my twitter type website.
Any help would be appreciated
My coding is:
string newPasswordString = Server.MapPath("~") + "/App_Data/tuitterUsers.txt";
string[] newPasswordArray = File.ReadAllLines(newPasswordString);
string newString = Server.MapPath("~") + "/App_Data/tuitterUsers.txt";
newString = File.ReadAllText(newString);
string[] newArray = newString.Split(' ');
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; i++)
{
newArray[1] = newPasswordTextBox.Text;
newArray[2] = avatarDropDownList.SelectedValue;
newPasswordArray.Replace(" " + Session["Username"].ToString() + " " + Session["UserPassword"].ToString() + " " + Session["UserAvatarID"].ToString() + " ", " " + Session["Username"].ToString() + " " + newPasswordArray[1] + " " + newPasswordArray[2]);
}
}
string newPasswordString = string.Join(Environment.NewLine, newPasswordArray);
File.WriteAllText(Server.MapPath("~") + "/App_Data/tuitterUsers.txt", newPasswordString);
If I understand your problem correctly you need to move the
File.WriteAllText(Server.MapPath("~") + "/App_Data/tuitterUsers.txt", newPasswordArray);
outside the loop, otherwise you rewrite the file at each loop, but this is not enough, you need also to rebuild the Whole text file
string fileToWrite = string.Join(Environment.NewLine, newPasswordArray);
File.WriteAllText(Server.MapPath("~") + "/App_Data/tuitterUsers.txt", fileToWrite);
EDIT: After the code update and the comment below
The looping is totally wrong as well the rebuilding of the array
string userDataFile = Server.MapPath("~") + "/App_Data/tuitterUsers.txt";
string[] userDataArray = File.ReadAllLines(userDataFile);
for(int x = 0; x < userDataArray.Length; x++)
{
string[] info = userData[x].Split(' ');
if(Session["Username"].ToString() == info[0])
{
userData[x] = string.Join(" ", Session["UserName"].ToString(),
newPasswordTextBox.Text,
avatarDropDownList.SelectedValue.ToString());
break;
}
}
string fileToWrite = string.Join(Environment.NewLine, userDataArray);
File.WriteAllText(Server.MapPath("~") + "/App_Data/tuitterUsers.txt", fileToWrite);
Keep in mind that this works for a limited number of users.
If you are lucky and you site becomes the new Twitter, you cannot think to use a solution where you read in memory the names of all your users.
Firstly, what you're doing is A Bad Idea™. Given that a web server can have multiple threads in operation, you can't be certain that two threads aren't going to be writing different data at the same time. The more users you have the larger your user file will be, which means it takes longer to read and write the data, which makes it more likely that two threads will come into conflict.
This is why we use databases for things like this. Instead of operating on the whole file every time you want to read or write, you operate on a single record. There are plenty of other reasons to do it to.
That said, if you insist on using a text file...
If you treat each line in the file as a record - a single user's details in this case - then it makes sense to build a class to handle the content of those records, and make that class able to read and write the line format.
Something like this:
class UserRecord
{
public string Name;
public string Password;
public string Avatar;
public UserRecord(string name, string password, string avatar)
{
Name = name;
Password = password;
Avatar = avatar;
}
// static factory method
public static UserRecord Parse(string source)
{
if (string.IsNullOrEmpty(source))
return null;
string[] parts = source.Split(',');
if (parts.Length < 3)
return null;
return new UserRecord(parts[0], parts[1], parts[2]);
}
// convert to string
public string ToString()
{
return (new string[] { Name, Password, Avatar }).Join(",");
}
}
Adjust the Parse method to handle whatever format you're using for the data in the line, and change the ToString method to produce that format.
Once you have that working, use it to parse the contents of your file like this:
// Somewhere to put the data - a Dictionary is my first choice here
Dictionary<string, UserRecord> users = new Dictionary<string, UserRecord>();
// Don't forget to use 'using' where appropriate
using (TextReader userfile = File.OpenText(userDataFile))
{
string srcline;
while ((srcline = userfile.ReadLine()) != null)
{
UserRecord user = UserRecord.Parse(line);
if (user != null)
users[user.Name] = user;
}
}
Then you can access the user's data by username, manipulate it as required, and save it back out whenever you like.
Writing the data back out from a Dictionary of users is as simple as:
StringBuilder sb = new StringBuilder;
foreach (UserRecord user in users.Values)
{
sb.AppendFormat("{0}\n", user);
}
File.WriteAllText(userDataFile, sb.ToString());
Meanwhile, you have a users collection that you can save for future checks and manipulations.
I still think you should use a database though. They're not hard to learn and they are far better for this sort of thing.
I am new to WatiN but made a few applications, and it's quite good. The problem is that my program is not responding while performing some web actions.
Here is my source code of WatiN:
using (var ie = new IE())
{
StreamWriter wr = new StreamWriter(textBox1.Text+".txt");
ie.GoTo("http://twitter.com/"+textBox1.Text+"/followers");
string dm1 = ie.Body.Parent.OuterHtml;
Match match1 = Regex.Match(dm1, "(?<=<strong>).*?(?=</strong> Followers)");
string ck = match1.ToString();
ck = ck.Replace(",", "");
long check = Int64.Parse(ck);
long n= 0;
string pattern = "(?<=data-screen-name=\").*(?=data-name)";
while (n <= check)
{
WatiN.Core.Settings.WaitUntilExistsTimeOut = 1;
var focusme = ie.Div(Find.ByClass("stream-loading"));
var element = focusme.NativeElement as IEElement;
element.AsHtmlElement.scrollIntoView();
string dm = ie.Body.Parent.OuterHtml;
MatchCollection matches1 = Regex.Matches(dm, pattern);
n = matches1.Count;
label4.Text = n.ToString();
}
string dom = ie.Body.Parent.OuterHtml;
MatchCollection matches = Regex.Matches(dom, pattern);
foreach (Match match in matches)
{
string usr0 = match.ToString();
int i = usr0.IndexOf("\"");
string usr = usr0.Substring(0, i - 1);
wr.WriteLine(usr);
}
label4.Text = label4.Text + " done";
wr.Close();
}
This is the source, to get the twitter followers into a file. It's just a random example, but while performing this action, my program is not responding. I guess I have to create a new process for this action, but don't know exactly how to proceed.
EDIT: I am using this in the button1_Click, so basicly in the Form1 Class.
There's tons of materials regarding such issues - it's not a problem with WatiN, but the design. You should move the WatiN core in a BackgroundWorker so the UI won't hang while executing the WatiN code.
WinForm Application UI Hangs during Long-Running Operation
Windows Forms Application - Slow/Unresponsive UI
And some guides:
http://www.codeproject.com/Articles/58292/Basic-Backgroundworker
http://www.dotnetperls.com/backgroundworker
I am currently testing the google API. It seems promising, but I am stuck at a "simple" problem. I want to update an existing document with a local copy.
My idea was, download all google documents to a folder, using the doc-download. That works. At the next run, I check the dates, if a remote document is newer, grab it again. If the local document is newer, upload it, and replace the current online version.
I can't find a function to replace a document. There is a Upload(filename, doctitle) but this creates a new document. Does anybody know if this is possible and can point me in the correction direction. Do I have to dissect the atom feed (is the document content somewhere inside it..). The "download / change in word / upload" looked so nice :-)
Chris
And for anyone who is interested, its pretty simple and nice to use the API. Here is a short WPF example (without credentials, of course)
var settings = new RequestSettings("GoogleDocumentsSample", _credentials);
AllDocuments = new ObservableCollection<Document>();
settings.AutoPaging = true;
settings.PageSize = 10;
service = new DocumentsService("DocListUploader");
((GDataRequestFactory)service.RequestFactory).KeepAlive = false;
service.setUserCredentials(username, password);
//force the service to authenticate
var query = new DocumentsListQuery {NumberToRetrieve = 1};
service.Query(query);
var request = new DocumentsRequest(settings);
Feed<Document> feed = request.GetEverything();
// this takes care of paging the results in
foreach (Document entry in feed.Entries)
{
AllDocuments.Add(entry);
if (entry.Type == Document.DocumentType.Document)
{
var fI = new FileInfo(#"somepath" + entry.DocumentId + ".doc");
if (!fI.Exists || fI.LastWriteTime < entry.Updated)
{
Debug.WriteLine("Download doc " + entry.DocumentId);
var type = Document.DownloadType.doc;
Stream stream = request.Download(entry, type);
if (fI.Exists) fI.Delete();
Stream file = fI.OpenWrite();
int nBytes = 2048;
int count = 0;
Byte[] arr = new Byte[nBytes];
do
{
count = stream.Read(arr, 0, nBytes);
file.Write(arr, 0, count);
} while (count > 0);
file.Flush();
file.Close();
stream.Close();
fI.CreationTimeUtc = entry.Updated;
fI.LastWriteTimeUtc = entry.Updated;
}
else
{
if (entry.Updated == fI.LastWriteTime)
{
Debug.WriteLine("Document up to date " + entry.DocumentId);
}
else
{
Debug.WriteLine(String.Format("Local version newer {0} [LOCAL {1}] [REMOTE {2}]", entry.DocumentId, fI.LastWriteTimeUtc, entry.Updated));
service.UploadDocument(fI.FullName, entry.Title);
}
}
}
}
According to Docs API docs ;) you can replace the content of a document
http://code.google.com/apis/documents/docs/3.0/developers_guide_protocol.html#UpdatingContent