I am new to WatiN but made a few applications, and it's quite good. The problem is that my program is not responding while performing some web actions.
Here is my source code of WatiN:
using (var ie = new IE())
{
StreamWriter wr = new StreamWriter(textBox1.Text+".txt");
ie.GoTo("http://twitter.com/"+textBox1.Text+"/followers");
string dm1 = ie.Body.Parent.OuterHtml;
Match match1 = Regex.Match(dm1, "(?<=<strong>).*?(?=</strong> Followers)");
string ck = match1.ToString();
ck = ck.Replace(",", "");
long check = Int64.Parse(ck);
long n= 0;
string pattern = "(?<=data-screen-name=\").*(?=data-name)";
while (n <= check)
{
WatiN.Core.Settings.WaitUntilExistsTimeOut = 1;
var focusme = ie.Div(Find.ByClass("stream-loading"));
var element = focusme.NativeElement as IEElement;
element.AsHtmlElement.scrollIntoView();
string dm = ie.Body.Parent.OuterHtml;
MatchCollection matches1 = Regex.Matches(dm, pattern);
n = matches1.Count;
label4.Text = n.ToString();
}
string dom = ie.Body.Parent.OuterHtml;
MatchCollection matches = Regex.Matches(dom, pattern);
foreach (Match match in matches)
{
string usr0 = match.ToString();
int i = usr0.IndexOf("\"");
string usr = usr0.Substring(0, i - 1);
wr.WriteLine(usr);
}
label4.Text = label4.Text + " done";
wr.Close();
}
This is the source, to get the twitter followers into a file. It's just a random example, but while performing this action, my program is not responding. I guess I have to create a new process for this action, but don't know exactly how to proceed.
EDIT: I am using this in the button1_Click, so basicly in the Form1 Class.
There's tons of materials regarding such issues - it's not a problem with WatiN, but the design. You should move the WatiN core in a BackgroundWorker so the UI won't hang while executing the WatiN code.
WinForm Application UI Hangs during Long-Running Operation
Windows Forms Application - Slow/Unresponsive UI
And some guides:
http://www.codeproject.com/Articles/58292/Basic-Backgroundworker
http://www.dotnetperls.com/backgroundworker
Related
I am doing a research of how to split a video in four fragments. I have seen a lot of solutions and libraries. I was looking at this library:
https://github.com/AydinAdn/MediaToolkit
And this is the code for splitting the video
var inputFile = new MediaFile {Filename = #"C:\Path\To_Video.flv"};
var outputFile = new MediaFile {Filename = #"C:\Path\To_Save_ExtractedVideo.flv"};
using (var engine = new Engine())
{
engine.GetMetadata(inputFile);
var options = new ConversionOptions();
// This example will create a 25 second video, starting from the
// 30th second of the original video.
//// First parameter requests the starting frame to cut the media from.
//// Second parameter requests how long to cut the video.
options.CutMedia(TimeSpan.FromSeconds(30), TimeSpan.FromSeconds(25));
engine.Convert(inputFile, outputFile, options);
}
The code is splitting just one fragment. Is there a way to split it in four fragments?
Kind regards
PS: the solution must be in C# and already have seen the Directshow solution.
It works well for me, but I'll fix the algorithm, because I'm missing the final video following, the code I have at the moment is this:
static void Main(string[] args)
{
using (var engine = new Engine())
{
string file = #"C:\Users\wilso\Downloads\IZA - Meu Talismã.mp4";
var inputFile = new MediaFile { Filename = file };
engine.GetMetadata(inputFile);
var outputName = #"C:\Users\wilso\Downloads\output";
var outputExtension = ".mp4";
double Duration = inputFile.Metadata.Duration.TotalSeconds;
double currentPosition = 0;
int contador = 0;
while (currentPosition < Duration)
{
currentPosition = contador * 30;
contador++;
var options = new ConversionOptions();
var outputFile = new MediaFile(outputName + contador.ToString("00") + outputExtension);
options.CutMedia(TimeSpan.FromSeconds(currentPosition), TimeSpan.FromSeconds(30));
engine.Convert(inputFile, outputFile, options);
}
}
}
I haven't used this library before but this is how I would go about it.
var inputFile = new MediaFile {Filename = #"C:\Path\To_Video.flv"};
var outputName = "C:\Path\To_Save_ExtractedVideo";
var outputExtension = ".flv";
double t = inputFile.Length/4; //length of parts -- need to use method to get file playtime length
for(int i=0;i<4;i++){
var engine = new Engine()
engine.GetMetadata(inputFile);
var options = new ConversionOptions();
// This example will create a 25 second video, starting from the
// 30th second of the original video.
//// First parameter requests the starting frame to cut the media from.
//// Second parameter requests how long to cut the video.
options.CutMedia(TimeSpan.FromSeconds(30 + (i*int.Parse(t))), TimeSpan.FromSeconds((i+1)*int.Parse(t)));
engine.Convert(inputFile, $"{outputName}_{i.ToString()}{outputExtension}, options);
engine.Destroy(); // Need to destroy object or close inputstream. Whichever the library offers
}
}
Since a while I'm coding in C# and I'm trying to create a couple of small tools for myself and my friends but I've ran into a problem, which stops me from continuing.
The problem is this. I want to use HmtlAgilityPack to get a changing value to use it for a couple of diffrent actions. But the problem is, that the value gets stuck on the same value until I restart the program.
So here is the code I'm using:
public static void Main(string[] args)
{
Console.WriteLine("Running the program!");
Console.WriteLine("Reading the value!");
int i = 0;
string url = "Website";
while (i < 300)
{
i++;
HtmlWeb web = new HtmlWeb();
HtmlDocument LoadWebsite = web.Load(url);
HtmlNode rateNode = LoadWebsite.DocumentNode.SelectSingleNode("//div[#class='the-value']");
string rate = rateNode.InnerText;
Console.WriteLine(i + ". " + rate);
Thread.Sleep(1000);
}
Console.WriteLine("Done");
Console.ReadLine();
}
So here it first loads the website. Next it gets the value from div. After that it writes the value so I can check it. But it just keeps writing the same value.
My question here is, that I don't know what I have to change to get newest value because the value changes every few seconds and I need the most recent value from my website. It's like the value is needed to keep the system running.
Declare your HtmlWeb web = new HtmlAgilityPack.HtmlWeb(); outside the loop, it isn't necessary to create for every loop.
You could be having caching issues in the website that you want to crawl.
Set web.UsingCache = false;.
If it doesn't work append some random string so that every call is different.
Code:
HtmlWeb htmlWeb = new HtmlAgilityPack.HtmlWeb();
htmlWeb.UsingCache = false;
int i = 0;
while (i < 300)
{
var uri = new Uri($"yoururl?z={Guid.NewGuid()}");
i++;
HtmlAgilityPack.HtmlDocument LoadWebsite = htmlWeb.Load(uri.AbsoluteUri);
HtmlNode rateNode = LoadWebsite.DocumentNode.SelectSingleNode("//div[#class='the-value']");
string rate = rateNode.InnerText;
Console.WriteLine(i + ". " + rate);
Thread.Sleep(1000);
}
I am creating an application that fetches information about a website. I have been trying several approaches on getting the information from the HTML tags. The website is who.is and I am trying to get information about Google (as a test!) Source can be found on view-source:https://who.is/whois/google.com/ < (if using Chrome browser)
Now the problem is that I am trying to get the name of the creator of the website (Mark or something) but I am not receiving the correct result. My code:
//GET name
string getName = source;
string nameBegin = "<div class=\"col-md-4 queryResponseBodyKey\">Name</div><div class=\"col-md-8 queryResponseBodyValue\">";
string nameEnd = "</div>";
int nameStart = getName.IndexOf(nameBegin) + nameBegin.Length;
int nameIntEnd = getName.IndexOf(nameEnd, nameStart);
string creatorName = getName.Substring(nameStart, nameIntEnd - nameStart);
lb_name.Text = creatorName;
(source contains html of page)
This doesn't put out the correct answer though... I think it has something to do with the fact that I use a [\] because of the multiple "" 's...
What am I doing wrong? :(
Instead of trying the parse the html result manually, use a real html parser like HtmlAgilityPack
using (var client = new HttpClient())
{
var html = await client.GetStringAsync("https://who.is/whois/google.com/");
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var nodes = doc.DocumentNode.SelectNodes("//*[#class='col-md-4 queryResponseBodyKey']");
var results = nodes.ToDictionary(n=>n.InnerText, n=>n.NextSibling.NextSibling.InnerText);
//print
foreach(var kv in results)
{
Console.WriteLine(kv.Key + " => " + kv.Value);
}
}
string getName = "<div class=\"col-md-4 queryResponseBodyKey\">Name</div><div class=\"col-md-8 queryResponseBodyValue\">";
string nameBegin = "<div class=\"col-md-4 queryResponseBodyKey\">";
string nameEnd = "</div>";
int nameStart = getName.IndexOf(nameBegin) + nameBegin.Length;
int nameIntEnd = getName.IndexOf(nameEnd, nameStart);
string creatorName = getName.Substring(nameStart, nameIntEnd - nameStart);
//lb_name.Text = creatorName;
Console.WriteLine(creatorName);
Console.ReadLine();
Is this what you are looking for, to get Name from that div ?
I am trying to build a small application where when I enter the a list of around 100,000 to 200,0000 urls it should go and download the html and save it in a relative folder.
I have 2 solution but each a some problems I have trying to figure out the best approach.
First Solution: Synchronize Method
Below is the code I am using
currentline = 0;
var lines = txtUrls.Lines.Where(line => !String.IsNullOrWhiteSpace(line)).Count();
string urltext = txtUrls.Text;
List<string> list = new List<string>(
txtUrls.Text.Split(new string[] { "\r\n" },
StringSplitOptions.RemoveEmptyEntries));
lblStatus.Text = "Working";
btnStart.Enabled = false;
foreach (string url in list)
{
using (WebClient client = new WebClient())
{
client.DownloadFile(url, #".\pages\page" + currentline + ".html");
currentline++;
}
}
lblStatus.Text = "Finished";
btnStart.Enabled = true;
the code works fine however it's slow and also randomly after 5000 urls it's stops working and the process says it's completed. (Please note I am using this code on background worker but make this code simpler to view I am showing only the relevant code.)
Second Solution : Asynchronize Method
int currentline = 0;
string urltext = txtUrls.Text;
List<string> list = new List<string>(
txtUrls.Text.Split(new string[] { "\r\n" },
StringSplitOptions.RemoveEmptyEntries));
foreach (var url in list)
{
using (WebClient webClient = new WebClient())
{
webClient.DownloadFileCompleted += new AsyncCompletedEventHandler(Completed);
webClient.DownloadProgressChanged += new DownloadProgressChangedEventHandler(ProgressChanged);
webClient.DownloadFileAsync(new Uri(url), #".\pages\page" + currentline + ".html");
}
currentline++;
label1.Text = "No.of Lines Completed: " + currentline;
}
this code works super fast but most of the time I am getting downloaded files with 0KB and I am sure the network is fast since I am testing in OVH Dedi server.
Can anyone point what I am doing wrong ? or tips on improving it or entirely different solution to this problem.
Instead of using DownloadFile() try use
public async Task GetData()
{
WebClient client = new WebClient();
var data = await client.DownloadDataTaskAsync("http://xxxxxxxxxxxxxxxxxxxxx");
}
you will get data formated in byte[]. Then you just call:
File.WriteAllBytes() to save them to disk.
My code goes like this:
public void button1_Click(object sender, EventArgs e)
{
StreamReader tx = null;
if (textBox1.Text != "")
{
tx = new StreamReader(textBox1.Text);
}
else
{
tx = new StreamReader("new.txt");
}
string line;
while ((line = tx.ReadLine()) != null)
{
string url = (line);
string sourceCode = Worker.getSourceCode(url);
MatchCollection m1 = Regex.Matches(sourceCode, #"title may-blank "" href=""(.+?)""", RegexOptions.Singleline);
MatchCollection m2 = Regex.Matches(sourceCode, #"(?<=tabindex=\""1\"" \>| tabindex=\""1\"" rel=""nofollow"" \>)(.+?) (?=<\/a>)", RegexOptions.Singleline);
List<string> adresy = new List<string>();
List<string> nazwy = new List<string>();
int counter = 0;
foreach (Match m in m1)
{
string adres = m.Groups[1].Value;
adresy.Add(adres);
counter++;
label1.Text = counter.ToString();
}
int counter2 = 0;
foreach (Match m in m2)
{
string nazwa = m.Groups[1].Value;
nazwy.Add(nazwa);
counter2++;
label2.Text = counter2.ToString();
}
listBox1.DataSource = adresy;
listBox2.DataSource = nazwy;
}
}
I am using RegEx to scrape text from web pages. And the thing is, that I want to scrape single URL if that URL is in textBox1. But if textbox1 is empty, I want to scrape all the URL's from new.txt file.
So... I have to implement "if" but I don't really know how to. I mean, it should go like this:
if textbox1 is empty
then read from single line
if not, then read from new.txt
do stuff like scraping..
But as you can see in my code which is upper, it doesn't work properly. I mean it works, but only if I read from new.txt. When I add some text to textbox1.Text and try to scrape URL, my app is crashing. I assume that it crashes, because I shouldn't have used streamreader to read from textbox. I don't know. Do you have any ideas?
If you want to write your code like this, then you can use a StringReader:
TextReader tx = null;
if (textBox1.Text != "")
{
tx = new StringReader(textBox1.Text);
}
else
{
tx = new StreamReader("new.txt");
}
Make sure to wrap your code in a try/finally block and call tx.Dispose() in the finally.