Linkchecker doesnt print any broken urls - c#

I'm having problems creating a linkchecker, I'd like to have it online mainly for learning..
The problem is that i first had it as a console application which worked kinda well (i got broken urls to show i debug console), now that i'm trying to get it to web I'm having trouble..
How do I go about getting this into the document? I'm kinda stumped at the moment..
public partial class Default2 : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
}
public bool UrlIsValid(string url)
{
try
{
HttpWebRequest request = HttpWebRequest.Create(url) as HttpWebRequest;
request.Timeout = 5000; //set the timeout to 5 seconds to keep the user from waiting too long for the page to load
request.Method = "HEAD"; //Get only the header information -- no need to download any content
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
int statusCode = (int)response.StatusCode;
if (statusCode >= 100 && statusCode < 400) //Good requests
{
return true;
}
else if (statusCode >= 500 && statusCode <= 510) //Server Errors
{
string cl = (String.Format("The remote server has thrown an internal error. Url is not valid: {0}", url));
// Debug.WriteLine(cl, Convert.ToString(url));
return false;
}
}
catch (WebException ex)
{
if (ex.Status == WebExceptionStatus.ProtocolError) //400 errors
{
return false;
}
else
{
string cl = String.Format("Unhandled status [{0}] returned for url: {1}", ex.Status, url);
/// Debug.WriteLine(cl, Convert.ToString(ex));
}
}
catch (Exception ex)
{
object cl = String.Format("Could not test url {0}.", url);
Debug.WriteLine(cl, Convert.ToString(ex));
}
return false;
}
private void button1_Click(object sender, EventArgs e)
{
WebClient wc = new WebClient();
string checker = wc.DownloadString("http://administration.utbildningssidan.se/linkcheck.aspx");
while (checker.Contains("<a href="))
{
int checkstart = checker.IndexOf("<a href=") + 8;
int checkstop = checker.IndexOf(">", checkstart);
string validator = checker.Substring(checkstart, checkstop - checkstart);
// perform the check
if (!UrlIsValid(validator)) { Debug.WriteLine(validator); }
checker = checker.Substring(checkstop + 1);
}
}
}
Hope you understand what I want accomplished, having a hard time making sense right now..

I think you want Response.Write() in place of your Debug.WriteLine() methods. OR, you could create a TextArea object in your markup and use myTextArea.Text += "Some text";

Related

How to make webclient download file again if failed?

I'm trying to download a list of links of images to my server (Up to 40 links) using foreach.
In my case sometimes the link exists but I don't know why it's going to catch and cancel the download of the next link. Maybe it needs to wait for a little? because when I debug the app I see that the link was the application skipped and went to catch was available but sometimes it's open after few seconds in my browser so the response time from the server I trying to download sometimes need more time to load and open the link.
string newPath = "~/data/" + model.PostID + "/" + name + "/";
//test1 is a list of links
foreach (var item1 in test1)
{
HttpWebRequest request = WebRequest.Create(item1) as HttpWebRequest; request.Method = "HEAD";
try
{
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
var webClient = new WebClient();
string path = newPath + i + ".jpg";
webClient.DownloadFileAsync(new Uri(item1), Server.MapPath(path));
string newlinks = "https://example.com/data/" + chapter.PostID + "/" + name + "/" + i + ".jpg";
allimages = allimages + newlinks + ',';
response.Close();
i++;
}
}
catch
{
break;
}
}
Now my main goal is to fix this issue but as I saw in debugging:
The Images Links I'm trying to download exists
Sometimes Need More Time to response
So How I can fix this ? when download cancel and a link exists, what I should do?
you can use this example:
class WebClientUtility : WebClient
{
public int Timeout { get; set; }
public WebClientUtility() : this(60000) { }
public WebClientUtility(int timeout)
{
this.Timeout = timeout;
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request != null)
{
request.Timeout = Timeout;
}
return request;
}
}
//
public class DownloadHelper : IDisposable
{
private WebClientUtility _webClient;
private string _downloadUrl;
private string _savePath;
private int _retryCount;
public DownloadHelper(string downloadUrl, string savePath)
{
_savePath = savePath;
_downloadUrl = downloadUrl;
_webClient = new WebClientUtility();
_webClient.DownloadFileCompleted += ClientOnDownloadFileCompleted;
}
public void StartDownload()
{
_webClient.DownloadFileAsync(new Uri(_downloadUrl), _savePath);
}
private void ClientOnDownloadFileCompleted(object sender, AsyncCompletedEventArgs e)
{
if (e.Error != null)
{
_retryCount++;
if (_retryCount < 3)
{
_webClient.DownloadFileAsync(new Uri(_downloadUrl), _savePath);
}
else
{
Console.WriteLine(e.Error.Message);
}
}
else
{
_retryCount = 0;
Console.WriteLine($"successfully download: # {_downloadUrl} to # {_savePath}");
}
}
public void Dispose()
{
_webClient.Dispose();
}
}
//
class Program
{
private static void Main(string[] args)
{
for (int i = 0; i < 100; i++)
{
var downloadUrl = $#"https://example.com/mag-{i}.pdf";
var savePath = $#"D:\DownloadFile\FileName{i}.pdf";
DownloadHelper downloadHelper = new DownloadHelper(downloadUrl, savePath);
downloadHelper.StartDownload();
}
Console.ReadLine();
}
}
to fix timeout problem you can create a derived class and set the timeout property of the base WebRequest class and
for retry you can use the DownloadFileCompleted event of the WebClient and implement your retry pattern there
You're using the async version of 'DownloadFileAsync'. However you're not awaiting the call, that leaves a mess with unpredicted behaviour.
Make your method async and then use this:
await webClient.DownloadFileAsync(new Uri(item1), Server.MapPath(path));
This Solved my case:
await Task.Run(() =>
{
webClient.DownloadFileAsync(new Uri(item1), Server.MapPath(path));
});

Handling post requests in HttpListener

private void ListenerCallback(IAsyncResult ar)
{
_busy.WaitOne();
try
{
HttpListenerContext context;
try
{ context = _listener.EndGetContext(ar); }
catch (HttpListenerException)
{ return; }
if (_stop.WaitOne(0, false))
return;
var sr = new StreamReader(context.Request.InputStream);
string x = sr.ReadToEnd();
Console.WriteLine("{0} {1}", context.Request.HttpMethod, x);
//context.Response.SendChunked = true;
using (TextWriter tw = new StreamWriter(context.Response.OutputStream))
{
//for (int i = 0; i < 5; i++)
{
//tw.WriteLine("<p>{0} # {1}</p>", i, DateTime.Now);
tw.WriteLine("<html><head></head><body>");
tw.WriteLine("Server Response");
tw.WriteLine("</body></html>");
tw.Flush(); //Catch http exception if client exists halfway through
//Thread.Sleep(1000);
}
}
}
finally
{
if (_maxThreads == 1 + _busy.Release())
_idle.Set();
}
}
Above is my code, I can go to the URL with Chrome and few the reply even though the server shows it takes 2 get requests, I want to be able to handle POST requests, when I send a post request it reads it properly but the client doesn't get the reply.
You should add ctx.Response.ContentLength64=....
(you may also need ctx.Response.Close())

Simple webclient request returning MS.InternalMemoryStream

I have a simple webclient that connects to a webpage and returns the data. The code is as follows:
try
{
WebClient webClient = new WebClient();
Uri uri = new Uri("https://domain.com/register.php?username=" + txtbUser.Text);
webClient.OpenReadCompleted +=
new OpenReadCompletedEventHandler(webClient_OpenReadCompleted);
webClient.OpenReadAsync(uri);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
void webClient_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e)
{
if (e.Error == null)
{
//Process web service result here
MessageBox.Show(e.Result.ToString());
}
else
{
//Process web service failure here
MessageBox.Show(e.Error.Message);
}
}
The data coming from e.Result is MS.InternalMemoryStream and not the data coming back from the webpage, the data coming back from the webpage should just be a 0 or 1. Any idea's?
thanks,
Nathan
.ToString() returns the name of the class - in this case, InternalMemoryStream. You have to READ the stream to get the result. Check this out

webrequest.begingetresponse is taking too much time when the url is invalid

I am using webrequest to fetch some image data. The url may be invaild sometime. In case of invalid URL, begingetresponse is taking time equals to timeout period. Also the control become unresponsive during that period. In other word the async callback is not working asynchronously. Is this expected behaviour?
try
{
// Async requests
WebRequest request = WebRequest.Create(uri);
request.Timeout = RequestTimeOut;
RequestObject requestObject = new RequestObject();
requestObject.Request = request;
request.BeginGetResponse(this.ProcessImage, requestObject);
}
catch (Exception)
{
ShowErrorMessage(uri);
}
private void ProcessImage(IAsyncResult asyncResult)
{
try
{
RequestObject requestObject = (RequestObject)asyncResult.AsyncState;
WebRequest request = requestObject.Request;
WebResponse response = request.EndGetResponse(asyncResult);
Bitmap tile = new Bitmap(response.GetResponseStream());
// do something
}
catch (Exception)
{
ShowErrorMessage();
}
}
looks like this is an issue with .NET. BeginGetResponse blocks until DNS is resolved. In case of wrong URL (like http://somecrap) it tries until it gets timeout. See the following links -
link1 and link2
I just ran into this same situation. While it's not a perfect workaround I decided to use the Ping.SendAsync() to ping the site first. Good part is the async part return immediately. Bad part is the extra step AND not all sites respond to Ping requests.
public void Start(WatchArgs args)
{
var p = new System.Net.NetworkInformation.Ping();
args.State = p;
var po = new System.Net.NetworkInformation.PingOptions(10, true);
p.PingCompleted += new PingCompletedEventHandler(PingResponseReceived);
p.SendAsync(args.Machine.Name, 5 * 1000, Encoding.ASCII.GetBytes("watchdog"), po, args);
}
private void PingResponseReceived(object sender, .PingCompletedEventArgs e)
{
WatchArgs args = e.UserState as WatchArgs;
var p = args.State as System.Net.NetworkInformation.Ping;
p.PingCompleted -= new System.Net.NetworkInformation.PingCompletedEventHandler(HttpSmokeWatcher.PingResponseReceived);
args.State = null;
if (System.Net.NetworkInformation.IPStatus.Success == e.Reply.Status)
{
// ... BeginGetResponse now
}
else
{
/// ... machine not available
}
}
Just code and running for a day but initial result look promising.

.NET Proxy Support - HTTPWebRequest

Okay I need help, again! For some reason it is not working, no idea why.. nothing even appears on my catch request..
public void load(object sender, DoWorkEventArgs e)
{
int repeat = 1;
int proxyIndex = 1;
if (listBox1.Items.Count == proxyIndex) //If we're at the end of the proxy list
{
proxyIndex = 0; //Make the selected item the first item in the list
}
try
{
int i = 0;
while (i < listBox1.Items.Count)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(textBox1.Text);
string proxy = listBox1.Items[1].ToString();
string[] proxyArray = proxy.Split(':');
WebProxy proxyz = new WebProxy(proxyArray[0], int.Parse(proxyArray[1]));
request.Proxy = proxyz;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
string str = reader.ReadToEnd();
}
}
/*HttpWebRequest request = (HttpWebRequest)WebRequest.Create(textBox1.Text);
string proxy = listBox1.Items[i].ToString();
string[] proxyArray = proxy.Split(':');
WebProxy proxyz = new WebProxy(proxyArray[0], int.Parse(proxyArray[1]));
request.Proxy = proxyz;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string str = reader.ReadToEnd();
Thread.Sleep(100);
{
if (str != null)
{
listBox2.Items.Add("Starting connection.");
Thread.Sleep(1000);
{
listBox2.Items.Add("Waiting..");
Thread.Sleep(500);
{
listBox2.Items.Add("Connection closed.");
repeat++;
continue;
}
}
}
else if (str == null)
{
listBox2.Items.Add("Reply was null, moving on.");
proxyIndex++;
repeat++;
}
}
*/
}
}
catch (Exception ex) //Incase some exception happens
{
MessageBox.Show(ex.Message);
return;
// listBox2.Items.Add("Error:" + ex.Message);
}
}
How can I get it to work?
It looks like you're trying to use a BackgroundWorker to perform this operation, and in the absence of any more detailed information on what isn't working, I'd guess it's because you aren't assigning any result or errors which can be picked up by main thread.
You should assign the results of the request in the case of success:
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
e.Result = reader.ReadToEnd();
}
Since you seem to be making multiple requests you should probably make the result a List<string> or similar.
You should remove the try/catch block and deal with any errors in the RunWorkerCompleted event of the BackgroundWorker:
private void BackgroundWorker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
{
if(e.Error != null)
{
MessageBox.Show("Error in async operation: " + ex.Message);
}
else
{
//process results
}
}

Categories