How can I get a web site's favicon? - c#

Simple enough question: I've created a small app that is basically just a favourites that sits in my system tray so that I can open often-used sites/folders/files from the same place. Getting the default icons from my system for known file types isn't terribly complicated, but I don't know how to get the favicon from a website. (SO has the grey->orange stack icon in the address bar for instance)
Does anyone know how I might go about that?

You'll want to tackle this a few ways:
Look for the favicon.ico at the root of the domain
www.domain.com/favicon.ico
Look for a <link> tag with the rel="shortcut icon" attribute
<link rel="shortcut icon" href="/favicon.ico" />
Look for a <link> tag with the rel="icon" attribute
<link rel="icon" href="/favicon.png" />
The latter two will usually yield a higher quality image.
Just to cover all of the bases, there are device specific icon files that might yield higher quality images since these devices usually have larger icons on the device than a browser would need:
<link rel="apple-touch-icon" href="images/touch.png" />
<link rel="apple-touch-icon-precomposed" href="images/touch.png" />
And to download the icon without caring what the icon is you can use a utility like http://www.google.com/s2/favicons which will do all of the heavy lifting:
var client = new System.Net.WebClient();
client.DownloadFile(
#"http://www.google.com/s2/favicons?domain=stackoverflow.com",
"stackoverflow.com.ico");

Updated 2020
Here are three services you can use in 2020 onwards
<img height="16" width="16" src='https://icons.duckduckgo.com/ip3/www.google.com.ico' />
<img height="16" width="16" src='http://www.google.com/s2/favicons?domain=www.google.com' />
<img height="16" width="16" src='https://api.statvoo.com/favicon/?url=google.com' />

You can use Google S2 Converter.
http://www.google.com/s2/favicons?domain=google.com
Source: http://www.labnol.org/internet/get-favicon-image-of-websites-with-google/4404/

This question is the first google search result I got when I keep searching for website favicon API. So I think it'll be still helpful in the future.
https://icon.horse/icon/[url.hostname] will give you a better site icon.
https://icon.horse/icon/stackoverflow.com

You can do it without programming in 3 steps:
1. Just open the web site, right-click and select "view source" to open the HTML code of that site. Then in the text editor search for "favicon" - it will direct you to something looking like
<link rel="icon" href='/SOMERELATIVEPATH/favicon.ico' type="image/x-icon" />
Take the string in href and append it to the web site's base URL (let's assume it is "http://WEBSITE/"), so it looks like
http://WEBSITE/SOMERELATIVEPATH/favicon.ico
which is the absolute path to the favicon. If you didn't find it this way, it can be as well in the root in which case the URL is http://WEBSITE/favicon.ico.
2. Take the URL you determined and insert it into the href-Parameter of the following code:
<html>
<head>
<title>Capture Favicon</title>
</head>
<body>
<a href='http://WEBSITE/SOMERELATIVEPATH/favicon.ico' alt="Favicon"/>Favicon</a>
</body>
</html>
3. Save this HTML code locally (e.g. on your desktop) as GetFavicon.html and then double-click on it to open it. It will display only a link named Favicon. Right-click on this link and select "Save target as..." to save the Favicon on your local PC - and you're done!

It's a good practice to minimize the number of requests each page needs.
So if you need several icons, yandex can do a sprite of favicons in one query.
Here is an example
http://favicon.yandex.net/favicon/google.com/stackoverflow.com/yandex.net/

The first thing to look for is /favicon.ico in the site root; something like WebClient.DownloadFile() should do fine. However, you can also set the icon in metadata - for SO this is:
<link rel="shortcut icon"
href="http://sstatic.net/stackoverflow/img/favicon.ico">
and note that alternative icons might be available; the "touch" one tends to be bigger and higher res, for example:
<link rel="apple-touch-icon"
href="http://sstatic.net/stackoverflow/img/apple-touch-icon.png">
so you would parse that in either the HTML Agility Pack or XmlDocument (if xhtml) and use WebClient.DownloadFile()
Here's some code I've used to obtain this via the agility pack:
var favicon = "/favicon.ico";
var el=root.SelectSingleNode("/html/head/link[#rel='shortcut icon' and #href]");
if (el != null) favicon = el.Attributes["href"].Value;
Note the icon is theirs, not yours.

In 2020, using duckduckgo.com's service from the CLI
curl -v https://icons.duckduckgo.com/ip2/<website>.ico > favicon.ico
Example
curl -v https://icons.duckduckgo.com/ip2/www.cdc.gov.ico > favicon.ico

You can get the favicon URL from the website's HTML.
Here is the favicon element:
<link rel="icon" type="image/png" href="/someimage.png" />
You should use a regular expression here. If no tag found, look for favicon.ico in the site root directory. If nothing found, the site does not have a favicon.

HttpWebRequest w = (HttpWebRequest)HttpWebRequest.Create("http://stackoverflow.com/favicon.ico");
w.AllowAutoRedirect = true;
HttpWebResponse r = (HttpWebResponse)w.GetResponse();
System.Drawing.Image ico;
using (Stream s = r.GetResponseStream())
{
ico = System.Drawing.Image.FromStream(s);
}
ico.Save("favicon.ico");

Sometimes we can't get the favicon image with the purposed solution as some websites use .png or other image extensions. Here is the working solution.
Open your website with a firefox browser.
Right-click on the website and click the "View page info" option from the list.
It will open up a dialog and click on the "Media" tab.
In that tab you will see all the images including favicon.
Select the favicon.ico image or click through the images to see which image is used as favicon. Some websites use .png images as well.
Then click on the "Save As" button and you should be good to go.
thanks!

This is a late answer, but for completeness: it is difficult to get even close to fetching 90% all favicons.
A while ago I wrote a WordPress plugin which attempts to get closer to 100%.
This is how it works:
It starts by searching existing favicon repositories such as Google favicons and GetFavicons for the favicon.
If none of them returns an icon, the plugin attempts to get the icon itself. This involves traversing several pages on the domain.
The plugin then inspects the physical image file, because on some servers files get returned with the incorrect mime types.
The code is still not perfect because in the details you will find many weird situations: people have wrongly coded paths, e.g. img/favicon.ico where img is not in the root, duplicate headers in HTML output, different server responses from the head and body etc.
The core of the fetching part is here so you can reverse-engineer it, but be aware that validating the response should be done (checking image filetype, mime etc.).

The SHGetFileInfo (Check pinvoke.net for the signature) lets you retrieve a small or large icon, just as if you were dealing with a file/folder/Shell item.

http://realfavicongenerator.net/favicon_checker?site=http://stackoverflow.com gives you favicon analysis stating which favicons are present in what size. You can process the page information to see which is the best quality favicon, and append it's filename to the URL to get it.

You can use Getfv.co :
To retrieve a favicon you can hotlink it at... http://g.etfv.co/[URL]
Example for this page : http://g.etfv.co/https://stackoverflow.com/questions/5119041/how-can-i-get-a-web-sites-favicon
Download content and let's go !
Edit :
Getfv.co and fvicon.com look dead. If you want I found a non free alternative : grabicon.com.

Using jquery
var favicon = $("link[rel='shortcut icon']").attr("href") ||
$("link[rel='icon']").attr("href") || "";

Related

How to load relative source in WebView

I'm using HttpClient to get html string and use WebView's navigateToString method to show this page. I know I can use WebView load this page directly, but I would need to do some processing on that page before it's shown in WebView.
So, I faced a question. The web page quotes some css/js files in header, but I saw that 'href' value is relative path. Then the page will not show correctly in WebView.
[Updated]
For example, I'm using HttpClient to request a URI (http://example.com), then I will get the whole html page string. I will do some operations on this html string. After that, I will use WebView.NavigateToString(htmlpage) method to show this page. But if you check its head tag, there will be some <link> tag, its href value is relative path(/style-a/1.css), not absolute path. Then you will find that the html page doesn't show correctly in WebView.
Could someone give me a solution/code sample?
#Pedro Lamas, rene, Barett, moi_meme, Shachaf.Gortler Please do not put my question on hold. I didn't break any SO rules. My question was very clear. I think you do this, it's because you do not know how to answer my question. That's ok. If you don't know, you could choose not to answer it, but please do not put it on hold.
You can set address of your Link or Scripts tag with ms-appx-web:// and put address after that with additional [ / ] .
for example in this case you can use :
<script src="ms-appx-web:///Assets/FolderName/test.js" type="text/javascript"></script>
and also for link can use :
<link rel="stylesheet" type="text/css" href="ms-appx-web:///style-a/1.css">

How to get favicon from url when website has no access to the internet

How can I get the favicon from ANY webpage (webpage has for example no access to the internet) without using a 3th party application like Google S2 Converter or something else in c#/asp.net?
I am a developer of an internal Web application; we generate in one of our page a table with dynamic links to other internal websites. We don’t want to download every possible icon from the internal websites and link them in the table. Is there a way to get the icon from a page dynamically?
example:
<table>
#foreach (var mylink in Model.links)
{
var faviconOfPage = //Get Url of favicon > example: favicon of subpage.mycompany.com
<tr>
<td>#mylink.name</td>
<td><img src="#faviconOfPage"></td>
</tr>
}
</table>
Note: The requested site is internal: no access from outside!
Thank you for helping me!
A favicon can be supplied through two ways: convention (located at /favicon.ico at the root of the site) or through the <link rel="shortcut icon" /> element.
Because the latter can be the case for every page, you need to request and parse every page to see whether that <link> element is present, and if so, download the file specified by the href attribute. If not, you can download the /favicon.ico from the host.
Since you need to do this per page, and since it can differ per request, you're going to need some caching. You can't request and parse N pages for every page of your web application containing N links, because that would make your application horribly slow.
So create some kind of background job that processes your links, downloads their favicons and stores them in a way so that you can associate icons with links.

Using site root relative links in Razor

I have a website that is working fine with Razor (C#) all the coding is working properly when I use my local testing (WebMatrix IIS).
When I put it "online" on my server the website is not at the root of the site it self
For example:
http:// intranet.mycompany.com/inform
That's basically the "root" of my folder structure so all my folders starts from there (css file default.cshtml... and so on)
My "_PageStart.cshtml" sees it properly cause when I access my site from the link http://intranet.mycompany.com/inform it gives me the Layout I have configured in _PageStart.cshtml (and it really show the layout + the rendered default.cshtml)
BUT nothing else is getting the proper path, for example :
<img src="~/images/logos/hdr.png" />
The img holder is there I can see it but shows that the link is broken... when I Right-Click the img holder and do properties to see where the files should be it shows me :
http:// intranet.mycompany.com/images/logos/hdr.png
So it's going to the "full" root not the relative root...
How can i fix that ?
You have to use relative paths all over your app:
~ won't work within static html code.
You can write
<img src="#Url.Content("~/images/logos/hdr.png")" />
or
<img src="../images/logos/hdr.png" />
The first approach is good for layout files where your relative path might be changing when you have different length routing urls.
EDIT
Regarding to your question about normal links:
When linking to another page in your app you don't specify the view file as the target but the action which renders a view as the result. For that you use the HtmlHelper ActionLink:
#Html.ActionLink("Linktext", "YourController", "YourAction")
That generates the right url for you automatically:
Linktext
EDIT 2
Ok, no MVC - so you have to generate your links yourself.
You have to use relative paths, too. Don't start any link with the / character!
Link
Link
Link
EDIT 3
When using Layout pages you can use the Hrefextension method to generate a relative url:
<link href="#Href("~/style.css")" ...
Use Url.Content as shown bellow:
<img src="#Url.Content("~/images/logos/hdr.png")" />
I know that '~' is added by default, but I tend to change it so that all paths are relative to my code file rather than application root, using ".." eg. "../images/logos" etc

Hot to get a web pages's text overview and photo from a link-similar to facebook when posting a link

When you posting a link on facebook they retrieve an image from that link's page and an overview of that page's content. Any ideas on how to have this kind of functionality?
Thanks in advance.
It's not something that can be answered momentarily, but I can point you in the right direction. You have to read the Html page and parse it for all image tags. There are different ways of doing this, but as an example:
WebClient webClient = new WebClient();
webClient.Encoding = Encoding.UTF8;
string pageHtml = webClient.DownloadString(your_link_url);
Then you can search the string for <img> tags and read their src attributes. Facebook (and more recently MySpace) uses more complicated logic and rules to determine which images to grab (e.g. only certain size limits), so you can do something similar.
Btw, Facebook and MySpace recommend to use metatagging of content in order to "tell" their "fetchers" which images exactly they should fetch when sharing. So you could parse the page for those first, and if they are not present, continue with other images:
<meta name="title" content="TITLE_GOES_HERE" />
<meta name="description" content="EXCERPT_GOES_HERE" />
<link rel="image_src" href="IMAGE_URL_GOES_HERE" />
http://developerwiki.myspace.com/index.php?title=How_to_Add_Post_To_MySpace_to_Your_Site

Style Sheet is not working in one web page

Style sheet in master page is not working for one web page of asp.net application but it works for another web page.
If you are referencing a css file from a master page you should ensure it has an absolute path, that way it will work everywhere. For example:
<head runat="server">
<link type="text/css" rel="stylesheet" href="~/_styles/mystylesheet.css" />
</head>
The important thing to note here is that the head tag has the runar="server" attribute and that i am specifying the full virtual path using a tilde ("~").
Are none of its style elements being included? Is it being over ridden( they are Cascading Style Sheets)? Does it have the correct CSS include statement?
Are your pages in different levels of folders ?
For example,
..\main.css
..\folder1\MasterPage.master
..\folder1\css_working.aspx
..\folder1\folder2\css_not_working.aspx
in this scenario you should define your css in masterpage as :
<link rel="stylesheet" type="text/css" href="../main.css" />
And take your pages to same level, like that :
..\main.css
..\folder1\MasterPage.master
..\folder1\css_working.aspx
..\folder2\css_not_working.aspx
If you are using update panels there are some cases where the styling may be lost for AJAX toolkit controls. To fix this you need to put hte full name of hte class items into the stylesheet instead of letting hte toolkit handle this.
Also be sure to use a relative url where possible so that if a file moves it won't loose it's mapping.
Use Firebug or Debug Bar, these tools will show you all the styles being employed on each element, so you can see what stylesheets it is using and which ones it is not.
Also, when you build check for any warnings about stylesheets that it can't reference etc.
it could be a permission issue on the folder... if you have deny users="?" in your web config.. make sure you have an allow users on the folder where you have your style sheets

Categories