Getting actual content from RSS feed

Getting actual content from RSS feed - c#

Here for example there is a link for ABC news which gives various RSS feeds to consume.
http://rss.cnn.com/rss/edition.rss`
Using this feeds in Windows 8 store app, I am able to read it using built in SyndicationClient class. However, it gives only title and few summary text for the news story/article and not all content. Now I want to have all content i.e. Text and Image. I saw many news reader app for Windows store and they are doing it pretty much easily when I tap on any story and it gives me actual content right there.
Any idea how to accomplish this? Do I need some sort of html parser here?
You can have a look at News, News Bento app for example. I want to achieve something similar.
Here are the images from the app:
This is extracted text and images from the news article:
This is the view when you click on "View Original Article". I know that view below is using webview control. But I want how to extract data like image above.

Well, answer is readablity. More here as well:
https://github.com/scottksmith95/CSharp.Readability
It took me lot of time to find out this stuff but it is exactly what I wanted.

Related

How to extract the embedded/preview image from a link in C#?

Im building a C# program that shows the news feed from a RSS XML page and I want to include a feature that shows a link, the title (Which I already have) and the image preview of that link like it happens on Discord, or Messenger.
If we send a link of a post, for example, and that post has a image, it will show a preview (+/- like the pictures). Same happens for Youtube links, it shows the tumbnail of the video.
Here is an example of a link from a post on Discord and Messenger.
It displays the "main picture" of that post.
Discord example: http://prntscr.com/n8j0m6
Messenger example: http://prntscr.com/n8j29f
I want to extract the embedded/preview image from a link in C# or at least the link of that image (So I can then load it in the program) or create a similar preview system. That would be even better. But with the image link, I can create a method to do that automatically.
I havent had any luck finding anything similar so far. Maybe I am not using the correct term.
Thank You in advance.

Pagination algorithms for HTML

I am building an ebook manager app for the Windows store using Windows 8.1 and Visual Studio 2013 preview. I have a new webview control that is able to resolve uri's and load the HTML and CSS.
However there is a lot of data in one HTML file and I would like to paginate it someway. My Questions are:
Is there a way to do this with the stream in C#?
Are there any examples out there on paginating HTML content?
Is there a way to measure programmatically how much screen real estate will be used by a particular piece of HTML?

It kind of depends on the type of data that is send back to the browser, also how you want to present it afterwards.
Perhaps you can show some sample data which you want to paginate

Determening what is content in html page

I am building a news reader and I have an option for users to share article from blog, website, etc. by entering link to page. I am using two methods for now to determine the content of page:
I am trying to extract rss feed link from page user entered and then match that url in feed to get right item.
If site doesn't cointain feed or it's malformed or entered address differes from item link in rss(which is in about 50% cases if not more) I try to find og meta tags, and that works great but only bigger sites have that, smaller sites and blogs usually have even same meta description for whole website.
I am wondering how for example Google does it? When website doesn't cointain meta description Google somehow determines by itself what is content on page for their search results.
I am using HtmlAgilityPack to extract stuff from pages and my own methods to clean html to text.
Can someone explain me the logic or best approach to this, If I try to crawl it directly from top I usually end up with content from sidebar, navigation etc.?

I ended up using Boilerpipe which is written in JAVA,imported it using IKVM and it works well for pages that area formated correctly, but it still has troubles with some pages where content is scattered.

How to get XML to TextBlocks and an Image

So I have a Windows Phone C# application in which I want to get a google weather api xml file located here: http://www.google.com/ig/api?weather=[insert zip code here] and get the current weather info. Well I want to display the image located in the xml file and I want to, based on the image, display a background image. How would I do that?
Also, I don't want to use a listbox for it because it won't let me resize the image to full screen.

You are asking rather a lot in your question, so here are some pointers rather than a complete answer.
Use Linq to XML to analyse the returned XML, using the XDocument.Parse method
Locate the images using Linq to XML, doc.Descendants("icon") will find all the icon elements, you can then iterate over them and extract the data attribute.
Create an ImageBrush for your background, setting its source to the URL of the image you require. There are numerous blog posts / SO questions that show you how to handle images in code-behind. For example:
How do you set Image.Source in Silverlight(Code behind)
This should get you started. If you get stuck on something specific, come back and ask a specific question.

get current page number of pdf document in asp.net

I am trying to implement a feature where i open (suppose in iframe) a PDF file (multiple pages), Highlight a section of the document a get the page number (the one that is displayed in the PDF tool bar).
Eg: if the toolbar display 2/7 which means i am right now in page 2, i need to capture the page number information. Sounds simple but i am not able to get a .dll/function that exposes this property.
Any help would be grateful.Thanks.

I wouldn't think this would be possible, there's no way to control PDFs with JavaScript in the browser, which is what you'd need to do.
This article suggests the same: http://codingforums.com/showthread.php?t=43436.
Content of link:
in short, no, you can't do that.
really don't think JS can read properties of PDFs, since PDFs are viewed in the browser thru a plugin, ie a viewport for another application (for want of a better explanation).
You may be better trying a different route, such as generating the pages as images and implementing your own paging. Depends on your content and requirements, of course. ABCPDF from http://www.websupergoo.com/ is free (with a link-back), not sure if that's any help for you.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Getting actual content from RSS feed - c#

Well, answer is readablity. More here as well: https://github.com/scottksmith95/CSharp.Readability It took me lot of time to find out this stuff but it is exactly what I wanted.

Related

How to extract the embedded/preview image from a link in C#?

Pagination algorithms for HTML

Determening what is content in html page

How to get XML to TextBlocks and an Image

get current page number of pdf document in asp.net

Categories

Resources