How to prevent known and unknown bots c# - c#

I have a simple web application, which is hosted and I have enable google search so google bots crawls, i cant find many unknown bots crawls in my application. I need to know the valid users visiting the website(except bots).
I have used
httprequest.Browser.Crawler
But it doesn’t works properly.
Can anyone please help me to prevent this fully?

You can use Request.UserAgent to see the UserAgent og requests, and then match that to a list of Crawler UserAgent strings
I would properly use a filter for this and then apply it in your filterconfig
public class CrawlerFilter : ActionFilterAttribute
{
public override void OnActionExecuting(ActionExecutingContext context)
{
var userAgent = context.HttpContext.Request.UserAgent;
//Do something with the userAgent and/or drop the request
}
}
The problem with this is that not all crawler abide by this, so you can't really be "super" certain.
Edit
i just learned that Request.Browser.Crawler is basically what I've suggested above, although the list (from Browser.Crawler) seems to not be maintained very well.

Related

"The page you requested was removed" with 410 and redirect

I have following requirement:
the user comes to a job page in our customer's website, but the job is already taken, so the page does not exist anymore
the user should NOT get a 404 but a 410(Gone) and then be redirected to a job-overview-page where he gets the information that this job is not available anymore and a list of available jobs
but instead of a 302(temp. moved) or a 404(current behavior) google should get a 410(gone) status to indicate that this page is permanently unavailable
so the old url should be removed from the index and the new not be treated as a replacement
So how i can redirect the user with a 410 status? If i try something like this:
string overviewUrl = _urlContentResolver.GetAbsoluteUrl(overviewPage.ContentLink);
HttpContext context = _httpContextResolver.GetCurrent();
context.Response.Clear();
context.Response.Redirect(overviewUrl, false);
context.Response.StatusCode = 410;
context.Response.TrySkipIisCustomErrors = true;
context.Response.End();
I get a static error page in chrome with nothing but:
The page you requested was removed
But the status-code is correct(410) and also the Location is set correctly, just no redirect.
If i use Redirect and set the status before:
context.Response.StatusCode = 410;
context.Response.Redirect(overviewUrl, true); // true => endReponse
the redirect happens but i get a 302 instead of the desired 410.
Is this possible at all, if yes, how?
I think you're trying to bend the rules of http. The documentation states
The HyperText Transfer Protocol (HTTP) 410 Gone client error response code indicates that access to the target resource is no longer available at the origin server and that this condition is likely to be permanent.
If you don't know whether this condition is temporary or permanent, a 404 status code should be used instead.
In your situation either 404 or 410 seems to be the right status code, but 410 does not have any statement about redirection as a correct behavior that browsers should implement, so you have to assume a redirect is not going to work.
Now, to the philosophically right way to implement your way out of this...
With your stated requirements, "taken" does not mean the resource is gone. It means it exists for the client that claimed it. So, do you 302 Redirect a different client to something else that might be considered correct? You implemented that, and it seems like the right way to do it.
That said, I don't know if you "own" the behavior across the client and server to change the requirements to this approach. Looking at it from the "not found" angle, a 404 also seems reasonable. It's not found because "someone" already has the resource.
In short if your requirements are set in stone, they may be in opposition to the HTTP spec. If you still must have a 410 then you would need to change the behavior on the client-side somehow. If that's JavaScript, you'd need to expect a 410 from the server that returns a helpful payload that the client interprets to do something else (e.g. like a simulated redirect).
If you don't "own" the client code... well that's a different problem.
There's a short blog post by Tommy Griffth that backs up what I am saying. Take a read. It says in part,
The “Gone” error response code means that the page is truly gone—it’s no longer available on the origin server and no redirect was set up.
Sometimes, webmasters want to be very explicit to Google and other search engines that a page is gone. This is a much more direct signal to Google that a page is truly gone and never coming back. It's slightly more direct than a 404.
So, is it possible? Yes, but you're going to need to "fake" it by changing both client and server code.
I will accept Kit's answer since he's right in general, but maybe i have overcomplicated my requirement a bit, so i want to share my solution:
What i wanted actually?
provide crawlers a 410 so that the taken job page is delisted from search engine indexes
provide the user a better exeprience than getting a 404, so redirect him to a job-overview where he can find similar jobs and gets a message
These are two separate requirements and two separate users, so i could simply provide a solution for a crawler and one for a "normal" user.
In case someone needs something similar i can provide more details, just a snippet:
if (HttpContext.Current.IsInSearchBotMode())
{
Deliver410ForSearchBots(HttpContext.Current);
}
else
{
// redirect(301) to job-overview, omitting details
}
private void Deliver410ForSearchBots(HttpContext context)
{
context.Response.Clear();
context.Response.StatusCode = 410;
context.Response.StatusDescription = "410 job taken";
context.Response.TrySkipIisCustomErrors = true;
context.Response.End();
}
public static bool IsInSearchBotMode(this HttpContext context)
{
ISearchBotConfiguration configuration = ServiceLocator.Current.GetInstance<ISearchBotConfiguration>();
string userAgent = context.Request?.UserAgent;
return !(string.IsNullOrEmpty(userAgent) || configuration.UserAgents == null)
&& configuration.UserAgents.Any(bot => userAgent!.IndexOf(bot, StringComparison.InvariantCultureIgnoreCase) >= 0);
}
These user-agents i have used for the crawler detection:
<add key="SearchBot.UserAgents" value="Googlebot;Googlebot-Image;Googlebot-News;APIs-Google;AdsBot-Google;AdsBot-Google-Mobile;AdsBot-Google-Mobile-Apps;DuplexWeb-Google;Google-Site-Verification;Googlebot-Video;Google-Read-Aloud;googleweblight;Mediapartners-Google;Storebot-Google;LinkedInBot;bitlybot;SiteAuditBot;FacebookBot;YandexBot;DataForSeoBot;SiteCheck-sitecrawl;MJ12bot;PetalBot;Yeti;SemrushBot;Roboter;Bingbot;AltaVista;Yahoobot;YahooCrawler;Slurp;MSNbot;Lycos;AskJeaves;IBMResearchWebCrawler;BaiduSpider;facebookexternalhit;XING-contenttabreceiver;Twitterbot;TweetmemeBot" />

ASP.NET Core Angular - send different SignalR messages based on logged in user

I have Angular SPA ASP.NET Core app with Identity (IdentityServer4). I use SignalR to push real-time messages to clients.
However I have to "broadcast" messages. All clients receive same messages regardless of what they require and then they figure out in Typescript - do they need this message or not.
What I want is to be able to decide which SignalR client should receive message and what content - it will make messages shorter and cut out processing time on clients completely.
I see there is hub.Client.User(userId) method - thats what I need.. However it appears that the Identity user ID is not known to SignalR.
If I override public override Task OnConnectedAsync() - context inside doesnt have any useful information eg user/principals/claims - are empty.
How can I find out which IdentityServer4 user is connecting to the hub?
EDIT1 suggested implementing IUserIdProvider doesnt work - all xs are null.
https://learn.microsoft.com/en-us/aspnet/core/signalr/authn-and-authz?view=aspnetcore-5.0#use-claims-to-customize-identity-handling
public string GetUserId(HubConnectionContext connection)
{
var x1 = connection.User?.FindFirst(ClaimTypes.Email)?.Value;
var x2 = connection.User?.FindFirst(ClaimTypes.NameIdentifier)?.Value;
var x3 = connection.User?.FindFirst(ClaimTypes.Name)?.Value;
...
EDIT2 implemented "Identity Server JWT authentication" from https://learn.microsoft.com/en-us/aspnet/core/signalr/authn-and-authz?view=aspnetcore-5.0 - doesnt work either - accessToken is empty in PostConfigure
You need to implement IUserIdProvider and register it in the services collection.
Check this question - How to user IUserIdProvider in .NET Core?
There is an obvious solution to it. Here is the sample one can use after creating an asp.net core angular app with identity.
Note that in this particular scenario (Angular with ASP.NET Core with Identity) you do NOT need to implement anything else, in contrary to multiple suggestions from people mis-reading the doc: https://learn.microsoft.com/en-us/aspnet/core/signalr/authn-and-authz?view=aspnetcore-5.0
Client side:
import { AuthorizeService } from '../../api-authorization/authorize.service';
. . .
constructor(. . . , authsrv: AuthorizeService) {
this.hub = new HubConnectionBuilder()
.withUrl("/newshub", { accessTokenFactory: () => authsrv.getAccessToken().toPromise() })
.build();
Server side:
[Authorize]
public class NewsHub : Hub
{
public static readonly SortedDictionary<string, HubAuthItem> Connected = new SortedDictionary<string, HubAuthItem>();
public override Task OnConnectedAsync()
{
NewsHub.Connected.Add(Context.ConnectionId, new HubAuthItem
{
ConnectionId = Context.ConnectionId,
UserId = Context.User?.FindFirst(ClaimTypes.NameIdentifier)?.Value
});
return base.OnConnectedAsync();
}
}
Use it like this:
if(NewsHub.Connected.Count != 0)
foreach (var cnn in NewsHub.Connected.Values.Where(i => !string.IsNullOrEmpty(i.UserId)))
if(CanSendMessage(cnn.UserId)
hub.Clients.Users(cnn.UserId).SendAsync("servermessage", "message text");
It is transpired that User data is empty in SignalR server context because authorization doesnt work as I expected it to. To implement SignalR authorization with Identity Server seems to be a big deal and is a security risk as it will impact the whole app - you essentially need to manually override huge amount of code which already is done by Identity Server just to satisfy SignalR case.
So I came up with a workaround, see my answer to myself here:
SignalR authorization not working out of the box in asp.net core angular SPA with Identity Server
EDIT: I missed an obvious solution - see the other answer. This is still valid workaround though, so I am going to let it hang here.

Get more facebook comments using Skybrud

I'm developing an app which connects to a specific public page on facebook and search for the comments in a specific post, we have to look all the posts in order to find specific hashtags but there's one post with 4000+ comments, facebook throws this exception:
'StackTrace: Skybrud.Social.Facebook.Exceptions.FacebookException: Please reduce the amount of data you're asking for, then retry your request
at Skybrud.Social.Facebook.Responses.FacebookResponse.ValidateResponse(SocialHttpResponse response, JsonObject obj)
at Skybrud.Social.Facebook.Responses.Comments.FacebookCommentsResponse.ParseResponse(SocialHttpResponse response)
at Skybrud.Social.Facebook.Endpoints.FacebookCommentsEndpoint.GetComments(String id, FacebookCommentsOptions options)'
As you can see I'm using Skybrud, I limited the comments to 800 and it works, but there are more comments and I don't know how to retrieve the other pages of comments, any ideas?.
Thanks for your time.
Maybe this could help someone else, use the two members from the FacebookCommentsOptions: Before and After, the response from Facebook will include this in the Paging section. Then call again the GetCommets method.

asp.net MVC4: how to detect screen width on the filter Action?

Hi This is My action filter, I need to detect the size of the screen to redirect to the adequate action, How can do this ??
public sealed class DetectViewFilterAttribute : ActionFilterAttribute
{
private readonly IRegistrationConfiguration _registrationConfiguration;
public DetectViewFilterAttribute()
{
_registrationConfiguration = DependencyResolver.Current.GetService<IRegistrationConfiguration>();
}
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
bool isMobile = false;
string userAgent = HttpContext.Current.Request.UserAgent.ToLower();
Regex mobileDetectionRegularExpression = new Regex(_registrationConfiguration.DetectMobileRegularExpression);
isMobile = mobileDetectionRegularExpression.IsMatch(userAgent);
if (isMobile)
{
String url;
UrlHelper helper = new UrlHelper(filterContext.RequestContext);
// TODO **if width de device between 300 and 600 px*
url = helper.Action("Mobile","Inscription");
else
url = helper.Action("Tablette","Inscription");
HttpContext.Current.Response.Redirect(url);
}
base.OnActionExecuting(filterContext);
}
}
You cannot know screen size on server side.. BUT you can know the user agent and then know it is a tablet, PC or smartphone. Then, you are closer to determine which view to display.
As you are using MVC4... it is wise to read this article:
http://www.hanselman.com/blog/MakingASwitchableDesktopAndMobileSiteWithASPNETMVC4AndJQueryMobile.aspx
This way, you don't re-invent the wheel... as this behaviour is built in the asp.net MVC framework.
Sadly, you can't have server side code detect client screen width alone. You'll need something on the client side (probably Javascript) to send the server the information about the client's window exact size. This is a little tough on the user experience since a page has to load first, then send that information to the server, then the server can react. That's why user agent sniffing is used a lot of time since that information is sent to the server on each request.
And as #Murali said, there's also solutions like using CSS Media Queries which will allow you to do different CSS rules based on client width without needing to involve the server.
Unfortunately on the server, there is no direct way to get the size of the screen. You can only do this on the client using JavaScript.
However, there are files that .Net uses to determine certain mobile capabilities. These are located in C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config\Browsers. However, since screen sizes can be different for the same mobile client, they are not defined there.
As the other guys wrote, you can't do that server side.
But, if you like to have this information only for mobile device WURFL has that information (resolution_width and resolution_height).
Check the specs here

Side by side Basic and Forms Authentication with ASP.NET Web API

Disclaimer: let me start by saying that I am new to MVC4 + Web Api + Web Services in general + JQuery. I might be attacking this on the wrong angle.
I am trying to build a Web MVC App + Web API in C# for .NET 4 to deploy in Azure. The web api will be used by mobile clients (iOS, using RestKit).
The Web MVC App will be relatively simple. We would like to use Forms Authentication for it and SimpleMembership - which we achieved and works fine.
We'll use the Web API methods from JQuery (Knockout) scripts to fill pieces of the web pages. Therefore, we expect the JQuery to use the same identity authenticated by Forms Authentication.
However, the idea is that the Web Api can be called directly by mobile clients. No Forms Authentications for those.
We have been looking at the Thinktecture Identity Model (http://nuget.org/packages/Thinktecture.IdentityModel https://github.com/thinktecture/Thinktecture.IdentityModel.40). We added the BasicAuth and AcessKey handlers to the config and it works (see code below).
When you try to access the webapi without being authenticated the browser displays the basic authentication dialog and works as expected.
The "issue" is that when you ARE already logged in via Forms Authentication and try to call a Web Api method you still get the Basic Authentication dialog. In other words, Thinktecture IdentityModel seems to ignore the Forms Authentication altogether.
My questions are:
Are my expectations correct? that once I have done the forms authentication I shouldn't do anything else to let the JQuery scripts, etc., access the Web API from the same browser user session.
How do I fix it?
If my expectations are not correct; how is this supposed to work? ie: how do I make the JQuery scripts authenticate?
I know there are tons of similar questions in Stackoverflow, I honestly looked a lot of up, saw videos, etc., but either I am missing something obvious or there is no clear documentation about this for somebody new in the technologies.
I appreciate the help. Thanks.
public static AuthenticationConfiguration CreateConfiguration()
{
var config = new AuthenticationConfiguration
{
DefaultAuthenticationScheme = "Basic",
EnableSessionToken = true,
SetNoRedirectMarker = true
};
config.AddBasicAuthentication((userName, password) => userName == password, retainPassword: false);
config.AddAccessKey(token =>
{
if (ObfuscatingComparer.IsEqual(token, "accesskey123"))
{
return Principal.Create("Custom",
new Claim("customerid", "123"),
new Claim("email", "foo#customer.com"));
}
return null;
}, AuthenticationOptions.ForQueryString("key"));
Here is the solution for this problem which I have come up with earlier.
Note: This solution doesn't involve Thinktecture Identity Model.
I have an abstract BasicAuthenticationHandler class which is a delegating handler. You can get this handler by installing the latest stable WebAPIDoodle NuGet package.
You can give a hint to this base basic authentication handler to suppress the authentication process if the request has been already authentication (e.g: by forms auth). Your custom handler that you need to register would look like as below:
public class MyApplicationAuthHandler : BasicAuthenticationHandler {
public MyApplicationAuthHandler()
: base(suppressIfAlreadyAuthenticated: true) { }
protected override IPrincipal AuthenticateUser(
HttpRequestMessage request,
string username,
string password,
CancellationToken cancellationToken) {
//this method will be called only if the request
//is not authanticated.
//If you are using forms auth, this won't be called
//as you will be authed by the forms auth bofore you hit here
//and Thread.CurrentPrincipal would be populated.
//If you aren't authed:
//Do you auth here and send back an IPrincipal
//instance as I do below.
var membershipService = (IMembershipService)request
.GetDependencyScope()
.GetService(typeof(IMembershipService));
var validUserCtx = membershipService
.ValidateUser(username, password);
return validUserCtx.Principal;
}
protected override void HandleUnauthenticatedRequest(UnauthenticatedRequestContext context) {
// Do nothing here. The Autharization
// will be handled by the AuthorizeAttribute.
}
}
As a final step, you will need to apply System.Web.Http.AuthorizeAttribute (not System.Web.Mvc.AuthorizeAttribute) to your controllers and action methods to give authorization for the specific roles and users.
I hope this helps to solve your problem.

Categories