Related
I have downloaded Privoxy few weeks ago and for the fun I was curious to know how a simple version of it can be done.
I understand that I need to configure the browser (client) to send request to the proxy. The proxy send the request to the web (let say it's a http proxy). The proxy will receive the answer... but how can the proxy send back the request to the browser (client)?
I have search on the web for C# and http proxy but haven't found something that let me understand how it works behind the scene correctly. (I believe I do not want a reverse proxy but I am not sure).
Does any of you have some explication or some information that will let me continue this small project?
Update
This is what I understand (see graphic below).
Step 1 I configure the client (browser) for all request to be send to 127.0.0.1 at the port the Proxy listen. This way, request will be not sent to the Internet directly but will be processed by the proxy.
Step2 The proxy see a new connection, read the HTTP header and see the request he must executes. He executes the request.
Step3 The proxy receive an answer from the request. Now he must send the answer from the web to the client but how???
Useful link
Mentalis Proxy : I have found this project that is a proxy (but more that I would like). I might check the source but I really wanted something basic to understand more the concept.
ASP Proxy : I might be able to get some information over here too.
Request reflector : This is a simple example.
Here is a Git Hub Repository with a Simple Http Proxy.
I wouldn't use HttpListener or something like that, in that way you'll come across so many issues.
Most importantly it'll be a huge pain to support:
Proxy Keep-Alives
SSL won't work (in a correct way, you'll get popups)
.NET libraries strictly follows RFCs which causes some requests to fail (even though IE, FF and any other browser in the world will work.)
What you need to do is:
Listen a TCP port
Parse the browser request
Extract Host connect to that host in TCP level
Forward everything back and forth unless you want to add custom headers etc.
I wrote 2 different HTTP proxies in .NET with different requirements and I can tell you that this is the best way to do it.
Mentalis doing this, but their code is "delegate spaghetti", worse than GoTo :)
I have recently written a light weight proxy in c# .net using TcpListener and TcpClient.
https://github.com/titanium007/Titanium-Web-Proxy
It supports secure HTTP the correct way, client machine needs to trust root certificate used by the proxy. Also supports WebSockets relay. All features of HTTP 1.1 are supported except pipelining. Pipelining is not used by most modern browsers anyway. Also supports windows authentication (plain, digest).
You can hook up your application by referencing the project and then see and modify all traffic. (Request and response).
As far as performance, I have tested it on my machine and works without any noticeable delay.
You can build one with the HttpListener class to listen for incoming requests and the HttpWebRequest class to relay the requests.
Proxy can work in the following way.
Step1, configure client to use proxyHost:proxyPort.
Proxy is a TCP server that is listening on proxyHost:proxyPort.
Browser opens connection with Proxy and sends Http request.
Proxy parses this request and tries to detect "Host" header. This header will tell Proxy where to open connection.
Step 2: Proxy opens connection to the address specified in the "Host" header. Then it sends HTTP request to that remote server. Reads response.
Step 3: After response is read from remote HTTP server, Proxy sends the response through an earlier opened TCP connection with browser.
Schematically it will look like this:
Browser Proxy HTTP server
Open TCP connection
Send HTTP request ----------->
Read HTTP header
detect Host header
Send request to HTTP ----------->
Server
<-----------
Read response and send
<----------- it back to the browser
Render content
If you are just looking to intercept the traffic, you could use the fiddler core to create a proxy...
http://fiddler.wikidot.com/fiddlercore
run fiddler first with the UI to see what it does, it is a proxy that allows you to debug the http/https traffic. It is written in c# and has a core which you can build into your own applications.
Keep in mind FiddlerCore is not free for commercial applications.
Agree to dr evil
if you use HTTPListener you will have many problems, you have to parse requests and will be engaged to headers and ...
Use tcp listener to listen to browser requests
parse only the first line of the request and get the host domain and port to connect
send the exact raw request to the found host on the first line of browser request
receive the data from the target site(I have problem in this section)
send the exact data received from the host to the browser
you see you dont need to even know what is in the browser request and parse it, only get the target site address from the first line
first line usually likes this
GET http://google.com HTTP1.1
or
CONNECT facebook.com:443 (this is for ssl requests)
Things have become really easy with OWIN and WebAPI. In my search for a C# Proxy server, I also came across this post http://blog.kloud.com.au/2013/11/24/do-it-yourself-web-api-proxy/ . This will be the road I'm taking.
Socks4 is a very simple protocol to implement. You listen for the initial connection, connect to the host/port that was requested by the client, send the success code to the client then forward the outgoing and incoming streams across sockets.
If you go with HTTP you'll have to read and possibly set/remove some HTTP headers so that's a little more work.
If I remember correctly, SSL will work across HTTP and Socks proxies. For a HTTP proxy you implement the CONNECT verb, which works much like the socks4 as described above, then the client opens the SSL connection across the proxied tcp stream.
For what it's worth, here is a C# sample async implementation based on HttpListener and HttpClient (I use it to be able to connect Chrome in Android devices to IIS Express, that's the only way I found...).
And If you need HTTPS support, it shouldn't require more code, just certificate configuration: Httplistener with HTTPS support
// define http://localhost:5000 and http://127.0.0.1:5000/ to be proxies for http://localhost:53068
using (var server = new ProxyServer("http://localhost:53068", "http://localhost:5000/", "http://127.0.0.1:5000/"))
{
server.Start();
Console.WriteLine("Press ESC to stop server.");
while (true)
{
var key = Console.ReadKey(true);
if (key.Key == ConsoleKey.Escape)
break;
}
server.Stop();
}
....
public class ProxyServer : IDisposable
{
private readonly HttpListener _listener;
private readonly int _targetPort;
private readonly string _targetHost;
private static readonly HttpClient _client = new HttpClient();
public ProxyServer(string targetUrl, params string[] prefixes)
: this(new Uri(targetUrl), prefixes)
{
}
public ProxyServer(Uri targetUrl, params string[] prefixes)
{
if (targetUrl == null)
throw new ArgumentNullException(nameof(targetUrl));
if (prefixes == null)
throw new ArgumentNullException(nameof(prefixes));
if (prefixes.Length == 0)
throw new ArgumentException(null, nameof(prefixes));
RewriteTargetInText = true;
RewriteHost = true;
RewriteReferer = true;
TargetUrl = targetUrl;
_targetHost = targetUrl.Host;
_targetPort = targetUrl.Port;
Prefixes = prefixes;
_listener = new HttpListener();
foreach (var prefix in prefixes)
{
_listener.Prefixes.Add(prefix);
}
}
public Uri TargetUrl { get; }
public string[] Prefixes { get; }
public bool RewriteTargetInText { get; set; }
public bool RewriteHost { get; set; }
public bool RewriteReferer { get; set; } // this can have performance impact...
public void Start()
{
_listener.Start();
_listener.BeginGetContext(ProcessRequest, null);
}
private async void ProcessRequest(IAsyncResult result)
{
if (!_listener.IsListening)
return;
var ctx = _listener.EndGetContext(result);
_listener.BeginGetContext(ProcessRequest, null);
await ProcessRequest(ctx).ConfigureAwait(false);
}
protected virtual async Task ProcessRequest(HttpListenerContext context)
{
if (context == null)
throw new ArgumentNullException(nameof(context));
var url = TargetUrl.GetComponents(UriComponents.SchemeAndServer, UriFormat.Unescaped);
using (var msg = new HttpRequestMessage(new HttpMethod(context.Request.HttpMethod), url + context.Request.RawUrl))
{
msg.Version = context.Request.ProtocolVersion;
if (context.Request.HasEntityBody)
{
msg.Content = new StreamContent(context.Request.InputStream); // disposed with msg
}
string host = null;
foreach (string headerName in context.Request.Headers)
{
var headerValue = context.Request.Headers[headerName];
if (headerName == "Content-Length" && headerValue == "0") // useless plus don't send if we have no entity body
continue;
bool contentHeader = false;
switch (headerName)
{
// some headers go to content...
case "Allow":
case "Content-Disposition":
case "Content-Encoding":
case "Content-Language":
case "Content-Length":
case "Content-Location":
case "Content-MD5":
case "Content-Range":
case "Content-Type":
case "Expires":
case "Last-Modified":
contentHeader = true;
break;
case "Referer":
if (RewriteReferer && Uri.TryCreate(headerValue, UriKind.Absolute, out var referer)) // if relative, don't handle
{
var builder = new UriBuilder(referer);
builder.Host = TargetUrl.Host;
builder.Port = TargetUrl.Port;
headerValue = builder.ToString();
}
break;
case "Host":
host = headerValue;
if (RewriteHost)
{
headerValue = TargetUrl.Host + ":" + TargetUrl.Port;
}
break;
}
if (contentHeader)
{
msg.Content.Headers.Add(headerName, headerValue);
}
else
{
msg.Headers.Add(headerName, headerValue);
}
}
using (var response = await _client.SendAsync(msg).ConfigureAwait(false))
{
using (var os = context.Response.OutputStream)
{
context.Response.ProtocolVersion = response.Version;
context.Response.StatusCode = (int)response.StatusCode;
context.Response.StatusDescription = response.ReasonPhrase;
foreach (var header in response.Headers)
{
context.Response.Headers.Add(header.Key, string.Join(", ", header.Value));
}
foreach (var header in response.Content.Headers)
{
if (header.Key == "Content-Length") // this will be set automatically at dispose time
continue;
context.Response.Headers.Add(header.Key, string.Join(", ", header.Value));
}
var ct = context.Response.ContentType;
if (RewriteTargetInText && host != null && ct != null &&
(ct.IndexOf("text/html", StringComparison.OrdinalIgnoreCase) >= 0 ||
ct.IndexOf("application/json", StringComparison.OrdinalIgnoreCase) >= 0))
{
using (var ms = new MemoryStream())
{
using (var stream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false))
{
await stream.CopyToAsync(ms).ConfigureAwait(false);
var enc = context.Response.ContentEncoding ?? Encoding.UTF8;
var html = enc.GetString(ms.ToArray());
if (TryReplace(html, "//" + _targetHost + ":" + _targetPort + "/", "//" + host + "/", out var replaced))
{
var bytes = enc.GetBytes(replaced);
using (var ms2 = new MemoryStream(bytes))
{
ms2.Position = 0;
await ms2.CopyToAsync(context.Response.OutputStream).ConfigureAwait(false);
}
}
else
{
ms.Position = 0;
await ms.CopyToAsync(context.Response.OutputStream).ConfigureAwait(false);
}
}
}
}
else
{
using (var stream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false))
{
await stream.CopyToAsync(context.Response.OutputStream).ConfigureAwait(false);
}
}
}
}
}
}
public void Stop() => _listener.Stop();
public override string ToString() => string.Join(", ", Prefixes) + " => " + TargetUrl;
public void Dispose() => ((IDisposable)_listener)?.Dispose();
// out-of-the-box replace doesn't tell if something *was* replaced or not
private static bool TryReplace(string input, string oldValue, string newValue, out string result)
{
if (string.IsNullOrEmpty(input) || string.IsNullOrEmpty(oldValue))
{
result = input;
return false;
}
var oldLen = oldValue.Length;
var sb = new StringBuilder(input.Length);
bool changed = false;
var offset = 0;
for (int i = 0; i < input.Length; i++)
{
var c = input[i];
if (offset > 0)
{
if (c == oldValue[offset])
{
offset++;
if (oldLen == offset)
{
changed = true;
sb.Append(newValue);
offset = 0;
}
continue;
}
for (int j = 0; j < offset; j++)
{
sb.Append(input[i - offset + j]);
}
sb.Append(c);
offset = 0;
}
else
{
if (c == oldValue[0])
{
if (oldLen == 1)
{
changed = true;
sb.Append(newValue);
}
else
{
offset = 1;
}
continue;
}
sb.Append(c);
}
}
if (changed)
{
result = sb.ToString();
return true;
}
result = input;
return false;
}
}
The browser is connected to the proxy so the data that the proxy gets from the web server is just sent via the same connection that the browser initiated to the proxy.
Assume we have an application that wants access popular Russian social network VK and written on C# with WinForms GUI. VK uses OAuth2-similiar approach, so we need to open web browser with vk oauth authorization url. Then we subscribe to webBrowser's OnNavigated event and waiting until url will not be equal some pre-defined url with access token in query string.
From now on we can call vk methods using received access token, but some strange things take place here: when i try to invoke some vk methods with HttpClient.GetAsync(methodUri), everything goes according to plan, except to opening the link from the authorization web browser in the system web browser.
vk's client authorization Url looks like https://oauth.vk.com/authorize?client_id={clientId}&scope={scope}&redirect_uri=https://oauth.vk.com/blank.html&display={displayType}&response_type=token, Url with received accessToken looks like https://oauth.vk.com/blank.html#access_token={accessToken}&expires_in={expiresIn}&user_id={userId}, note the number sign instead on question mark.
code in main form:
var authenticationForm = new AuthenticationForm();
authenticationForm.Show();
_authenticatedUser = await application.ClientAuthenticator.Authenticate(authenticationForm.GetToken);
authenticationForm.Close();
var httpClient = new HttpClient();
var request = "https://api.vk.com/method/users.get.xml?user_ids=1&fields=online";
var response = await httpClient.GetAsync(request);
authenticationForm class code:
public partial class AuthenticationForm : Form
{
private readonly TaskCompletionSource<VkAccessToken> _tokenCompletitionSource = new TaskCompletionSource<VkAccessToken>();
private Uri _redirectUri;
public AuthenticationForm()
{
InitializeComponent();
}
public async Task<IVkAccessToken> GetToken(Uri authUri, Uri redirectUri)
{
authenticationBrowser.Navigate(authUri);
_redirectUri = redirectUri;
var token = await _tokenCompletitionSource.Task;
return token;
}
private async void authenticationBrowser_Navigated(object sender, WebBrowserNavigatedEventArgs e)
{
if (!(_redirectUri.IsBaseOf(e.Url) && _redirectUri.AbsolutePath.Equals(e.Url.AbsolutePath))) return;
//working with e.Url to achieve token, userId and expiresIn, creating token variable based on them
_tokenCompletitionSource.SetResult(token);
}
}
ClientAuthenticator.Authenticate code:
public async Task<IVkAuthenticatedUser> Authenticate(Func<Uri, Uri, Task<IVkAuthenticatedUser>> aunthenticationResultGetter)
{
var authorizationUri =
new Uri("https://oauth.vk.com/authorize?client_id={clientId}&scope={scope}&redirect_uri=https://oauth.vk.com/blank.html&display=page&response_type=token");
var token = await aunthenticationResultGetter(authorizationUri, _application.Settings.RedirectUri);
//...
return newUserBasedOnToken;
}
after stepping out(using debugger) var response = await httpClient.GetAsync(request); line from main form, my system browser opens link like https://oauth.vk.com/blank.html#access_token={accessToken}&expires_in={expiresIn}&user_id={userId} - #access_token={accessToken}&expires_in={expiresIn}&user_id={userId} with recent accessToken, expiresIn and userId values. Yes, with ... - #access_token=.... in url.
I have no idea why this might happen, but I am concerned that the number sign.
important addition: it only happens if the Web browser does not have information about a session or it is expired, that is, I have to enter username and password to vk's login form. if cookies contain the necessary information and it automatically redirect to the page containing token in it's url (with # sign again), everything works as expected
I'm trying to use the Google+ API to access info for the authenticated user. I've copied some code from one of the samples, which works fine (below), however I'm having trouble making it work in a way I can reuse the token across app-launches.
I tried capturing the "RefreshToken" property and using provider.RefreshToken() (amongst other things) and always get a 400 Bad Request response.
Does anyone know how to make this work, or know where I can find some samples? The Google Code site doesn't seem to cover this :-(
class Program
{
private const string Scope = "https://www.googleapis.com/auth/plus.me";
static void Main(string[] args)
{
var provider = new NativeApplicationClient(GoogleAuthenticationServer.Description);
provider.ClientIdentifier = "BLAH";
provider.ClientSecret = "BLAH";
var auth = new OAuth2Authenticator<NativeApplicationClient>(provider, GetAuthentication);
var plus = new PlusService(auth);
plus.Key = "BLAH";
var me = plus.People.Get("me").Fetch();
Console.WriteLine(me.DisplayName);
}
private static IAuthorizationState GetAuthentication(NativeApplicationClient arg)
{
// Get the auth URL:
IAuthorizationState state = new AuthorizationState(new[] { Scope });
state.Callback = new Uri(NativeApplicationClient.OutOfBandCallbackUrl);
Uri authUri = arg.RequestUserAuthorization(state);
// Request authorization from the user (by opening a browser window):
Process.Start(authUri.ToString());
Console.Write(" Authorization Code: ");
string authCode = Console.ReadLine();
Console.WriteLine();
// Retrieve the access token by using the authorization code:
return arg.ProcessUserAuthorization(authCode, state);
}
}
Here is an example. Make sure you add a string setting called RefreshToken and reference System.Security or find another way to safely store the refresh token.
private static byte[] aditionalEntropy = { 1, 2, 3, 4, 5 };
private static IAuthorizationState GetAuthorization(NativeApplicationClient arg)
{
// Get the auth URL:
IAuthorizationState state = new AuthorizationState(new[] { PlusService.Scopes.PlusMe.GetStringValue() });
state.Callback = new Uri(NativeApplicationClient.OutOfBandCallbackUrl);
string refreshToken = LoadRefreshToken();
if (!String.IsNullOrWhiteSpace(refreshToken))
{
state.RefreshToken = refreshToken;
if (arg.RefreshToken(state))
return state;
}
Uri authUri = arg.RequestUserAuthorization(state);
// Request authorization from the user (by opening a browser window):
Process.Start(authUri.ToString());
Console.Write(" Authorization Code: ");
string authCode = Console.ReadLine();
Console.WriteLine();
// Retrieve the access token by using the authorization code:
var result = arg.ProcessUserAuthorization(authCode, state);
StoreRefreshToken(state);
return result;
}
private static string LoadRefreshToken()
{
return Encoding.Unicode.GetString(ProtectedData.Unprotect(Convert.FromBase64String(Properties.Settings.Default.RefreshToken), aditionalEntropy, DataProtectionScope.CurrentUser));
}
private static void StoreRefreshToken(IAuthorizationState state)
{
Properties.Settings.Default.RefreshToken = Convert.ToBase64String(ProtectedData.Protect(Encoding.Unicode.GetBytes(state.RefreshToken), aditionalEntropy, DataProtectionScope.CurrentUser));
Properties.Settings.Default.Save();
}
The general idea is as follows:
You redirect the user to Google's Authorization Endpoint.
You obtain a short-lived Authorization Code.
You immediately exchange the Authorization Code for a long-lived Access Token using Google's Token Endpoint. The Access Token comes with an expiry date and a Refresh Token.
You make requests to Google's API using the Access Token.
You can reuse the Access Token for as many requests as you like until it expires. Then you can use the Refresh Token to request a new Access Token (which comes with a new expiry date and a new Refresh Token).
See also:
The OAuth 2.0 Authorization Protocol
Google's OAuth 2.0 documentation
I also had problems with getting "offline" authentication to work (i.e. acquiring authentication with a refresh token), and got HTTP-response 400 Bad request with a code similar to the OP's code. However, I got it to work with the line client.ClientCredentialApplicator = ClientCredentialApplicator.PostParameter(this.clientSecret); in the Authenticate-method. This is essential to get a working code -- I think this line forces the clientSecret to be sent as a POST-parameter to the server (instead of as a HTTP Basic Auth-parameter).
This solution assumes that you've already got a client ID, a client secret and a refresh-token. Note that you don't need to enter an access-token in the code. (A short-lived access-code is acquired "under the hood" from the Google server when sending the long-lived refresh-token with the line client.RefreshAuthorization(state);. This access-token is stored as part of the auth-variable, from where it is used to authorize the API-calls "under the hood".)
A code example that works for me with Google API v3 for accessing my Google Calendar:
class SomeClass
{
private string clientID = "XXXXXXXXX.apps.googleusercontent.com";
private string clientSecret = "MY_CLIENT_SECRET";
private string refreshToken = "MY_REFRESH_TOKEN";
private string primaryCal = "MY_GMAIL_ADDRESS";
private void button2_Click_1(object sender, EventArgs e)
{
try
{
NativeApplicationClient client = new NativeApplicationClient(GoogleAuthenticationServer.Description, this.clientID, this.clientSecret);
OAuth2Authenticator<NativeApplicationClient> auth = new OAuth2Authenticator<NativeApplicationClient>(client, Authenticate);
// Authenticated and ready for API calls...
// EITHER Calendar API calls (tested):
CalendarService cal = new CalendarService(auth);
EventsResource.ListRequest listrequest = cal.Events.List(this.primaryCal);
Google.Apis.Calendar.v3.Data.Events events = listrequest.Fetch();
// iterate the events and show them here.
// OR Plus API calls (not tested) - copied from OP's code:
var plus = new PlusService(auth);
plus.Key = "BLAH"; // don't know what this line does.
var me = plus.People.Get("me").Fetch();
Console.WriteLine(me.DisplayName);
// OR some other API calls...
}
catch (Exception ex)
{
Console.WriteLine("Error while communicating with Google servers. Try again(?). The error was:\r\n" + ex.Message + "\r\n\r\nInner exception:\r\n" + ex.InnerException.Message);
}
}
private IAuthorizationState Authenticate(NativeApplicationClient client)
{
IAuthorizationState state = new AuthorizationState(new string[] { }) { RefreshToken = this.refreshToken };
// IMPORTANT - does not work without:
client.ClientCredentialApplicator = ClientCredentialApplicator.PostParameter(this.clientSecret);
client.RefreshAuthorization(state);
return state;
}
}
The OAuth 2.0 spec is not yet finished, and there is a smattering of spec implementations out there across the various clients and services that cause these errors to appear. Mostly likely you're doing everything right, but the DotNetOpenAuth version you're using implements a different draft of OAuth 2.0 than Google is currently implementing. Neither part is "right", since the spec isn't yet finalized, but it makes compatibility something of a nightmare.
You can check that the DotNetOpenAuth version you're using is the latest (in case that helps, which it might), but ultimately you may need to either sit tight until the specs are finalized and everyone implements them correctly, or read the Google docs yourself (which presumably describe their version of OAuth 2.0) and implement one that specifically targets their draft version.
I would recommend looking at the "SampleHelper" project in the Samples solution of the Google .NET Client API:
Samples/SampleHelper/AuthorizationMgr.cs
This file shows both how to use Windows Protected Data to store a Refresh token, and it also shows how to use a Local Loopback Server and different techniques to capture the Access code instead of having the user enter it manually.
One of the samples in the library which use this method of authorization can be found below:
Samples/Tasks.CreateTasks/Program.cs
I am trying to automate logging into Photobucket for API use for a project that requires automated photo downloading using stored credentials.
The API generates a URL to use for logging in, and using Firebug i can see what requests and responses are being sent/received.
My question is, how can i use HttpWebRequest and HttpWebResponse to mimic what happens in the browser in C#?
Would it be possible to use a web browser component inside a C# app, populate the username and password fields and submit the login?
I've done this kind of thing before, and ended up with a nice toolkit for writing these types of applications. I've used this toolkit to handle non-trivial back-n-forth web requests, so it's entirely possible, and not extremely difficult.
I found out quickly that doing the HttpWebRequest/HttpWebResponse from scratch really was lower-level than I wanted to be dealing with. My tools are based entirely around the HtmlAgilityPack by Simon Mourier. It's an excellent toolset. It does a lot of the heavy lifting for you, and makes parsing of the fetched HTML really easy. If you can rock XPath queries, the HtmlAgilityPack is where you want to start. It handles poorly foormed HTML quite well too!
You still need a good tool to help debug. Besides what you have in your debugger, being able to inspect the http/https traffic as it goes back-n-forth across the wire is priceless. Since you're code is going to be making these requests, not your browser, FireBug isn't going to be of much help debugging your code. There's all sorts of packet sniffer tools, but for HTTP/HTTPS debugging, I don't think you can beat the ease of use and power of Fiddler 2. The newest version even comes with a plugin for firefox to quickly divert requests through fiddler and back. Because it can also act as a seamless HTTPS proxy you can inspect your HTTPS traffic as well.
Give 'em a try, I'm sure they'll be two indispensable tools in your hacking.
Update: Added the below code example. This is pulled from a not-much-larger "Session" class that logs into a website and keeps a hold of the related cookies for you. I choose this because it does more than a simple 'please fetch that web page for me' code, plus it has a line-or-two of XPath querying against the final destination page.
public bool Connect() {
if (string.IsNullOrEmpty(_Username)) { base.ThrowHelper(new SessionException("Username not specified.")); }
if (string.IsNullOrEmpty(_Password)) { base.ThrowHelper(new SessionException("Password not specified.")); }
_Cookies = new CookieContainer();
HtmlWeb webFetcher = new HtmlWeb();
webFetcher.UsingCache = false;
webFetcher.UseCookies = true;
HtmlWeb.PreRequestHandler justSetCookies = delegate(HttpWebRequest webRequest) {
SetRequestHeaders(webRequest, false);
return true;
};
HtmlWeb.PreRequestHandler postLoginInformation = delegate(HttpWebRequest webRequest) {
SetRequestHeaders(webRequest, false);
// before we let webGrabber get the response from the server, we must POST the login form's data
// This posted form data is *VERY* specific to the web site in question, and it must be exactly right,
// and exactly what the remote server is expecting, otherwise it will not work!
//
// You need to use an HTTP proxy/debugger such as Fiddler in order to adequately inspect the
// posted form data.
ASCIIEncoding encoding = new ASCIIEncoding();
string postDataString = string.Format("edit%5Bname%5D={0}&edit%5Bpass%5D={1}&edit%5Bform_id%5D=user_login&op=Log+in", _Username, _Password);
byte[] postData = encoding.GetBytes(postDataString);
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.ContentLength = postData.Length;
webRequest.Referer = Util.MakeUrlCore("/user"); // builds a proper-for-this-website referer string
using (Stream postStream = webRequest.GetRequestStream()) {
postStream.Write(postData, 0, postData.Length);
postStream.Close();
}
return true;
};
string loginUrl = Util.GetUrlCore(ProjectUrl.Login);
bool atEndOfRedirects = false;
string method = "POST";
webFetcher.PreRequest = postLoginInformation;
// this is trimmed...this was trimmed in order to handle one of those 'interesting'
// login processes...
webFetcher.PostResponse = delegate(HttpWebRequest webRequest, HttpWebResponse response) {
if (response.StatusCode == HttpStatusCode.Found) {
// the login process is forwarding us on...update the URL to move to...
loginUrl = response.Headers["Location"] as String;
method = "GET";
webFetcher.PreRequest = justSetCookies; // we only need to post cookies now, not all the login info
} else {
atEndOfRedirects = true;
}
foreach (Cookie cookie in response.Cookies) {
// *snip*
}
};
// Real work starts here:
HtmlDocument retrievedDocument = null;
while (!atEndOfRedirects) {
retrievedDocument = webFetcher.Load(loginUrl, method);
}
// ok, we're fully logged in. Check the returned HTML to see if we're sitting at an error page, or
// if we're successfully logged in.
if (retrievedDocument != null) {
HtmlNode errorNode = retrievedDocument.DocumentNode.SelectSingleNode("//div[contains(#class, 'error')]");
if (errorNode != null) { return false; }
}
return true;
}
public void SetRequestHeaders(HttpWebRequest webRequest) { SetRequestHeaders(webRequest, true); }
public void SetRequestHeaders(HttpWebRequest webRequest, bool allowAutoRedirect) {
try {
webRequest.AllowAutoRedirect = allowAutoRedirect;
webRequest.CookieContainer = _Cookies;
// the rest of this stuff is just to try and make our request *look* like FireFox.
webRequest.UserAgent = #"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3";
webRequest.Accept = #"text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
webRequest.KeepAlive = true;
webRequest.Headers.Add(#"Accept-Language: en-us,en;q=0.5");
//webRequest.Headers.Add(#"Accept-Encoding: gzip,deflate");
}
catch (Exception ex) { base.ThrowHelper(ex); }
}
Here is how i solved it:
public partial class Form1 : Form {
private string LoginUrl = "/apilogin/login";
private string authorizeUrl = "/apilogin/authorize";
private string doneUrl = "/apilogin/done";
public Form1() {
InitializeComponent();
this.Load += new EventHandler(Form1_Load);
}
void Form1_Load(object sender, EventArgs e) {
PhotobucketNet.Photobucket pb = new Photobucket("pubkey","privatekey");
string url = pb.GenerateUserLoginUrl();
webBrowser1.Url = new Uri(url);
webBrowser1.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);
}
void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
if (e.Url.AbsolutePath.StartsWith(LoginUrl))
{
webBrowser1.Document.GetElementById("usernameemail").SetAttribute("Value","some username");
webBrowser1.Document.GetElementById("password").SetAttribute("Value","some password");
webBrowser1.Document.GetElementById("login").InvokeMember("click");
}
if (e.Url.AbsolutePath.StartsWith(authorizeUrl))
{
webBrowser1.Document.GetElementById("allow").InvokeMember("click");
}
if (e.Url.AbsolutePath.StartsWith(doneUrl))
{
string token = webBrowser1.Document.GetElementById("oauth_token").GetAttribute("value");
}
}
}
the token capture in the last if block is what is needed to continue using the API. This method works fine for me as of course the code that needs this will be running on windows so i have no problem spawning a process to load this separate app to extract the token.
It is possible to use the native WebbrowserControl to login into websites. But as u see in the example u'll have to identify the name of the control before.
private void webBrowserLogin_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if (webBrowserLogin.Url.ToString() == WebSiteUrl)
{
foreach (HtmlElement elem in webBrowserLogin.Document.All)
{
if (elem.Name == "user_name") // name of the username input
{
elem.InnerText = UserName;
}
if (elem.Name == "password") // name of the password input
{
elem.InnerText = Password;
}
}
foreach (HtmlElement elem in webBrowserLogin.Document.All)
{
if (elem.GetAttribute("value") == "Login")
{
elem.InvokeMember("Click");
}
}
}
}
Check out Rohit's BrowserSession class which he's described here (and part 2 here). Based on HtmlAgilityPack but does some of the boring work of populating POST data from a FORM.
I have downloaded Privoxy few weeks ago and for the fun I was curious to know how a simple version of it can be done.
I understand that I need to configure the browser (client) to send request to the proxy. The proxy send the request to the web (let say it's a http proxy). The proxy will receive the answer... but how can the proxy send back the request to the browser (client)?
I have search on the web for C# and http proxy but haven't found something that let me understand how it works behind the scene correctly. (I believe I do not want a reverse proxy but I am not sure).
Does any of you have some explication or some information that will let me continue this small project?
Update
This is what I understand (see graphic below).
Step 1 I configure the client (browser) for all request to be send to 127.0.0.1 at the port the Proxy listen. This way, request will be not sent to the Internet directly but will be processed by the proxy.
Step2 The proxy see a new connection, read the HTTP header and see the request he must executes. He executes the request.
Step3 The proxy receive an answer from the request. Now he must send the answer from the web to the client but how???
Useful link
Mentalis Proxy : I have found this project that is a proxy (but more that I would like). I might check the source but I really wanted something basic to understand more the concept.
ASP Proxy : I might be able to get some information over here too.
Request reflector : This is a simple example.
Here is a Git Hub Repository with a Simple Http Proxy.
I wouldn't use HttpListener or something like that, in that way you'll come across so many issues.
Most importantly it'll be a huge pain to support:
Proxy Keep-Alives
SSL won't work (in a correct way, you'll get popups)
.NET libraries strictly follows RFCs which causes some requests to fail (even though IE, FF and any other browser in the world will work.)
What you need to do is:
Listen a TCP port
Parse the browser request
Extract Host connect to that host in TCP level
Forward everything back and forth unless you want to add custom headers etc.
I wrote 2 different HTTP proxies in .NET with different requirements and I can tell you that this is the best way to do it.
Mentalis doing this, but their code is "delegate spaghetti", worse than GoTo :)
I have recently written a light weight proxy in c# .net using TcpListener and TcpClient.
https://github.com/titanium007/Titanium-Web-Proxy
It supports secure HTTP the correct way, client machine needs to trust root certificate used by the proxy. Also supports WebSockets relay. All features of HTTP 1.1 are supported except pipelining. Pipelining is not used by most modern browsers anyway. Also supports windows authentication (plain, digest).
You can hook up your application by referencing the project and then see and modify all traffic. (Request and response).
As far as performance, I have tested it on my machine and works without any noticeable delay.
You can build one with the HttpListener class to listen for incoming requests and the HttpWebRequest class to relay the requests.
Proxy can work in the following way.
Step1, configure client to use proxyHost:proxyPort.
Proxy is a TCP server that is listening on proxyHost:proxyPort.
Browser opens connection with Proxy and sends Http request.
Proxy parses this request and tries to detect "Host" header. This header will tell Proxy where to open connection.
Step 2: Proxy opens connection to the address specified in the "Host" header. Then it sends HTTP request to that remote server. Reads response.
Step 3: After response is read from remote HTTP server, Proxy sends the response through an earlier opened TCP connection with browser.
Schematically it will look like this:
Browser Proxy HTTP server
Open TCP connection
Send HTTP request ----------->
Read HTTP header
detect Host header
Send request to HTTP ----------->
Server
<-----------
Read response and send
<----------- it back to the browser
Render content
If you are just looking to intercept the traffic, you could use the fiddler core to create a proxy...
http://fiddler.wikidot.com/fiddlercore
run fiddler first with the UI to see what it does, it is a proxy that allows you to debug the http/https traffic. It is written in c# and has a core which you can build into your own applications.
Keep in mind FiddlerCore is not free for commercial applications.
Agree to dr evil
if you use HTTPListener you will have many problems, you have to parse requests and will be engaged to headers and ...
Use tcp listener to listen to browser requests
parse only the first line of the request and get the host domain and port to connect
send the exact raw request to the found host on the first line of browser request
receive the data from the target site(I have problem in this section)
send the exact data received from the host to the browser
you see you dont need to even know what is in the browser request and parse it, only get the target site address from the first line
first line usually likes this
GET http://google.com HTTP1.1
or
CONNECT facebook.com:443 (this is for ssl requests)
Things have become really easy with OWIN and WebAPI. In my search for a C# Proxy server, I also came across this post http://blog.kloud.com.au/2013/11/24/do-it-yourself-web-api-proxy/ . This will be the road I'm taking.
Socks4 is a very simple protocol to implement. You listen for the initial connection, connect to the host/port that was requested by the client, send the success code to the client then forward the outgoing and incoming streams across sockets.
If you go with HTTP you'll have to read and possibly set/remove some HTTP headers so that's a little more work.
If I remember correctly, SSL will work across HTTP and Socks proxies. For a HTTP proxy you implement the CONNECT verb, which works much like the socks4 as described above, then the client opens the SSL connection across the proxied tcp stream.
For what it's worth, here is a C# sample async implementation based on HttpListener and HttpClient (I use it to be able to connect Chrome in Android devices to IIS Express, that's the only way I found...).
And If you need HTTPS support, it shouldn't require more code, just certificate configuration: Httplistener with HTTPS support
// define http://localhost:5000 and http://127.0.0.1:5000/ to be proxies for http://localhost:53068
using (var server = new ProxyServer("http://localhost:53068", "http://localhost:5000/", "http://127.0.0.1:5000/"))
{
server.Start();
Console.WriteLine("Press ESC to stop server.");
while (true)
{
var key = Console.ReadKey(true);
if (key.Key == ConsoleKey.Escape)
break;
}
server.Stop();
}
....
public class ProxyServer : IDisposable
{
private readonly HttpListener _listener;
private readonly int _targetPort;
private readonly string _targetHost;
private static readonly HttpClient _client = new HttpClient();
public ProxyServer(string targetUrl, params string[] prefixes)
: this(new Uri(targetUrl), prefixes)
{
}
public ProxyServer(Uri targetUrl, params string[] prefixes)
{
if (targetUrl == null)
throw new ArgumentNullException(nameof(targetUrl));
if (prefixes == null)
throw new ArgumentNullException(nameof(prefixes));
if (prefixes.Length == 0)
throw new ArgumentException(null, nameof(prefixes));
RewriteTargetInText = true;
RewriteHost = true;
RewriteReferer = true;
TargetUrl = targetUrl;
_targetHost = targetUrl.Host;
_targetPort = targetUrl.Port;
Prefixes = prefixes;
_listener = new HttpListener();
foreach (var prefix in prefixes)
{
_listener.Prefixes.Add(prefix);
}
}
public Uri TargetUrl { get; }
public string[] Prefixes { get; }
public bool RewriteTargetInText { get; set; }
public bool RewriteHost { get; set; }
public bool RewriteReferer { get; set; } // this can have performance impact...
public void Start()
{
_listener.Start();
_listener.BeginGetContext(ProcessRequest, null);
}
private async void ProcessRequest(IAsyncResult result)
{
if (!_listener.IsListening)
return;
var ctx = _listener.EndGetContext(result);
_listener.BeginGetContext(ProcessRequest, null);
await ProcessRequest(ctx).ConfigureAwait(false);
}
protected virtual async Task ProcessRequest(HttpListenerContext context)
{
if (context == null)
throw new ArgumentNullException(nameof(context));
var url = TargetUrl.GetComponents(UriComponents.SchemeAndServer, UriFormat.Unescaped);
using (var msg = new HttpRequestMessage(new HttpMethod(context.Request.HttpMethod), url + context.Request.RawUrl))
{
msg.Version = context.Request.ProtocolVersion;
if (context.Request.HasEntityBody)
{
msg.Content = new StreamContent(context.Request.InputStream); // disposed with msg
}
string host = null;
foreach (string headerName in context.Request.Headers)
{
var headerValue = context.Request.Headers[headerName];
if (headerName == "Content-Length" && headerValue == "0") // useless plus don't send if we have no entity body
continue;
bool contentHeader = false;
switch (headerName)
{
// some headers go to content...
case "Allow":
case "Content-Disposition":
case "Content-Encoding":
case "Content-Language":
case "Content-Length":
case "Content-Location":
case "Content-MD5":
case "Content-Range":
case "Content-Type":
case "Expires":
case "Last-Modified":
contentHeader = true;
break;
case "Referer":
if (RewriteReferer && Uri.TryCreate(headerValue, UriKind.Absolute, out var referer)) // if relative, don't handle
{
var builder = new UriBuilder(referer);
builder.Host = TargetUrl.Host;
builder.Port = TargetUrl.Port;
headerValue = builder.ToString();
}
break;
case "Host":
host = headerValue;
if (RewriteHost)
{
headerValue = TargetUrl.Host + ":" + TargetUrl.Port;
}
break;
}
if (contentHeader)
{
msg.Content.Headers.Add(headerName, headerValue);
}
else
{
msg.Headers.Add(headerName, headerValue);
}
}
using (var response = await _client.SendAsync(msg).ConfigureAwait(false))
{
using (var os = context.Response.OutputStream)
{
context.Response.ProtocolVersion = response.Version;
context.Response.StatusCode = (int)response.StatusCode;
context.Response.StatusDescription = response.ReasonPhrase;
foreach (var header in response.Headers)
{
context.Response.Headers.Add(header.Key, string.Join(", ", header.Value));
}
foreach (var header in response.Content.Headers)
{
if (header.Key == "Content-Length") // this will be set automatically at dispose time
continue;
context.Response.Headers.Add(header.Key, string.Join(", ", header.Value));
}
var ct = context.Response.ContentType;
if (RewriteTargetInText && host != null && ct != null &&
(ct.IndexOf("text/html", StringComparison.OrdinalIgnoreCase) >= 0 ||
ct.IndexOf("application/json", StringComparison.OrdinalIgnoreCase) >= 0))
{
using (var ms = new MemoryStream())
{
using (var stream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false))
{
await stream.CopyToAsync(ms).ConfigureAwait(false);
var enc = context.Response.ContentEncoding ?? Encoding.UTF8;
var html = enc.GetString(ms.ToArray());
if (TryReplace(html, "//" + _targetHost + ":" + _targetPort + "/", "//" + host + "/", out var replaced))
{
var bytes = enc.GetBytes(replaced);
using (var ms2 = new MemoryStream(bytes))
{
ms2.Position = 0;
await ms2.CopyToAsync(context.Response.OutputStream).ConfigureAwait(false);
}
}
else
{
ms.Position = 0;
await ms.CopyToAsync(context.Response.OutputStream).ConfigureAwait(false);
}
}
}
}
else
{
using (var stream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false))
{
await stream.CopyToAsync(context.Response.OutputStream).ConfigureAwait(false);
}
}
}
}
}
}
public void Stop() => _listener.Stop();
public override string ToString() => string.Join(", ", Prefixes) + " => " + TargetUrl;
public void Dispose() => ((IDisposable)_listener)?.Dispose();
// out-of-the-box replace doesn't tell if something *was* replaced or not
private static bool TryReplace(string input, string oldValue, string newValue, out string result)
{
if (string.IsNullOrEmpty(input) || string.IsNullOrEmpty(oldValue))
{
result = input;
return false;
}
var oldLen = oldValue.Length;
var sb = new StringBuilder(input.Length);
bool changed = false;
var offset = 0;
for (int i = 0; i < input.Length; i++)
{
var c = input[i];
if (offset > 0)
{
if (c == oldValue[offset])
{
offset++;
if (oldLen == offset)
{
changed = true;
sb.Append(newValue);
offset = 0;
}
continue;
}
for (int j = 0; j < offset; j++)
{
sb.Append(input[i - offset + j]);
}
sb.Append(c);
offset = 0;
}
else
{
if (c == oldValue[0])
{
if (oldLen == 1)
{
changed = true;
sb.Append(newValue);
}
else
{
offset = 1;
}
continue;
}
sb.Append(c);
}
}
if (changed)
{
result = sb.ToString();
return true;
}
result = input;
return false;
}
}
The browser is connected to the proxy so the data that the proxy gets from the web server is just sent via the same connection that the browser initiated to the proxy.