Tool for parsing querystrings, creating URL encoded params etc? - c#

before writing my own :-)
I was wondering if anyone knows a tool to parse a URL and extract all the params into a easier viewable format, a grid maybe? (these urls are extremely long :- )
And also if it allows you to construct the querystring for standard text and automates the URL ENCODING etc.
Must be one available, i have searched high and low and can't find anything.
Thanks in advance

The ParseQueryString method is pretty handy for those tasks.
I was wondering if anyone knows a tool to parse a URL and extract all
the params into a easier viewable format, a grid maybe? (these urls
are extremely long :- )
using System;
using System.Web;
class Program
{
static void Main()
{
var uri = new Uri("http://foo.com/?param1=value1&param2=value2");
var values = HttpUtility.ParseQueryString(uri.Query);
foreach (string key in values.Keys)
{
Console.WriteLine("key: {0}, value: {1}", key, values[key]);
}
}
}
And also if it allows you to construct the querystring for standard
text and automates the URL ENCODING etc.
using System;
using System.Web;
class Program
{
static void Main()
{
var values = HttpUtility.ParseQueryString(string.Empty);
values["param1"] = "value1";
values["param2"] = "value2";
var builder = new UriBuilder("http://foo.com");
builder.Query = values.ToString();
var url = builder.ToString();
Console.WriteLine(url);
}
}

Related

How can I extract data from this site using HTMLAgilityPack?

I've been following tutorials on how to scrape information using HTMLAgilityPack, here is an example:
using System;
using System.Linq;
using System.Net;
namespace web_scraping_test
{
class Program
{
static void Main(string[] args)
{
HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("http://www.yellowpages.com/search?search_terms=Software&geo_location_terms=Sydney2C+ND");
var names = doc.DocumentNode.SelectNodes("//a[#class='business-name']").ToList();
foreach (var item in names)
{
Console.WriteLine(item.InnerText);
}
}
}
}
This was easy to get the data because there's a common class name and it's simple to get to
I'm trying to use this to scrape information from this site, https://osu.ppy.sh/beatmapsets/354163#osu/780200
but I have no idea about the correct markup to get 'Stitches
Shawn Mendes' and the values given in this diagram:Diagram
For the 'Shawn Mendes' the markup is '<a class="beatmapset-header__details-text beatmapset-header__details-text--artist" href="https://osu.ppy.sh/beatmapsets?q=Shawn%20Mendes">Shawn Mendes</a>'
but I'm not sure about how to implement this into the code. I've replaced the url and have changed the classname but the directory of this text seems a lot more complicated on this site. Any advice would be appreciated, thanks!
All of the details you're looking for appear to be in a JSON object in the markup. There is a script block with the ID "json-beatmapset", if you scrape the content of that, and parse the JSON it contains, it should be smooth sailing after that.

Rocketr IPN with C# ASP.NET MVC

So i am fiddling with this website's IPN function to see if i wan't to incorporate it to some dumb project my friends and i are working on. To be honest i don't know much C# in depth, i'm fairly new to the language (a few months of coding practice).
This is the PHP sample they give out on how to use it:
https://github.com/Rocketr/rocketrnet-ipn-php/blob/master/example_rocketr_ipn.php
I am trying to make a receiver like that in MVC 5. I have the Model setup with a function when the IPN hits the server page to process the request but it seems to just fail out everytime and not write any raw data i am trying to capture to the logs.
// GET: Purchases/Incoming
public void Incoming()
{
var ipnDebugLog = HttpRuntime.AppDomainAppPath + "/Logs/IPN/debug.txt";
var testIPNKey = "the hash here";
byte[] ipnToByes = Encoding.ASCII.GetBytes(testIPNKey); // IPN string to byteto hash with
var recvdIPN = Request["HTTP_IPN_HASH"];
HMACSHA256 testHash = new HMACSHA256(ipnToByes); // Setting testHash to IPN secret string
string ipnHeader = Request["IPN_HASH"];
using (StreamWriter sw = new StreamWriter(ipnDebugLog))
{
sw.WriteLine(ipnHeader);
foreach (var reqHead in ipnHeader)
{
sw.WriteLine(reqHead.ToString());
sw.WriteLine("JSON String: " + Request["HTTP_IPN_SECRET"]);
sw.WriteLine(recvdIPN);
sw.WriteLine("From: " + GetIPAddress());
}
}
}
So this is just me trying to get the data being sent from Rocketr. On the site it states:
To verify the integrity of the payload, we will send an HMAC signature
over a HTTP header called “IPN_HASH”. The HMAC signature will be a
signed json encoded POST object with your IPN secret. You can see how
to verify the signature in the example_rocketr_ipn.php file in this
repository.
Am i just to dumb and new to understand C# to function like this? I feel like i'm on the right track to reading the raw data but i'm probly wrong?
So to sum up the question
Am i doing the incorrect way to read a raw custom HTTP header called IPN_HASH? Going off of the PHP example they used isset to read a server variable header labled HTTP_IPN_HASH right?
So i have to convert this $hmac = hash_hmac("sha512", json_encode($_POST), trim($IPN_SECRET));
Try this (make adjustments as needed/necessary):
using System;
using System.Linq;
using System.Security.Cryptography;
using System.Text;
using System.Web;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json;
namespace Foo
{
[TestClass]
public class UnitTest1
{
[TestMethod]
public void PhpHashTest()
{
string IPN_SECRET = "I-am-the-secret";
//Mocking some HTTP POSTed data
var someFormUrlEncodedData = "order_id=1234&product_title=Hello%20World&product_id=Sku123";
//Mocking json_encode($_POST)
var data = HttpUtility.ParseQueryString(someFormUrlEncodedData);
var dictionary = data.AllKeys.ToDictionary(key => key, key => data[key]);
//{"order_id":"1234","product_title":"Hello World","product_id":"Sku123"}
var json = JsonConvert.SerializeObject(dictionary);
byte[] bytes;
using (var hmac512 = new HMACSHA512(Encoding.ASCII.GetBytes(IPN_SECRET)))
{
bytes = hmac512.ComputeHash(Encoding.ASCII.GetBytes(json));
}
//will contain lower-case hex like Php hash_hmac
var hash = new StringBuilder();
Array.ForEach(bytes, b => hash.Append(b.ToString("x2")));
//Assert that Php output exactly matches implementation
Assert.IsTrue(string.Equals("340c0049bde54aa0d34ea180f8e015c96edfc4d4a6cbd7f139d80df9669237c3d564f10366f3549a61871779c2a20d2512c364ee56af18a25f70b89bd8b07421", hash.ToString(), StringComparison.Ordinal));
Console.WriteLine(hash);
}
}
}
Not a PHP dev - this is my "Php version":
<?php
$IPN_SECRET = 'I-am-the-secret';
# 'order_id=1234&product_title=Hello%20World&product_id=Sku123';
$json = array('order_id' => '1234', 'product_title' => 'Hello World', 'product_id' =>'Sku123');
echo json_encode($json);
echo "<br />";
$hmac = hash_hmac("sha512", json_encode($json), trim($IPN_SECRET));
echo $hmac;
?>
Hth....

To do searching in an array with some keyword

I have an array which is like
books={'java 350','Photoshop 225','php 210','JavaScript 80','python 180','jquery 250'}
my input for search as be like "ph2" it retrieve both Photoshop 225,php 210 in drop-down menu what is the exact string function to do this task or any set codes available to do this task.
I'm using some build in function like
if (array.Any(keyword.Contains))
and
if (array.Contains(keyword))
it's doesn't help what exactly i want any one pls help me to solve this thanks in advance.....
More flexible approach:
using Microsoft.VisualBasic; // add reference
using Microsoft.VisualBasic.CompilerServices;
using System.Linq;
namespace ConsoleApplication
{
class Program
{
static void Main (string[] args)
{
string[] books = { "java 350", "Photoshop 225", "php 210", "JavaScript 80", "python 180", "jquery 250" };
string input = "*" + string.Join ("*", "ph2".ToArray ()) + "*"; // will be *p*h*2*
var matches = books.Where (x => LikeOperator.LikeString (x, input, CompareMethod.Text));
}
}
}

How to replace html tags while writing web scraper in C#?

I am writing console application for web crawling and scraping in C# just for learning purpose only. When result is displayed, some of the values are displayed along with the html tags, infact tags. I figured out the strong tags and replaced them completely. But what if there were many strong tags with different inline styling values?
How could I solve this problem ?
Well the problem is in GetData() function
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Web;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;
namespace MyCrawler
{
public class Program
{
public static string GetContent(string url)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
WebResponse response = request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string line = "";
StringBuilder builder = new StringBuilder();
while ((line = reader.ReadLine()) != null)
{
builder.Append(line.Trim());
}
reader.Close();
return builder.ToString().Replace("\n", "");
}
public static void GetData(string content)
{
// these tags are to be replaced
string ToBeReplaced1 = "<strong style=\"color:#F00\">"; //
string ToBeReplaced2 = "</strong>";
string ToBeReplaced3 = "<strong style=\"color:#000099\">";
// pattern for regular expression
string pattern3 = "<dt>(.*?)</dt><dd>(.*?)</dd>";
Regex regex = new Regex(pattern3);
MatchCollection mc = regex.Matches(content);
foreach(Match m2 in mc)
{
Console.Write(m2.Groups[1].Value);
Console.WriteLine(((m2.Groups[2].Value.Replace(ToBeReplaced3, "")).Replace(ToBeReplaced1, "")).Replace(ToBeReplaced2, ""));
}
Console.WriteLine();
}
public static void Main(string[] args)
{
string url = "http://www.merojob.com/";
string content = GetContent(url);
string pattern = "<div class=\"employername\"><h2>(.*?)</h2><a href=\"(.*?)\"";
Regex regex = new Regex(pattern);
MatchCollection mc = regex.Matches(content);
foreach (Match m in mc)
{
foreach (Capture c in m.Groups[2].Captures)
{
//Console.WriteLine(c.Value); // write the value to the console "pattern"
content = GetContent(c.Value);
GetData(content);
}
}
Console.ReadKey();
}
}
}
Well, if I dont use Replace() function, I end up with :
The best way in your case would be to use a dedicated library, such as HtmlAgilityPack to be able to retrieve specific tags and manipulate the structure of your DOM document. Doing it manually is a recipe for pain. Doing it with regular expressions may endanger your mind so use a library to handle your html
Even if this is for learning purposes only, you are not really using the right tool or exercice to start learning, since this is a really complicated subject.

Cross-platform Localization

With Xamarin Android, it possible to create localized strings for multi-language apps, as is shown in their Android documentation:
http://docs.xamarin.com/guides/android/application_fundamentals/resources_in_android/part_5_-_application_localization_and_string_resources
However, I have various try/catch blocks in my Model which send error messages back as strings. Ideally I'd like to keep the Model and Controller parts of my solution entirely cross platform but I can't see any way to effectively localize the messages without passing a very platform specific Android Context to the Model.
Does anyone have ideas about how this can be achieved?
I'm using .net resource files instead of the Android ones. They give me access to the strings from code, wherever it is.
The only thing I can't do automatically is reference those strings from layouts. To deal with that I've written a quick utility which parses the resx file and creates an Android resource file with the same values. It gets run before the Android project builds so all the strings are in place when it does.
Disclaimer: I haven't actually tested this with multiple languages yet.
This is the code for the utility:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Xml;
namespace StringThing
{
class Program
{
static void Main(string[] args)
{
string sourceFile = args[0];
string targetFile = args[1];
Dictionary<string, string> strings = LoadDotNetStrings(sourceFile);
WriteToTarget(targetFile, strings);
}
static Dictionary<string, string> LoadDotNetStrings(string file)
{
var result = new Dictionary<string, string>();
XmlDocument doc = new XmlDocument();
doc.Load(file);
XmlNodeList nodes = doc.SelectNodes("//data");
foreach (XmlNode node in nodes)
{
string name = node.Attributes["name"].Value;
string value = node.ChildNodes[1].InnerText;
result.Add(name, value);
}
return result;
}
static void WriteToTarget(string targetFile, Dictionary<string, string> strings)
{
StringBuilder bob = new StringBuilder();
bob.AppendLine("<?xml version=\"1.0\" encoding=\"utf-8\"?>");
bob.AppendLine("<resources>");
foreach (string key in strings.Keys)
{
bob.Append(" ");
bob.AppendLine(string.Format("<string name=\"{0}\">{1}</string>", key, strings[key]));
}
bob.AppendLine("</resources>");
System.IO.File.WriteAllText(targetFile, bob.ToString());
}
}
}
For Xamarin, you can also look at Vernacular https://github.com/rdio/vernacular
You can write code with minimal effort without worrying about the translation. Feed the generated IL to Vernacular to get translatable strings in iOS, Andorid, Windows Phone formats.
I've created a slightly ugly solution at Xamarin iOS localization using .NET which you might find helpful.

Categories