How can I scrape instagram followers? - c#

there are a lot of websites where you can get a list of all followers from an Instagram profile. For example the profile of Jennifer Lopez. If I click on followers and scroll the hole list down, I only see round about 1000 users. Is there any way to get a list of all followers, or something in the range between 10 thousand and 100 thousand users? How do the others do it?
Here are a few pages where it seems to work:
crowdbabble
Instagram Scraper
magimetrics
I would be very grateful if you could help me!

I believe most of the pages you are seeing is using the Instagram API (or the method described below). However, that is a bit hard to get access to without an application that they are happy with. As far as I have understood it, you will have to make the application before you know if you will have access, which is a bit stupid. I guess they are trying to stop new users from using it while they keep letting the people already using it keep using it.
The documentation for their API seems to be missing a lot of what was available earlier, and right now there is no endpoint to get followers(that might be something temporarily wrong with the documentation page: https://www.instagram.com/developer/endpoints/).
You could get the followers the same way the Instagram webpage is doing it. However, it seems only to work if you request up to around 5000-6000 followers at a time, and you might get rate limited.
They are making a GET request to: https://www.instagram.com/graphql/query/ with the query parameters query_hash and variables.
The query_hash I guess is a hash of the variables. However, I might be wrong since it will keep working even tho you change the variables. The same hash might not work forever, so its possible you would have to get the same way the Instagram page is doing it. You will get that even tho you are not logged in, so I would not think it would be very hard.
The variables parameter is an URL encoded JSON object containing your search variables.
The JSON should look like this:
{
"id":"305701719",
"first":20
}
The id is the user's id. The first is the number of followers you want.
The URL would look like this when you encode it. https://www.instagram.com/graphql/query/?query_hash=bfe6fc64e0775b47b311fc0398df88a9&variables=%7B%22id%22%3A%22305701719%22%2C%22first%22%3A20%7D
This will return a json object like this:
"data": {
"user": {
"edge_followed_by": {
"count": 73785285,
"page_info": {
"has_next_page": true,
"end_cursor": "AQDJzGlG3jGfM6KGYF7oOhlMqDm9_-db8DW_8gKYTeKO5eIca7cRqL1ODK1SsMA33BYBbAZz3BdC3ImMT79a1YytB1j9z7f-ZaTIkQKEoBGepA"
},
"edges": [
{
"node": {}
}
]
}
}
}
The edges array will contain a list of node elements containg user info about people that are following the person you where searching for.
To get the next x number of followers, you would have to change the json used in the variables query to something like this:
{
"id":"305701719",
"first":10,
"after":"AQDJzGlG3jGfM6KGYF7oOhlMqDm9_-db8DW_8gKYTeKO5eIca7cRqL1ODK1SsMA33BYBbAZz3BdC3ImMT79a1YytB1j9z7f-ZaTIkQKEoBGepA"
}
after would be what you received as an end_cursor in the previous request.
and your new URL would look like this: https://www.instagram.com/graphql/query/?query_hash=bfe6fc64e0775b47b311fc0398df88a9&variables=%7B%22id%22%3A%22305701719%22%2C%22first%22%3A10%2C%22after%22%3A%22AQDJzGlG3jGfM6KGYF7oOhlMqDm9_-db8DW_8gKYTeKO5eIca7cRqL1ODK1SsMA33BYBbAZz3BdC3ImMT79a1YytB1j9z7f-ZaTIkQKEoBGepA%22%7D
This way you can keep looping until has_next_page is false in the response.

EDIT 23/08/2018
Instagram seems to have blocked any scrolling/query hash request to get followers list/likers list on a post, at least on desktop, even for your own account.
https://developers.facebook.com/blog/post/2018/01/30/instagram-graph-api-updates/
It should be although still possible from a phone, maybe following a Selenium-like for mobile, using Appmium : http://appium.io/
Maybe some reverse app engineering may also be the key, if any idea from that side :
https://www.blackhatworld.com/seo/journey-instagram-app-reverse-engineer.971468/
EDIT 25/08/2018
It seems to be back... if any information about it ?

Related

Sending multipe arrays to a "server" (C#)

Let's say i have a client, and a server. I want to send to the server two arrays:
Username (which is "a", for this example)
Password (which is "b", for this example)
I'm using this code to send to the server:
stream.Write (userd, 0, userd.GetLength);
stream.Write (passd, 0, passd.GetLength);
server side is :
The problem is that the output i get in the server side is "ab" both user and password, because i can't seperate between the password bytes and the user bytes, for it is all sent in one stream (i got it right?).
How can i do it properly? :O
This question is a little broad, but here goes. Basically, though, you have a number of options, and you just have to pick one and run with it. I mean, there are advantages and disadvantages to certain approaches, but you can work those out more easily than I can guess what you're doing this for.
You'll want to worry about security if you're doing something like this, but that's far out of the scope of your question, so I'll just assume you've got it covered.
These are just a few options off the top of my head.
Use a Delimiter
If you went with this, you'd have a single character that you know on the server and client, and can guarantee will never appear in the username (or you could get into escaping, if need be). If you chose a colon, for instance, you'd then send the server:
username:password
And the server could use string.Split(':') or equivalent to work out the arguments.
Use Fixed Width
Again, set up a contract, but here you have a certain number of characters that the username will take up no matter what, and will never exceed.
username password
Then you can grab the string.Substring(...) to find the arguments.
HTTP
This is a big more complicated, but the Authorization header of an HTTP request uses a colon-delimiter like I originally mentioned. If you standardized to use HTTP for all requests, it might look something like this, with a bit of pseudocode.
GET /path HTTP/1.1
Authorization: BASIC [base64(username:password)]
JSON / XML
JSON and XML are formats for sending and storing data.
JSON would look something like this:
{ "username" : "thisIsTheUsername", "password" : "password01" }
XML would look something like this:
<creds>
<username>thisIsTheUsername</username>
<password>password01</password>
</creds>
Can you serialize object to binary and send a stream on a server?
Then on a server deserialize binary stream to a object.

Unity NullReference when trying to use ParseFacebookUtils to log in to parse

I'm not to familiar with the SO user tags, so I hope that this works: #aaron
This is the closest question that I could find that relates to my issue, but it's not exactly the issue. (I tried Google, Bing, and SO's own search.)
https://stackoverflow.com/questions/25014006/nullreferenceexception-with-parseobjects-in-array-of-pointers
My issue: I have a Unity Web-Player game that interfaces with both Facebook and Parse. after resolving many issues in it, I have it to where it will easily connect to Facebook, pull in the user's profile information and picture. It then attempts to connect to parse to log the user into parse to retrieve their game related data (like high scores, currency stats, power ups, etc.) and when it tries to do that, I get a NullReferenceException. The specific contents of the error message is:
"NullReferenceException: Object reference not set to an instance of an object
at GameStateController.ParseFBConnect (System.String userId, System.String accessToken, DateTime tokenExpiration) [0x0001a] in C:...\Assets\Scripts\CSharp\CharacterScripts\GameStateController.cs:1581
at GameStateController.Update () [0x0011f] in C:\Users\Michieal\Desktop\Dragon Rush Game\Assets\Scripts\CSharp\CharacterScripts\GameStateController.cs:382"
The code that generates this error message is:
public void ParseFBConnect(string userId, string accessToken, DateTime tokenExpiration)
{
Task<ParseUser> logInTask = ParseFacebookUtils.LogInAsync (userId, accessToken, tokenExpiration).ContinueWith<ParseUser> (t =>
{
if (t.IsFaulted || t.IsCanceled)
{
if (t.IsCanceled)
Util.LogError ("LoginTask::ParseUser:: Cancelled. >.<");
// The login failed. Check the error to see why.
Util.LogError ("Error Result: " + t.Result.ToString ());
Util.LogError ("Error Result (msg): " + t.Exception.Message);
Util.LogError ("Error Result (inmsg): " + t.Exception.InnerException.Message);
}
if (t.IsCompleted)
{ // No need to link the user to a FB account, as there are no "real" (non fb) accounts yet.
Util.Log ("PFBC::Login result reports successful. You Go Gurl!");
// Login was successful.
user = t.Result; // assign the resultant user to our "user"...
RetryPFBC = false;
return user;
}
return t.Result;
});
if (user.ContainsKey ("NotNew"))
{ // on true, then we don't have to set up the keys...
Util.Log ("User, " + user.Username + ", contains NotNew Key.");
}
else
{
CreateKeys (); // Create Keys will only build MISSING Keys, setting them to the default data specifications.
user.Email = Email;
user.SaveAsync (); // if we have created the keys data, save it out.
}
}
It is being passed the proper (post authenticated) Facebook values (FB.UserId, FB.AccessToken, FB.AccessTokenExpiresAt) in that order. I'm using FB Sdk version 6.0.0 and Parse Unity SDK version 1.2.16.
In the log file, instead of any of the debug.log/Util.log comments, it does the "Null Reference" error (above), followed by "About to parse url: https://api.parse.com/1/classes/_User
Checking if https://api.parse.com/1/classes/_User is a valid domain
Checking request-host: api.parse.com against valid domain: *"
And that is where it just stopped. So, I built a simple retry block in the Update() function to call the ParseFBConnect() function every 10 or so seconds. Which, seems to only fill up my log file with the same error sets. After searching across the internet for help, I tried changing the FB.AccessTokenExpiresAt to DateTime.UtcNow.AddDays(1) as others have said that this works for them. I cannot seem to get either to work for me. When I check the Dashboard in Parse to see if it shows any updates or activity, it doesn't, and hasn't for a few days now. The Script Execution Order is as follows:
Parse.ParseInitialzeBehaviour -- -2875 (very first thing)
Facebook loaders (FB, Editor, Canvas, etc) -- -1000 to -900
GameStateController -- -875
...
So, I know that the Parse.ParseInitializeBehaviour is being loaded first (another result of searching), and I have tracked the NullReference down to the Parse.Unity.dll by changing where the login method is stored; The GSC is on the player (the Player starts in the splash screen and remains throughout the entire game). I have also tried placing the Parse.ParseInitializeBehaviour on the Player gameobject and on an empty gameobject (with a simple dontdestroy(gameObject) script on that). And, I have my Application ID and my DotNet Key correctly filled in on the object.
When I first set up parse, and on the current production version of the game, it can successfully create a user account based off of the code snippet above. Now, it just breaks at the trying to parse the url...
So, My Question is: What am I doing wrong? Does anyone else have this issue? and does anyone have any ideas/suggestions/etc., on what to do to fix this? I really just want this to work, and to log the user in, so that I can grab their data and go on to breaking other things in my game :D
All help is appreciated!! and I thank you for taking the time to read this.
The Answer to this question is as follows:
backend as a service - Bing
Ditch parse because it doesn't work well with Unity3d, the support (obviously, judging by the sheer number answers to this question and other questions from people that need help getting it to work) is extremely lacking, and most, if not all, of the examples to show how to use it are broken / missing parts / etc. I see this as a failure on the part of the creators. I also see it as False Advertising. So, to answer this question - PARSE is a waste of time, cut to the chase and grab something that does work, has real, WORKING examples, and that is maintained by someone that knows HOW TO PROGRAM in the UNITY3d environment.
A great example of this would be the Photon Networking service, App42 by Shephertz.com, Azure by Microsoft, etc. With the overwhelming number of answers to this, if you exclude PARSE, This should have been a no-brainer. I've included a Bing Search link, you can also do a quick search for Gaming Backends (you'll find blogs and reviews.)
I suggest strongly, that you read the reviews on these services, read the questions (and make note of whether they have been answered, or, in the case of Parse, if they simply closed the forum rather than answering their customers)... And, just bounce around through the examples that they give for using their products. If they show how it works, and expected results (like the MSDN does) you'll have a lot better and a lot easier time getting it work.
FYI - My game now has all of the back end integrations, saves/creates/gets users & their data, has a store, and is moving right along, since I dropped PARSE.
To make things correct, I WAS using Parse 1.2.16 for unity, and the Facebook for Unity SDK 6.0.0... I tried Parse 1.2.14, with the same results, and I tried both Unity 4.5.2 and 4.5.3. My opinion of Parse is based off of their documentation, these 2 API versions, and the very long days that I pulled trying to get it to work. I then added in the fact that NO ONE here answered this question AT ALL (Sadly, Stack Over flow is Facebook's preferred Help/Support forum. They even say in their blogs, help documentation, etc., that if you have an issue - post your question on Stack Overflow.com with relevant tags. Which, I did.)
Oh, and the final star rating of Parse (to clarify it) is "-2 out of 5" -- That's a NEGATIVE 2 stars... out of 5. as in, run away from it... far, far away. Just to clarify.

Traversing Pages with since_id and max_id

I am currently migrating code over to Twitters 1.1. Previously I was doing a series of GET requests for search/tweets based traversing pages using the page parameter. However with 1.1 you have to use since_id and max_id. While I understand the idea behind the two params I am wondering what is the preferred way of getting say 500 tweets (or N) tweets using these param options.
Currently I am doing a get request with a blank since_id param, I then set this param to the last str_id of the tweets I got back. So for my next iteration of get requests I have a since_id equal to the last id of the last tweet iv got. Really not sure if im doing it right.
Anyone know a good solution to traversing pages using these two params?
Basically, you work up to newer (more recent tweets) from since_id. You work back to older tweets from max_id. You can use the parameter count to size the timeline. The returned count may constrained by rate limits. Hope that helps!
Refer:
See this document for working with timelines here: https://dev.twitter.com/docs/working-with-timelines
Or try building your query using the Twitter dev console here: https://dev.twitter.com/console

Best Way To Show Related Content Using C#

I work for a high traffic content website that offers news, photos & videos (amongst other things). One thing we struggle with is perfecting our "related items" module. If you are viewing a video, we show a list of 5 related videos. This applies to blog posts, articles, etc... Our team does a good job tagging the content w/ keywords as well as relating it to an appropriate category, and associating it with items of other content types, but whatever mechanism we try when displaying "related content" it's never close to 100% accurate.
Are there any tried and true ways to get fairly accurate results based on the tagged keyword, title, or category name? Our site is .net (c#) and uses SQL 2005. Let me know if you need me to elaborate.
Thanks!
It's already great that you are using tags to categorize your items.
This can be either very powerful or very weak depending on the tags you are using.
First of all: Make sure you are using meaningful tag names.
[ Bad ones: C#1, C#1.0, Ruby1, Ruby-1 and so on ]
[ Good ones: C#1, C#2, C#3, Ruby1, Ruby2 and so on ]
You can now build your GetRelatedItmesList method which of course is Generic and do checks in that.
For example something like this:
List<T> GetRelatedItemsList<T> (T item) where T : IOurMediaItem // I used an interface here because I like them :P - it can also be a class.
{
if (item.TagCount == 1)
{
// Get related items with the same tag and based on some keywords in title
}
else
{
// First: Get all items with exactly the tags
// Second Get all items with relating title and append it to the list
}
}
Either way, you can also do a switch() on the item.TagCount property / method.
Using tags is the easiest way I see to get these kind of functionality, for example, if you are showing the article or video that is having arts, fun tags, you can load top 3 videos of arts and fun under the related video section.
I use Tags because they are much more flexible, but use strict rules when Tags are saved.
I hope this make sense

Techniques to make autocomplete on website more responsive

In my website's advanced search screen there are about 15 fields that need an autocomplete field.
Their content is all depending on each other's value (so if one is filled in, the other's content will change depending on the first's value).
Most of the fields have a huge amount of possibilities (1000's of entries at least).
Currently make an ajax call if the user stops typing for half a second. This ajax call makes a quick call to my Lucene index and returns a bunch of JSon objects. The method itself is really fast, but it's the connection and transferring of data that is too slow.
If I look at other sites (say facebook), their autocomplete is instant. I figure they put the possible values in their HTML, so they don't have to do a round trip. But I fear with the amounts of data I'm handling, this is not an option.
Any ideas?
Return only top x results.
Get some trends about what users are picking,
and order based on that, preferably
automatically.
Cache results for every URL & keystroke combination,
so that you don't have to round-trip
if you've already fetched the result
before.
Share this cache with all
autocompletes that use the same URL
& keystroke combination.
Of course,
enable gzip compression for the
JSON, and ensure you're setting your
cache headers to cache for some
time. The time depends on your rate
of change of autocomplete response.
Optimize the JSON to send down the
bare minimum. Don't send down
anything you don't need.
Are you returning ALL results for the possibilities or just the top 10 as json objects.
I notice a lot of people send large numbers of results back to the screen, but then only show the first few. By sending back small numbers of results, you can reduce the data transfer.
Return the top "X" results, rather than the whole list, to cut back on the number of options? You might also want to try and put in some trending to track what users pick from the list so you can try and make the top "X" the most used/most relvant. You could always return your most relevant list first, then return the full list if they are still struggling.
In addition to limiting the set of results to a top X set consider enabling caching on the responses of the AJAX requests (which means using GET and keeping the URL simple).
Its amazing how often users will backspace then end up retyping exactly the same content. Also by allowing public and server-side caching your could speed up the overall round-trup time.
Cache the results in System.Web.Cache
Use a Lucene cache
Use GET not POST as IE caches this
Only grab a subset of results (10 as people suggest)
Try a decent 3rd party autocomplete widget like the YUI one
Returning the top-N entries is a good approach. But if you want/have to return all the data, I would try and limit the data being sent and the JSON object itself.
For instance:
"This Here Company With a Long Name" becomes "This Here Company..." (you put the dots in the name client side--again; transfer a minimum of data).
And as far as the JSON object goes:
{n: "This Here Company", v: "1"}
... Where "n" would be the name and "v" would be the value.

Categories