Traversing Pages with since_id and max_id

Traversing Pages with since_id and max_id - c#

I am currently migrating code over to Twitters 1.1. Previously I was doing a series of GET requests for search/tweets based traversing pages using the page parameter. However with 1.1 you have to use since_id and max_id. While I understand the idea behind the two params I am wondering what is the preferred way of getting say 500 tweets (or N) tweets using these param options.
Currently I am doing a get request with a blank since_id param, I then set this param to the last str_id of the tweets I got back. So for my next iteration of get requests I have a since_id equal to the last id of the last tweet iv got. Really not sure if im doing it right.
Anyone know a good solution to traversing pages using these two params?

Basically, you work up to newer (more recent tweets) from since_id. You work back to older tweets from max_id. You can use the parameter count to size the timeline. The returned count may constrained by rate limits. Hope that helps!
Refer:
See this document for working with timelines here: https://dev.twitter.com/docs/working-with-timelines
Or try building your query using the Twitter dev console here: https://dev.twitter.com/console

Related

How can I scrape instagram followers?

there are a lot of websites where you can get a list of all followers from an Instagram profile. For example the profile of Jennifer Lopez. If I click on followers and scroll the hole list down, I only see round about 1000 users. Is there any way to get a list of all followers, or something in the range between 10 thousand and 100 thousand users? How do the others do it?
Here are a few pages where it seems to work:
crowdbabble
Instagram Scraper
magimetrics
I would be very grateful if you could help me!

I believe most of the pages you are seeing is using the Instagram API (or the method described below). However, that is a bit hard to get access to without an application that they are happy with. As far as I have understood it, you will have to make the application before you know if you will have access, which is a bit stupid. I guess they are trying to stop new users from using it while they keep letting the people already using it keep using it.
The documentation for their API seems to be missing a lot of what was available earlier, and right now there is no endpoint to get followers(that might be something temporarily wrong with the documentation page: https://www.instagram.com/developer/endpoints/).
You could get the followers the same way the Instagram webpage is doing it. However, it seems only to work if you request up to around 5000-6000 followers at a time, and you might get rate limited.
They are making a GET request to: https://www.instagram.com/graphql/query/ with the query parameters query_hash and variables.
The query_hash I guess is a hash of the variables. However, I might be wrong since it will keep working even tho you change the variables. The same hash might not work forever, so its possible you would have to get the same way the Instagram page is doing it. You will get that even tho you are not logged in, so I would not think it would be very hard.
The variables parameter is an URL encoded JSON object containing your search variables.
The JSON should look like this:
{
"id":"305701719",
"first":20
}
The id is the user's id. The first is the number of followers you want.
The URL would look like this when you encode it. https://www.instagram.com/graphql/query/?query_hash=bfe6fc64e0775b47b311fc0398df88a9&variables=%7B%22id%22%3A%22305701719%22%2C%22first%22%3A20%7D
This will return a json object like this:
"data": {
"user": {
"edge_followed_by": {
"count": 73785285,
"page_info": {
"has_next_page": true,
"end_cursor": "AQDJzGlG3jGfM6KGYF7oOhlMqDm9_-db8DW_8gKYTeKO5eIca7cRqL1ODK1SsMA33BYBbAZz3BdC3ImMT79a1YytB1j9z7f-ZaTIkQKEoBGepA"
},
"edges": [
{
"node": {}
}
]
}
}
}
The edges array will contain a list of node elements containg user info about people that are following the person you where searching for.
To get the next x number of followers, you would have to change the json used in the variables query to something like this:
{
"id":"305701719",
"first":10,
"after":"AQDJzGlG3jGfM6KGYF7oOhlMqDm9_-db8DW_8gKYTeKO5eIca7cRqL1ODK1SsMA33BYBbAZz3BdC3ImMT79a1YytB1j9z7f-ZaTIkQKEoBGepA"
}
after would be what you received as an end_cursor in the previous request.
and your new URL would look like this: https://www.instagram.com/graphql/query/?query_hash=bfe6fc64e0775b47b311fc0398df88a9&variables=%7B%22id%22%3A%22305701719%22%2C%22first%22%3A10%2C%22after%22%3A%22AQDJzGlG3jGfM6KGYF7oOhlMqDm9_-db8DW_8gKYTeKO5eIca7cRqL1ODK1SsMA33BYBbAZz3BdC3ImMT79a1YytB1j9z7f-ZaTIkQKEoBGepA%22%7D
This way you can keep looping until has_next_page is false in the response.

EDIT 23/08/2018
Instagram seems to have blocked any scrolling/query hash request to get followers list/likers list on a post, at least on desktop, even for your own account.
https://developers.facebook.com/blog/post/2018/01/30/instagram-graph-api-updates/
It should be although still possible from a phone, maybe following a Selenium-like for mobile, using Appmium : http://appium.io/
Maybe some reverse app engineering may also be the key, if any idea from that side :
https://www.blackhatworld.com/seo/journey-instagram-app-reverse-engineer.971468/
EDIT 25/08/2018
It seems to be back... if any information about it ?

Batch get items using Parse API times out

I have an array of Object Ids which I need to retrieve from Parse. The size of the array varies greatly, and sometimes there are duplicates. Up until now, I've been prototyping, so I would use
string[] objectIds = new [] { "xT6...
...WhereContainedIn("objectId", objectIds);
And this would work okay. In real life, though, the size of the objectId array above can reach in the hundreds, and the query returns "operation was slow and timed out". I really have two questions here:
1) There has to be a better way to retrieve an array of objects, if you know the object Ids, but I couldn't find it. Is WhereContainedIn() the only solution here?
2) Are there any guidelines for how/when queries will simply fail? The documentation only mentions a limit of 1000 items to be retrieved, and nothing about the query going in. If it turns out that this query has to be batched, that would be okay, but there are no guidelines for batching, either.

So I have never used (or even heard of parse but reading throught the documentation I found this text about the limit maybe it would help.
"You can limit the number of results by calling Limit. By default, results are limited to 100, but anything from 1 to 1000 is a valid limit:"
https://www.parse.com/docs/dotnet_guide#queries-constraints

Unity NullReference when trying to use ParseFacebookUtils to log in to parse

I'm not to familiar with the SO user tags, so I hope that this works: #aaron
This is the closest question that I could find that relates to my issue, but it's not exactly the issue. (I tried Google, Bing, and SO's own search.)
https://stackoverflow.com/questions/25014006/nullreferenceexception-with-parseobjects-in-array-of-pointers
My issue: I have a Unity Web-Player game that interfaces with both Facebook and Parse. after resolving many issues in it, I have it to where it will easily connect to Facebook, pull in the user's profile information and picture. It then attempts to connect to parse to log the user into parse to retrieve their game related data (like high scores, currency stats, power ups, etc.) and when it tries to do that, I get a NullReferenceException. The specific contents of the error message is:
"NullReferenceException: Object reference not set to an instance of an object
at GameStateController.ParseFBConnect (System.String userId, System.String accessToken, DateTime tokenExpiration) [0x0001a] in C:...\Assets\Scripts\CSharp\CharacterScripts\GameStateController.cs:1581
at GameStateController.Update () [0x0011f] in C:\Users\Michieal\Desktop\Dragon Rush Game\Assets\Scripts\CSharp\CharacterScripts\GameStateController.cs:382"
The code that generates this error message is:
public void ParseFBConnect(string userId, string accessToken, DateTime tokenExpiration)
{
Task<ParseUser> logInTask = ParseFacebookUtils.LogInAsync (userId, accessToken, tokenExpiration).ContinueWith<ParseUser> (t =>
{
if (t.IsFaulted || t.IsCanceled)
{
if (t.IsCanceled)
Util.LogError ("LoginTask::ParseUser:: Cancelled. >.<");
// The login failed. Check the error to see why.
Util.LogError ("Error Result: " + t.Result.ToString ());
Util.LogError ("Error Result (msg): " + t.Exception.Message);
Util.LogError ("Error Result (inmsg): " + t.Exception.InnerException.Message);
}
if (t.IsCompleted)
{ // No need to link the user to a FB account, as there are no "real" (non fb) accounts yet.
Util.Log ("PFBC::Login result reports successful. You Go Gurl!");
// Login was successful.
user = t.Result; // assign the resultant user to our "user"...
RetryPFBC = false;
return user;
}
return t.Result;
});
if (user.ContainsKey ("NotNew"))
{ // on true, then we don't have to set up the keys...
Util.Log ("User, " + user.Username + ", contains NotNew Key.");
}
else
{
CreateKeys (); // Create Keys will only build MISSING Keys, setting them to the default data specifications.
user.Email = Email;
user.SaveAsync (); // if we have created the keys data, save it out.
}
}
It is being passed the proper (post authenticated) Facebook values (FB.UserId, FB.AccessToken, FB.AccessTokenExpiresAt) in that order. I'm using FB Sdk version 6.0.0 and Parse Unity SDK version 1.2.16.
In the log file, instead of any of the debug.log/Util.log comments, it does the "Null Reference" error (above), followed by "About to parse url: https://api.parse.com/1/classes/_User
Checking if https://api.parse.com/1/classes/_User is a valid domain
Checking request-host: api.parse.com against valid domain: *"
And that is where it just stopped. So, I built a simple retry block in the Update() function to call the ParseFBConnect() function every 10 or so seconds. Which, seems to only fill up my log file with the same error sets. After searching across the internet for help, I tried changing the FB.AccessTokenExpiresAt to DateTime.UtcNow.AddDays(1) as others have said that this works for them. I cannot seem to get either to work for me. When I check the Dashboard in Parse to see if it shows any updates or activity, it doesn't, and hasn't for a few days now. The Script Execution Order is as follows:
Parse.ParseInitialzeBehaviour -- -2875 (very first thing)
Facebook loaders (FB, Editor, Canvas, etc) -- -1000 to -900
GameStateController -- -875
...
So, I know that the Parse.ParseInitializeBehaviour is being loaded first (another result of searching), and I have tracked the NullReference down to the Parse.Unity.dll by changing where the login method is stored; The GSC is on the player (the Player starts in the splash screen and remains throughout the entire game). I have also tried placing the Parse.ParseInitializeBehaviour on the Player gameobject and on an empty gameobject (with a simple dontdestroy(gameObject) script on that). And, I have my Application ID and my DotNet Key correctly filled in on the object.
When I first set up parse, and on the current production version of the game, it can successfully create a user account based off of the code snippet above. Now, it just breaks at the trying to parse the url...
So, My Question is: What am I doing wrong? Does anyone else have this issue? and does anyone have any ideas/suggestions/etc., on what to do to fix this? I really just want this to work, and to log the user in, so that I can grab their data and go on to breaking other things in my game :D
All help is appreciated!! and I thank you for taking the time to read this.

The Answer to this question is as follows:
backend as a service - Bing
Ditch parse because it doesn't work well with Unity3d, the support (obviously, judging by the sheer number answers to this question and other questions from people that need help getting it to work) is extremely lacking, and most, if not all, of the examples to show how to use it are broken / missing parts / etc. I see this as a failure on the part of the creators. I also see it as False Advertising. So, to answer this question - PARSE is a waste of time, cut to the chase and grab something that does work, has real, WORKING examples, and that is maintained by someone that knows HOW TO PROGRAM in the UNITY3d environment.
A great example of this would be the Photon Networking service, App42 by Shephertz.com, Azure by Microsoft, etc. With the overwhelming number of answers to this, if you exclude PARSE, This should have been a no-brainer. I've included a Bing Search link, you can also do a quick search for Gaming Backends (you'll find blogs and reviews.)
I suggest strongly, that you read the reviews on these services, read the questions (and make note of whether they have been answered, or, in the case of Parse, if they simply closed the forum rather than answering their customers)... And, just bounce around through the examples that they give for using their products. If they show how it works, and expected results (like the MSDN does) you'll have a lot better and a lot easier time getting it work.
FYI - My game now has all of the back end integrations, saves/creates/gets users & their data, has a store, and is moving right along, since I dropped PARSE.
To make things correct, I WAS using Parse 1.2.16 for unity, and the Facebook for Unity SDK 6.0.0... I tried Parse 1.2.14, with the same results, and I tried both Unity 4.5.2 and 4.5.3. My opinion of Parse is based off of their documentation, these 2 API versions, and the very long days that I pulled trying to get it to work. I then added in the fact that NO ONE here answered this question AT ALL (Sadly, Stack Over flow is Facebook's preferred Help/Support forum. They even say in their blogs, help documentation, etc., that if you have an issue - post your question on Stack Overflow.com with relevant tags. Which, I did.)
Oh, and the final star rating of Parse (to clarify it) is "-2 out of 5" -- That's a NEGATIVE 2 stars... out of 5. as in, run away from it... far, far away. Just to clarify.

Efficient algorithm for finding related submissions

I recently launched my humble side project and would like to add a "related submissions" section when viewing a submission. Exactly like what SO is doing here - see right column, titled "Related"
Considering that each submission has a title and a set of tags, what is most effective (optimum result), most efficient (fast, memory friendly) way to query the database for related submissions?
I can think of one way to do this (which I'll post as an answer) but I'm very interested to see what others have to say. Or perhaps there's already a standard way of achieving this?

Here's my two cent solution:
To achieve the best output, we need to put “weight” on the query results.
To start with, each submission in the database is assumed to have a weight of zero.
Then, if a submission in the "pool" shares one tag with the current submission, we'd add +3 to the found submission. Hence, if another submission is found that shares two tags with the current submission, we add +6 to the weight.
Next, we split/tokenize the title of the current submission and remove “stop words”.
I’ve seen a list of stop words from google, but for now I’ll define my stop words to be: [“of”, “a”, “the”, “in”]
Example:
Title “The Best Submission of All Times”
Result the array: ["The", “Best”, “Submission”, “of”, “All”, “Times”]
Remove stop words: [“Best”, “Submission”, “All”, “Times”]
Then we query the database for submissions containing any of the mentioned titles, and for each result we add the weight: +2
And finally sort the list descending by weight and take the top N results.
What do you think? (be gentle!)

If I understand well, you need a technique to find whether two posts are "similar" one to each other. You may want to use a probabilistic model for that:
http://en.wikipedia.org/wiki/Mutual_information
The idea would be to say that if two posts share a lot of "uncommon" words, they are probably speaking on the same topic. For detecting uncommon words, depending on your application, you may use a general table of frequencies, or maybe better, build it yourself on the universe of the words of your posts (but you will need to have enough of them to have something relevant).
I would not limit myself on title and tags, but I would overweight them in the research.
This kind of ideas is very common in spam filtering. I unfortunately the time to make a full review, but a quick google search gives:
http://www.aclweb.org/anthology/P/P04/P04-3024.pdf
karlmicha.googlepages.com/acl2004_poster.pdf

Techniques to make autocomplete on website more responsive

In my website's advanced search screen there are about 15 fields that need an autocomplete field.
Their content is all depending on each other's value (so if one is filled in, the other's content will change depending on the first's value).
Most of the fields have a huge amount of possibilities (1000's of entries at least).
Currently make an ajax call if the user stops typing for half a second. This ajax call makes a quick call to my Lucene index and returns a bunch of JSon objects. The method itself is really fast, but it's the connection and transferring of data that is too slow.
If I look at other sites (say facebook), their autocomplete is instant. I figure they put the possible values in their HTML, so they don't have to do a round trip. But I fear with the amounts of data I'm handling, this is not an option.
Any ideas?

Return only top x results.
Get some trends about what users are picking,
and order based on that, preferably
automatically.
Cache results for every URL & keystroke combination,
so that you don't have to round-trip
if you've already fetched the result
before.
Share this cache with all
autocompletes that use the same URL
& keystroke combination.
Of course,
enable gzip compression for the
JSON, and ensure you're setting your
cache headers to cache for some
time. The time depends on your rate
of change of autocomplete response.
Optimize the JSON to send down the
bare minimum. Don't send down
anything you don't need.

Are you returning ALL results for the possibilities or just the top 10 as json objects.
I notice a lot of people send large numbers of results back to the screen, but then only show the first few. By sending back small numbers of results, you can reduce the data transfer.

Return the top "X" results, rather than the whole list, to cut back on the number of options? You might also want to try and put in some trending to track what users pick from the list so you can try and make the top "X" the most used/most relvant. You could always return your most relevant list first, then return the full list if they are still struggling.

In addition to limiting the set of results to a top X set consider enabling caching on the responses of the AJAX requests (which means using GET and keeping the URL simple).
Its amazing how often users will backspace then end up retyping exactly the same content. Also by allowing public and server-side caching your could speed up the overall round-trup time.

Cache the results in System.Web.Cache
Use a Lucene cache
Use GET not POST as IE caches this
Only grab a subset of results (10 as people suggest)
Try a decent 3rd party autocomplete widget like the YUI one

Returning the top-N entries is a good approach. But if you want/have to return all the data, I would try and limit the data being sent and the JSON object itself.
For instance:
"This Here Company With a Long Name" becomes "This Here Company..." (you put the dots in the name client side--again; transfer a minimum of data).
And as far as the JSON object goes:
{n: "This Here Company", v: "1"}
... Where "n" would be the name and "v" would be the value.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.