Find retweeterers 2-hop away from a tweet - twitter

Is there a way to get a list with the id's of the retweeterers of a tweet two hop away from the tweet? For example let's say I have a tweet T. User A retweets tweet T. User B retweets user's A retweet.
[T] -- retweets --> [A] -- retweets --> [B]
I know that I can find A's id with GET statuses/retweets, but how can I get the B's id?
EDIT:
Let's say I know the id of the tweet I can get A by using this code:
IDs ids = twitter.getRetweeterIds(tweet_id);

It is possible, but it is a little tricky....
You can use https://dev.twitter.com/docs/api/1.1/get/statuses/retweets/%3Aid to get all the people who have retweeted a specific tweet. Well, you can get the 100 most recent.
I am #edent. #LJRich has retweeted me. #winechateau retweets me from #LJRich.
This API call will show both both retweeters in there.
https://api.twitter.com/1.1/statuses/retweets/4493786060150046720
However! If you look through #LJRich's timeline, you'll see that her retweet will have its own unique status ID!
{
"created_at": "Mon Jul 28 15:56:11 +0000 2014",
"id": 493787008138244100,
"id_str": "493787008138244096",
"text": "RT #edent: "Accidentally" drunk an entire case of wine?\nUse it to store all the cables you're too stubborn to throw out! http://t.co/rzAW89…",
....
"retweeted_status": {
"created_at": "Mon Jul 28 15:52:25 +0000 2014",
"id": 493786060150046700,
"id_str": "493786060150046720"
So, let's now search for who retweeted #LJRich's status... (remember to use id_str)
https://api.twitter.com/1.1/statuses/retweets/493787008138244096.json
Hey presto! You can see #winechateau retweeted from #LJRich who retweeted from me.

Related

How to do grouping in elasticsearch with searchkick rails

I have articles data indexed to elastic as follows.
{
"id": 1011,
"title": "abcd",
"author": "author1"
"status": "published"
}
Now I wanted to get all the article id grouped by status.
Result should someway look like this
{
"published": [1011, 1012, ....],
"draft": [2011],
"deleted": [3011]
}
NB: I tried normal aggs (Article.search('*',aggs: [:status], load: false).aggs) , it just giving me the count of each items in, I want ids in each item instead
#Crazy Cat
You can transform you query in this way:
sort(Inc/Dec order) your response from ES over field "status".
Only Ask ES query to return only ID Field and status.
Now the usage of sorting would be it would sort your response to like this: [1st N results of "deleted" status, then N+1 to M results to "draft" and then M+1 to K results to "published"].
Now the advantages of this technique:
You will get flagged ids field of every document over which you can apply operations in you application.
Your query would be light weight as compared to Aggs query.
This way you would also get the metdata of every document ike docId of that document.
Now the Disadvantages:
You would always have to give a high upper bound of your page size, but You can also play around with count coming in the metadata.
Might take a bit more of network size as it returns redundant status in every document.
I Hope this redesign of your query might be helpful to you.

Etsy API: Can I get a number of active listings by shop id without paginating through all listing items

I want to know the total number of active listings by shop id. Is there any such API available ?
I could find the API which returns paginated results for all the listings in a shop.
'/shops/:shop_id/listings/active'
I cannot give a limit of over 100 in this API and for fetching total count of all listings, I will have to make a lot of requests is the listings are lets say several thousands. A simple API endpoint that can get the count of total active listings would be really helpful
It's included in the standard response:
{
"count":integer,
"results": [
{ result object }
],
"params": { parameters },
"type":result type
}
Docs can be found here: https://www.etsy.com/developers/documentation/getting_started/api_basics#section_standard_response_format
Got it.
The response contains a count field which gives the exact count of the active listings.
100 is the highest limit you can set—you will need to use the "page" parameter to move to the next 100 and so on.

Get who retweeted me using twitter apis

I want to get users who retweeted my tweets
$tweets2 = $connection->get("https://api.twitter.com/1.1/statuses/retweets_of_me.json");
Gives me list of my tweets which are retweed by other.
But it does not provide me details about who retweeted it. Any way to do this?
CAn I get this details using tweet ID?
In version 1.o
https://api.twitter.com/1/statuses/21947795900469248/retweeted_by.json
it was there but not present in version 2.:
I tried this:
https://api.twitter.com/1.1/statuses/retweet/241259202004267009.json
but does not show anny response
Any idea or help is appreciated.
Scenaraio is like this:
user1 retweets me 5 times, user2 retweets me 7 times, that means I had 12 retweets.
User1 has 500 followers, user2 has 100 followers, that means my retweet reach was 5x500 + 7x100 = 3200. So, on my webpage, I would like to see 12 retweets and 3200 retweet reach.
Use this api https://dev.twitter.com/docs/api/1.1/get/statuses/retweets_of_me to get the ID of the users.
Pass comma-separated user_id or screen_name to this https://dev.twitter.com/docs/api/1.1/get/users/lookup to get the info about the users.

JIRA REST API 6.01 - listing all groups

I am trying to use JIRA REST API[1] to list all the groups in JIRA. I am currently using JIRA version 6.01 .
I tried /rest/api/2/groups/picker[2] in JIRA REST API 6.01 but could not find a way to specify the parameter "query" as the way I need.
If I use a whole group name in parameter "query", I receive the correct group like this.
Request 1:
GET /jira/rest/api/2/groups/picker?query=jira-users
Response 1
{
"header": "Showing 1 of 1 matching groups",
"total": 1,
"groups": [ {
"name": "jira-users",
"html": "<b>jira-users<\/b>"
}]
}
But if I use a part of the group name in "query" parameter, it does not give expected results.
Request 2
GET /jira/rest/api/2/groups/picker?query=j
According to the method spec [2] I hope to receive all groups that name contains "j" but I do not receive any result.
Response 2
{
"header": "Showing 0 of 0 matching groups",
"total": 0,
"groups": []
}
Can anyone please let me know the right way to give parameters?
Thank you
[1] https://developer.atlassian.com/static/rest/jira/6.0.1.html
[2] https://developer.atlassian.com/static/rest/jira/6.0.1.html#id150432
We're using JIRA 6.0.7 and can do:
/rest/api/2/groups/picker?maxResults=10000
Which will show you all groups up to a max of 10000 results. The the response is important part as it shows the total number of groups, this may require you to adjust the maxResults query parameter that you pass to it if you have too small of a value to show all results:
{
"header":"Showing 5014 of 5014 matching groups",
"total":5014,
"groups":{
...
}
}
If you omit the maxResults it just returns the first 20 out of 5014. However, for us doing:
/rest/api/2/groups/picker?query=j
Will result in all groups containing the letter j to show up. Maybe it wasn't properly implemented in your version. If you are unable to get the query part working properly, you could try and get all results and then do your own filter by analyzing the name for each group object returned.

How to design HBase schema for Twitter data?

I have following Twitter data and I want to design a schema for the same .The queries which I would need to perform would be following:
get tweets volume for time interval,tweets with corresponding user info,tweets with corresponding topic info etc... Based on the below data ,anyone tell where designing of schema is correct.. (make rowkey as id+timestamp, column family as user ,others grouped into primary column . Any Suggestions ?
{
"created_at":"Tue Feb 19 11:16:34 +0000 2013",
"id":303825398179979265,
"id_str":"303825398179979265",
"text":"Unleashing Innovation Conference Kicks Off - Wall Street Journal (India) http:\/\/t.co\/3bkXJBz1",
"source":"\u003ca href=\"http:\/\/dlvr.it\" rel=\"nofollow\"\u003edlvr.it\u003c\/a\u003e",
"truncated":false,
"in_reply_to_status_id":null,
"in_reply_to_status_id_str":null,
"in_reply_to_user_id":null,
"in_reply_to_user_id_str":null,
"in_reply_to_screen_name":null,
"user":{
"id":948385189,
"id_str":"948385189",
"name":"Innovation Plaza",
"screen_name":"InnovationPlaza",
"location":"",
"url":"http:\/\/tinyurl.com\/ee4jiralp",
"description":"All the latest breaking news about Innovation",
"protected":false,
"followers_count":136,
"friends_count":1489,
"listed_count":1,
"created_at":"Wed Nov 14 19:49:18 +0000 2012",
"favourites_count":0,
"utc_offset":28800,
"time_zone":"Beijing",
"geo_enabled":false,
"verified":false,
"statuses_count":149,
"lang":"en",
"contributors_enabled":false,
"is_translator":false,
"profile_background_color":"131516",
"profile_background_image_url":"http:\/\/a0.twimg.com\/profile_background_images\/781710342\/17a75bf22d9fdad38eebc1c0cd441527.jpeg",
"profile_background_image_url_https":"https:\/\/si0.twimg.com\/profile_background_images\/781710342\/17a75bf22d9fdad38eebc1c0cd441527.jpeg",
"profile_background_tile":true,
"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/3205718892\/8126617ac6b7a0e80fe219327c573852_normal.jpeg",
"profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/3205718892\/8126617ac6b7a0e80fe219327c573852_normal.jpeg",
"profile_link_color":"009999",
"profile_sidebar_border_color":"FFFFFF",
"profile_sidebar_fill_color":"EFEFEF",
"profile_text_color":"333333",
"profile_use_background_image":true,
"default_profile":false,
"default_profile_image":false,
"following":null,
"follow_request_sent":null,
"notifications":null
},
"geo":null,
"coordinates":null,
"place":null,
"contributors":null,
"retweet_count":0,
"entities":{
"hashtags":[
],
"urls":[
{
"url":"http:\/\/t.co\/3bkXJBz1",
"expanded_url":"http:\/\/dlvr.it\/2yyG5C",
"display_url":"dlvr.it\/2yyG5C",
"indices":[
73,
93
]
}
],
"user_mentions":[
]
},
"favorited":false,
"retweeted":false,
"possibly_sensitive":false
}
If you are 100% sure that the ID is unique, you could this one as your row key to store the bulk of the data:
303825398179979265 -> data_CF
You column family data_CF would be defined in these lines:
"created_at":"Tue Feb 19 11:16:34 +0000 2013"
"id_str":"303825398179979265"
...
"user_id":948385189 { take note here I'm denormalizing your dictionary }
"user_name":"Innovation Plaza"
It gets a little trickier for the lists. The solution is to put something that will make it unique prefixed by the category:
"entities_hashtags_":"\x00" { Here \x00 is a dummy value }
For the URL, if the ordering is not important, you may prefix it with a UUID. It will guarantee that it is unique.
The advantage with this approach is if you need to update a field in this data, it will be done atomically since HBase guarantees row atomicity.
For the second question, if you need instantaneous aggregated information, you will have to precompute it and store it as you said in another table. If you want this data to be generated through M/R, you may put the timestamp + row id if it is to be time based. By topic would be something like topic + row id. This allows you to write Prefix scans with start stop row that will include only the time range or the topic you are interested.
Have fun !

Resources