twitter search api +paging +max_id +next_page - twitter

What is the purpose of paging + next_page in the twitter search api? - they don't pivot around data as one would expect.
I'm experimenting with the search api and noticed the following query changes overtime.
This url was returned from search api "next_page".
http://search.twitter.com/search.json?page=3&max_id=192123600919216128&q=IndieFilmLove&rpp=100&include_entities=1
hit refresh on a trending topic and you will notice that the page is not constant.
When iterating through all 15 pages on a trending topic you run into duplicates on the first few items on each page.
It seems the paging variable + next_page are useless if you were aggregating data. page 1 will be page 3 in a few minutes of a trending topic. So you end up with duplicates on 1-3 items of each page since new data is pushing the pages down.
The only way to avoid this is by NOT using next_page and or paging parameter as discussed here:
https://dev.twitter.com/discussions/3809
I pass the oldest id from my existing result set as the max_id. I do
not pass a page.
which approach is better for aggregating data?
i could use next_page but skip statuses already processed on this run of 15 pages.
or
use max_id only and skip already processed
==============

In their Working with Timelines document at http://dev.twitter.com/docs/working-with-timelines Twitter recommend cursoring using the max_id parameter in preference to attempting to step through a timeline page by page.

Related

Google Sheets - Scrape table involved with pagination

I'm trying to find a work around with google sheets. I'm pulling data from finviz.com to build out custom stock screeners, but the only issue is they make use of pagination, therefore only allowing 20 rows for the first few results. I've checked that if I click on the 2nd page results in the pagination section of the table, only the URL changes, indicating the first row of the new table. Meaning if my first result page would be 20 rows, the second result page URL would have a parameter like this "r=21" indicating the first row of the second page results. Now how would I go about this to ensure that I'm pulling all the data once pagination of the table is in place? Also, checking the source of the page, these new parameters are stored into href's, meaning if our pagination had 3 pages as results, then within the <table/> elements we can see the new urls in href's, for example:
<table>
<a href="screener.ashx?v=111&f=targetprice_a5&r=21"/>
<a href="screener.ashx?v=111&f=targetprice_a5&r=41"/>
<a href="screener.ashx?v=111&f=targetprice_a5&r=61"/>
</table>
Take note only one new parameter is added to the url "r=21", the rest are consistant throughout different result pages.
Is this even possible with google sheets?
Here's what I have. The goal to this idea is to build out stock market screeners that are updated every 3 mins, which allows an integration/view from notion.
=QUERY(IMPORTHTML("https://finviz.com/screener.ashx?v=111&f=cap_smallover,earningsdate_thismonth,fa_epsqoq_o15,fa_grossmargin_o20,sh_avgvol_o750,sh_curvol_o1000,ta_perf_52w10o,ta_rsi_nob50&ft=4&o=perfytd&ar=180","Table","19"),"SELECT Col1,Col2,Col7,Col8,Col9,Col10,Col11")
try:
=QUERY({
IMPORTHTML("https://finviz.com/screener.ashx?v=111&f=cap_smallover,earningsdate_thismonth,fa_epsqoq_o15,fa_grossmargin_o20,sh_avgvol_o750,sh_curvol_o1000,ta_perf_52w10o,ta_rsi_nob50&ft=4&o=perfytd&ar=180","Table","19");
IMPORTHTML("https://finviz.com/screener.ashx?v=111&f=cap_smallover,earningsdate_thismonth,fa_epsqoq_o15,fa_grossmargin_o20,sh_avgvol_o750,sh_curvol_o1000,ta_perf_52w10o,ta_rsi_nob50&ft=4&o=perfytd&r=21&ar=180","Table","19");
IMPORTHTML("https://finviz.com/screener.ashx?v=111&f=cap_smallover,earningsdate_thismonth,fa_epsqoq_o15,fa_grossmargin_o20,sh_avgvol_o750,sh_curvol_o1000,ta_perf_52w10o,ta_rsi_nob50&ft=4&o=perfytd&r=41&ar=180","Table","19");
IMPORTHTML("https://finviz.com/screener.ashx?v=111&f=cap_smallover,earningsdate_thismonth,fa_epsqoq_o15,fa_grossmargin_o20,sh_avgvol_o750,sh_curvol_o1000,ta_perf_52w10o,ta_rsi_nob50&ft=4&o=perfytd&r=61&ar=180","Table","19")},
"select Col1,Col2,Col7,Col8,Col9,Col10,Col11 where Col1 matches '\d+'", 1)

Max-results value is too high.YouTube API

I'm trying to load the first 100 videos of a YouTube channel and I can only load 50 videos. I always get this error message
Max-results value is too high. Only up to 50 results can be returned per query.
and I'm using this URL.
http://gdata.youtube.com/feeds/api/users/shaytards/uploads?&max-results=100
I'm not sure if I need any kind of developer key to load 100 results. Even if you know a way to load videos 75-100 would be great but any help would be appreciated.
The Youtube data API requires you to set up your own pagination if you need to get more than 50 results. In this case, then, you would set max-results to 50 and do the query, then on your page have some sort of "next" button (or however you want to implement your pagination) that would trigger a new call (either through ajax, a new page, or whatever your own workflow is). This new call would be to the same data feed, but you'd append the parameter &start-index=51 so that you'd get 50 more results, numbered 51-100. You can continue on like this up to 1000 results.
See https://developers.google.com/youtube/2.0/reference#Paging_through_Results for more details.
In the YouTube Data API V3 it is a little different. Each search query returns a "nextPageToken" that can be used in a subsequent query for more results. Documentation: https://developers.google.com/youtube/v3/guides/implementation/pagination

JQGrid Pagination Issue

I am working jqgrid pagination .And stucked at a very basic problem but it is really irritating me.
There are two main aspects of what i am doing.
1. Server side pagination for server side data.
2. Client side pagination for server side searching
In 1st case i am fetching 50 records for every pager button(next,previous,last,first) and also if user enter page number then also correct service call is fetching perfect 50 records for me and setting data. Also as per my requirements i want jqgrid to show total records on the server at the bottom-right side of grid even if grid contains only 50 at current and accordingly total pages should get updated. this is also working properly.
Actual conflict is here .If i search with some criteria service will return me whole data for search say 300 records. Now all the 300 records are fetched in single service call. So i want client side pagination for this .I am able set 300 records and page number also but view {} to {} and page number in textbox at the center does not get updated.
Is there any way to reset the value of page textbox and view{} to {} to default value ?
Please help
It seems, that you post wrong values in total, page and records of the server response. I suppose that you switched the values of records and total. The value of total should be the total number of pages in the dataset. The value of records should be the number of rows (records, items) in the dataset. You should either adjust your server code of specify jsonReader which get (calculate) in correct way the data total, page and records from the data returned from the server.
I recommend you to read the answer to understand why jqGrid send to the server additional parameters and why the server require to return total, page and records.

Is youtube data api paging consistent if you use pagetokens? (v3 data api)

The YouTube V3 Data API provides "page tokens" for paging result sets from search endpoints.
Is the dataset being viewed consistent if you make use of the page tokens?
e.g. if there is an ordering on view count, which is likely to change with time, then is the ordering over the total result set guaranteed across multiple queries, so the lowest result on the first page is guaranteed to be ordered greater than the highest result on the following page?
The ordering is not guaranteed when using page tokens. (Nor was it guaranteed when using start-index in v1/v2.) If something changes with the backend data source's ordering in between fetching one page and the next page, then retrieving the next page would reflect the updated ordering.

Delay and Inconsistent results using Twitter search API when using "since_id" parameter

We've noticed what seems to be a delay and/or inconsistent results using the Twitter Search API when specifying a sinceid in the param clause. For example:
http://search.twitter.com/search?ors=%23b4esummit+#b4esummit+b4esummit&q=&result_type=recent&rpp=100&show_user=true&since_id=
Will give the most recent Tweets, but:
http://search.twitter.com/search?ors=%23b4esummit+#b4esummit+b4esummit&q=&result_type=recent&rpp=100&show_user=true&since_id= 12642940173
will often not give tweets that are after that ID for several hours (even though they're visible in the first query)...
anyone have similar problems?
First off, those are not Twitter search API URLs. You should be querying the API like this:
http://search.twitter.com/search.json?q=%23b4esummit%20OR%20#b4esummit%20OR%20b4esummit&result_type=recent&rpp=100&show_user=true
Second, since_id cuts off from the bottom of the list. You can see the behavior illustrated in this documentation: https://dev.twitter.com/docs/working-with-timelines
For an example, at the time of this writing, the above URL returns 31 entries. Picking the ID of a Tweet in the middle of that list, I constructed:
http://search.twitter.com/search.json?q=%23b4esummit%20OR%20#b4esummit%20OR%20b4esummit&result_type=recent&rpp=100&show_user=true&since_id=178065448397574144
Which only returns 12 entries, which match the top 12 entries of the first URL.

Resources