YQL, returning only 100 values. Can I get more? - yql

I'm using YQL with JSON in order to retrieve a Twitter search. It only returns 100 values. Any chance to get more than that?

Doesn't look good, friend: "The maximum number of results that can be returned by a YQL query on this table is 100, which is defined by the attribute max."
From: http://developer.yahoo.com/yql/guide/yql-tutorials.html

The maximum number of items returned by a SELECT statement with YQL is 5,000. If the table in query does not give enough results by default (assuming there are more available), you can ask for more results by using a remote limit.
select * from twitter.search(250) where q="lol"
For more details, see Paging and Table Limits in the YQL Guide.
Be aware that many data providers will rate limit queries against their services, Twitter certainly does.

Related

Counting ALL rows in Dynamics CRM Online web api (ODATA)

Is it possible to count all rows in a given entity, bypassing the 5000 row limit and bypassing the pagesize limit?
I do not want to return more than 5000 rows in one request, but only want the count of all the rows in that given entity.
According to Microsoft, you cannot do it in the request URI:
The count value does not represent the total number of entities in the system.
It is limited by the maximum number of entities that can be returned.
I have tried this:
GET [Organization URI]/api/data/v9.0/accounts/?$count=true
Any other way?
Use function RetrieveTotalRecordCount:
If you want to retrieve the total number of records for an entity beyond 5000, use the RetrieveTotalRecordCount Function.
Your query will look like this:
https://<your api url>/RetrieveTotalRecordCount(EntityNames=['accounts'])
Update:
Latest release v9.1 has the direct function to achieve this - RetrieveTotalRecordCount
————————————————————————————
Unfortunately we have to pick one of this route to identify the count of records based on expected result within the limits.
1. If less than 5000, use this: (You already tried this)
GET [Organization URI]/api/data/v9.0/accounts/?$count=true
2. Less than 50,000, use this:
GET [Organization URI]/api/data/v8.2/accounts?fetchXml=[URI-encoded FetchXML query]
Exceeding limit will get error: AggregateQueryRecordLimit exceeded. Cannot perform this operation.
Sample query:
<fetch version="1.0" mapping="logical" aggregate="true">
<entity name="account">
<attribute name="accountid" aggregate="count" alias="count" />
</entity>
</fetch>
Do a browser address bar test with URI:
[Organization URI]/api/data/v8.2/accounts?fetchXml=%3Cfetch%20version=%221.0%22%20mapping=%22logical%22%20aggregate=%22true%22%3E%3Centity%20name=%22account%22%3E%3Cattribute%20name=%22accountid%22%20aggregate=%22count%22%20alias=%22count%22%20/%3E%3C/entity%3E%3C/fetch%3E
The only way to get around this is to partition the dataset based on some property so that you get smaller subsets of records to aggregate individually.
Read more
3. The last resort is iterating through #odata.nextLink and counting the records in each page with a code variable (code example to query the next page)
The XrmToolBox has a counting tool that can help with this .
Also, we here at MetaTools Inc. have just released an online tool called AggX that runs aggregates on any number of records in a Dynamics 365 Online org, and it's free during the beta release.
You may try OData's $inlinecount query option.
Adding only $inlinecount=allpages in the querystring will return all records, so add $top=1 in the URI to fetch only one record along with count of all records.
You URL will look like /accounts/?$inlinecount=allpages&$top=1
For example, click here and the response XML will have the count as <m:count>11</m:count>
Note: This query option is only supported in OData version 2.0 and
above
This works:
[Organization URI]/api/data/v8.2/accounts?$count

Is there any limit to the number of rows returned by API?

I am making a bulk call with 30 posts and daily data of all. Is there any limits to the number of rows that will be returned by the API?
I am having problem getting the results.
Can anyone please help.
YouTube doesn't return any rows ... it's not relational data. That may sound like a pedantic thing to point out, but it's crucial for this next point; the API will return 50 videos at a time, along with tokens to get more results based on the same query, up to a total of 500 ... because the data isn't relational, you can't just "select all rows" that match a certain criteria. Rather, it is probabilistically determining relevance to your search parameters, and after about 500 results the algorithms don't have enough certainty to make additional results relevant.
So in your case, where you can change the date as needed (to allow the algorithms to be more specific), you'll want to do a series of calls; perhaps one at a time (since you have to paginate anyway to get more than 50 results, it's probably not that much more expensive in terms of network bandwidth).

Why limited number of next page tokens?

Through a script I can collect a sequence of videos that search list returns. The maxresults variable was set to 50. The total number items are big in number but the number of next page tokens are not enough to retrieve all the desired results. Is there any way to take all the returned items or it is YouTube restricted?
Thank you.
No, retrieving the results of a search is limited in size.
The total results that you are allowed to retrieve seems to have been reduced to 500 (in the past it was limited to 1000). The api does not allow you to retrieve more from a query. To try to get more, try using a number of queries with different parameters, like: publishedAfter, publishedBefore, order, type, videoCategoryId, or vary the query tags and keep track of getting different video id's returned.
See for a reference:
https://code.google.com/p/gdata-issues/issues/detail?id=4282
BTW. "totalResults" is an estimation and its value can change on the next page call.
See: YouTube API v3 totalResults field is returning 1 000 000 when it shoudn't

Is there a way to tell if an activerecord query hit its limit

given the following query:
Cars.where(color: red).limit(5)
Is there a way to tell if the limit was hit. Say there are 6 red cars, do I have to do a separate query to count the total number of red cars?
Basically, I am trying to send a message to the user letting them know that a search was limited due to reaching max number of results allowed.
The only way to combine it into one query is to do a limit(6). If the size is 6, then remove the last element and record that their are more results.
Alternatively, do a separate query Cars.where(color: red).count. Although this will do a separate SQL query, count queries are sometimes very fast for databases.
The direct answer to your question is that you cannot get this information back in one single query. You would have to run two queries: one to return the limited set, and one to find the total count.
The only time this wouldn't be necessary is when you are querying the first page of results (the offset is 0) and the number of returned results is less than the limited value (e.g., you set the limit to 5 but get 4 results back).
Sample code would look like this:
limit = 5
cars_scoped = Cars.where(color: red)
cars = cars_scoped.limit(limit)
cars_count = cars.length < limit ? cars.length : cars_scoped.count

Twitter search API results

I'm using the Twitter API atom format
http://search.twitter.com/search.atom?q=Name&:)&since:year-month-date&rpp=1500
but it's only returning 100 tweets, I tried using the JSON format as well, but it only returned 100 results. Is there anything that I'm doing wrong to only get 100 results?
Yes, you're limited on the number of results per page. In order to get more results, you have to use the page parameter like so:
http://search.twitter.com/search.atom?q=Name&:)&since:year-month-date&rpp=1500&page=2
EDIT
rpp: the number of tweets to return
per page, up to a max of 100. E.g.,
http://search.twitter.com/search.atom?lang=en&q=devo&rpp=15
page: the page number to return, up to
a max of roughly 1500 results (based
on rpp * page)
Source: http://search.twitter.com/api/
In other words your rpp won't work as you expect because the max is 100.
My sugestion.
Make a request to your API and retrieve 100 results by time.
Use a loop to check if your result count is set to 100.
if true, do a new request to page 2.
test again and check the number of itens until the resultset is lower than 100.
The Twitter Search API has changed, including in the naming of the parameters: for instance, rpp is now count and the page parameter was removed in favor of max_id, a parameter based on a timeline concept:
"To use max_id correctly, an application’s first request to a
timeline endpoint should only specify a count. When processing this
and subsequent responses, keep track of the lowest ID received. This
ID should be passed as the value of the max_id parameter for the next
request, which will only return Tweets with IDs lower than or equal to
the value of the max_id parameter."
https://developer.twitter.com/en/docs/tweets/timelines/guides/working-with-timelines
The updated link to the Twitter search api is:
https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets.html
Remember that not all tweets are indexed and if you are using the non-commercial version, you are limited to a 7-day search.

Resources