Watson Discovery News - filter query to show a maximum of 1 result per domain - watson

Lets say I run a query for the company Dell. Something like
{WATSON_URL}/query?
version=2016-1107&
query=entities.text%3ADell%2Centities.type%3ACompany%2Clanguage%3Aenglish&
count=&
offset=&
aggregation=&
filter=blekko.hostrank%3E200&
return=blekko.clean_title%2Centities.text%2Centities.type%2Cblekko.basedomain%2Cblekko.hostrank
All but 1 of the results is from the same publisher (Yahoo). Is there a way to Filter the results so that only a single result can come from a given publisher?
I can use an aggregation and still get an article with the 'top_hits' command
term(blekko.basedomain).top_hits(1)
This aggregation gives me roughly what I want but, in this case, I can't filter what fields I get back as I would be able to do if I could get this in the 'results' section.

Related

Filter GetRows on GoogleSheet Document via Logic Apps

I am reading from a google sheet of 100,000+ records but I want to load only the records after a certain date. ( so applying filter ) . But I haven't been able to accomplish that as I do not know how to access the column name without using foreach.
Here is what the data looks like from the googlesheet
So basically, I would like to filter the records something like this.
Timestamp ge '10/13/2021' so it will only return records for 10/13/2021 and 10/14/2021... etc.
Is this possible to do that? If not, what is the best recommended way to approach this issue as i just wanted to load daily records to sql db in the next step.
The Get rows action of the Google Sheets connector doesn't support filtering. You can only specify how many rows you want returned, and how many rows you want to skip. E.g. if you have 100,000 rows in your sheet, you can easily get the rows between 90,001 and 90,200, should you wish to do so.
While you can't use this connector to retrieve filtered data from Google Sheets, you can use the Filter array action to filter the retrieved data as you wish.
You might still need to use the For each loop to retrieve and filter data in chunks.

Combined Select and Filter MS-Graph query parameters not working as expected for signInActivity/lastSignInDateTime

Query: https://graph.microsoft.com/beta/users?$select=id,displayName,signInActivity&$filter=signInActivity/lastSignInDateTime le 2020-03-01T00:00:00Z
I am trying to query for users based on "lastSignInDateTime". When I do this , the response gives all the properties for every user returned. I then try to reduce this response by adding a "select" parameter to reduce the properties returned but it seems to have no effect. Is it possible to combine the "Filter" and "Select" query Parameter's?
We have a bug for collection enumeration in that beta endpoint. Due to be fixed within next couple of months. AS a workaround you can export your dataset into data structure and filter in memory (preferred) or you can query specific users (expensive and not recommended)

Counting ALL rows in Dynamics CRM Online web api (ODATA)

Is it possible to count all rows in a given entity, bypassing the 5000 row limit and bypassing the pagesize limit?
I do not want to return more than 5000 rows in one request, but only want the count of all the rows in that given entity.
According to Microsoft, you cannot do it in the request URI:
The count value does not represent the total number of entities in the system.
It is limited by the maximum number of entities that can be returned.
I have tried this:
GET [Organization URI]/api/data/v9.0/accounts/?$count=true
Any other way?
Use function RetrieveTotalRecordCount:
If you want to retrieve the total number of records for an entity beyond 5000, use the RetrieveTotalRecordCount Function.
Your query will look like this:
https://<your api url>/RetrieveTotalRecordCount(EntityNames=['accounts'])
Update:
Latest release v9.1 has the direct function to achieve this - RetrieveTotalRecordCount
————————————————————————————
Unfortunately we have to pick one of this route to identify the count of records based on expected result within the limits.
1. If less than 5000, use this: (You already tried this)
GET [Organization URI]/api/data/v9.0/accounts/?$count=true
2. Less than 50,000, use this:
GET [Organization URI]/api/data/v8.2/accounts?fetchXml=[URI-encoded FetchXML query]
Exceeding limit will get error: AggregateQueryRecordLimit exceeded. Cannot perform this operation.
Sample query:
<fetch version="1.0" mapping="logical" aggregate="true">
<entity name="account">
<attribute name="accountid" aggregate="count" alias="count" />
</entity>
</fetch>
Do a browser address bar test with URI:
[Organization URI]/api/data/v8.2/accounts?fetchXml=%3Cfetch%20version=%221.0%22%20mapping=%22logical%22%20aggregate=%22true%22%3E%3Centity%20name=%22account%22%3E%3Cattribute%20name=%22accountid%22%20aggregate=%22count%22%20alias=%22count%22%20/%3E%3C/entity%3E%3C/fetch%3E
The only way to get around this is to partition the dataset based on some property so that you get smaller subsets of records to aggregate individually.
Read more
3. The last resort is iterating through #odata.nextLink and counting the records in each page with a code variable (code example to query the next page)
The XrmToolBox has a counting tool that can help with this .
Also, we here at MetaTools Inc. have just released an online tool called AggX that runs aggregates on any number of records in a Dynamics 365 Online org, and it's free during the beta release.
You may try OData's $inlinecount query option.
Adding only $inlinecount=allpages in the querystring will return all records, so add $top=1 in the URI to fetch only one record along with count of all records.
You URL will look like /accounts/?$inlinecount=allpages&$top=1
For example, click here and the response XML will have the count as <m:count>11</m:count>
Note: This query option is only supported in OData version 2.0 and
above
This works:
[Organization URI]/api/data/v8.2/accounts?$count

Microsoft Graph filter vs $filter

I am testing filtering using Microsoft Graph Explorer. I noticed odd behavior that I cannot figure out.
Using endpoint https://graph.microsoft.com/v1.0/me/events?filter=start/dateTime%20ge%20%272018-04-01%27 I get properly filtered data back.
However, using documented $ prefix, https://graph.microsoft.com/v1.0/me/events?$filter=start/dateTime%20ge%20%272018-04-01%27, I get nothing. There is no error, just no data coming back.
How do I query the data using the $filter?
You're not actually getting the results you think you are. When Microsoft Graph sees a query parameter it doesn't expect, it simply ignores it.
When you call /events?filter=start/dateTime ge '2018-04-01' it is simply ignoring the unknown filter parameter and returning you an unfiltered result.
When you call /events?filter=start/dateTime ge '2018-04-01', it is filtering out anything prior to April 1, 2018. If there are no events with a start after this date, you will get an empty array as a result.
I assume you're using the default dataset included with Graph Explorer? The default Graph Explorer data set's most recent event is 2017-11-16T08:00:00.0000000.
The reason you see results from the /calendarView endpoint but not the /events endpoint is that /events only returns single instance meetings and series masters while /celandarView shows everything within a date range. In order to avoid having to maintain a dataset with updated events, the demo data relies on a handful of recurring event entries.
Since events does not return individual occurrences of a meeting, you don't see any results from your query.
If you try this query, you'll see actual results:
https://graph.microsoft.com/v1.0/me/events?$filter=start/dateTime ge '2017-04-01'

How to build rails analytics dashboard

I'm looking to build an analytics dashboard for my data in a rails application.
Let's say I have a list of request types "Fizz", "Buzz", "Bang", "Bar".
I want to display a count for each day based on type.
How should I do this?
Here is what I plan on doing:
Add get_bazz_by_day, get_fizz_by_day, etc to the appropriate models.
In each model get all records of type Fizz, then create an array that stores date and count.
format in view so a JS library can format it into a pretty graph.
Does this sound reasonable?
Depending on number of records, your dashboard can soon get performance problems.
Step 1 is misleading. Don't get the data for each day individually, try to get them all at once.
In Step 2 you can have the database do the the aggregation over days, with the group method.
See http://guides.rubyonrails.org/active_record_querying.html#group
Fizz.select("date(created_at) as fizzed_day, count(*) as day_count").
group("date(created_at)")
In Step 3 you need to take care that days without any fizzbuzz are still displayed, as they are not returned in the query.

Resources