SageMaker groundtruth - seeing time it took to complete annotation? - data-annotations

When I use SageMaker A2i I get timeSpentInSeconds in the object returned, which is useful as we can get stats on how long it takes for workers to complete certain tasks and plan around it. However for sagemaker groundtruth, I receive a list of objects like this;
{
"datasetObjectId": "0",
"consolidatedAnnotation": {
"content": {
"translation2": {
"annotationsFromAllWorkers": [
{
"workerId": "private.us-east-2.ex11121331faeb5c25c",
"annotationData": {
"content": "{\"semantic-similarity\":{\"label\":\"New\"}}"
}
}
]
}
}
}
}
No information on time to complete is included....is there a way to get this included?

Related

Not able to retrieve the spreadsheet id from workspace add-on

I'm developing a workspace add-on with alternate runtime; I configured the add-on to work with spreadsheets and I need to retrieve the spreadsheet id when the user opens the add-on. For test purposes I created a cloud function that contains the business logic.
My deployment.json file is the following:
{
"oauthScopes": ["https://www.googleapis.com/auth/spreadsheets.currentonly", "https://www.googleapis.com/auth/drive.file"],
"addOns": {
"common": {
"name": "My Spreadsheet Add-on",
"logoUrl": "https://cdn.icon-icons.com/icons2/2070/PNG/512/penguin_icon_126624.png"
},
"sheets": {
"homepageTrigger": {
"runFunction": "cloudFunctionUrl"
}
}
}
}
However, the request I receive seems to be empty and without the id of the spreadsheet in which I am, while I was expecting to have the spreadsheet id as per documentation
Is there anything else I need to configure?
The relevant code is quite easy, I'm just printing the request:
exports.getSpreadsheetId = function addonsHomePage (req, res) { console.log('called', req.method); console.log('body', req.body); res.send(createAction()); };
the information showed in the log is:
sheets: {}
Thank you
UPDATE It's a known issue of the engineering team, here you can find the ticket
The information around Workspace Add-ons is pretty new and the documentation is pretty sparse.
In case anyone else comes across this issue ... I solved it in python using CloudRun by creating a button that checks for for the object then if there is no object it requests access to the sheet in question.
from flask import Flask
from flask import request
app = Flask(__name__)
#app.route('/', methods=['POST'])
def test_addon_homepage():
req_body = request.get_json()
sheet_info = req_body.get('sheets')
card = {
"action": {
"navigations": [
{
"pushCard": {
"sections": [
{
"widgets": [
{
"textParagraph": {
"text": f"Hello {sheet_info.get('title','Auth Needed')}!"
}
}
]
}
]
}
}
]
}
}
if not sheet_info:
card = create_file_auth_button(card)
return card
def create_file_auth_button(self, card):
card['action']['navigations'][0]['pushCard']['fixedFooter'] = {
'primaryButton': {
'text': 'Authorize file access',
'onClick': {
'action': {
'function': 'https://example-cloudrun.a.run.app/authorize_sheet'
}
}
}
}
return card
#app.route('/authorize_sheet', methods=['POST'])
def authorize_sheet():
payload = {
'renderActions': {
'hostAppAction': {
'editorAction': {
'requestFileScopeForActiveDocument': {}
}
}
}
}
return payload

Zapier - add data to JSON response (App development)

We are creating a Zapier app to expose our APIs to the public, so anyone can use it. The main endpoint that people are using returns a very large and complex JSON object. Zapier, it looks like, has a really difficult time parsing nested complex JSON. But it does wonderful with a very simple response object such as
{ "field": "value" }
Our data that is being returned has this structure and we want to move some of the fields to the root of the response so it's easily parsed by Zapier.
"networkSections": [
{
"identifier": "Deductible",
"label": "Deductible",
"inNetworkParameters": [
{
"key": "Annual",
"value": " 600.00",
"message": null,
"otherInfo": null
},
{
"key": "Remaining",
"value": " 600.00",
"message": null,
"otherInfo": null
}
],
"outNetworkParameters": null
},
So, can we do something to return for example the remaining deductible?
I got this far (adding outputFields) but this returns an array of values. I'm not sure how to parse through this array either in the Zap or in the App.
{key: 'networkSections[]inNetworkParameters[]key', label: 'xNetworkSectionsKey',type: 'string'},
ie this returns an array of "Annual", "Remaining", etc
Great question. In this case, there's a lot going on, and outputFields can't quite handle it all. :(
In your example, inNetworkParameters contains an array of objects. Throughout our documentation, we refer to these as line items. These lines items can be passed to other actions, but the different expected structures presents a bit of a problem. The way we've handled this is by letting users map line-items from one step's output to another step's input per field. So if step 1 returns
{
"some_array": [
{
"some_key": "some_value"
}
]
}
and the next step needs to send
{
"data": [
{
"some_other_key": "some_value"
}
]
}
users can accomplish that by mapping some_array.some_key to data.some_other_key.
All of that being said, if you want to always return a Remaining Deductible object, you'll have to do it by modifying the result object itself. As long as this data is always in that same order, you can do something akin to
var data = z.JSON.parse(bundle.response.content);
data["Remaining Deductible"] = data.networkSections[0].inNetworkParameters[1].value;
return data;
If the order differs, you'll have to implement some sort of search to find the objects you'd like to return.
I hope that all helps!
Caleb got me where I wanted to go. For completeness this is the solution.
In the creates directory I have a js file for the actual call. The perform part is below.
perform: (z, bundle) => {
const promise = z.request({
url: 'https://api.example.com/API/Example/' + bundle.inputData.elgRequestID,
method: 'GET',
headers: {
'content-type': 'application/json',
}
});
return promise.then(function(result) {
var data = JSON.parse(result.content);
for (var i=0; i<data.networkSections.length; i++) {
for (var j=0; j<data.networkSections[i].inNetworkParameters.length; j++) {
// DEDUCT
if (data.networkSections[i].identifier == "Deductible" &&
data.networkSections[i].inNetworkParameters[j].key == "Annual")
data["zAnnual Deductible"] = data.networkSections[i].inNetworkParameters[j].value;
} // inner for
} // outer for
return data;
});

Automl image prediction problems

I get different results when using a model to get image annotation predictions from web UI and from API. Specifically, using the web UI I actually get predictions, but using the API I get nothing - just empty output.
It's this one that gives nothing using the API: https://cloud.google.com/vision/automl/docs/predict#automl-nl-example-cli
Specifically, the return value is {} - an empty JS object. So, the call goes through just fine, there's just no output.
Any hints as to how to debug the issue?
By default only results with prediction score > 0.5 are returned by the API.
To get all predictions you will need to provide extra argument 'score_threshold' to predict request:
For the REST API:
{
"payload": {
"image": {
"imageBytes": "YOUR_IMAGE_BYTES"
},
"params": { "score_threshold": "0.0" },
}
}
For the python call:
payload = {'image': {'image_bytes': content }, "params": { "score_threshold": "0.0" }}
With this argument all predictions will be returned. The predictions will be ordered by the 'score'.
Hope that helps,
That doesn't work, at least at the moment.
Instead the params need to go at the same level as the payload. E.g.:
{
"payload": {
"image": {
"imageBytes": "YOUR_IMAGE_BYTES"
}
},
"params": { "score_threshold": "0.0" },
}

How to order portable storage using SoftLayer API

Is there a simple method to order portable storage given the input datacenter such as WDC06 and size 500 GB.
At the moment the method I know of is painful, complex and manual, if I have do this in a new datacenter. First get the configuration through Product_Package and then going through long list of items to find the right product id, itemId ... etc. This call also requires that I should know the pkgid before hand.
categories = client['Product_Package'].getConfiguration(id=pkgId, mask='isRequired, itemCategory.id, itemCategory.name, itemCategory.categoryCode')
Please if you can share some code samples if this ordering process can be simplified.
I have not idea how you are ordering the portable storage,but you need to use the placeOrder method and get the proper prices for the disk size that you want to order, this literature can help you to understand how to make orders:
https://sldn.softlayer.com/blog/cmporter/location-based-pricing-and-you
https://sldn.softlayer.com/blog/bpotter/going-further-softlayer-api-python-client-part-3
The process to pick the correct prices is hard, but you can use the object filters to get them:
https://sldn.softlayer.com/article/object-filters
and here a sample using the softlayer Python client:
import SoftLayer
# Your SoftLayer API username and key.
API_USERNAME = 'set me'
API_KEY = 'set me'
datacenter = "wdc06" # lower case
size = "500" # the size of the disk
diskDescription = "optional value"
client = SoftLayer.Client(username=API_USERNAME, api_key=API_KEY)
package = 198 # this package is always the same
# Using a filter to get the price for an especific disk size
# into an specific datacenter
filter = {
"itemPrices": {
"pricingLocationGroup": {
"locations": {
"name": {
"operation": datacenter
}
}
},
"item": {
"capacity": {
"operation": size
}
}
}
}
price = client['SoftLayer_Product_Package'].getItemPrices(id=package, filter=filter)
# In case the request do not return any price we will look for the standard price
if not price:
filter = {
"itemPrices": {
"locationGroupId": {
"operation": "is null"
},
"item": {
"capacity": {
"operation": size
}
}
}
}
price = client['SoftLayer_Product_Package'].getItemPrices(id=package, filter=filter)
if not price:
print ("there is no a price for the selected datacenter %s and disk size %s" % (datacenter, size))
sys.exit(0)
# getting the locationId for the order template
filter = {
"regions": {
"location": {
"location": {
"name": {
"operation": datacenter
}
}
}
}
}
location = client['SoftLayer_Product_Package'].getRegions(id=package, filter=filter)
# now we are creating the ordertemplate
orderTemplate = {
"complexType": "SoftLayer_Container_Product_Order_Virtual_Disk_Image",
"packageId": package,
"location": location[0]["location"]["location"]["id"],
"prices": [{"id": price[0]["id"]}],
"diskDescription": diskDescription
}
#When you are ready to order change "verifyOrder" by "placeOrder"
order = client['SoftLayer_Product_Order'].verifyOrder(orderTemplate)
print order

Elasticsearch function_score not working?

I'm using the following function score for outfits purchased:
{
"query": {
"function_score": {
"field_value_factor": {
"field": "purchased",
"factor": 1.2,
"modifier": "sqrt",
"missing": 1
}
}
}
}
However, when I create a search - I get the following error:
"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [purchased] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
The syntax is correct for the search as I've run it locally and it works perfectly. I'm now running it on my server and it's not workings. Do I need to define purchased as an integer somewhere or is this due to something else?
The purchased field is an analyzed string field, hence the error you see.
When indexing your documents, make sure that the numbers are not within double quotes, i.e.
Wrong:
{
"purchased": "324"
}
Right:
{
"purchased": 324
}
...or if you can't change the source documents (because you're not responsible for producing them), make sure that you create a mapping that defines the purchased field as being an integer field.
{
"your_type": {
"properties": {
"purchased": {
"type": "integer"
}
}
}
}

Resources