Programmatically adding tags to Data Catalog Custom entries - google-data-catalog

I am trying to attach tags to data catalog custom entries. I am trying to create a python function to perform data catalog operations i.e. create/delete custom entries, create/delete tag templates, attach tags to the fields of the created custom entries.
I was able to create a custom entry and a tag template using the datacatalog_v1 library, however I don't find a method or a rest API to attach the tags fields to the custom entry columns.
I am however able to complete via the GCP web UI console

You could see the next couple of examples on how to work with a data catalog REST API, and refer to the documentation that Google provides here.
Create an entry group
Before using any of the request data, make the following replacements:
project-id: Your GCP project ID
entryGroupId: The ID must begin with a letter or underscore, contain only English letters, numbers and underscores, and be at most
64 characters.
3.displayName: The textual name for the entry group.
HTTP method and URL:
POST https://datacatalog.googleapis.com/v1/projects/project-id/locations/us-central1/entryGroups?entryGroupId=entryGroupId
Request JSON body:
{
"displayName": "Entry Group display name"
}
Save the request body in a file called request.json, and execute the following command:
$cred = gcloud auth application-default print-access-token
$headers = #{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://datacatalog.googleapis.com/v1/projects/project-id/locations/us-central1/entryGroups?entryGroupId=entryGroupId" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
{
"name": "projects/my_projectid/locations/us-central1/entryGroups/my_entry_group",
"displayName": "Entry Group display name",
"dataCatalogTimestamps": {
"createTime": "2019-10-19T16:35:50.135Z",
"updateTime": "2019-10-19T16:35:50.135Z"
}
}
You can structure your tags by topic using tag templates. For example:
A data governance tag with fields for: data governor, retention date, deletion date, PII (yes or no), data classification (public,
confidential, sensitive, regulatory)
A data quality tag with fields for: quality issues, update frequency, SLO information
A data usage tag with fields for: top users, top queries, average daily users

Related

Sentence dictionary in Azure cognitive services

I'm having trouble to find the right way to create sentence dictionary using new portal
There is still a way to create one in legacy portal, but no clear examples. Also I'm curious if sentences would take into account grammar. I want to create some translations fron English to Polish which has quite complex grammar and depending on grammatical case and context different output is expected.
We can you the translator dictionary from the new portal too. But we need to take the keys generated from the new portal to custom language translator portal. Let’s walk through the solution.
Part 1: Language translation using Azure portal. Inbuild grammar (not complete)
Go to azure portal and search for a translator
Fill in the details according to the subscription
The above block will convert the English language into Polish according to the requirement. Below is the python code generated for translation. Fill in the details required according to the subscription.
import requests, uuid, json
# Add your key and endpoint
key = "<your-translator-key>"
endpoint = "https://api.cognitive.microsofttranslator.com"
# location, also known as region.
# required if you're using a multi-service or regional (not global) resource. It can be found in the Azure portal on the Keys and Endpoint page.
location = "<YOUR-RESOURCE-LOCATION>"
path = '/translate'
constructed_url = endpoint + path
params = {
'api-version': '3.0',
'from': 'en',
'to': ['fr', 'zu']
}
headers = {
'Ocp-Apim-Subscription-Key': key,
# location required if you're using a multi-service or regional (not global) resource.
'Ocp-Apim-Subscription-Region': location,
'Content-type': 'application/json',
'X-ClientTraceId': str(uuid.uuid4())
}
# You can pass more than one object in body.
body = [{
'text': 'I would really like to drive your car around the block a few times!'
}]
request = requests.post(constructed_url, params=params, headers=headers, json=body)
response = request.json()
print(json.dumps(response, sort_keys=True, ensure_ascii=False, indent=4, separators=(',', ': ')))
Get the keys before going to part 2
Part 2: To add sentence dictionary. Use the custom translator services studio
https://language.cognitive.azure.com/home -> Check into this link
This will create a project where we can choose the language to convert and start the translation with sentence dictionary. By default, sentence dictionary is application in new language translation.

Expanding fields not working fully on SharePoint Lists

I am following the documentation for Get metadata for a list.
Querying using either PowerShell or the Graph Explorer fails to fully expand the fields for items in a SharePoint list.
An example of this is a lookup field called Responsible that looks up users in Azure Active Directory (or in SharePoint terms, the column is a Person or Group column, limited to people only).
Once selected via the GUI, it's populated with a display name (although I'd hope for more definitive information to be stored on the back end, like UPN).
When querying the Graph API using the form:
$Uri = "https://graph.microsoft.com/v1.0/sites/$($SPSite.id)/lists/$($ServiceList.id)/items?expand=fields"
$Data = Invoke-RestMethod -Headers #{Authorization = "Bearer $accesstoken"} -Uri $Uri -Method Get -ErrorAction Stop
we get something like this:
#odata.etag : "REMOVED"
Title : Storage Platform
Description : Central storage platform
ResponsibleLookupId : 14
Responsible2LookupId : 13
AccountableLookupId : 3
Features : NFS
AudienceLookupId : 92
RequestProcess : {#{LookupId=1; LookupValue=Service Desk}}
Support : {#{LookupId=1; LookupValue=Service Desk}}
AvailabilityLookupId : 1
DependsOn : {}
O365GroupLookupId : 87
LifecycleStageLookupId : 2
ConsultLookupId : 88
id : 1
ContentType : Item
Modified : 2017-11-17T10:47:07Z
Created : 2017-11-17T10:47:07Z
_UIVersionString : 1.0
Attachments : False
Edit :
LinkTitleNoMenu : Storage Platform
LinkTitle : Storage Platform
ItemChildCount : 0
FolderChildCount : 0
_ComplianceFlags :
_ComplianceTag :
_ComplianceTagWrittenTime :
_ComplianceTagUserId :
You can see that the field ResponsibleLookupId just gives a value of 14 which is not useful. Other fields link to Office 365 Groups, but again return values. As such it's impossible to link any of this data to users/groups and is very limited in value except when looking at it through the portal.
How do we expand this data? Will it be provided by the API call at a later date, or do we have to perform further look ups?
By default, Microsoft Graph will return the LookupId for lookup fields. You can ask it to provide the actual value by specifically requesting that field in a $select parameter.
Using the following query will return the displayName rather than the LookupId for Responsible:
...items?expand=fields($select=Responsible)
You can read about how this works in the documentation for FieldValueSet.
As for returning the userPrincipalName, currently you can't control which value it returns (it's either LookupId or displayName).I recommend visiting the UserVoice and adding your suggestion.

Open new ticket in JIRA using REST api

I'd like to understand how to create a new ticket in JIRA using REST API from Jenkins. Is there any limitations or special things I should be aware of?
I'm going to write a Python script, which will parse the build log and then create a new ticket in JIRA project.
I checked the plugins, but most of them only can update the existing tickets.
Thanks
There's documentation here about the JSON schema and some example JSON which needs to go in the body of your POST request to /rest/api/2/issue
https://docs.atlassian.com/jira/REST/cloud/#api/2/issue-createIssue
Here's a basic python3 script to make a post request
import requests, json
from requests.auth import HTTPBasicAuth
base_url = "myjira.example.com" # The base_url of the Jira insance.
auth_user = "simon" # Jira Username
auth_pass = "N0tMyRe3lP4ssw0rd" # Jira Password
url = "https://{}/rest/api/2/issue".format(base_url)
# Set issue fields in python dictionary. See docs and comment below regarding available fields
fields = {
"summary": "something is wrong"
}
payload = {"fields": fields}
headers = {"Content-Type": "application/json"}
response = requests.post(
url,
auth=(auth_user, auth_pass),
headers=headers,
data=json.dumps(payload))
print("POST {}".format(url))
print("Response {}: {}".format(response.status_code, response.reason))
_json = json.loads(response.text)
Using this HTTP requests library for python http://docs.python-requests.org/en/master/
You can make a GET request to /rest/api/2/issue/{issueIdOrKey}/editmeta using the id or key of existing issue in the same project as the issue's you will be creating via the API will go to in order to get a list of all the fields you can set and which ones are required.
https://docs.atlassian.com/jira/REST/cloud/#api/2/issue-getEditIssueMeta

404 response when creating a course with the API

I am attempting to create a course with the API, and no matter how I tweak what I am sending I keep getting back the same 404 error. I am posting the following to /d2l/api/lp/1.4/courses/ in our test instance.
{
"Name":"STLR Course-112",
"Code":"STLR.112.201420",
"Path":"",
"CourseTemplateId":22462,
"SemesterId":22460,
"StartDate":"2014-05-07T12:00:00.000Z",
"EndDate":"2014-05-07T13:00:00.000Z",
"LocaleId":null,
"ForceLocale":false,
"ShowAddressBook":false
}
I can confirm with a test instance here that this API works with data almost identical to the block you provide here. POSTing a body like this (white space added for clarity):
{"CourseTemplateId": 8082,
"LocaleId": null,
"Code": "STLR.112.201420",
"Name": "STLR Course-112",
"Path": "",
"ShowAddressBook": false,
"EndDate": "2014-05-07T13:00:00.000Z",
"StartDate": "2014-05-07T12:00:00.000Z",
"ForceLocale": false,
"SemesterId": 6984}
Gets me a 200 with a response like this (white space added for clarity):
{"Identifier":"114119",
"Name":"STLR Course-112",
"Code":"STLR.112.201420",
"IsActive":true,
"Path":"/content/enforced/114119-STLR.112.201420/",
"StartDate":"2014-05-07T12:00:00.000Z",
"EndDate":"2014-05-07T13:00:00.000Z",
"CourseTemplate":{"Identifier":"8082",
"Name":"ExtensibilityTemplate",
"Code":"EXT-TMPL"},
"Semester":{"Identifier":"6984",
"Name":"Fall 2011",
"Code":"FA2011"},
"Department":{"Identifier":"8081",
"Name":"Extensibility",
"Code":"EXT"}
}
It appears to me that the only differences between my input block and yours are the IDs provided for course template and semester, so that I could hook the new course into my local test instance instead of the IDs for those orgunits in yours. Otherwise, it appears the input properties are identical.
Some things you could look at:
Ensure you're using the right Org Unit Id values for your course template and your semester
Ensure that your LMS is configured to enforce content paths for new org units: this should then provoke the LMS to auto-create the path for you when you create a course offering; if you don't have content path enforcement on, then you might have to instead specify a valid content path for your course offering on create, and passing in an empty string probably won't be a valid path, and thus you might get a 404 back because the API service handler "can't find the content path" you've specified.
Is there a particular message that comes back with the 404?

Simultaneously get multiple resources by ID

There exists a DocsClient.get_resource_by_id function to get the document entry for a single ID. Is there a similar way to obtain (in a single call) multiple document entries given multiple document IDs?
My application needs to efficiently download the content from multiple files for which I have the IDs. I need to get the document entries to access the appropriate download URL (I could manually construct the URLs, but this is discouraged in the API docs). It is also advantageous to have the document type and, in the case of spreadsheets, the document entry is required in order to access individual worksheets.
Overall I'm trying to reduce I/O waits, so if there's a way I can bundle the doc ID lookup, it will save me some I/O expense.
[Edit] Backporting AddQuery to gdata v2.0 (from Alain's solution):
client = DocsClient()
# ...
request_feed = gdata.data.BatchFeed()
request_entry = gdata.data.BatchEntry()
request_entry.batch_id = gdata.data.BatchId(text=resource_id)
request_entry.batch_operation = gdata.data.BATCH_QUERY
request_feed.add_batch_entry(entry=request_entry, batch_id_string=resource_id, operation_string=gdata.data.BATCH_QUERY)
batch_url = gdata.docs.client.RESOURCE_FEED_URI + '/batch'
rsp = client.batch(request_feed, batch_url)
rsp.entry is a collection of BatchEntry objects, which appear to refer to the correct resources, but which differ from the entries I'd normally get via client.get_resource_by_id().
My workaround is to convert gdata.data.BatchEntry objects into gdata.docs.data.Resource objects ilke thus:
entry = atom.core.parse(entry.to_string(), gdata.docs.data.Resource)
You can use a batch request to send multiple "GET" requests to the API using a single HTTP request.
Using the Python client library, you can use this code snippet to accomplish that:
def retrieve_resources(gd_client, ids):
"""Retrieve Documents List API Resources using a batch request.
Args:
gd_client: authorized gdata.docs.client.DocsClient instance.
ids: Collection of resource id to retrieve.
Returns:
ResourceFeed containing the retrieved resources.
"""
# Feed that holds the batch request entries.
request_feed = gdata.docs.data.ResourceFeed()
for resource_id in ids:
# Entry that holds the batch request.
request_entry = gdata.docs.data.Resource()
self_link = gdata.docs.client.RESOURCE_SELF_LINK_TEMPLATE % resource_id
request_entry.id = atom.data.Id(text=self_link)
# Add the request entry to the batch feed.
request_feed.AddQuery(entry=request_entry, batch_id_string=resource_id)
# Submit the batch request to the server.
batch_url = gdata.docs.client.RESOURCE_FEED_URI + '/batch'
response_feed = gd_client.Post(request_feed, batch_url)
# Check the batch request's status.
for entry in response_feed.entry:
print '%s: %s (%s)' % (entry.batch_id.text,
entry.batch_status.code,
entry.batch_status.reason)
return response_feed
Make sure to sync to the latest version of the project repository.

Resources