Phonograph2:ReadOnlyTables error in Slate writeback query - foundry-slate

I'm working with a sample dataset of airports as I continue exploring Slate features for my team. I copied the default airport dataset into my files, so this is a version that I fully own (so presumably no permission issues there). The dataset is properly available in my Slate application since I'm also using it to display and filter data via Phonograph2 queries.
Based on the Phonograph2 docs, I created a new query to add a new airport to the dataset. I'm using the "Table Storage Service" and the "Post Event" endpoint. As a test, I configured my tableEditedEventPostRequest request as:
{
"primaryKey": {
"airport": "ABC"
},
"payload": {
"type": "rowAdded",
"rowAdded": {
"columns": {
"display_name": "[ABC] My New Airport"
}
}
}
}
(Once I get this working I'd switch the values out with dynamic values from widgets.)
When I run a test of this query, I get this error response:
{
"errorCode":"INVALID_ARGUMENT",
"errorName":"Phonograph2:ReadOnlyTables",
"errorInstanceId":"17ec990d-5d58-479d-a1b6-5ad033c8c808",
"parameters":{
"tableRids":"[ri.phonograph2.main.table.f3f33f6e-801a-4454-98e9-f2df5f170559]",
"dataInputLocatorRids":"[ri.foundry.main.dataset.6add7c46-d3c9-4056-89b6-a19dbe461ed4]"
}
}
I'm not finding anything about this error or anything in the docs (so far) about the target dataset being configured as read-only. There aren't settings I can find on the dataset to make it more permissive and I'm already the owner of the dataset. I'd appreciate any insights or tips to get past this road block.

For a Phonograph Table to be "editable" it needs to be associated with a writeback dataset. If you created the sync through the Ontology, which it seems like you did not, you would do this on the "Datasources" configuration tab.
Since it sounds like you created the sync directly from the Dataset Details view (or maybe through the Slate Datasets tab), you should have an option in that configuration to create a new dataset for writeback. All you should need to do is provide a dataset name and folder location.

Related

dask - read_json into dataframe ValueError

A minimal example here: I have a json file xaa.json whose contents looks like this (two rows from stackoverflow archive):
[
{"Id": 11, "Body": "<p>Given a specific <code>DateTime</code> value", "Title": "Calculate relative time in C#", "Comments": "There is the .net package https://github.com/NickStrupat/TimeAgo which pretty much does what is being asked."},
{"Id": 7888, "Body": "<p>You need to use an <code>ifstream</code> if you just want to read (use an <code>ofstream</code> to write, or an <code>fstream</code> for both).</p>
<p>To open a file in text mode, do the following:</p>
<pre><code>ifstream in(\\"filename.ext\\", ios_base::in); // the in flag is optional
</code></pre>
<p>To open a file in binary mode, you just need to add the \\"binary\\" flag.</p>
<pre><code>ifstream in2(\\"filename2.ext\\", ios_base::in | ios_base::binary );
</code></pre>
<p>Use the <code>ifstream.read()</code> function to read a block of characters (in binary or text mode). Use the <code>getline()</code> function (it's global) to read an entire line.</p>
", "Title": null, "Comments": "+1 for noting that the global getline() function is to be used instead of the member function."}
]
I want to load such json files into a dask dataframe. I use:
so_posts_df = dd.read_json('./xaa.json', orient='columns').compute()
I get this error:
ValueError: Unexpected character found when decoding object value
After looking into the contents, I figured that the "\\"' stuff was causing it. So, when I removed them, (the editor - IntelliJ said it was clean and nice looking JSON) and when I ran the same read_json, it was able to read into a df and display them nicely.
So, I have 2 questions: (a) what are the values for the read_json argument "errors" ? (b) How can I properly preprocess the json file before reading into dask dataframe? The presence of double-quotes and the double-escaping seems to be causing an issue.
[This may not be a dask issue at all...]...
This also fails with pandas.read_json. I recommend first trying to get things to work well with Pandas, and then try the same workload with dask dataframe. You will likely get much better support when asking Pandas questions.

Azure digital Twin UDF getting Space location

I have location added to a space that sits 2 levels above sensor but I am finding no way with the current client reference operation to get the location as I want to enrich the telemetry with space location information.
I have used the following
getSpaceMetadata
getSpaceExtendedProperty(spaceId, propertyName) //As it is not extended property
I need the functionality similar to this
https://urlofdigitaltwin/management/api/v1.0/spaces/633a40d6-790d-4bd5-92c5-1cc8b1a86141/?includes=location
Please let me know if there is a way I can always do it inside some other azure service by going and reading these separately.
space with location
device
sensor
-matcher
-udf
Thanks for the great question! Azure Digital Twins is undergoing continuous improvement. I hope you'll find the documentation significantly improved.
Assuming you have extracted the location ID from the sensor or device, you can find the associated parentSpaceId:
{
"id": "aa000aaa-a0a0-0000-a0aa-00000a000aa0",
"name": "Example Room",
"typeId": 14,
"parentSpaceId": "1b1b1111-b1b1-1111-111b-1b1b11b11111",
"subtypeId": 13,
"statusId": 12
}
From there you can call the top-level space directly. You can combine that operation with several API query parameters such as traverse, minLevel, and maxLevel which should allow you to fetch everything you need in one call.
Two new resources that describe those API operations are now available:
https://learn.microsoft.com/azure/digital-twins/how-to-navigate-apis
https://learn.microsoft.com/azure/digital-twins/how-to-query-common-apis
Thanks!

Can Dataflow sideInput be updated per window by reading a gcs bucket?

I’m currently creating a PCollectionView by reading filtering information from a gcs bucket and passing it as side input to different stages of my pipeline in order to filter the output. If the file in the gcs bucket changes, I want the currently running pipeline to use this new filter info. Is there a way to update this PCollectionView on each new window of data if my filter changes? I thought I could do it in a startBundle but I can’t figure out how or if it’s possible. Could you give an example if it is possible.
PCollectionView<Map<String, TagObject>>
tagMapView =
pipeline.apply(TextIO.Read.named("TagListTextRead")
.from("gs://tag-list-bucket/tag-list.json"))
.apply(ParDo.named("TagsToTagMap").of(new Tags.BuildTagListMapFn()))
.apply("MakeTagMapView", View.asSingleton());
PCollection<String>
windowedData =
pipeline.apply(PubsubIO.Read.topic("myTopic"))
.apply(Window.<String>into(
SlidingWindows.of(Duration.standardMinutes(15))
.every(Duration.standardSeconds(31))));
PCollection<MY_DATA>
lineData = windowedData
.apply(ParDo.named("ExtractJsonObject")
.withSideInputs(tagMapView)
.of(new ExtractJsonObjectFn()));
You probably want something like "use an at most a 1-minute-old version of the filter as a side input" (since in theory the file can change frequently, unpredictably, and independently from your pipeline - so there's no way really to completely synchronize changes of the file with the behavior of the pipeline).
Here's a (granted, rather clumsy) solution I was able to come up with. It relies on the fact that side inputs are implicitly also keyed by window. In this solution we're going to create a side input windowed into 1-minute fixed windows, where each window will contain a single value of the tag map, derived from the filter file as-of some moment inside that window.
PCollection<Long> ticks = p
// Produce 1 "tick" per second
.apply(CountingInput.unbounded().withRate(1, Duration.standardSeconds(1)))
// Window the ticks into 1-minute windows
.apply(Window.into(FixedWindows.of(Duration.standardMinutes(1))))
// Use an arbitrary per-window combiner to reduce to 1 element per window
.apply(Count.globally());
// Produce a collection of tag maps, 1 per each 1-minute window
PCollectionView<TagMap> tagMapView = ticks
.apply(MapElements.via((Long ignored) -> {
... manually read the json file as a TagMap ...
}))
.apply(View.asSingleton());
This pattern (joining against slowly changing external data as a side input) is coming up repeatedly, and the solution I'm proposing here is far from perfect, I wish we had better support for this in the programming model. I've filed a BEAM JIRA issue to track this.

Neo4j Java VM Tuning (v2.3 Community)

From what I can tell I'm having an issue with my Neo4j v2.3 Community Java VM adding items to the Old Gen Heap and never being able to Garbage Collecting them.
Here is a detailed outline of the situation.
I have a PHP file which calls the Dropbox Delta API and writes out the file structure to my Neo4j Database. Each call to Delta returns a 2000 Item data sets of which I pull out the information I need, the following is an example of what that query looks like with just one item, usually I send in full batches of 2000 items as it gave me the best results.
***Following is an example Query***
MERGE (c:Cloud { type:'Dropbox', id_user:'15', id_account:''})
WITH c
UNWIND [
{ parent_shared_folder_id:488417928, rev:'15e1d1caa88',.......}
]
AS items MERGE (i:Item { id:items.path, id_account:'', id_user:'15', type:'Dropbox' })
ON Create SET i = { id:items.path, id_account:'', id_user:'15', is_dir:items.is_dir, name:items.name, description:items.description, size:items.size, created_at:items.created_at, modified:items.modified, processed:1446769779, type:'Dropbox'}
ON Match SET i+= { id:items.path, id_account:'', id_user:'15', is_dir:items.is_dir, name:items.name, description:items.description, size:items.size, created_at:items.created_at, modified:items.modified, processed:1446769779, type:'Dropbox'}
MERGE (p:Item {id_user:'15', id:items.parentPath, id_account:'', type:'Dropbox'})
MERGE (p)-[:Contains]->(i)
MERGE (c)-[:Owns]->(i)
***The query is sent via Everyman****
static function makeQuery($client, $qry) {
return new Everyman\Neo4j\Cypher\Query($client, $qry);
}
This works fine and generally from start to finish takes 8-10 seconds to run.
The Dropbox account I am accessing contains around 35000 items, and takes around 18 runs of my PHP to populate my Neo4j Database with the folder/file structure of the dropbox account.
With every run of this PHP, around 50mb of items are added to the Neo4j JVM Old Gen heap, 30mb of that is not removed by GC.
The end result is obviously the VM runs out of memory and gets stuck in a constant state of GC throttling.
I have tried a range of Neo4j VM settings, as well as an update from Neo4j v2.2.5 to v2.3, which actually has appeared to make the problem worse.
My current settings are as follows,
-server
-Xms4096m
-Xmx4096m
-XX:NewSize=3072m
-XX:MaxNewSize=3072m
-XX:SurvivorRatio=1
I am testing on a windows 10 PC with 8GB of ram and an i5 2.5GHz quad core. Java 1.8.0_60
Any info on how to solve this issue would be greatly appreciated.
Cheers, Jack.
Reduce the new size to 1024m
change your settings to:
-server
-Xms4096m
-Xmx4096m
-XX:NewSize=1024m
It is most likely that the size of your tx grows too large.
I recommend sending each of the parents in separately, so instead of the UNWIND sent one statement each.
Make sure to use the new transactional http endpoint, I recommend to go wit neoclient instead of Neo4jPHP
You should also use parameters instead of literal values!!!
And don't repeeat user-id and type etc. properties on every item.
Are you sure you want to connect everything to c not just the root of the directory structure? I would do the latter.
MERGE (c:Cloud:Dropbox { id_user:{userId}})
MERGE (p:Item:Dropbox {id:{parentPath}})
// owning the parent should be good enough
MERGE (c)-[:Owns]->(p)
WITH c
UNWIND {items} as item
MERGE (i:Item:Dropbox { id:item.path})
ON Create SET i += { is_dir:item.is_dir, name:item.name, created_at:item.created_at }
SET i += { description:item.description, size:item.size, modified:items.modified, processed:timestamp()}
MERGE (p)-[:Contains]->(i);
Make sure to use 2.3.0 for best MERGE performance for relationships.

How do you find a user's last used printer in SysLastValue

I've been trying to find where a user's last used printer is stored so that I can clear this usage data (as a few users have an issue where the remembered printer keeps defaulting to the XPS writer, despite us having KB981681 installed & the printer being available; just not defaulted on certain AX forms).
I know this data's somewhere in the Usage Data, which I can browse via AX:
Microsoft Dynamics AX > Tools > Development Tools > Application Objects > Usage Data
AOT > System Documentation > Tables > SysLastValue > (right click) > Add-Ins > Table Browser
Or through SQL:
use AXDB
go
select *
from SysLastValue
where userid in
(
select id
from userinfo
where networkalias in ('userid1','userid2')
)
and elementname like '%print%'
and iskernel = 1
However so far I've not been able to guess which setting holds the last used printer information. Since the value field is of type image (i.e. a blob) I also can't search based on value.
Any advise on how to find this setting would be helpful.
Unfortunately there really isn't one "last used printer" stored, as much as each process packs and stores the last used print settings. Here is an example of how you can pull the last used print settings after posting a picking slip from the sales form.
static void JobGetPrinterSettingsPickList(Args _args)
{
container lastValues;
SalesFormLetter_PickingList pickList = new SalesFormLetter_PickingList();
SRSPrintDestinationSettings printSettings;
lastValues = xSysLastValue::getValue(curext(), curUserId(), UtilElementType::Class, classStr(SalesFormLetter_PickingList), formStr(SalesTable));
pickList.unpack(lastValues);
printSettings = new SRSPrintDestinationSettings(pickList.printerSettingsFormletter());
info(strFmt("%1", printSettings.printerName()));
info(strFmt("%1", printSettings.printerType()));
}
Edit: Ah I see you're having a specific issue. Check the pack/unpack and version of whatever object is having the issue. That is likely where the issue is. Or if it's on several things, check if they're all extended classes and you need to look at the parent class.

Resources