Getting more than 1000 documents using folder() in Appian - appian

I am writing an Appian web API, to retrieve documents from our Appian system which will be used to integrate with our other systems.
To this end, I am using the folder() method to get information about the contents of a folder in Appian.
folder(
theCaseFolder,
"documentChildren"
)
The problem I am having is that while this code works most of the time - we have some cases where there are more than 1000 documents stored against the case. I note that the Appian documentation states that:
The documentChildren and folderChildren properties return up to the first 1000 documents or folders, respectively, that are direct children of the selected folder.
My problem is that we have a few cases where there are more than 3000 documents attached to the case. Is there a way to get a list of of those child documents, or am I plain out of luck?

In long term I would suggest storing some information about document in separate table in db. In this way you can query db as you wish by Appian or by SQL.
In short term you can get first 1000 as it is in documentation and then move them to subfolder/different folder or delete. This can be repeated multiple times to get all files from folder.
Move Document Appian Function

Related

Suitelink Reference Table

I work with Wonderware software. One of the objects used to perform communication between Wonderware and the PLC is called Suitelink. In it, I have a table defined that has the name of one of my application fields on the left side and the name of the PLC tag providing its value on the right.
Once this saved and activated (deployed) the PLC tags will feed values in the field attributes to Wonderware.
Does anyone know where is this list saved in the system?
I am working at a web page and want to retrieve this list dynamically so I can have the page updated based on the current live value of the PLC tag being used.
I have looked in the database but could not find it.
C:\ProgramData\Wonderware\DAServer
Then within there you'll have several subfolders for your DA Servers. Open the subfolder to find a *.AAcfg file and your contents are in there in what looks like an XML format. You'll be hunting for all the <DeviceItem> tags

Differences in Umbraco cache structure?

Ok, So I have just spent the last 6-8 weeks in the weeds of Umbraco and have made some fixes/Improvements to our site and environments. I have spent a lot of that time trying to correct lower level Umbraco caching related issues. Now reflecting on my experience and I still don't have a clue what the conceptual differences are between the following:
Examine indexes
umbraco.config
cached xml file in memory (supposedly similar to umbraco.config)
CMSContentXML Table
Thanks Again,
Devin
Examine Indexes are index of umbraco content
So when ever you create/update/delete content, the current content information will be indexed
This index are use for searching - under the hood, it is lucene index
umbraco backend use these index for searching
You can create your own index if you want
more info checkout, Overview & Explanation - "Examining Examine by Peter Gregory"
umbraco.config and cached xml in memory are really the same thing.
The front end UmbracoHelper api get content from the cache not the database - the cache is from the umbraco.config
CMSContentXML contains each content's information as xml
so essentially this xml represent all the information of a node content
So in a nutshell they represent really 3 things:
examine is used for searching
umbraco.config cached data - save round trip to DB
CMSContentXML stores full information of a content
Edit to include better clarification from Robert Foster comment and the UmbracoHelper vs ExamineManager
For the umbraco.config and CMSContentXML table, #robert-foster commented
umbraco.config stores the most recent version of all published content only; the in-memory cache is a cached version of this file; and the cmscontentxml table stores a representation of all content and is used primarily for preview mode - it is updated every time a content item is saved. IIRC it also stores a representation of other content types
Regards to UmbracoHelper vs ExamineManager
UmbracoHelper api mostly get it's content from the memory cache - IMO it works best when locating direct content, such as when you know the id of content you want, you just call Umbraco.TypedContent(id)
But where do you get the id you want in the first place? or put it another way, say if you want to find all content's property Title which contain a word "Test", then you would use Examine to search for it. Because Examine is really lucene wrapper, so it is going to be fast and efficient
Although you can traverse tree by method such as Umbraco.TypedContent(id).Children then use linq to filter the result, but I think this is done in memory using linq-to-object, so it is not as efficient and preferment as lucene
So personally I think:
use Examine when you are searching (locating) for content - because you can use the capability of a proper search engine lucene
once you got the ids from the search result, use UmbracoHelper to get the full publish content representation of the content id into strong type model and work with the data.
one thing #robert-foster mention in the comment which, I did not know is that UmbracoHelper provides Search method which is a wrapper around the examine, so use that if more familiar with that api.
Lastly, if any above statement are wrong or not so correct, help me clarify so that anyone look at it later will not get it wrong, thanks all.

Apache Solr: Merging documents from two sources before indexing

I need to index data from a custom application in Solr. The custom app stores metadata in an Oracle RDBMS and documents (PDF, MS Word, etc.) in a file store. The two are linked in the sense that the metadata in the database refers to a physical document (PDF) in the file store.
I am able to index the metadata from the RDBMS without issues. Now I would like to update the indexed documents with an additional field in which I can store the parsed content from the PDFs.
I have considered and tried the following
1. Using Update RequestHandler to try and update the indexed document with . This didn't work and the original document indexed from the RDBMS was overwritten.
2. Using SolrJ to do atomic updates but I am not sure if this is a good approach for something like this
Has anyone come across this issue before and what would be the recommended approach?
You can update the document, but it requires that you know the id of the existing document. For example:
{
"id": "5",
"parsed_content":{"set": "long text field with parsed content"}
}
Instead of just saying "parsed_content":"something" you have to wrap the value in "parsed_content":{"set":"something"} to trigger adding it to the existing document.
See https://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22field.22 for documentation on how to work with multivalued fields etc.

Google Drive queryForFilesList not returning any results

I'm currently having issues with the iOS Google Drive SDK. I'm using GTLQueryDrive queryForFilesList to search for a file in my Google Drive. All the files I want have a path in the format directory-name/file-name. Since the SDK/API doesn't allow searching for files using a full path, I'm using the following query to ultimately get it's downloadUrl. I'm using a query in the following format:
((title = 'directory-name') AND ('root' in parents) AND (mimeType = 'application/vnd.google-apps.folder')) OR
((title = 'file-name') AND (not 'root' in parents) AND (mimeType != 'application/vnd.google-apps.folder'))
The first line is meant to find all directories in the root directory whose name matches mine, and the second line should match all files with the same name. This should return the directory i'm looking for, the file i'm looking for, and maybe some other stuff (e.g. files with the same name in other directories). I have some code to figure out which file is the correct one.
The problem I'm having is that sometimes I get no results from the query. This generally happens after I rename the file, and rename it back, or other things like that. The weird part is that if I run either of the two lines of the query independently, it returns correctly, but together they don't.
Any help would be greatly appreciated, and I would gladly provide more information if required.
And yes, I'm using the kGTLAuthScopeDrive scope.
The ideal solution would be if I could just search using a full path, so if there's a way to do this, then I'm not aware.
Unfortunately, I was unable to get this to work. And also unfortunately, Google does not provide an API to query for a full path. So I resorted to iterating over the path to get the directory IDs, and then when I get to the file, get its ID and download it. Although slightly more complex, my solution was based on this question and chosen answer: What's the right way to find files by "full path" in Google Drive API v2

Delete multiple documents in CouchDB

I've a "best practice" question on CouchDB (actually I'm using TouchDB a CouchDB port to iOS), when using CouchCocoa framework.
I need to delete a bunch of documents that I get via a query.
I know 3 ways to do this:
1) put all the documents into an NSArray, then use [CouchDatabase deleteDocuments:]
2) foreach query rows call the delete method, like:
for (CouchQueryRow* row in query.rows)
[row.document DELETE];
3) create a query that emit the _id, _rev properties and add the _deleted property, then use the bulk update, like:
[couchDatabase putChanges:]
What's the better performance-wise? There's a better way to do it?
At the HTTP API level, the fastest way to achieve this is to run a single batch request that provides the _id and current _rev of all documents to be removed.
Your job is to make sure that CouchCocoa actually does this — I know that CouchCocoa will try to cache the _rev of documents it reads, so if you are deleting documents that have just been read, [CouchDatabase deleteDocuments:] should be enough, otherwise you will have to [CouchDatabase getDocumentsWithIDs:] first.
If your documents are very large, it might become better to get the _rev using a view instead of a bulk fetch. This forces you to use [CouchDatabase putChanges:] to perform the bulk deletion. I don't know where the document size threshold lies, so you will have to benchmark this one.
Of course, you also need to decide what happens when a conflict occurs.

Resources