Which encryption algorithm is used by Watson knowledge studio to encrypt training data - watson

I would like to know Encryption Algorithm that is used by Watson Knowledge Studio to Encrypt training data. Is it AES256 ?
Thanks in advance

The exported WKS model, that will be used by Watson Explorer, is partially encrypted. But we can not disclose our encryption mechanism.

Related

Does TFF serializalize functions of another library?

I'm planning a TFF scheme in which the clients send to the sever data besides the weights, like their hardware information (e.g CPU frequency). To achieve that, I need to call functions of third-party python libraries, like psutils. Is it possible to serialize (using tff.tf_computation) such kind of functions?
If not, what could be a solution to achieve this objective in a scenario where I'm using a remote executor setting through gRPC?
Unfortunately no, this does not work without modification. TFF uses TensorFlow graphs to serialize the computation logic to run on remote machines. TFF does not interpret Python code on the remote machines.
There maybe a solution by creating a TensorFlow custom op. This would mean writing C++ code to retrieve CPU frequency, and then a Python API to add the operation to the TensorFlow graph during computation construction. TensorFlow's guide for Create an op can provide detailed instructions.

transfer knowledge learned from distributed source domains

To resolve the problem of non-iid data in federated learning, I read a paper which add a new node with a different data domain and transfer knowledge from decentralized nodes. My question is what is the information transfered, is that updates or data ?
In layman terms, non-idd means that not all class labels are distributed evenly between clients for training. For obvious reasons, in federated environment it is not feasible for every client to hold and train on idd data. With regards your specific query of how it works in the paper mentioned in your question, you may please share the link of the paper.

What tools do you know for storage, version control and deploy as API service of ML models?

I found https://dataversioncontrol.com and https://hydrosphere.io/ml-lambda/. What can be more?
Convert your ML pipeline to a standardized text-based representation, and use regular version control tools (such as Git). For example, the PMML standard can represent most popular R, Scikit-Learn and Apache Spark ML transformation and model types. Better yet, after conversion to the standardized representation, all these models become directly comparable with one another (eg. measuring the "complexity" of random forest model objects between different ML frameworks).
You can build whatever APIs you like on top of this versioned base layer.
To get started with the PMML standard, please check out the Java PMML API backend project, and its Openscoring REST API frontend project.

What is the recommended method to transport machine learning models?

I'm currently working on a machine learning problem and created a model in Dev environment where the data set is low in the order of few hundred thousands. How do I transport the model to Production environment where data set is very large in the order of billions.
Is there any general recommended way to transport machine learning models?
Depends on which Development Platform your using. I know that DL4J uses Hadoop Hyper Parameter server. I write my ML progs in C++ and use my own generated data, TensorFlow and others use Data that is compressed and unpacked using Python. For Realtime data I would suggest using one of the Boost librarys as I have found it useful in dealing with large amounts of RT data for example Image Processing with OpenCV. But I imagine there must be an equivalent set of librarys suited to your data. CSV data is easy to process using C++ or Python. Realtime (Boost), Images (OpenCV), csv (Python) or you can just write a program that pipes the data into your program using Bash (Tricky). You could have it buffer the data somehow and then routinely serve the data to your ML program and then retrieve the data and store it in a Mysql Database. Sounds like you need a Data server or a Data management program so the ML algo just works away on its chunk of data. Hope that helps.

Does IBM Watson API learn from my data?

I'm testing couple of IBM Watson APIs like the following:
Does Watson get smarter and learn more about my data the more I use it?
I read that Watson is getting smarter with more data it learns and processes. I'm not sure if this is only done behind the scenes by IBM Watson team, or if these API's as well allow an instance of Watson for example to be smarter with my specific application I'm developing.
If you mean that Watson is using the data you input into your instances, then no. Watson is IBM's, but your data is always yours.
By default, instances are isolated.
By smarter, they mean they have their very own instances of APIs which they train. Also they, improve algorithms behind the scenes.
It depends on your definition of learning. Is it offline learning or online learning? Do you refer to Watson learning from your corpus on the entire domain and use it later on, or just on your data.
It also depends which services you use, check out Retrieve and Rank or Natural Language Classifier for examples of services that learn from your data

Resources