Tensorflow federated : How to map the remote-worker with remote datasets in iterative_process.next?

Tensorflow federated : How to map the remote-worker with remote datasets in iterative_process.next? - tensorflow-federated

I would like to point the federated_train_data to remote client data as shown in the code below.Is this possible? How ?
If not what further implementation is required for me to try this out. Kindly point me to the relevant code.
factory = tff.framework.create_executor_factory(make_remote_executor)
context = tff.framework.ExecutionContext(factory)
tff.framework.set_default_context(context)
state = iterative_process.initialize()
state, metrics = iterative_process.next(state, federated_train_data)
def make_remote_executor(inferred_cardinalities):
"""Make remote executor."""
def create_worker_stack(ex):
ex = tff.framework.ThreadDelegatingExecutor(ex)
return tff.framework.ReferenceResolvingExecutor(ex)
client_ex = []
num_clients = inferred_cardinalities.get(tff.CLIENTS, None)
if num_clients:
print('Inferred that there are {} clients'.format(num_clients))
else:
print('No CLIENTS placement provided')
for _ in range(num_clients or 0):
channel = grpc.insecure_channel('{}:{}'.format(FLAGS.host, FLAGS.port))
remote_ex = tff.framework.RemoteExecutor(channel, rpc_mode='STREAMING')
worker_stack = create_worker_stack(remote_ex)
client_ex.append(worker_stack)
federating_strategy_factory = tff.framework.FederatedResolvingStrategy.factory(
{
tff.SERVER: create_worker_stack(tff.framework.EagerTFExecutor()),
tff.CLIENTS: client_ex,
})
unplaced_ex = create_worker_stack(tff.framework.EagerTFExecutor())
federating_ex = tff.framework.FederatingExecutor(federating_strategy_factory,
unplaced_ex)
return tff.framework.ReferenceResolvingExecutor(federating_ex)
This is from https://github.com/tensorflow/federated/blob/master/tensorflow_federated/python/examples/remote_execution/remote_executor_example.py

In the linked example, you can see that the client data is coming from a tf.data.Dataset per-client generated by the make_federated data function.
Client data can be supplied in the form of a serializable tf.data.Dataset or, depending on how you're defining your iterative process, you can tff.federated_map some input data (such as client IDs) to datasets using TensorFlow.
Note that RemoteExecutors are not designed to run against data "on clients", that is, on the remote executor itself. They could perhaps be used this way using TensorFlow code to read data from the remote executor's filesystem into a dataset, but in general this is not a supported use-case. The recommended way to handle client data is to have a TensorFlow computation that can generate a tf.data.Dataset representing the client data based on a client ID or other input to the client's TensorFlow computation.

Related

Vertex AI - Deployment failed

I'm trying to deploy my custom-trained model using a custom-container, i.e. create an endpoint from a model that I created.
I'm doing the same thing with AI Platform (same model & container) and it works fine there.
At the first try I deployed the model successfully, but ever since whenever I try to create an endpoint it says "deploying" for 1+ hours and then it fails with the following error:
google.api_core.exceptions.FailedPrecondition: 400 Error: model server never became ready. Please validate that your model file or container configuration are valid. Model server logs can be found at (link)
The log shows the following:
* Running on all addresses (0.0.0.0)
WARNING: This is a development server. Do not use it in a production deployment.
* Running on http://127.0.0.1:8080
[05/Jul/2022 12:00:37] "[33mGET /v1/endpoints/1/deployedModels/2025850174177280000 HTTP/1.1[0m" 404 -
[05/Jul/2022 12:00:38] "[33mGET /v1/endpoints/1/deployedModels/2025850174177280000 HTTP/1.1[0m" 404 -
Where the last line is being spammed until it ultimately fails.
My flask app is as follows:
import base64
import os.path
import pickle
from typing import Dict, Any
from flask import Flask, request, jsonify
from streamliner.models.general_model import GeneralModel
class Predictor:
def __init__(self, model: GeneralModel):
self._model = model
def predict(self, instance: str) -> Dict[str, Any]:
decoded_pickle = base64.b64decode(instance)
features_df = pickle.loads(decoded_pickle)
prediction = self._model.predict(features_df).tolist()
return {"prediction": prediction}
app = Flask(__name__)
with open('./model.pkl', 'rb') as model_file:
model = pickle.load(model_file)
predictor = Predictor(model=model)
#app.route("/predict", methods=['POST'])
def predict() -> Any:
if request.method == "POST":
instance = request.get_json()
instance = instance['instances'][0]
predictions = predictor.predict(instance)
return jsonify(predictions)
#app.route("/health")
def health() -> str:
return "ok"
if __name__ == '__main__':
port = int(os.environ.get("PORT", 8080))
app.run(host='0.0.0.0', port=port)
The deployment code which I do through Python is irrelevant because the problem persists when I deploy through GCP's UI.
The model creation code is as follows:
def upload_model(self):
model = {
"name": self.model_name_on_platform,
"display_name": self.model_name_on_platform,
"version_aliases": ["default", self.run_id],
"container_spec": {
"image_uri": f'{REGION}-docker.pkg.dev/{GCP_PROJECT_ID}/{self.repository_name}/{self.run_id}',
"predict_route": "/predict",
"health_route": "/health",
},
}
parent = self.model_service_client.common_location_path(project=GCP_PROJECT_ID, location=REGION)
model_path = self.model_service_client.model_path(project=GCP_PROJECT_ID,
location=REGION,
model=self.model_name_on_platform)
upload_model_request_specifications = {'parent': parent, 'model': model,
'model_id': self.model_name_on_platform}
try:
print("trying to get model")
self.get_model(model_path=model_path)
except NotFound:
print("didn't find model, creating a new one")
else:
print("found an existing model, creating a new version under it")
upload_model_request_specifications['parent_model'] = model_path
upload_model_request = model_service.UploadModelRequest(upload_model_request_specifications)
response = self.model_service_client.upload_model(request=upload_model_request, timeout=1800)
print("Long running operation:", response.operation.name)
upload_model_response = response.result(timeout=1800)
print("upload_model_response:", upload_model_response)
My problem is very close to this one with the difference that I do have a health check.
Why would it work on the first deployment and fail ever since? Why would it work on AI Platform but fail on Vertex AI?

This issue could be due to different reasons:
Validate the container configuration port, it should use port 8080.
This configuration is important because Vertex AI sends liveness
checks, health checks, and prediction requests to this port on the
container. You can see this document about containers, and this
other about custom containers.
Another possible reason is quota limits, which could need to be increased. You will be able to verify this using this document to do it
In the health and predict route use the MODEL_NAME you are using.
Like this example
"predict_route": "/v1/models/MODEL_NAME:predict",
"health_route": "/v1/models/MODEL_NAME",
Validate that the account you are using has enough permissions to
read your project's GCS bucket.
Validate the Model location, should be the correct path.
If any of the suggestions above work, it’s a requirement to contact GCP Support by creating a Support Case to fix it. It’s impossible for the community to troubleshoot it without using internal GCP resources

In case you haven't yet found a solution you can try out custom prediction routines. They are really helpful as they strip away the necessity to write the server part of the code and allows us to focus solely on the logic of our ml model and any kind of pre or post processing. Here is the link to help you out https://codelabs.developers.google.com/vertex-cpr-sklearn#0. Hope this helps.

Dask - diagnostics dashboard - custom info about task

I'm using Dask to schedule and run research batches.
Those mostly produce side effects and are quite heavy (ranging from few minutes to a couple of hours). There's no communication between the tasks.
In code it looks like this, first I'm passing all the batches to process:
def process_batches(batches: Iterator[Batch], log_dir: Path):
cluster = LocalCluster(
n_workers=os.cpu_count(),
threads_per_worker=1
)
client = Client(cluster)
futures = []
for batch in batches:
futures += process_batch(batch, client, log_dir)
progress(futures)
Then I'm submitting repetitions from each batch as tasks:
def process_batch(batch: Batch, client: Client, log_dir: Path) -> List[Future]:
batch_dir = log_dir.joinpath(batch.nice_hash)
batch_futures = []
num_workers = len(client.scheduler_info()['workers'])
with Logger(batch_dir, clear_dir=True) as logger:
logger.save_json(batch.as_dict, 'batch')
for repetition in range(batch.n_repeats):
cpu_index = repetition % num_workers
future = client.submit(
process_batch_repetition,
batch,
repetition,
cpu_index,
logger
)
batch_futures.append(future)
return batch_futures
Is there any way to pass some custom info about the submitted task to the dashboard?
All I'm seeing are just tasks process_batch_repetition. Could I replace it with a custom string, so I can see what batch configurations are being processed at the moment?

Got an answer from Dask's BDFL mrocklin.
You can use the key= keyword to specify a key for the future. This should
be unique per future. Dask will use the prefix of the key name to
determine how it is rendered on the dashboard. See the docstring for
dask.utils.key_split for examples on how a key prefix is generated from a
key.
So you can use it like this:
future = client.submit(
process_batch_repetition,
batch,
repetition,
cpu_index,
logger,
key=f'{str(batch)}_repetition_{repetition}'
)
You just pass a unique string for this task. There are some forbidden chars (i.e. spaces), so expect some key errors.

Using DASK to read files and write to NEO4J in PYTHON

I am having trouble parallelizing code that reads some files and writes to neo4j.
I am using dask to parallelize the process_language_files function (3rd cell from the bottom).
I try to explain the code below, listing out the functions (First 3 cells).
The errors are printed at the end (Last 2 cells).
I am also listing environments and package versions at the end.
If I remove dask.delayed and run this code sequentially, its works perfectly well.
Thank you for your help. :)
==========================================================================
Some functions to work with neo4j.
from neo4j import GraphDatabase
from tqdm import tqdm
def get_driver(uri_scheme='bolt', host='localhost', port='7687', username='neo4j', password=''):
"""Get a neo4j driver."""
connection_uri = "{uri_scheme}://{host}:{port}".format(uri_scheme=uri_scheme, host=host, port=port)
auth = (username, password)
driver = GraphDatabase.driver(connection_uri, auth=auth)
return driver
def format_raw_res(raw_res):
"""Parse neo4j results"""
res = []
for r in raw_res:
res.append(r)
return res
def run_bulk_query(query_list, driver):
"""Run a list of neo4j queries in a session."""
results = []
with driver.session() as session:
for query in tqdm(query_list):
raw_res = session.run(query)
res = format_raw_res(raw_res)
results.append({'query':query, 'result':res})
return results
global_driver = get_driver(uri_scheme='bolt', host='localhost', port='8687', username='neo4j', password='abc123') # neo4j driver object.=
This is how we create a dask client to parallelize.
from dask.distributed import Client
client = Client(threads_per_worker=4, n_workers=1)
The functions that the main code is calling.
import sys
import time
import json
import pandas as pd
import dask
def add_nodes(nodes_list, language_code):
"""Returns a list of strings. Each string is a cypher query to add a node to neo4j."""
list_of_create_strings = []
create_string_template = """CREATE (:LABEL {{node_id:{node_id}}})"""
for index, node in nodes_list.iterrows():
create_string = create_string_template.format(node_id=node['new_id'])
list_of_create_strings.append(create_string)
return list_of_create_strings
def add_relations(relations_list, language_code):
"""Returns a list of strings. Each string is a cypher query to add a relationship to neo4j."""
list_of_create_strings = []
create_string_template = """
MATCH (a),(b) WHERE a.node_id = {source} AND b.node_id = {target}
MERGE (a)-[r:KNOWS {{ relationship_id:{edge_id} }}]-(b)"""
for index, relations in relations_list.iterrows():
create_string = create_string_template.format(
source=relations['from'], target=relations['to'],
edge_id=''+str(relations['from'])+'-'+str(relations['to']))
list_of_create_strings.append(create_string)
return list_of_create_strings
def add_data(language_code, edges, features, targets, driver):
"""Add nodes and relationships to neo4j"""
add_nodes_cypher = add_nodes(targets, language_code) # Returns a list of strings. Each string is a cypher query to add a node to neo4j.
node_results = run_bulk_query(add_nodes_cypher, driver) # Runs each string in the above list in a neo4j session.
add_relations_cypher = add_relations(edges, language_code) # Returns a list of strings. Each string is a cypher query to add a relationship to neo4j.
relations_results = run_bulk_query(add_relations_cypher, driver) # Runs each string in the above list in a neo4j session.
# Saving some metadata
results = {
"nodes": {"results": node_results, "length":len(add_nodes_cypher),},
"relations": {"results": relations_results, "length":len(add_relations_cypher),},
}
return results
def load_data(language_code):
"""Load data from files"""
# Saving file names to variables
edges_filename = './edges.csv'
features_filename = './features.json'
target_filename = './target.csv'
# Loading data from the file names
edges = helper.read_csv(edges_filename)
features = helper.read_json(features_filename)
targets = helper.read_csv(target_filename)
# Saving some metadata
results = {
"edges": {"length":len(edges),},
"features": {"length":len(features),},
"targets": {"length":len(targets),},
}
return edges, features, targets, results
The main code.
def process_language_files(process_language_files, driver):
"""Reads files, creates cypher queries to add nodes and relationships, runs cypher query in a neo4j session."""
edges, features, targets, reading_results = load_data(language_code) # Read files.
writing_results = add_data(language_code, edges, features, targets, driver) # Convert files nodes and relationships and add to neo4j in a neo4j session.
return {"reading_results": reading_results, "writing_results": writing_results} # Return some metadata
# Execution starts here
res=[]
for index, language_code in enumerate(['ENGLISH', 'FRENCH']):
lazy_result = dask.delayed(process_language_files)(language_code, global_driver)
res.append(lazy_result)
Result from res. These are dask delayed objects.
print(*res)
Delayed('process_language_files-a73f4a9d-6ffa-4295-8803-7fe09849c068') Delayed('process_language_files-c88fbd4f-e8c1-40c0-b143-eda41a209862')
The errors. Even if use dask.compute(), I am getting similar errors.
futures = dask.persist(*res)
AttributeError Traceback (most recent call last)
~/Code/miniconda3/envs/MVDS/lib/python3.6/site-packages/distributed/protocol/pickle.py in dumps(x, buffer_callback, protocol)
48 buffers.clear()
---> 49 result = pickle.dumps(x, **dump_kwargs)
50 if len(result) < 1000:
AttributeError: Can't pickle local object 'BoltPool.open.<locals>.opener
==========================================================================
# Name
Version
Build
Channel
dask
2020.12.0
pyhd8ed1ab_0
conda-forge
jupyterlab
3.0.3
pyhd8ed1ab_0
conda-forge
neo4j-python-driver
4.2.1
pyh7fcb38b_0
conda-forge
python
3.9.1
hdb3f193_2

You are getting this error because you are trying to share the driver object amongst your worker.
The driver object contains private data about the connection, data that do not make sense outside the process (and also are not serializable).
It is like trying to open a file somewhere and share the file descriptor somewhere else.
It won't work because the file number makes sense only within the process that generates it.
If you want your workers to access the database or any other network resource, you should give them the directions to connect to the resource.
In your case, you should not pass the global_driver as a parameter but rather the connection parameters and let each worker call get_driver to get its own driver.

How to find the concurrent.future input arguments for a Dask distributed function call

I'm using Dask to distribute work to a cluster. I'm creating a cluster and calling .submit() to submit a function to the scheduler. It returns a Futures object. I'm trying to figure out how to obtain the input arguments to that future object once it's been completed.
For example:
from dask.distributed import Client
from dask_yarn import YarnCluster
def somefunc(a,b,c ..., n ):
# do something
return
cluster = YarnCluster.from_specification(spec)
client = Client(cluster)
future = client.submit(somefunc, arg1, arg2, ..., argn)
# ^^^ how do I obtain the input arguments for this future object?
# `future.args` doesn't work

Futures don't hold onto their inputs. You can do this yourself though.
futures = {}
future = client.submit(func, *args)
futures[future] = args

A future only knows the key by which it is uniquely known on the scheduler. At the time of submission, if it has dependencies, these are transiently found and sent to the scheduler but no copy if kept locally.
The pattern you are after sounds more like delayed, which keeps hold of its graph, and indeed client.compute(delayed_thing) returns a future.
d = delayed(somefunc)(a, b, c)
future = client.compute(d)
dict(d.dask) # graph of things needed by d
You could communicate directly with the scheduler to find the dependencies of some key, which will in general also be keys, and so reverse-engineer the graph, but that does not sound like a great path, so I won't try to describe it here.

Retrained inception_v3 model deployed in Cloud ML Engine always outputs the same predictions

I followed the codelab TensorFlow For Poets for transfer learning using inception_v3. It generates retrained_graph.pb and retrained_labels.txt files, which can used to make predictions locally (running label_image.py).
Then, I wanted to deploy this model to Cloud ML Engine, so that I could make online predictions. For that, I had to export the retrained_graph.pb to SavedModel format. I managed to do it by following the indications in this answer from Google's #rhaertel80 and this python file from the Flowers Cloud ML Engine Tutorial. Here is my code:
import tensorflow as tf
from tensorflow.contrib import layers
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils
export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5
def build_signature(inputs, outputs):
signature_inputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in inputs.items() }
signature_outputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in outputs.items() }
signature_def = signature_def_utils.build_signature_def(
signature_inputs,
signature_outputs,
signature_constants.PREDICT_METHOD_NAME
)
return signature_def
class GraphReferences(object):
def __init__(self):
self.examples = None
self.train = None
self.global_step = None
self.metric_updates = []
self.metric_values = []
self.keys = None
self.predictions = []
self.input_jpeg = None
class Model(object):
def __init__(self, label_count):
self.label_count = label_count
def build_image_str_tensor(self):
image_str_tensor = tf.placeholder(tf.string, shape=[None])
def decode_and_resize(image_str_tensor):
return image_str_tensor
image = tf.map_fn(
decode_and_resize,
image_str_tensor,
back_prop=False,
dtype=tf.string
)
return image_str_tensor
def build_prediction_graph(self, g):
tensors = GraphReferences()
tensors.examples = tf.placeholder(tf.string, name='input', shape=(None,))
tensors.input_jpeg = self.build_image_str_tensor()
keys_placeholder = tf.placeholder(tf.string, shape=[None])
inputs = {
'key': keys_placeholder,
'image_bytes': tensors.input_jpeg
}
keys = tf.identity(keys_placeholder)
outputs = {
'key': keys,
'prediction': g.get_tensor_by_name('final_result:0')
}
return inputs, outputs
def export(self, output_dir):
with tf.Session(graph=tf.Graph()) as sess:
with tf.gfile.GFile(retrained_graph, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name="")
g = tf.get_default_graph()
inputs, outputs = self.build_prediction_graph(g)
signature_def = build_signature(inputs=inputs, outputs=outputs)
signature_def_map = {
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
}
builder = saved_model_builder.SavedModelBuilder(output_dir)
builder.add_meta_graph_and_variables(
sess,
tags=[tag_constants.SERVING],
signature_def_map=signature_def_map
)
builder.save()
model = Model(label_count)
model.export(export_dir)
This code generates a saved_model.pb file, which I then used to create the Cloud ML Engine model. I can get predictions from this model using gcloud ml-engine predict --model my_model_name --json-instances request.json, where the contents of request.json are:
{ "key": "0", "image_bytes": { "b64": "jpeg_image_base64_encoded" } }
However, no matter which jpeg I encode in the request, I always get the exact same wrong predictions:
Prediction output
I guess the problem is in the way the CloudML Prediction API passes the base64 encoded image bytes to the input tensor "DecodeJpeg/contents:0" of inception_v3 ("build_image_str_tensor()" method in the previous code). Any clue on how can I solve this issue and have my locally retrained model serving correct predictions on Cloud ML Engine?
(Just to make it clear, the problem is not in retrained_graph.pb, as it makes correct predictions when I run it locally; nor is it in request.json, because the same request file worked without problems when following the Flowers Cloud ML Engine Tutorial pointed above.)

First, a general warning. The TensorFlow for Poets codelab was not written in a way that is very amenable to production serving (partly manifested by the workarounds you are having to implement). You would normally export a prediction-specific graph that doesn't contain all of the extra training ops. So while we can try and hack something together that works, extra work may be needed to productionize this graph.
The approach of your code appears to be to import one graph, add some placeholders, and then export the result. This is generally fine. However, in the code shown in the question, you are adding input placeholders without actually connecting them to anything in the imported graph. You end up with a graph containing multiple disconnected subgraphs, something like (excuse the crude diagram):
image_str_tensor [input=image_bytes] -> <nothing>
keys_placeholder [input=key] -> identity [output=key]
inception_subgraph -> final_graph [output=prediction]
By inception_subgraph I mean all of the ops that you are importing.
So image_bytes is effectively a no-op and is ignored; key gets passed through; and prediction contains the result of running the inception_subgraph; since it's not using the input you are passing, it's returning the same result everytime (though I admit I actually expected an error here).
To address this problem, we would need to connect the placeholder you've created to the one that already exists in inception_subgraph to create a graph more or less like this:
image_str_tensor [input=image_bytes] -> inception_subgraph -> final_graph [output=prediction]
keys_placeholder [input=key] -> identity [output=key]
Note that image_str_tensor is going to be a batch of images, as required by the prediction service, but the inception graph's input is actually a single image. In the interest of simplicity, we're going to address this in a hacky way: we'll assume we'll be sending images one-by-one. If we ever send more than one image per request, we'll get errors. Also, batch prediction will never work.
The main change you need is the import statement, which connects the placeholder we've added to the existing input in the graph (you'll also see the code for changing the shape of the input):
Putting it all together, we get something like:
import tensorflow as tf
from tensorflow.contrib import layers
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils
export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5
class Model(object):
def __init__(self, label_count):
self.label_count = label_count
def build_prediction_graph(self, g):
inputs = {
'key': keys_placeholder,
'image_bytes': tensors.input_jpeg
}
keys = tf.identity(keys_placeholder)
outputs = {
'key': keys,
'prediction': g.get_tensor_by_name('final_result:0')
}
return inputs, outputs
def export(self, output_dir):
with tf.Session(graph=tf.Graph()) as sess:
# This will be our input that accepts a batch of inputs
image_bytes = tf.placeholder(tf.string, name='input', shape=(None,))
# Force it to be a single input; will raise an error if we send a batch.
coerced = tf.squeeze(image_bytes)
# When we import the graph, we'll connect `coerced` to `DecodeJPGInput:0`
input_map = {'DecodeJPGInput:0': coerced}
with tf.gfile.GFile(retrained_graph, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, input_map=input_map, name="")
keys_placeholder = tf.placeholder(tf.string, shape=[None])
inputs = {'image_bytes': image_bytes, 'key': keys_placeholder}
keys = tf.identity(keys_placeholder)
outputs = {
'key': keys,
'prediction': tf.get_default_graph().get_tensor_by_name('final_result:0')}
}
tf.simple_save(sess, output_dir, inputs, outputs)
model = Model(label_count)
model.export(export_dir)

I believe that your error is quite simple to solve:
{ "key": "0", "image_bytes": { "b64": "jpeg_image_base64_encoded" } }
You used " to specify what, I believe, is a string. By doing that, your program is reading jpeg_image_base64_encoded instead of the actual value of the variable.
That's why you get always the same prediction.

For anyone working on deploying TensorFlow image-based models on Google Cloud ML, in particular trying to get the base64 encoding working for images (as discussed in this question), I'd recommend also having a look at the following repo that I put together. I spent a lot of time working through the deployment process and was only able to find partial information across the web and on stack overflow. This repo has a full working version of deploying a TensorFlow tf.keras model onto google cloud ML and I think it will be of help to people who are facing the same challenges I faced. Here's the github link:
https://github.com/mhwilder/tf-keras-gcloud-deployment.
The repo covers the following topics:
Training a fully convolutional tf.keras model locally (mostly just to have a model for testing the next parts)
Example code for exporting models that work with the Cloud ML Engine
Three model versions that accept different JSON input types (1. An image converted to a simple list string, 2. An image converted to a base64 encoded string, and 3. A URL that points to an image in a Google Storage bucket)
Instructions and references for general Google Cloud Platform setup
Code for preparing the input JSON files for the 3 different input types
Google Cloud ML model and version creation instructions from the console
Examples using the Google Cloud SDK to call predict on the models

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Tensorflow federated : How to map the remote-worker with remote datasets in iterative_process.next? - tensorflow-federated

Related

Vertex AI - Deployment failed

Dask - diagnostics dashboard - custom info about task

Using DASK to read files and write to NEO4J in PYTHON

How to find the concurrent.future input arguments for a Dask distributed function call

Retrained inception_v3 model deployed in Cloud ML Engine always outputs the same predictions

Categories

Resources