How do you provide runtime inputs to Templates for google-dataflow? - google-cloud-dataflow

I have been triggering google dataflow templates through endpoints. Now, I want to pass some input to the dataflow template through these endpoints. These inputs are custom made e.g the name of the output file. I have been around valueProviders, would these help in this context?

ValueProviders is the way to add template support to a parameter.
If you want to be able to provide a runtime option for a job, you first need to define that ValueProvider in your user code:
https://cloud.google.com/dataflow/docs/templates/creating-templates
Once you do that, you'll be able to run the job providing a runtime value for the specific parameter:
https://cloud.google.com/dataflow/docs/templates/executing-templates

Yes, Value Providers helps in taking runtime params. Although, you would need to make some stub call to expose the custom params otherwise dataflow jobs would not take the param as input. Hope it helps to others as well.

Related

Use plugin config as paramater in services.yml

The documentation only shows that you can pass the SystemConfigService as a parameter to another service.
Is there also the possibility to pass directly the value from the plugin configuration?
Background of the question: I would like to initialize directly an instance of an external component. But this expects fixed arguments as strings. Alternatively, one would otherwise have to write some kind of factory.
Hm, it's possible to do. I have not done this directly myself, but 99% confident that it will work. You may need to play around with it a little.
In the services.xml you can use symfony expressions.
<argument type="expression">service('Shopware\Core\System\SystemConfig\SystemConfigService').get('SwagBasicExample.config.example')</argument>
You may need to find the alias name for the Shopware\Core\System\SystemConfig\SystemConfigService class instead. Also check the Symfony documentation, you can do a lot more with this!
I myself passed an array as an argument, but used a custom class as a config getter like so:
<argument type="expression">
{
"shop_is_active": service('config_bridge').get('isActive'),
"customer_number": service('config_bridge').get('customerNumber'),
"shop_number": service('config_bridge').get('shopNumber'),
"apikey": service('config_bridge').get('apiKey')
}
</argument>
Not strictly necessary as Shopware already requires it, but always a good practice to add the requirement to your plugin composer file:
"require": {
...,
"symfony/expression-language": "~5.3.0|~5.4.0"
},
As of today it's not possible to inject specific system_config values in services.

Dataflow/Beam Templates, Productionization, Initialization, and ValueProviders

I have an Apache Beam job running on Google Cloud Dataflow, and as part of its initialization it needs to run some basic sanity/availability checks on services, pub/sub subscriptions, GCS blobs, etc. It's a streaming pipeline intended to run ad infinitum that processes hundreds of thousands of pub/sub messages.
Currently it needs a whole heap of required, variable parameters: which Google Cloud project it needs to run in, which bucket and directory prefix it's going to be storing files in, which pub/sub subscriptions it needs to read from, and so on. It does some work with these parameters before pipeline.run is called - validation, string splitting, and the like. In its current form in order to start a job we've been passing these parameters to to a PipelineOptionsFactory and issuing a new compile every single time, but it seems like there should be a better way. I've set up the parameters to be ValueProvider objects, but because they're being called outside of pipeline.run, Maven complains at compile time that ValueProvider.get() is being called outside of a runtime context (which, yes, it is.)
I've tried using NestedValueProviders as in the Google "Creating Templates" document, but my IDE complains if I try to use NestedValueProvider.of to return a string as shown in the document. The only way I've been able to get NestedValueProviders to compile is as follows:
NestedValueProvider<String, String> pid = NestedValueProvider.of(
pipelineOptions.getDataflowProjectId(),
(SerializableFunction<String, String>) s -> s
);
(String pid = NestedValueProvider.of(...) results in the following error: "incompatible types: no instance(s) of type variable(s) T,X exist so that org.apache.beam.sdk.options.ValueProvider.NestedValueProvider conforms to java.lang.String")
I have the following in my pipelineOptions:
ValueProvider<String> getDataflowProjectId();
void setDataflowProjectId(ValueProvider<String> value);
Because of the volume of messages we're going to be processing, adding these checks at the front of the pipeline for every message that comes through isn't really practical; we'll hit daily account administrative limits on some of these calls pretty quickly.
Are templates the right approach for what I want to do? How do I go about actually productionizing this? Should (can?) I compile with maven into a jar, then just run the jar on a local dev/qa/prod box with my parameters and just not bother with ValueProviders at all? Or is it possible to provide a default to a ValueProvider and override it as part of the options passed to the template?
Any advice on how to proceed would be most appreciated. Thanks!
The way templates are currently implemented there is no point to perform "post-template creation" but "pre-pipeline start" initialization/validation.
All of the existing validation executes during template creation. If the validation detects that there the values aren't available (due to being a ValueProvider) the validation is skipped.
In some cases it is possible to approximate validation by adding runtime checks either as part of initial splitting of a custom source or part of the #Setup method of a DoFn. In the latter case, the #Setup method will run once for each instance of the DoFn that is created. If the pipeline is Batch, after 4 failures for a specific instance it will fail the pipeline.
Another option for productionizing pipelines is to build the JAR that runs the pipeline, and have a production process that runs that JAR to initiate the pipeline.
Regarding the compile error you received -- the NestedValueProvider returns a ValueProvider -- it isn't possible to get a String out of that. You could, however, put the validation code into the SerializableFunction that is run within the NestedValueProvider.
Although I believe this will currently re-run the validation everytime the value is accessed, it wouldn't be unreasonable to have the NestedValueProvider cache the translated value.

Generating a dynamic parameter in Jenkins based off a parameter from the same job

I've got a Jenkins build with a choice box for build prefixes by release. It helps trigger a job based off the value of whatever specific build the person wanted.
I wanted to take the value of that choice box and transform the variable into the correct prefix based off the naming conventions typically used on this server for triggering the job based off its name.
So let's say I've got build prefix choices specifically for,
ReleaseOne
ReleaseTwo
none
For none, meaning the parameters used won't try to access or set any specific release-based info by triggering the non-release-specified build.
I wanted to take the value of Release_Prefix and transform it, if needed, for the job that I trigger later. I was hoping to accomplish this with a dynamic parameter or similar mechanism. I'm not sure if my script is bugged, or something fundamental is not working to my intent. This might be the case, based off some alluded feedback from a similar question.
Can I do something like this snippet below? If not with Dynamic Parameter plugin + GroovyScript, what would you suggest? This currently seems to return nothing, regardless of my choice.
Formatted_Prefix parameter, Dynamic Parameter
switch(binding.getVariables().get("Release_Prefix"))
{
case "none":
return "";
case "ReleaseOne":
return "ReleaseOne_";
case "ReleaseTwo":
return "ReleaseTwo_";
default:
def prefix = binding.getVariables().get("Release_Prefix")
return "$prefix_";
}
There's multiple ways I can overcome this, but if I can do it at the initial parameter stage, that would be best for me.
You can use EnvInject Plugin for this.
check the checkbox Prepare an environment for the run and
write your script inside Evaluated Groovy script text box
def prefix1 = Release_Prefix + "mydata"
return[prefix:prefix1]

Dropwizard: customize health check address and format

Is it possible to customize Dropwizrd's healthcheck output so that, e.g.: /health for healthchecks instead of /healthcheck and some output like {“status”: 200}.
I realise I could simply write a new resource that does what ever I need, I was just wondering if there is a more standard way to do this.
From what I have read on the 0.7.1 source code it's not possible to change the resource URI for healthchecks unfortunately, I highly doubt you can change the healthcheck format. I also remember people complaining about not being able to add REST resources to admin page, only servlets. Maybe on 0.8.0?
Here are the details of what I've tracked so far on the source code. Maybe I have misread or misunderstood something, so somebody could fix it.
Metrics has actually written AdminServlet to add healtcheck servlet in a way that it checks the servlet config whether the URI is defined or not.
this.healthcheckUri = getParam(config.getInitParameter(HEALTHCHECK_URI_PARAM_KEY), DEFAULT_HEALTHCHECK_URI);
But dropwizard doesn't provide a way to inject this configuration in any way on AbstractServerFactory.
handler.addServlet(new NonblockingServletHolder(new AdminServlet()), "/*");
NonblockingServletHolder is the one which is providing the config to AdminServlet but is created by AbstractServerFactory with empty constructor and provides no way to change the config.
I've thought of and tried to access the ServletHolder from the Environment object on Application.run method but the admin servlets are not created until after run method is run.
environment.getAdminContext().getServletHandler().getServlets()[0].setInitParameter("healthcheck-uri", "/health");
Something like this in your run() function will help you control the URI of your healthchecks:
environment.servlets().addServlet(
"HealthCheckServlet",
new HealthCheckServlet(environment.healthChecks())
).addMapping("/health");
If you want to actually control what's returned you need to write your own resource file. Fetch all the healthchecks from the registery, run them and return whatever aggregated value you want based on their results.

Get methods params type parsing wsdl file in a rails/ruby application

I have a question about ruby and wsdl soap.
I couldn't find a way to get each method's params and their type.
For example, if I found out that a soap has a methods called "get_user_information" (using wsdlDriver) is there a way to know if this method requires some params and what type of params does it require (int, string, complex type, ecc..)?
I'd like to be able to build html forms from a remote wsdl for each method...
Sorry for my horrible English :D
Are you using soapr4?
Soap4r comes with a command line client to build proxies for accessing web services via SOAP. This is preferable to using the wsdlDriver which has to build the proxy dynamically every time it runs.
To build a "permanent" proxy then you need to run the following command
wsdl2ruby.rb --type client --wsdl http://some/path/to/the/wsdl
When this command runs then you should end up with a bunch of ruby files one of which (probably default.rb) will call each method in turn and document the necessary inputs and outputs.
Alternatively you may find the Wsdl Analyser useful. This will allow you to enter the URL for a WSDL which it will then analyse and list all of the operations and (sometimes) the paramaters required
Thank you for the very quick response!
I'll try to explain myself a little better :D
I've tried soap4r, and I'm able to get soap's methods with something like this:
require "soap/wsdlDriver"
client = SOAP::WSDLDriverFactory.new(my-wsdl-url).create_rpc_driver
puts client.singleton_methods
What I'd like to know is:
If, for example, my soap has a method called "get_some_params_and_sum_them", is there a way to know how many params it takes and which type they should be?
Something like
puts client.method("get_some_params_and_sum_them").params
Wsdl Analyser does it, and I'd like to know if this is possible also in a ruby script without tons of code lines :D

Resources