Duplicated port of child tasks in Spring Cloud Data Flow - spring-cloud-dataflow

When I launch new task (Spring Batch Job) using Spring Cloud Data Flow, I see that SCDF auto initialize Tomcat with some "random" ports but I do not know if there ports are created randomly or following any rule of the framework?
Therefore, I sometime have a trouble that "Web server failed to start. Port 123456 was already in use".
In conclusion, my questions are:
1) How does the framework choose ports for initializing? (randomly or by principle)?
2) Are there anyway to launch task effectively without duplicated ports(fixed configuration or method for choosing unused port at particular time)?

I don't think SCDF has anything to do with the port assignment etc.,
It is your task application that gets launched. You need to decide whether you really need the web dependency that brings in the tomcat to your application.
Assuming you use Spring Boot, you can either exclude the web starter dependency in your dependencies or pass the command line arg server.port=<?> to a specific port when launching the task (if you really need this task app to be a web app).

Related

Azure Cloud Service microservice to K8 Migration

I am in the process of evaluating moving a very large Azure Cloud Service (Web Role) microservice architecture to AKS and have been working through the necessary code and build changes to support it.
In order to replicate the production environment locally for the developers, we run nginx on the host with SSL offloading and DNS (hosted in Azure) A records pointing to 127.0.0.1. When running in the Azure Emulator, the net affect is the ability for both the developer to visit the various web front ends in their browser (i.e. https://myapp.mydomain.dev) as well as hit the various API's in the solution (Web API 2) in Postman/cURL, etc.
Additionally due to how the networking of the Azure Emulator works, the apps themselves can resolve each other through nginx on the host (i.e. MVC app at https://myapp.mydomain.dev can obtain a token from the IdP web API at https://identity.mydomain.dev and then use that token at the API at https://api.mydomain.dev). This is the critical piece and the source of my question.
All attempts at getting the containers themselves to resolve each other the same way the host OS can (browser/Postman, SSL offloading via nginx) have failed. Many of the instructions out there are understandably for linux containers but having adapted the various networking docker-compose settings for the windows container equivalent have not yet yielded an success. In order to keep the development environments aligned with the real work systems, which are tenantized and make sure of the default mapping in nginx to catch all incoming traffic and route it to a specific user facing app/container, it is not as simple as determining a "static" method of addressing these on startup and why the effort was put in to produce the development environments we have today.
Right now when one service (container) attempts to communication with another, it ultimately results in a resolution error as all requests resolve to https://127.0.0.1 due to the DNS A records hosted in Azure for the domain. Since this migration will be a longer term project, the environments need to co-exist so changing the way that DNS is resolved (real DNS A records pointing to 127.0.0.1), host running nginx and handling SSL offloading to the various webroles normally running in the Azure Emulator is not an option.
Is there a way (with Windows containers) to either:
Allow the container to utilize nginx on the host OS transparently (app must still call the API at https://api.mydomain.dev), which will cause the traffic to be routed properly to the correct container/port defined in the docker-compose file?
OR
Run nginx on each container, allowing each container to then resolve and route appropriately without knowing the IP of the other container, possibly through an alias which could be added to the containers nginx.conf before the service starts?
The platform utilizes OAuth2/OIDC and it is critical to maintain the full URL to the other services from the applications perspective. Beyond mirroring production and sandbox environments, this URL's are utilized for redirect URL and post logout redirect URL validation among other things so using "https://myContainerNameForOtherContainerAlias" is not a workable solution.
Will I have the same problem when setting up the AKS environment as well?

Configure Spring Cloud Task to use Kafa of Spring Cloud Data Flow server

I have a Spring Cloud Data Flow (SCDF) server running on Kubernetes cluster with Kafka as the message broker. Now I am trying to launch a Spring Cloud Task (SCT) that writes to a topic in Kafka. I would like the SCT to use the same Kafka that SCDF is using. This brings up two questions that I have and hope they can be answered:
How to configure the SCT to use the same Kafka as SCDF?
Is it possible to configure the SCT so that the Kafka server uri can be passed to the SCT automatically when it launches, similar to
the data source properties that get passed to the SCT at launch?
As I could not find any examples on how to achieve this, help is very appreciated.
Edit: My own answer
This is how I get it working for my case. My SCT requires spring.kafka.bootstrap-servers to be supplied. From SCDF's shell, I provide it as an argument --spring.kafka.bootstrap-servers=${KAFKA_SERVICE_HOST}:${KAFKA_SERVICE_PORT}, where KAFKA_SERVICE_HOST and KAFKA_SERVICE_PORT are environment variables created by SCDF's k8s setup script.
This is how to launch the task within SCDF's shell
dataflow:>task launch --name sample-task --arguments "--spring.kafka.bootstrap-servers=${KAFKA_SERVICE_HOST}:${KAFKA_SERVICE_PORT}"
You may want to review the Spring Cloud Task Events section in the reference guide.
The expectation is that you'd choose the binder of choice and pack that library in the Task application's classpath. With that dependency, you'd then configure the application with Spring Cloud Stream's Kafka binder properties such as the spring.cloud.stream.kafka.binder.brokers and others that are relevant to connect to the existing Kafka cluster.
Upon launching the Task application (from SCDF) with these configurations, you'd be able to publish or receive events in your Task app.
Alternatively, with the Kafka-binder in the classpath of the Task application, you can define the Kafka binder properties to all the Task's launched by SCDF via global configuration. See Common Application Properties in the ref. guide for more information. In this model, you don't have to configure each of the Task application with Kafka properties explicitly, but instead, SCDF would propagate them automatically when it launches the Tasks. Keep in mind that these properties would be supplied to all the Task launches.

Worker "dyno" in AWS Elastic Beanstalk

Amazon Web Service has now a worker tiers in their Elastic Beanstalk. But, it nevertheless confuse us who come from the days of Worker dyno.
As a comparison, in Heroku, one can configure two dynos (something like processor?) each for web and worker. The web will work for any request, and will timeout normally at 15 secs. Thus, if you have a request that last more than that, your request will simply timed-out although not terminated per se. In that case, you should use worker and your web dyno should visit the endpoint several times per minutes (maybe) to check if there is any result to be brought back to the user. To make either worker or web dyno, what you need is just slide the slider and you are good to go. Sometimes, you may need a Procfile. But there is nothing fancy, or something really difficult, or confusing about it.
In AWS EBS (Elastic Beanstalk), since day 1 you hit eb init, you will be asked whether it is a Standard or Worker. When you hit Standard, it seems there is no way to make it as worker as well.
In our situation, the worker and standard web is located under one application. So, how could we use an EBS instance both for worker and standard. Our worker is using sidekiq, and redis. Please, point to any guidance or help us in this matter.
AWS Elastic Beanstalk has two types of Environments - Web tier and Worker tier.
Web tier environments are meant for web applications - http/https request processing. You get one or more EC2 instances behind a load balancer. You can get other resources like database per your requirement. You can choose the platform you wish e.g. Ruby, Python, Java, Node.js, PHP, Docker.
Worker environments are meant for asynchronous message processing. When you create a worker environment you do not have a load balancer. All your EC2 instances are in an autoscaling group. All these instances are running a daemon which is polling a single SQS queue for messages. When a message is pulled by the daemon from the SQS queue, the daemon sends a HTTP Post request on localhost:80. You can configure the port but the important thing is that the daemon posts the message as an HTTP request on localhost. Your worker application is actually a web application that receives the post request and processes the message. After the message is successfully processed the worker daemon expects that your web application running on localhost returns a HTTP 200 OK response. The daemon then deletes the message from SQS queue. You can write your worker application for any platform just like standard web server applications - Ruby, Python, Java, Node.js, PHP, Docker.
Based on my understanding of your usecase I would recommend creating two Elastic Beanstalk environments - one Standard and one Worker environment. The Standard web server receives HTTP requests and processes them synchronously. This environment puts the relevant data in an SQS queue. The second environment is a worker and the daemon running on this environment polls this SQS queue for messages. Your second environment is a web application that is NOT open to the internet. The worker daemon posts the messages as HTTP requests to your worker environment. Thus you can process long running workloads asynchronously using this second worker environment.
With worker environments you can use your own queues or Elastic Beanstalk can generate a queue for you. You can configure parameters like message visibility timeout, http connections based on your requirements or you can use the defaults.
Below are some links that may be useful for you:
http://aws.amazon.com/blogs/aws/background-task-handling-for-aws-elastic-beanstalk/
http://blogs.aws.amazon.com/application-management/post/Tx1Y8QSQRL1KQZC/Elastic-Beanstalk-Video-Tutorial-Worker-Tier
https://stackoverflow.com/a/23942498/161628
Does this meet your requirements? Please let me know if you have further questions.
Update
You need to upload your source code at two places - once for the worker environment and once for the web server environment. If someone was starting from scratch then they might have two separate code bases. But I think in your case I think it should be perfectly fine to have a single code base shared between the two environments. Suppose your web request arrives at '/register', then the register() method in your application can post messages to an SQS queue and be done with the HTTP request. Now your worker environment will poll the SQS queue and post messages over HTTP on localhost to the URL '/async_register' which will invoke a method async_register() in your application and do the asynchronous processing. These two methods can live in the same source code bundle which can be shared by both the worker and web server environment. The code path taken by worker and web server will be different so that web server environments will invoke register() and worker environments will invoke the async_register() method.
Another caveat is that HTTP requests sent by the worker daemon on localhost will contain an HTTP header - "User-Agent": "aws-sqsd/1.1". Read more here. So in your web application you can have a single listener to post requests on "/register" and depending on whether this header is present or not you invoke register() or async_register() methods internally.
Also I think if you want to share the code base between the two environments you can upload the code base at only one place. Your environments are logically grouped into applications. So you can have a single application. You upload your source code to this application using the "CreateApplicationVersion" API call. Suppose you upload an application version with label 'v1'. You can now create a worker environment and a web server environment under the same application. When you create an environment you need to provide a version to deploy to your enviroment. In this case you can deploy v1 to both environments. So you will be sharing the same source code for both environments. When you have a new version "v2". You upload this version and then perform an update on both environments changing their version to "v2".
The same version of the source code can be deployed to both environments. They will be running on different EC2 instances because one environment is dedicated for responding to web requests and one environment is dedicated for responding to asynchronous web requests (worker).

Identify which server instance is represented by a particular java.exe on task manager

From my RAD workspace, I start a Websphere portal server and a Websphere App Server. Now when I go to Task Manager--> Processes, I see two java.exes running that represent the two server instances.
How can I tell which java.exe is for the portal and which java.exe is the App Server?
PROFILE_HOME/logs/SERVER_NAME/SERVER_NAME.pid contains the server PID. In task manager, ensure View > Select Columns > PID (Process Identifier) is checked.
Use Sysinternals Process Explorer instead of Task Manager. It can show you the full command line and the loaded libraries for every running process; also which process started it. Usually it is easy to distinguish different Java services that way. Some services might store their PID into some file, too, but that depends on the service.

Confusion about installing windows service using command prompts

I have designed a simple windows service in .NET 2.0.
I am trying to deploy it on my local machine. I have switched to design view, and setup ServiceInstaller and ServiceProcessInstaller objects. There is a Project Installer. I have also wrapped the Windows Service into a .NET setup project and install it, leaving an .exe in the specified directory.
I have fired up cmd and entered the path to installutil. This works fine, but then I typeinstallutil and the full path to the service, in Visual Studio command prompt, and this does not work (I've also tried installutil /i and all sorts of things out of desperation). The permissions are local system (extensive).
Any ideas what I am doing wrong? For those here who have installed Windows Services, what was your methodology to install the service?
Thanks
We actually create an installer built into our application. It's a console app that has a command line to install/uninstall the server as well as run as a service or in console mode.
See this article on a Self Installing Service for some details. I like this method as it provides flexibility.
DESCRIPTION:
SC is a command line program used for communicating with the
NT Service Controller and services.
USAGE:
sc [command] [service name] ...
The option has the form "\\ServerName"
Further help on commands can be obtained by typing: "sc [command]"
Commands:
query-----------Queries the status for a service, or
enumerates the status for types of services.
queryex---------Queries the extended status for a service, or
enumerates the status for types of services.
start-----------Starts a service.
pause-----------Sends a PAUSE control request to a service.
interrogate-----Sends an INTERROGATE control request to a service.
continue--------Sends a CONTINUE control request to a service.
stop------------Sends a STOP request to a service.
config----------Changes the configuration of a service (persistant).
description-----Changes the description of a service.
failure---------Changes the actions taken by a service upon failure.
qc--------------Queries the configuration information for a service.
qdescription----Queries the description for a service.
qfailure--------Queries the actions taken by a service upon failure.
delete----------Deletes a service (from the registry).
create----------Creates a service. (adds it to the registry).
control---------Sends a control to a service.
sdshow----------Displays a service's security descriptor.
sdset-----------Sets a service's security descriptor.
GetDisplayName--Gets the DisplayName for a service.
GetKeyName------Gets the ServiceKeyName for a service.
EnumDepend------Enumerates Service Dependencies.
The following commands don't require a service name:
sc
boot------------(ok | bad) Indicates whether the last boot should
be saved as the last-known-good boot configuration
Lock------------Locks the Service Database
QueryLock-------Queries the LockStatus for the SCManager Database
EXAMPLE:
sc start MyService
Here's another reference specific to .NET services.
http://bytes.com/forum/thread739857.html
I'm calling installutil in my setup package and it works for me just fine.
That'd be great if you posted an error message that you're getting when running installutil.

Resources