How can I determine the "reason" behind the status of an EC2 instance that has just been registered on Amazon ELB? - amazon-elb

I'm working on a deployment script (more specifically, an Ansible module) that registers an EC2 instance with an Amazon ELB. The script uses the Boto library.
Here's a look at the relevant part of the script:
def register(self, wait):
"""Register the instance for all ELBs and wait for the ELB
to report the instance in-service"""
for lb in self.lbs:
lb.register_instances([self.instance_id])
if wait:
self._await_elb_instance_state(lb, 'InService')
def _await_elb_instance_state(self, lb, awaited_state):
"""Wait for an ELB to change state
lb: load balancer
awaited_state : state to poll for (string)"""
while True:
state = lb.get_instance_health([self.instance_id])[0].state
if state == awaited_state:
break
else:
time.sleep(1)
(BTW the code above is from Ansible's ec2_elb module.)
So, when the instance is first registered it is 'OutOfService'. The script here "waits" for the instance to reach the state 'InService' after it has passed health checks etc.
So here's the problem: The process above is overly simplistic (which is why I'm trying to customize the module for my own purposes). The main problem I've hit is that if the load balancer is not configured to service the availability zone that the instance resides in, then the instance will remain out of service. Essentially the script above will just hang.
What I'd like to do (and that's why I'm customizing this built in module) is find a way to determine if the ELB is just waiting around for the instance to pass the healthcheck OR if there is some other reason (like an unregistered availability zone) that's causing it to remain Out of Service.
The Boto library (via the Amazon ELB API) does provide slightly more detail than state: it has a "reason" attribute which is described in the Boto docs (and also the Amazon ELB API docs) as follows:
reason_code (str) – Provides information about the cause of an
OutOfService instance. Specifically, it indicates whether the cause is
Elastic Load Balancing or the instance behind the LoadBalancer.
There is a paucity of documentation on the reason_code attribute that I could find out there, so I'm not sure a) what I can expect the possible return value to even be here, and b) what they actually mean in relation to my question above.
I think what I want to do is doable given that Amazon is able to display a detailed reason for why an instance is out of service is the management console -- and from what I understand they are dogfooding their API there.
So how/where can I find the more detailed reason behind the instance's status?

Ah, it's the description field of InstanceState:
description (str) – A description of the instance.
I guess that was so vague that my brain ignored it.
It also looks like the possibilities of state are two string values:
'ELB'
'Instance'
That's just from playing around with the API; it's not definitive or anything.

Related

Fn project is missing http operations (CRUD)

I have spent my afternoon getting very excited about the container-native serverless platform 'fn project' - http://fnproject.io/.
I love the idea of the FaaS model but have no intention of locking myself into a particular cloud vendor for most of the lifetime of an app - and several other reasons including the desire to spin up the entire app on a small server anywhere if I choose.
fn project seems great for my needs until I finish perusing the documentation and all the relevant blog posts and suddenly think 'what? Wait....what??? Where are the http operations?'.
I cannot find a single reference anywhere that states if it is even possible to to have http triggers for different http operations (ie POST, PUT, PATCH, DELETE), let alone how I would do it.
I want to build REST api's (or certainly at the very least json-serving http-based RPC apis - if it doesn't have hypermedia links it isn't REST ;) but let's not get into that one in this thread)
Am I missing something here (certainly the correct bit of documentation)??
Can anybody please enlighten me as to how I would do this, or even tell me if I have totally misunderstood what I should use this for?
My excitement has gone soft for now but I'm hoping someone that will change with the right information.
It feels odd that I can't find anyone else complaining about this, so I think that indicates my misunderstanding perhaps.
Other solutions such as OpenFaaS look interesting but I dont wan't to have to learn how to deploy kubernetes and docker swarms if I can avoid it :)
I'm not an expert, but as of now it seems not possible to specify the http method inside the trigger. Check latest trigger spec : as you can see, there is no notion of http method here.
However, handling different HTTP methods can be done inside the function itself.
For example, in Java (with fdk-java v1.0.80), you can use com.fnproject.fn.api.httpgateway.HTTPGatewayContext as the first parameter of the function, as described in the section "Accessing HTTP Information From Functions" of the documentation :
In Fn for Java, when your function is being served by an HTTP trigger (or another compatible HTTP gateway) you can get access to both the incoming request headers for your function by adding a 'com.fnproject.fn.api.httpgateway.HTTPGatewayContext' parameter to your function's parameters.
Using this allows you to :
...
Access the method and request URL for the trigger
...
You can then retrieve the HTTP method by calling getMethod() on the HTTPGatewayContext passed as parameter.
In other languages (with others fdk), it's possible to do the same :
in Go : example calling RequestMethod() on context
in Ruby : class HTTPContext
in Python : class HTTPGatewayContext
in Node : class HTTPGatewayContext
From this different contexts, you'll then be able to get method parameter passed when fn invoke --method=[GET|POST|...] (via fn-http-method header).
The main drawback here is that all HTTP methods should be handled in the same function. Nonetheless, you can structure your code to have only one class per method.
After some further thought it seems fairly clear now what my actual misunderstanding was....
When I have built Serverless framework services in the past (or built and deployed Lambda functions using terraform) I have been deploying to AWS and so have been using AWS's API Gateway offering (their product is actually called API gateway but its important to recognise that API Gateway is a distributed systems / micro-sevices design pattern).
API gateway makes it possible to route specific http request types including the method (GET,POST,PUT,DELETE) to the desired functions.
Platforms such as Fn project and OpenFaaS do not provide an out of the box api gateway solution and it seems we would need to take care of this ourselves.
These above mentioned platforms are about deployment of functions. We find the other bits via our product of choice.

Detecting network request

When using gems for external services, sometimes I don't know whether a method does a trip to the service or just returns some local computation.
An example - I'm using an image hosting service (Cloudinary) that has a Ruby wrapper for their API. The following command returns a complete URL to a hosted image when providing an image_id:
Cloudinary::Utils.cloudinary_url(image_id)
#=> "http://res.cloudinary.com/my_service/image/upload/v1/some_image_id"
I'd probably construct that URL manually to skip a HTTP request (if that what's what happens).
So to my question - is there any faster way than digging in the source code or unplugging internet access to detect if network request is made before returning a value?
Cloudinary is using RestClient gem. It allows set global logging.
https://github.com/rest-client/rest-client#logging
To enable logging globally you can:
set RestClient.log with a Ruby Logger, or set an environment variable
to avoid modifying the code (in this case you can use a file name,
"stdout" or "stderr"):
$ RESTCLIENT_LOG=stdout path/to/my/program

Is it possible to have Centralised Logging for ElasticBeanstalk Docker apps?

We have custom Docker web app running in Elastic Beanstalk Docker container environment.
Would like to have application logs be available for viewing outside. Without downloading through instances or AWS console.
So far neither of solutions been acceptable. Maybe someone achieved centralised logging for Elastic Benastalk Dockerized apps?
Solution 1: AWS Console log download
not acceptable - requires to download logs, extract every time. Non real-time.
Solution 2: S3 + Elasticsearch + Fluentd
fluentd does not have plugin to retrieve logs from S3
There's excellent S3 plugin, but it's only for log output to S3. not for input logs from S3.
Solution 3: S3 + Elasticsearch + Logstash
cons: Can only pull all logs from entire bucket or nothing.
The problem lies with Elastic Beanstalk S3 Log storage structure. You cannot specify file name pattern. It's either all logs or nothing.
ElasticBeanstalk saves logs on S3 in path containing random instance and environment ids:
s3.bucket/resources/environments/logs/publish/e-<random environment id>/i-<random instance id>/my.log#
Logstash s3 plugin can be pointed only to resources/environments/logs/publish/. When you try to point it to environments/logs/publish/*/my.log it does not work.
which means you can not pull particular log and tag/type it to be able to find in Elasticsearch. Since AWS saves logs from all your environments and instances in same folder structure, you cannot chose even the instance.
Solution 4: AWS CloudWatch Console log viewer
It is possible to forward your custom logs to CloudWatch console. Do achieve that, put configuration files in .ebextensions path of your app bundle:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.cloudwatchlogs.html
There's a file called cwl-webrequest-metrics.config which allows you to specify log files along with alerts, etc.
Great!? except that configuration file format is neither yaml,xml or Json, and it's not documented. There is absolutely zero mentions of that file, it's format either on AWS documentation website or anywhere on the net.
And to get one log file appear in CloudWatch is not simply adding a configuration line.
The only possible way to get this working seem to be trial and error. Great!? except for every attempt you need to re-deploy your environment.
There's only one reference to how to make this work with custom log: http://qiita.com/kozayupapa/items/2bb7a6b1f17f4e799a22 I have no idea how that person reverse engineered the file format.
cons:
Cloudwatch does not seem to be able to split logs into columns when displaying, so you can't easily filter by priority, etc.
AWS Console Log viewer does not have auto-refresh to follow logs.
Nightmare undocumented configuration file format, no way of testing. Trial and error requires re-deploying whole instance.
Perhaps an AWS Lambda function is applicable?
Write some javascript that dumps all notifications, then see what you can do with those.
After an object is written, you could rename it within the same bucket?
Or notify your own log-management service about the creation of a new object?
Lots of possibilities there...
I've started using Sumologic for the moment. There's a free trial and then a free tier (500mb /day, 7 day retention). I'm not out of the trial period yet and my EB app does literally nothing (it's just a few HTML pages serve by Nginx in a docker container. Looks like it could get expensive once you hit any serious amount of logs though.
It works ok so far. You need to create an IAM user that has access to the S3 bucket you want to read from and then it sucks the logs over to Sumologic servers and does all the processing and searching over there. Bit fiddly to set up, but I don't really see how it could be simpler and it's reasonably well-documented.
It lets you provide different path expressions with wildcards, then assign a "sourceCategory" to those different paths. You then use those sourceCategories to filter your log searching to a specific type of logging.
My plan long-term is to use something like your solution 3, but this got me going in very short order so I can move on to other things.
You can use a Multicontainer environment, sharing the log folder to another docker container with the tool of your preference to centralize the logs, in our case we connected an Apache Flume to move the files to an HDFS. Hope this helps you with this.
The easiest method I found to do this was using papertrail via rsyslog and .ebextensions, however it is very expensive for logging everything.
The good part is with rsyslog you can essentially send your logs anywhere and you are not tied to papertrail.
example ebextension
I've found loggly to be the most convenient.
It is a hosted service which might not be what you want. However if you check out their setup page you can see a number of ways your situation is supported (docker specific solutions, as well as like 10 amazon specific options). Even if loggly isn't to your taste, you can look at those solutions and easily see how some of them could be applied to most any centralized logging solution you might use or write.

ELB not routing traffic to healthy instance

This seems to have something to do with the subnet/availability zone, but I'm new to using a VPC and it's eluding me.
VPC: 10.80.0.0/16
subnet: 10.80.1.0/24 (us-east-1b)
subnet: 10.80.2.0/24 (us-east-1a)
All instances are Windows Server 2012.
I have an internet facing ELB created within my VPC (10.80.0.0/16). There is one instance added from AZ us-east-1a, which is on subnet 10.80.2.0/24. The instance is running IIS 7.5, with an app running on port 80 and /health.aspx set up for use as the ELB health check.
Internal traffic on the VPC is flowing normally (unrestricted). I can request health.aspx from this instance from another instance in us-east-1b (10.80.1.0/24). I can also copy files from one instance to another.
Outbound traffic is unrestricted. I can RDP to the instance (when connected to our VPN) and open a browser and request a web page and get it.
The ELB says the instance is healthy and I can see the requests to health.aspx in the IIS logs. Both the ELB and the instance are configured with a security group that allows 80 and 443.
But if I try to request {elb-url}/health.aspx over the open internet the request just times out. Similarly, with an elastic IP associated to the instance, a request to {elastic-ip}/health.aspx times out.
#Chris, thanks for the response...as it happens, I've already worked it out with some help from a friend. I'll post my findings here for posterity (in case anybody else was similarly confused about how ELB works).
This would be more clear with a diagram. But the summary is that in each availability zone, you need to create both a public and a private subnet. When you add availability zones to your ELB, you need to select the public subnet for the zone. This had already been done in us-east-1b before I got to this setup, and I had simply missed this nuance of ELB configuration. So for the new availability zone, I had to do this...
us-east-1c
private subnet 10.1.3.0/24 (using nat instance as default route)
public subnet 10.1.4.0/24 (using internet gateway as default route)
Then my instance goes in the private subnet as expected.
And the lynch pin of this whole thing is (drum roll....)
When I add us-east-1c to my ELB, I have to select the public subnet...10.1.4.0. Otherwise the instances will pass the health check (since the ELB can communicate with any instance within my entire VPC) but the responses from the servers cannot make it back out to the public internet.
This is what is so confusing. And I still don't fully understand it. The instance can make a request for, say, www.google.com. I can RDP to it and open a browser and get the web page. But a request from a host (like my laptop at my house) will die. strange.
PS: another note...make sure you are using enough NAT instance for your load. I think we ran into an issue where our NAT instance simply ran out of ports because too many web servers were trying to route outbound connections to 3rd party APIs through it. Quite honestly, I'm not good enough at this level of network/OS troubleshooting to be sure. But my theory is that our 8 instances of IIS were holding too many connections open to the NAT instance. We were also abusing the NIC on that micro instance. I upped us to two large instances, one per AZ and things smoothed back out. Both NAT instances are humming and we're not seeing the hung processes in IIS anymore.
Debugging this kind of issue is always a challenge. I have a few ideas to suggest based on what you have written (and generally apply to trying to solve this problem) that come from dealing with this a number of times.
Have you checked both the security groups and network ACLs? Bear in mind that all network ACLs need to be specified in both directions, as they are stateless. Also bear in mind that ELBs are a bit unique in this regard. While they are associated with your VPC, they sometimes need extra rules to ensure connectivity. In the past I have debugged this by opening all network ACLs on all ports, then removing these rules until it has stopped working in order to identify where the block was.
Security groups should be checked too. They are stateful but ensure that your load balancer has permissions to be hit from the web.
Have you checked this isn't an application configuration problem? I don't know how IIS comes out of the box but I would check it is setup to respond to all hostnames.
Check the ELB isn't an internal one, as that wouldn't be publically addressable.
You say the ELB is configured with the health check, but it's worth checking you also have the listener setup for port 80? It's in a separate tab on the dashboard and you will need this in addition to the health check for connectivity through the ELB.
Hope one of these tips is useful to you.

How do I consume a real web service from a BPEL Process?

I've been doing some research on BPEL for about two weeks now and still don't quite get it.
I have deployed the HelloWorld sample in ODE and have also managed to deploy this other one.
My intention was to do something like the second example but with my own real WS deployed and working.
I'm now at the point of having a process with no errors and correctly deployed in ODE with the following structure:
I have started the project from a service definition importing my Multiply.wsdl. The Designer has composed the import tag into the MuktiplyProcessArtifacts.wsdl next to the PartnerLinkTypes all automagically so I assume all namespaces, etc are ok.
There is a few concepts I misundertand in order to make all of this work:
In my original Multiply.wsdl I have
soap:address
location="http://localhost:8080/WS-multiply/multiply"
but ODE tells me my soap:address must have the form host.port/ode/processes..
This doesn't sound reasonable to me since my WS could be implemented anywhere outside my ODE_HOME.
The second example I mentioned before explains how the Designer presumably creates a "Caller.wsdl", which in fact has the function I would desire, which is to implement a "wrapper" WSDL, providing the BPEL process with entry and exit points. The issue is the Designer does not generate that interface. Am I supposed to create it myself? Do I have to create it at all?
If that 3rd wsdl is really needed, is it the one I would have to call if I wanted to test the whole process?
It looks like your partner WSDL is associated to a myrole of a partnerlink. Partnerlinks and partnerlink types are a concept in BPEL that is used to define dual interfaces in a sense that if a partner A wants to communicate with a BPEL process as a buyer, it needs to provide a certain set of functionality that the process can use for further communications (i.e. sending a shipment confirmation to the buyer). Thus, a partnerlink maintains two roles, the myRole is the portType (aka interface) that the process itself provides, the partnerRole refers to a portType the process expects to be implemented by the partner. MyRoles must be of course implemented by the BPEL process and thus needs to have an endpoint that is exposed by the BPEL engine. PartnerRoles can be bound to arbitrary endpoints. This happens in the deployment descriptor, which is the deploy.xml in ODE.
I guess you can fix your process by assigning your partner WSDL to a partner role.
I hope http://thiliniishaka.blogspot.com/2012/10/develop-ws-bpel-process-using-wso2.html
and http://thiliniishaka.blogspot.com/2012/10/part-2-developing-ws-bpel-process-using.html may help you to resolve aforementioned queries.
Thanks
Thilini
Its mandatory to have Ode.war deployed at tomcat server, tomcat create a path like the picture, you need to config your endpoit with the complete path /ode/processes
c:\apache-tomcat-7.0.55\webapps\ode\WEB-INF\processes\BPEL_WS\

Resources