Spring Security SAML infinite retry loop on bad SAMLResponse - spring-security

I am currently working on configuring Spring Security SAML extension (I am a Service Provider who wants to integrate with multiple Identity Providers).
I noticed bizarre behavior. When everything is configured correctly, federated login just works, and everything's perfect. But when there's a problem with SAMLResponse parsing, my app goes into a retry loop (which I have not configured anywhere). After a few hundreds of failed retry attempts, it returns 500 to a user. This dumps a ton of useless stacktraces into my logs.
The first SAMLResponse parsing error message contains an actual root cause. All others have the same info: InResponseToField in SAMLResponse did not match request Id in AuthnRequest. But this is very strange because in my browser I see only one request and one response (which appears after all retries finish and I receive 500). And ids match in those two requests.
first failure:
{"#timestamp":"2019-05-10T13:20:17.628+0300","level":"INFO","thread":"http-nio-8000-exec-2","logger":"org.springframework.security.saml.log.SAMLDefaultLogger","message":"AuthNResponse;FAILURE;0:0:0:0:0:0:0:1;org:patkovskyi:test;http://www.okta.com/exkm1jqgci0YsCUgy0x7;;;org.opensaml.common.SAMLException: Response doesn't have any valid assertion which would pass subject validation\n\tat org.springframework.security.saml.websso.WebSSOProfileConsumerImpl.processAuthenticationResponse(WebSSOProfileConsumerImpl.java:229)}
retries:
{"#timestamp":"2019-05-10T13:20:17.630+0300","level":"INFO","thread":"http-nio-8000-exec-2","logger":"org.springframework.security.saml.log.SAMLDefaultLogger","message":"AuthNResponse;FAILURE;0:0:0:0:0:0:0:1;org:patkovskyi:test;http://www.okta.com/exkm1jqgci0YsCUgy0x7;;;org.opensaml.common.SAMLException: InResponseToField of the Response doesn't correspond to sent message a1cj420hf746f14746d8c2gb014a588\n\tat org.springframework.security.saml.websso.WebSSOProfileConsumerImpl.processAuthenticationResponse(WebSSOProfileConsumerImpl.java:139)}
Has anyone seen a similar problem? And is there some hidden retry mechanism in Spring Security SAML?

Related

WSO2 - Extend Allowed URI Length to Maximum

We have an API published on WSO2. It works perfectly.
When I send my request like the picture below, it responses 200 as I wanted:
I just wanted to test my request by adding more deleted=false query. So, I can send request until the request's size is 5.75 KB. I see stil 200 OK nicely. You can see on picture below:
But, if I reach request size 5.76KB by adding 1 more deleted=false query, I see this error:
As I searched on internet, I see that the REST API supports Uniform Resource Locators (URLs) with a length of up to 6000 characters.
My question is, how can I extend this limit? Is there any way to do that ?
As per the shared screenshot, it seems the Backend itself is responding back with a 400 Bad Request status code. The API Manager doesn't have any restrictions on large query parameters in the URI. So, this error is coming up from your actual Backend service, which is not able to handle a large request.
To confirm this behavior, you can enable the WIRE logs in the API Manager server and troubleshoot the behavior. If the request is dispatched to the Backend and the Backend is responded with 400 Bad Request means, the Backend is only capable of handling requests up to 5.75 KB in your case.
Also, as an alternate check, you can also try invoking the actual Backend service URL from the Postman (direct invocation and not via WSO2) and verify the behavior with large requests.
Given below are few documentations related to enabling WIRE logs and understanding the WIRE logs
WSO2 API Manager v3.1.0: Enable WIRE Logs
WSO2 API Manager v2.6.0: Enable WIRE Logs
How to read and understand WIRE Logs

Performing Poorly org.springframework.security.authentication.ProviderManager.authenticate(Authentication)

I am trying to figure out why this is performing so poorly:
execution(Authentication org.springframework.security.authentication.ProviderManager.authenticate(Authentication)) ->
ELAPSED_TIME="592 ms"
I am using a org.springframework.security.ldap.authentication.LdapAuthenticationProvider;
What can I log/capture to see why this is taking so long?
Thanks,
Brian
Since you are using LDAP authentication on first authenticate request that authentication must be verified against LDAP server (thats why it take so long for first request). I dont know exactly how LDAP Works but it seems that subsequent requests also have to be verified against server but already with some type of cache on LDAP server Side (thus faster response times).
Please see this thread:
Slow authentication to LDAP Server on initial login attempt

Jersey Client: Authentication fails at redirect by Jenkins

I am attempting to use the REST api of Jenkins. Jenkins requires a POST request to a URL to delete a job. This results in the following:
I tell my chosen Client to send a POST to the appropriate URL.
The client sends a POST and authorizes itself with username and password.
Jenkins deletes the job.
Jenkins returns a "302 - Found" with the location of folder containing the deleted job.
Client automatically sends a POST to the location.
Jenkins answers with "200 - OK" and the full HTML of the folder page.
This works just fine with Postman (unless I disable "Automatically follow redirects" of course).
Jersey however keeps running into a "404" at step 5 because I blocked anonymous users from viewing the folder in question. (Or a "403" if I blocked anonymous users altogether.)
Note that the authentication works in step 1 because the job has been deleted successfully!
I was under the impression that Jersey should use the given authentication for all requests concerning the client.
Is there a way to actually make this true? I really don't want to forbid redirects just to do every single redirect myself.
To clarify: The problem is that while Jersey follows the redirect, but fails to authenticate itself again, leading to the server rejecting the second request.
Code in question:
HttpAuthenticationFeature auth = HttpAuthenticationFeature.basicBuilder()
.credentials(username, token)
.build();
Client client = ClientBuilder.newBuilder()
.register(auth)
.build();
WebTarget deleteTarget = client.target("http://[Jenkins-IP]/job/RestTestingArea/job/testJob/doDelete")
Response response = deleteTarget.request()
.post(null);
EDIT: The "302-Found" only has 5 headers according to Postman: Date, X-Content-Type-Options ("nosniff"), Location, Content-Length (0) and Server. So neither any cookies nor any tokens that Postman might use and Jersey disregard.
Question loosely related to this one - if I were able to log the second request I might be able to understand what's happening behind the scenes.
EDIT2: I have also determined that the problem is clearly with the authentication. If I allow anonymous users to view the folder in question, the error disappears and the server answers with a 200.
I found the answer with the help of Paul Samsotha and Gautham.
TL;DR: This is intended behavior and you have to set the System property http.strictPostRedirect=true to make it work or perform the second request yourself.
As also described here, HttpURLConnection decided to not implement a redirect as it is defined in the HTTP standard but instead as many browsers implemented it (so in laymans terms, "Do it like everyone else instead of how it is supposed to work"). This leads to the following behavior:
Send POST to URL_1.
Server answers with a "302 - Found" and includes URL_2.
Send GET to URL_2, dropping all the headers.
Server answers with a "404 - Not Found" as the second request does not included correct authentication headers.
The "404" response is the one received by the code, as steps 2 and 3 are "hidden" by the underlying code.
By dropping all headers, the authentication fails. As Jersey uses this class by default, this lead to the behavior I was experiencing.

401s during low network connectivity with rails api using devise-token-auth

I have a rails api using devise-token-auth. Recently I was on really spotty/slow Wifi and I noticed I was getting 401's from my app. My theory is that the refreshing auth token is either being lost and delayed by the bad network. That being said, I'm having a hard time reproducing the bug itself.
Three primary questions:
Could a spotty Wifi/network connection lead to 401s, due to loss or delay of the new auth-token. And if this is the case, is there a way to recover without needing the user to log back in.
How to reproduce such an environment, so I can debug this scenario.
I was able to reproduce it by delaying the server response using a debugger. In my case, this happens when I enable change_headers_on_each_request config, so when the response which carries the new tokens fails the next responses throw 401 code.
I recently sent an issue to the gem explaining this and asking how can I handle this situation on the frontend.

status code 500 internal server error in LoadRunner

I have a web application which i need to be load tested using LoadRunner. When I record the website using vugen it works good and there is no any application bug. But when I tried to replay the script, script failed after login and while navigating to next page, say, Transaction. At the end of log, I receive error:
Action.c(252): Error -26612: HTTP Status-Code=500 (Internal Server Error)
for "http://rob.com/common/transaction
Please help me to resolve this error.
LoadRunner generates HTTP request just as your browser does, this error is the same error you would get if you would go to that URL using your browser. Error code 500 is a generic server error that is returned when there is no better (more specific error to return).
Most likely the login process requires some form of authentication which is protected against a replay attack by using some form of token. It is up to you to capture this token using Correlations in LoadRunner and replay it as the server expects. The Correlation Studio in VuGen should detect and identify the token for you but since authentication methods vary it is sometimes impossible to do this automatically and you will have to create manual correlation. Please consult the product documentation for more details on how to do it. If your website is publicly available online then post its URL and I will try to record the script on my machine.
Thanks,
Boris.
Most common reasons
You are not checking each request for a valid result being returned and using a 200 HTTP status as an assumed correct step without examining the content of what is being returned. As a result when data being returned is incorrect you are not branching the code to handle the exception. Go one to two steps beyond where your business process has come off the rails with an assumptive success and you will have a 500 status message for an out of context action occurring 100% of the time.
Missed dynamic element. Record three times. Compare the code. Address the changing components.

Resources