Web Crawler for testing purpose? - ruby-on-rails

I want to test a set of ruby-on-rails applications. Specifically, I want to trigger all possible GET/POST requests available. I am considering using some web crawler-like tool, which could (recursively) send requests to my web server, get responses, and parse the response HTML file to get all possible "href tags", "form submission buttons", etc.
Essentially I want to see the performance of these web applications and get some logs of things like what are the request routes, parameters, database accesses, queries, transactions, etc.
Sending GET requests is relatively easy to handle, I would need to simply parse the HTML response and extract the href attributes of all anchors. However, I don't know how to handle those POST requests; they would require me to fill in all these parameter fields included in the form fields. I am wondering if there exist some tools doing such work. Or some tools I can easily modify (not too much) code to achieve my functionality?
Thanks a lot.

Related

Load-testing in Orbeon - request generation

I've been trying to use Gatling to load-test our Orbeon servers. More specifically we want to know how many concurrent users the server can handle submitting forms.
I've already captured the requests using Gatling (one request per form field that is filled in). However, the requests are not working when I replay them. My first thought upon inspecting the requests is that it should contain a valid UUID. But where can I generate this ID, or parse it from the initial request? Is it even possible to manually generate these requests?
Any other suggestion for a load-testing tool for Orbeon would also be helpful.
We often do something similar here, using JMeter, but the idea is the same whatever tool you're using. Indeed, Ajax requests:
Need to have to be "in" the same session used to generate the page to which they are related, i.e. typically carry the correct JESSSIONID cookie.
Need to refer to the proper UUID. You can find the UUID in the HTML of the page, in the <input type="hidden" name="$uuid" value="…">.
Need to have the correct <xxf:sequence>1</xxf:sequence> number. I.e. 1 for the first request made after the page is loaded, then 2, and so on.

MVC 5 how to achieve POST that behaves like a redirect to GET with content

My client redirects to a https://domain.com/Controller/GetInfo?Querystring method. Now my query string is getting dangerously close to the 2K limit, so I need to reproduce this behavior but pack my query string into the content of the messages. Since it would be heresy (etc.) to try a GET with content, I'll use a POST. However, I can't redirect to a POST since a Redirect has no content.
So, what I am looking for is the best MVC 5 pattern to resolve this: I need to provide lots of content, but I want the resulting page hosted on my remote server (i.e. as if I had redirected)
Also, since I use load balanced servers in azure, I'd prefer maintaining my clean stateless server if at all possible (else I'll have to introduce session caching).
#AntP is absolutely right in the comments above. If your query string is approaching 2K, then you're abusing it.
If there's a particular object you're referencing, then you can simply include the id or some other identifying piece of it and use that to look it up again from your data store.
If there's no persistent record of the object, then you can use something like Session or TempData to store it between one request and the next.
Regardless, it's not possible to redirect with a request body, with also means it's not possible to redirect using POST. The reason for this that the a redirect is not something the server does, but rather the client. The server merely suggests that the client go to a different URL. It's then up to the client (web browser) to issue a new request for that URL. Since the client is the one issuing the request, it makes the decision about what data is or isn't included in that request, not the server.

Ping/Post Form Handling with PHP?

I'm working with a company on lead delivery, and they sent me some info regarding a Ping Post form setup. I've built hundreds of HTML forms processed by PHP (ie. sending an email/etc), but never something that would Ping a url, then return a value. The value it returns is XML.
Here's the purpose of the process:
I send a lead (form data) using the form with a particular zip code
This company parses that info, decides if it wants to "buy" it
Returns XML saying "Approved" or "Denied"
If "approved", I then post the data, and if "denied", I can do whatever I want
What is a common PHP method for doing this? I can research the code and put something together, just need to know what structure or PHP methods would work?
Thanks in advance.
You should be looking into RESTful Web Services.
here's a few examples that might help you
http://markroland.com/blog/restful-php-api/
http://coreymaynard.com/blog/creating-a-restful-api-with-php/
I did not create these examples, just what I found on Google.
I used file_get_contents(url) to handle the posting. The url contains inputs from the HTML form added as a query string, and the response is in XML which gets handled with simplexml_load_file().
As far as I understand your question what you need is to make an HTTP POST request and parse the incoming XML data.
I would rather not use file_get_contents() on remote servers - there are some potential security issues and it was missing some features the last time I checked. I strongly recommend cURL for remote HTTP/HTTPS communication.
Depending on the API you are posting to you might be able to use the SOAPclient class, but from the look of the response you got all you need is XML parser or Simple XML.
Anyway if you just need to check if a certain keyword (like Approved or Denied) is present you can use a simple string matching like this
if(strpos($response,'<STATUS>APPROVED</STATUS')!==false){
//approved
}
...

How to properly load HTML data from third party website using MVC+AJAX?

I'm building ASP.NET MVC2 website that lets users store and analyze data about goods found on various online trade sites. When user is filling a form to create or edit an item, he should have a button "Import data" that automatically fills some fields based on data from third party website.
The question is: what should this button do under the hood?
I see at least 2 possible solutions.
First. Do the import on client side using AJAX+jQuery load method.
I tried it in IE8 and received browser warning popup about insecure script actions. Of course, it is completely unacceptable.
Second. Add method ImportData(string URL) to ItemController class. It is called via AJAX, does the import + data processing server-side and returns JSON-d result to client.
I tried it and received server exception (503) Server unavailable when loading HTML data into XMLDocument. Also I have a feeling that dealing with not well-formed HTML (missing closing tags, etc.) will be a huge pain. Any ideas how to parse such HTML documents?
Unfortunately you can't do cross-site loading usting JavaScript without using JSONP. This is a security issue. Your best bet would be to AJAX a request to one of your controller's actions and have it do the web request and return the result to the client.
As far as the 503 Server Unavailable goes, does this happen on every request? It sounds like you're parsing information from WoW Armory. They throttle web requests and will ban you after a certain about of time.
Use http://htmlagilitypack.codeplex.com/ to process HTML on server. Or regexps. Or string.IndexOf. Or import MSHTML via Interop library and use it. Do not load HTML into XML documents unless you're absolutely sure it's pure XHTML.
Also, try to see if 3rd party websites provide more direct access to data - XML, REST, web services.

how to write an artificial request

how can i construct a artificial request to login to twitter or any site for that matter that accpets post forms.
what i've been trying is to extract the headers and post request parameters from the origional request(directed at the action atribute of the form) and copy it to the outgoing url object that i am making.but it just won't work.
And i am aware of the apis and i don't wanna use them i am trying this to write a web proxy site.
I don't fully understand your question (e.g. "aware of the APIs and I don't want to use them") but urlib may be useful, particularly urllib.FancyURLopener(...).
Are you looking for libcurl ?
It's a library that allows you to interact with servers using a bunch of different protocoles, including HTTP. So, for instance, you can simulate POST or GET request.
You can use it as a command line tool or as a library from many languages (PHP, C, etc ...)

Resources