Rails request statistics gem? - ruby-on-rails

I'm looking for a rails plugin to show request statistics (# of sql queries, time, etc) on each request while in development mode. Something like http://getglimpse.com/ would be great. I've seen one or two of these before but for the life of me I can't find them. Any help?
Ideally, it would show in the header or the footer of every page.

I found a few including the one I was thinking about and some others.
This one is amazing so far:
https://github.com/dsboulder/query_reviewer
This is the one I was thinking of:
https://github.com/josevalim/rails-footnotes
This seems to be a similar but better plugin to rails-footnotes:
https://github.com/brynary/rack-bug
May be best to mix and match these, try rack-bug and query_reviewer
A few others are linked from query_reviewer

Related

From a development perspective, how does does the indeed.com URL structure and site work?

On the webmaster's Q and A site, I asked the following:
https://webmasters.stackexchange.com/questions/42730/how-does-indeed-com-make-it-to-the-top-of-every-single-search-for-every-single-c
But, I would like a little more information about this from a development perspective.
If you search Google for anything job related, for example, Gastonia Jobs (City + jobs), then, in addition to their search results dominating the first page of Google, you get a URL structure back that looks like this:
indeed.com/l-Gastonia,-NC-jobs.html
I am assumming that the L stands for location in the URL structure. If you do a search for an industry related job, or a job with a specific company name, you will get back something like the following (Microsoft jobs):
indeed.com/q-Microsoft-jobs.html
With just over 40,000 cities in the USA I thought, ok, maybe it's possible they looped through them and created a page for every single one. That would not be hard for a computer. But then obviously the site is dynamic as each of those pages has 10000s of results and paginated by 10. The q above obviously stands for query. The locations I can understand, but they cannot possibly have created a web page for every single query combination, could they?
Ok, it gets a tad weirder. I wanted to see if they had a sitemap, so I typed into Google "indeed.com sitemap.xml" I got the response:
indeed.com/q-Sitemap-xml-jobs.html
.. again, I searched for "indeed.com url structure" and, as I mentioned in the other post on webmasters, I got back:
indeed.com/q-change-url-structure-l-Arkansas.html
Is indeed.com somehow using programming to create a webpage on the fly based on my search input into google? If they are not, how are they able to have a static page for millions and millions and millions possible query combinations, have them dynamically paginate, and then have all of those dominate google's first page of results (albeit that very last question may be best for the webmasters QA)?
Does the javascript in the page somehow interact with the URL
It's most likely not a bunch of pages. The "actual" page might be http://indeed.com/?referrer=google&searchterm=jobs%20in%20washington. The site then cleverly produces a human readable URL using URL rewrite, fetches jobs in the database that matches the query, and voĆ­la...
I could be dead wrong of course. Truth be told, the technical aspect of it can probably be solved in a multitude of ways. Every time a job is added to the site, all pages that need to be done to match that job, might be created, thus producing an enormous amount of pages for Google to crawl.
This is a great question however remains unanswered on the ground that a basic Google search using,
ste:indeed.com
returns over 120MM results and secondly a query such as, "product manager new york" ranks #1 in results. These pages are obviously pre-generated which is confirmed by the fact the page is cached by the search engine (sometimes several days before) has different results from a live query on the site.
Easy when Googles search bot crawls the pages on indeed or any other job search site those page are dynamically created. Here is another site: http://jobuzu.co.uk i run this which is similar to how indeed works.
PHP is your friend in this and Indeed don't just use standard databases look into Sphinx and Solr as they offer Full text search for better performance then MySql etc.
They also make clever use of rel="canonical" and thorough internal linking:
http://www.indeed.com/find-jobs.jsp
Notice that all the pages that actually rank can be found from that direct internal link structure.

Advice with Full Text Search Engine in Rails

I am trying to add to my website a search bar like the one on Facebook. I want my users to be able to search through my products, my other users ... But I also want the result to be displayed in real time without pressing a button. I am currently looking at several options (thinking-sphinx, ferret, ...) but I am not sure which one to use and that's why I would like to get advices from pros ;)
So my requirement are :
Result to be displayed in real-time in a box on the current page.
Css Customizable
Be able to search through severals table in my PostGreSQL DB.
I am currently using Heroku for production.
I want to choose the best one for my needs and that's why I am asking your opinion.
Thanks in advance !
Make sure to separate how you want data displayed (your first two requirements) from how you want it indexed (third) from how you want to deploy.
Let's start backwards. Heroku provides limited support for machine configuration; both options you mention require installation of a service that reads and writes files. Heroku has such an option in two ways: 1) PostgreSQL has a built-in full-text search capability, and 2) Heroku has made Flying Sphinx an option, as well as these other options documented on the Heroku site. The first two options may provide the easiest linkage to your database, but I haven't tried other options, so it's possible they do too. So now you have a search index, deployed.
Real-time "incremental" search is purely a matter of presentation ... and maybe performance. Start typing and you start getting results is nothing more than sending requests via AJAX to the search server, typically after a short delay in typing (maybe 50ms), and handling the display of results. There are a couple of ways to make that simple written up here in this SO answer.
I ended up keeping the PostgreSQL engine for now but used select2 as the jquery for the presentation and was able to get a facebook like search box :
Select2 with Rails and JSON
Thanks for you help !

Can anyone recommend a gem for searching that actually allows me to easily filter my results in ruby on rails?

I've tried thinking sphinx after being pointed in that direction and simple filtering seems impossible. I've googled and asked questions for 2 days now and it seems it can't be done which is shocking because it's something commonly done when searching on websites.
All I would like to do add filtering options to my search form such as filtering by one or a combination of:
When user hits browse page all the sites users are returned but showing 20 results per page
Filtering options
in: location
who are: sexual preference
between the ages: age range
and located in: country
My search page works fine because all I require is 1 textfield a user uses for finding users by email, username or full name. My browse page is a different story because I'm using 1 form with multiple text fields and one or two select fields.
Example
Is there a gem that does this easily and performs well at the same time?
or would doing this manually via find methods be the only way?
Kind regards
Apart from using Sphinx and Thinking Sphinx, you can think of those gems: meta_where and meta_search
However after reading your description I think Sphinx is the best choice here indeed.
You wrote that it seems impossible to apply simple filtering using Thinking Sphinx. Let me explain a bit of Thinking Sphinx within the post you mentioned under the link: Example
You can go for Elasticsearch. Ruby has the 'Tire' gem, which is a client for ElasticSearch http://www.elasticsearch.org/

Has anyone got any experience using Pfeed? I have several issues to build up a 'recent activity' log

I have started trying to use Pfeed plugin for my Rails app. Apart from four support pages of wiki on the Github, I only found this blog post helpful for me to start using.
I have managed to get the simple feeds working like "User bought 12 items about 1 minute ago" etc. But when it comes to customize the feed items, that's where I have having issues to proceed. Pfeed uses Model&View items for each feed configuration and I found out that models are working as it should be. Very frustrating.
Has anyone used this Plugin before? If so, please do let me know how it goes. Also if you have ever used any other good plugins for this sort of Recent Activity feature, please show me the way.
Many thanks.
Phyo
Not sure if it's the answer you're looking for, but I looked through a bunch of timeline generators and settled for timeline_fu
It's quite easy to use (I ended up forking it and making some edits so save some extra variables etc. but it was really easy to do).

how do I block my rails app from being hit by bots?

I'm not even sure I'm using the right terminology, whether this is actually bots or not. I didn't want to use the word 'spam' because it's not like I have comments or posts that are being created/spammed. It looks more like something is making the same repeated request to my domain, which is what made me think it was some kind of bot.
I've opened my first rails app to the 'public', which is a really a small group of users, <50 currently. That was last Friday. I started having performance issues today, so I looked at the log and I see tons of these RoutingErrors
ActionController::RoutingError (No route matches "/portalApp/APF/pages/business/util/whichServer.jsp" with {:method=>:get}):
They are filling up the log and I'm assuming this is causing the slowdown. Note the .jsp on the end and this is a rails app, so I've got no urls remotely like this in my app. I mean, the /portalApp I don't even have, so I don't know where this is coming from.
This is hosted at Dreamhost and I chatted with one of their support people, and he suggested a couple sites that detail using htaccess to block things. But that looks like you need to know the IP or domain that the requests are coming from, which I don't.
How can I block this? How can I find the IP or domain from the request? Any other suggestions?
Follow up info:
After looking at the access logs, it looks like it's not a bot. Maybe I'm not reading the logs right, but there are valid url requests (generated from within my Flex app) coming from the same IP. So now I'm wondering if it's some kind of plugin generating the requests, but I really don't know. Now I'm wondering if it's possible to block a certain url request, based on a pattern, but I suppose that's a separate question.
Old question, but for people who are still looking for alternatives I suggest checking out Kickstarter's rack-attack gem. Allows not only blacklisting and whitelisting, but also throttling.
These page seems to offer some good advice:
Here
The section on blocking by user agent may be something you could look at implementing. Is there anyway you can get the useragent from the bot from your logs? If so look for the unique aspect of the useragent that probably identifies the bot and add the following to .htaccess replacing the relevant bits
BrowserMatchNoCase SpammerRobot bad_bot
Order Deny,Allow
Deny from env=bad_bot
Its detail on that link in more detail and of course, if you can't get the useragent from your logs then this will be of no use to you!
You can also update your public/robots.txt file to allow/disallow robots.
http://www.robotstxt.org/wc/robots.html

Resources