I am working on some code that scrapes a page for two css classes on a page. I am simply using the Hpricot search method for this as so:
webpage.search("body").search("div.first_class | div.second_class")
...for each item found i create an object and put it into an array, this works great except for one thing.
The search will go through the entire html page and add an object into an array every time it comes across '.first_class' and then it will go through the document again looking for '.second_class', resulting in the final array containing all of the searched items in the incorrect order in the array, i.e all of the '.first_class' objects, followed by all the '.second_class' objects.
Is there a way i can get this to search the document in one go and add an object into the array each time it comes across one of the specified classes, giving me an array of items that is in the order they are come across in on the page i am scraping?
Any help much appreciated. Thanks

See the section here on "Checking for a Few Attributes":
You should be able to stack the elements in the same way as you do attributes. This feature is apparently possible in Hpricot versions after 2006 Mar 17... An example with elements is:

Ok so it turned out i was mistaken and this didn't do anything different to what i previously had at all. However, i have come up with a solution, wether it is the most suitable or not i am not sure. It seems like a fairly straight forward for an annoying problem though.
I now perform the search for the two classes above as i mentioned above:
However this still returned an array firstly containing all the divs with a class of 'first_class' followed by all divs with a class of 'second_class'. So to fix this and get an array of all the items as they appear in order on the page, i simply chain the 'add_class' method with my own custom class e.g. 'foo_bar'. This then allows me to perform another search on the page for all divs with just this one tag, thus returning an array of all the items i am after, in the order they appear on the page.

Thanks for the tip. I hadn't spotted that in the documentation and also found another page i hadnt seen either. I have fixed this with the following line:
This now adds an object into the array each time it comes across one of the above classes in the document. Brilliant!


How to create a tagging system like on Stack Overflow or Quora

I want to create a tagging system like seen here on Stack Overflow or on Quora. It'll be its own model, and I'm planning on using this autocomplete plugin to help users find tags. I have a couple of questions:
I want tags to be entirely user-generated. If a user inputs a new tag by typing it and pressing an "Add" button, then that tag is added to the db, but if a user types in an existing tag, then it uses that one. I'm thinking of using code like this:
def create
#video.tags = find_or_create_by_name(#video.tags.name)
Am I on the right track?
I'd like to implement something like on Stack Overflow or Quora such that when you click a tag from the suggested list or click an "Add" button, that tag gets added right above the text field with ajax. How would I go about implementing something like that?
I know this is kind of an open-ended question. I'm not really looking for the exact code as much as a general nudge in the right direction. Of course, code examples wouldn't hurt :)
Note I am NOT asking for help on how to set up the jQuery autocomplete plugin... I know how to do that. Rather, it seems like I'll have to modify the code in the plugin so that instead of the tags being added inside the text field, they are added above the text field. I'd appreciate any direction with this.
mbleigh's acts_as_taggable_on gem is a feature-complete solution that you should definitely look into a little more closely. The implementation is rock-solid and flexible to use. However, it is mostly concerned with attaching tags to objects, retrieving tags on objects, and searching for tagged items. This is all backend server stuff.
Most of the functionality you are looking to change (based on your comments) is actually related more to your front-end UI implementation, and the gem doesn't really do much for you there. I'll take your requests one-by-one.
If user inputs a new tag, that tag
gets added, if user inputs an
existing tag, the existing tag gets
used. acts_as_taggable_on does this.
Click a tag from suggested list to
add that tag. This is an
implementation issue - on the
back-end you'll need to collect the
suggested list of tags, then display
those in your presentation as links
to your processing function.
Autocomplete as user enters
potential tag. You'll use the jQuery
autocomplete plugin against a list
of items pulled off the tags table.
With additional jQuery, you can
capture when they've selected one of
the options, or completed entering
their new tag, and then call the
processing function.
Restrict users to entering only one
tag. This will be your UI
implementation - once they've
entered or selected a tag, you
process it. If they enter two words
separated by a comma, then before or
during processing you have to either
treat it as one tag, or take only
the text up to the first comma and
discard the rest.
When you process the addition of a
tag, you will have to do two things.
First, you'll need to handle the UI
display changes to reflect that a
tag has been entered/chosen. This
includes placing the tag in the
"seleted" area, removing it from the
"available" display, updating any
counters, etc. Second, you'll need
to send a request to the server to
actually add the tag to the object
and persist that fact to the
database (where the taggable gem will take over for you). You can either do this via
an individual AJAX request per tag,
or you can handle it when you submit
the form. If the latter, you'll need
a var to keep the running list of
tags that have been added/removed
and you'll need code to handle
adding/removing values to that var.
For an example of saving tags while editing but not sending to server/db until saving a form, you might take a look at the tagging functionality on Tumblr's new post page. You can add/remove tags at will while creating the post, but none of it goes to the database until you click save.
As you can see, most of this is on you to determine and code, but has very little to do with the backend part. The gem will take care of that for you quite nicely.
I hope this helps get you moving in the right direction.
The more I try to force the acts-as-taggable-on gem to work the more I think these are fundamentally different types of problems. Specifically because of aliases. The gem considers each tag to be its own special snowflake, making it difficult to create synonyms. In some cases it doesn't go far enough, if you want the Tag to have a description you'd need to edit the given migrations (which isn't hard to do).
Here's what I'm considering implementing, given the trouble I've had implementing via the gem. Let's assume you want to create a tagging system for Technologies.
Consider the following psuedo code, I haven't yet tested it.
rails g model Tech usage_count::integer description:text icon_url:string etc. Run the migration. Note the
Now in the controller you will need to increment usage_count each time something happens, the user submits a new question tagged with given text.
rails g model Name::Tech belongs_to:Tech name:string
Name::Tech model
belongs_to :tech
Then you could search via something like:
search = Name::Tech.where("name LIKE :prefix", prefix: "word_start%")
.order(usage_count: desc)
This is starting point. It's fundamentally different from the gem, as each tag is just a string on its own, but references a richer data table on the back end. I'll work on implementing and come back to update with a better solution.

