How to search an element in Nokogiri using its pointer id - ruby-on-rails

In the Nokogiri documentation you can find the following:
node.pointer_id # internal pointer number
This returns the internal pointer number as an integer. However, it states nowhere how this can be used to look up a node?
I would have expected something like this:
p_id = node.pointer_id
element = page.with_pointer_id(p_id)
UPDATE...to give you an idea of the use case.
I am caching lots of html pages as Nokogiri object and scan them for specific nodes. Those nodes I save to a hash, together with the number of occurence:
{"node1" => 8}
Right now its saving the whole node as key, but it would be so much more convenient to have an identifier for it. After clustering those hashes I want to retrieve the nodes again -> thats were the id should come in.

You can do this using the #traverse method available through the Nokogiri::XML::Document instance.
Here is #traverse wrapped in your #with_pointer_id method as a singleton.
class Nokogiri::XML::Document
def with_pointer_id(p_id)
traverse {|node| return node if node.pointer_id == p_id}
end
end
Now you can do this:
element = page.with_pointer_id(p_id)
This will find the node with a pointer_id matching p_id, if it exists.

Related

multiple line where clauses

I've got a search page with multiple inputs (text fields). These inputs may or may not be empty - depending on what the user is searching for.
In order to accommodate this I create a base searchQuery object that pulls in all the correct relationships, and then for each non-empty input I modify the query using the searchQuery.Where function.
If I place multiple conditions in the WHERE clause I get the following error:
Cannot compare elements of type 'System.Collections.Generic.ICollection`1'. Only primitive types, enumeration types and entity types are supported.
searchQuery = searchQuery.Where(Function(m) (
(absoluteMinimumDate < m.ClassDates.OrderBy(Function(d) d.Value).FirstOrDefault.Value) _
OrElse (Nothing Is m.ClassDates)
)
)
I know that code looks funky, but I was trying to format it so you didn't have to scroll horizontally to see it all
Now, if I remove the ORELSE clause, everything works (but of course I don't get the results I need).
searchQuery = searchQuery.Where(Function(m) (
(absoluteMinimumDate < m.ClassDates.OrderBy(Function(d) d.Value).FirstOrDefault.Value)
)
)
This one works fine
So, what am I doing wrong? How can I make a multi-condition where clause?
Multiple conditions in the Where isn't the problem. m.ClassDates Is Nothing will never be true and doesn't make sense in SQL terms. You can't translate "is the set of ClassDates associated with this record NULL?" into SQL. What you mean is, are there 0 of them.
If there are no attached ClassDate records, m.ClassDates will be an empty list. You want m.ClassDates.Count = 0 OrElse...

ActiveRecord Integer Column won't accept some Integer assignments

I'm seeing some genuinely bizarre behavior w/ ActiveRecord as it relates to assignment. I have an ActiveRecord model named Venue that includes the measurements of the Venue, all integers less than 1K. We add Venues via an XML feed. On the model itself, I have a Venue.from_xml_feed method takes the XML, parses, and creates Venues.
The problem comes from the measurements. Using Nokogiri, I'm parsing out the measurements like so:
elems = xml.xpath("//*[#id]")
elems.each do |node|
distance = node.css("distances")
rs = distance.attr("rs")
// get the rest of the sides
# using new instead of create to print right_side, behavior is the same
venue = Venue.new right_side: rs # etc
venue.save
puts venue.right_side
end
The problem is that venue.right_side ALWAYS evaluates to nil, even though distance.attr("rs") contains a legal value, say 400. So this code:
rs = distance.attr("rs")
puts rs
Venue.new right_side: rs
Will print 400, then save rs as nil. If I try any type of Type Conversions, like so:
content = distance.attr("rs").content
str = content.to_s
int = Integer(str)
puts "Is int and Integer? #{int.is_a? Integer}"
Venue.new right_side: int
It will print Is int an Integer? true, then again save again save Venue.right_side as nil.
However, if I just explicitly create a random integer like so:
int = 400
Venue.new right_side: int
It will save Venue.right_side as 400. Can anyone tell me what's going on with this?
Well, you failed to include the prerequisite sample XML to confirm this, so you get a fairly generic answer.
In your code you're using:
distance = node.css("distances")
rs = distance.attr("rs")
css doesn't return what you think it does. It returns a NodeSet, which is similar to an Array. When you try to use attr on a NodeSet, you're going to set the value, not retrieve it. From the documentation:
#attr(key, value = nil, &blk) ⇒ Object (also: #set, #attribute)
Set the attribute key to value or the return value of blk on all Node objects in the NodeSet.
Because you're not using a value, the resulting action is to remove the attribute from the tag, which will then return nil and Ruby will assign nil to rs.
If you want to get the attribute of a node, you need to point to the node itself, so use at, or at_css, either of which returns a Node. Once you have the node, you can use attribute to retrieve the value, or use the [] shortcut similar to this untested code:
rs = node.at('distances')['rs']
Again though, because you didn't supply XML it's not possible to tell what else you might be trying to do, or whether this code is entirely accurate.

Rails isn't correctly rendering nested JSON results from Postgres JSON functions

For various reasons, I'm creating an app that takes a SQL query string as a URL parameter and passes it off to Postgres(similar to the CartDB SQL API, and CFPB's Qu). Rails then renders a JSON response of the results that come from Postgres.
Snippet from my controller:
#table = ActiveRecord::Base.connection.execute(#query)
render json: #table
This works fine. But when I use Postgres JSON functions (row_to_json, json_agg), it renders the nested JSON property as a string. For example, the following query:
query?q=SELECT max(municipal) AS series, json_agg(row_to_json((SELECT r FROM (SELECT sch_yr,grade_1 AS value ) r WHERE grade_1 IS NOT NULL))ORDER BY sch_yr ASC) AS values FROM ed_enroll WHERE grade_1 IS NOT NULL GROUP BY municipal
returns:
{
series: "Abington",
values: "[{"sch_yr":"2005-06","value":180}, {"sch_yr":"2005-06","value":180}, {"sch_yr":"2006-07","value":198}, {"sch_yr":"2006-07","value":198}, {"sch_yr":"2007-08","value":158}, {"sch_yr":"2007-08","value":158}, {"sch_yr":"2008-09","value":167}, {"sch_yr":"2008-09","value":167}, {"sch_yr":"2009-10","value":170}, {"sch_yr":"2009-10","value":170}, {"sch_yr":"2010-11","value":153}, {"sch_yr":"2010-11","value":153}, {"sch_yr":"2011-12","value":167}, {"sch_yr":"2011-12","value":167}]"
},
{
series: "Acton",
values: "[{"sch_yr":"2005-06","value":353}, {"sch_yr":"2005-06","value":353}, {"sch_yr":"2006-07","value":316}, {"sch_yr":"2006-07","value":316}, {"sch_yr":"2007-08","value":323}, {"sch_yr":"2007-08","value":323}, {"sch_yr":"2008-09","value":327}, {"sch_yr":"2008-09","value":327}, {"sch_yr":"2009-10","value":336}, {"sch_yr":"2009-10","value":336}, {"sch_yr":"2010-11","value":351}, {"sch_yr":"2010-11","value":351}, {"sch_yr":"2011-12","value":341}, {"sch_yr":"2011-12","value":341}]"
}
So, it only partially renders the JSON, running into problems when I have nested JSON arrays created with the Postgres functions in the query.
I'm not sure where to start with this problem. Any ideas? I am sure this is a problem with Rails.
ActiveRecord::Base.connection.execute doesn't know how to unpack database types into Ruby types so everything – numbers, booleans, JSON, everything – you get back from it will be a string. If you want sensible JSON to come out of your controller, you'll have to convert the data in #table to Ruby types by hand and then convert the Ruby-ified data to JSON in the usual fashion.
Your #table will actually be a PG::Result instance and those have methods such as ftype (get a column type) and fmod (get a type modifier for a column) that can help you figure out what sort of data is in each column in a PG::Result. You'd probably ask the PG::Result for the type and modifier for each column and then hand those to the format_type PostgreSQL function to get some intelligible type strings; then you'd map those type strings to conversion methods and use that mapping to unpack the strings you get back. If you dig around inside the ActiveRecord source, you'll see AR doing similar things. The AR source code is not for the faint hearted though, sorry but this is par for the course when you step outside the narrow confines of how AR things you should interact with databases.
You might want to rethink your "sling hunks of SQL around" approach. You'll probably have an easier time of things (and be able to whitelist when the queries do) if you can figure out a way to build the SQL yourself.
The PG::Result class (the type of #table), utilizes TypeMaps for type casts of result values to ruby objects. For your example, you could use PG::TypeMapByColumn as follows:
#table = ActiveRecord::Base.connection.execute(#query)
#table.type_map = PG::TypeMapByColumn.new [nil, PG::TextDecoder::JSON.new]
render json: #table
A more generic approach would be to use the PG::TypeMapByOid TypeMap class. This requires you to provide OIDs for each PG attribute type. A list of these can be found in pg_type.dat.
tm = PG::TypeMapByOid.new
tm.add_coder PG::TextDecoder::Integer.new oid: 23
tm.add_coder PG::TextDecoder::Boolean.new oid: 16
tm.add_coder PG::TextDecoder::JSON.new oid: 114
#table.type_map = tm

Creating a where query in rails from an array

I need to make a where query from an array where each member of the array is a 'like' operation that is ANDed. Example:
SELECT ... WHERE property like '%something%' AND property like '%somethingelse%' AND ...
It's easy enough to do using the ActiveRecord where function but I'm unsure how to sanitize it first. I obviously can't just create a string and stuff it in the where function, but there doesn't seem to be a way possible using the ?.
Thanks
The easiest way to build your LIKE patterns is string interpolation:
where('property like ?', "%#{str}%")
and if you have all your strings in an array then you can use ActiveRecord's query chaining and inject to build your final query:
a = %w[your strings go here]
q = a.inject(YourModel) { |q, str| q.where('property like ?', "%#{str}%") }
Then you can q.all or q.limit(11) or whatever you need to do to get your final result.
Here's a quick tutorial on how this works; you should review the Active Record Query Interface Guide and the Enumerable documentation as well.
If you had two things (a and b) to match, you could do this:
q = Model.where('p like ?', "%#{a}%").where('p like ?', "%#{b}%")
The where method returns an object that supports all the usual query methods so you can chain calls as M.where(...).where(...)... as needed; the other query methods (such as order, limit, ...) return the same sort of object so you can chain those as well:
M.where(...).limit(11).where(...).order(...)
You have an array of things to LIKE against and you want to apply where to the model class, then apply where to what that returns, then again until you've used up your array. Thing that look like a feedback loop tend to call for inject (AKA reduce from "map-reduce" fame):
inject(initial) {| memo, obj | block } → obj
Combines all elements of enum by applying a binary operation, specified by a block or a symbol that names a method or operator.
If you specify a block, then for each element in enum the block is passed an accumulator value (memo) and the element [...] the result becomes the new value for memo. At the end of the iteration, the final value of memo is the return value for the method.
So inject takes the block's output (which is the return value of where in our case) and feeds that as an input to the next execution of the block. If you have an array and you inject on it:
a = [1, 2, 3]
r = a.inject(init) { |memo, n| memo.m(n) }
then that's the same as this:
r = init.m(1).m(2).m(3)
Or, in pseudocode:
r = init
for n in a
r = r.m(n)
If you're using AR, do something like Model.where(property: your_array) , or Model.where("property in (?)", your_array) This way, everything is sanitized
Let's say your array is model_array, try Array select:
model_array.select{|a|a.property=~/something/ and a.property=~/somethingelse/}
Of course you can use any regex as you like.

How to match ets:match against a record in Erlang?

I have heard that specifying records through tuples in the code is a bad practice: I should always use record fields (#record_name{record_field = something}) instead of plain tuples {record_name, value1, value2, something}.
But how do I match the record against an ETS table? If I have a table with records, I can only match with the following:
ets:match(Table, {$1,$2,$3,something}
It is obvious that once I add some new fields to the record definition this pattern match will stop working.
Instead, I would like to use something like this:
ets:match(Table, #record_name{record_field=something})
Unfortunately, it returns an empty list.
The cause of your problem is what the unspecified fields are set to when you do a #record_name{record_field=something}. This is the syntax for creating a record, here you are creating a record/tuple which ETS will interpret as a pattern. When you create a record then all the unspecified fields will get their default values, either ones defined in the record definition or the default default value undefined.
So if you want to give fields specific values then you must explicitly do this in the record, for example #record_name{f1='$1',f2='$2',record_field=something}. Often when using records and ets you want to set all the unspecified fields to '_', the "don't care variable" for ets matching. There is a special syntax for this using the special, and otherwise illegal, field name _. For example #record_name{record_field=something,_='_'}.
Note that in your example you have set the the record name element in the tuple to '$1'. The tuple representing a record always has the record name as the first element. This means that when you create the ets table you should set the key position with {keypos,Pos} to something other than the default 1 otherwise there won't be any indexing and worse if you have a table of type 'set' or 'ordered_set' you will only get 1 element in the table. To get the index of a record field you can use the syntax #Record.Field, in your example #record_name.record_field.
Try using
ets:match(Table, #record_name{record_field=something, _='_'})
See this for explanation.
Format you are looking for is #record_name{record_field=something, _ = '_'}
http://www.erlang.org/doc/man/ets.html#match-2
http://www.erlang.org/doc/programming_examples/records.html (see 1.3 Creating a record)

Resources