The ordinal parsing problem

The ordinal parsing problem - ruby-on-rails

Rails has a nice function, ordinalize, which converts an integer to a friendly string representation. Namely 1 becomes 1st, 2 becomes 2nd, and so on. My question is how might one implement the inverse feature?
To be more general I'd like to handle both of the following cases:
>> s = "First"
>> s.integerize
=> 1
>> s = 1st
>> s.integerize
=> 1
I am looking for an smart way to do this as opposed to a giant lookup table or just hacking off the last two characters. Any ideas would be appreciated.

to_i does essentially 1/2 of that:
"72nd".to_i
=> 72
It doesn't check validity, but if you need to fail on bad input like "72x", you can just re-ordinalize and compare to the original input string.
For parsing ordinal words, Wikipedia seems impressively helpful.

The first case is relatively hard - I'd say the smart way to do it is find someone who's already done it and use their code. If you can't find someone, the next smartest thing would probably be restating (or renegotiating) the problem so that it's not needed. Beyond that, I think you're into parser-writing...
The second case is as trivial as the to_i already offered. You could also use a regex, I suppose:
"1000000th".scan(/\d+/).first.to_i #=> 1000000

Related

Character Replacements

I have a UniCode string UniStr.
I also have a MAP of { UniCodeChar : otherMappedStrs }
I need the 'otherMappedStrs' version of UniStr.
Eg: UniStr = 'ABC', MAP = { 'A':'233','B':'#$','C':'9ij' }, Result = '233#$9ij'
I have come up with the formula below which works;
=ArrayFormula(JOIN("",VLOOKUP(REGEXEXTRACT(A1,REPT("(.)",LEN(A1))),MapRange,2,FALSE)))
The MAP being a whole character set (40 chars) is quite large.
I need to use this function in multiple spreadsheets. How can I subsume the MAP into the formula for portability ?
Is there a better way to iterate a string other than the REGEXEXTRACT method in formula ? This method has limitation for long strings.
I also tested the below formula. Problem here is it gives 2 results (or the size of the array within SUBSTITUTE replacement). If 3 substitutions made, then it gives three results. Can this be resolved ?
=ArrayFormula(SUBSTITUTE(A1,{"s","i"},{"#","#"}))
EDIT;
#Tom 's first solution appears best for my case (1) REGEX has an upper limit on search criteria which does not hinder in your solution (2) Feels fast (did not do empirical testing) (3) This is a better way to iterate string characters, I believe (you answered my Q2 - thanks)
I digress here. I wish google would introduce Named-Formulas or Formula-Aliases. In this case, hypothetically below. I have sent feed back along those lines many times. Nothing :(
MyFormula($str) == ArrayFormula(join(,vlookup(mid($str,row(indirect("1:"&len($str))),1), { "A","233";"B","#$";"C","9ij" },2,false)))

Not sure how long you want your strings to be, but the more traditional
=ArrayFormula(join(,vlookup(mid(A1,row(indirect("1:"&len(A1))),1), { "A","233";"B","#$";"C","9ij" },2,false)))
seems a bit more robust for long strings.
For a more radical idea, supposing the maximum length of your otherMappedStrings is 3 characters, then you could try:
=ArrayFormula(join(,trim(mid("233 #$9ij",find(mid(A1,row(indirect("1:"&len(A1))),1), "ABC")*3-2,3))))
where I have put a space in before #$ to pad it out to 3 characters.
Incidentally the original VLOOKUP is not case sensitive. If you want this behaviour, use SEARCH instead of FIND.

You seem to have several different Qs, but considering only portability, perhaps something like the following would help:
=join(,switch(arrayformula(regexextract(A1&"",rept("(.)",len(A1)))),"A",233,"B","#$","C","9ij"))
extended with 37 more pairs.

Lua if A==1 or 2 or 3 then

I have a lot of music chords that can have alternate names, rather than having to create longer lines with a lot of == and or's for each alternate name, and if chord=="Maj" or "maj" or.. don't work with Lua:
if chord=="Maj9" or chord="maj9" or chord=="M9" or chord=="Maj7(add9)" or chord=="M7(add9)" then notes="0,4,7,11,14" end
I need a simpler way to do it, maybe just reformat the lines in Notepad++ to use an array,
at the moment each of the 200+ chords on one line each:
if chord=="Maj9,maj9,M9,Maj7(add9),M7(add9)" then notes="0,4,7,11,14" end
if chord=="mMaj7,minmaj7,mmaj7,min/maj7,mM7,m(addM7),m(+7),-(M7)" then notes="0,3,7,11" end

The correct way is to normalize your input. For example, take whatever chord value comes in and use Lua’s string.lower() function to make the string all lowercase. By normalizing your input, you simplify the logic you need to write to work with that data. Consider other ways as well to normalize the data. You might, for example, write a method that converts all notes into an enumerated list (C = 1, C# = 2, etc.). That way equivalent notes get the same in-memory values.
Those are just a few ideas to get you on track. You should not try to think up and then hard-code every possible way a user may input a chord name.

How do Ruby min and max compare strings?

I'm playing around with iterators. What is Ruby comparing using max and min?
If I have this array:
word_array = ["hi", "bob", "how's", "it", "going"]
and I run:
puts word_array.max
puts word_array.min
I expect to get "going" or "how's" for max, since they're both five characters long, or "going" on the theory that it's the last item in the array. I expect min to return "hi" or "it", since they're tied for the shortest string. Instead, I get back:
puts word_array.max -> it
puts word_array.min -> bob
What is Ruby measuring to make this judgement? It selected the shortest string for max and a middle length string for min.

Actually, you are (kind of) asking the wrong question. max and min are defined on Enumerable. They don't know anything about Strings.
They use the <=> combined comparison operator (aka "spaceship"). (Note: pretty much any method that compares two objects will do so using the <=> operator. That's a general rule in Ruby that you can both rely on when using objects other people have written and that you should adhere to when writing your own objects: if your objects are going to be compared to one another, you should implement the <=> operator.)
So, the question you should be asking is, how does String#<=> compare Strings? Unfortunately, the answer is not quite clear from the documentation. However, it does mention that the shorter string is considered to be less than the longer string if the two strings are equal up to that point. So, it is clear that length is used only as a tie-breaker, not as the primary criterion (as you are assuming in your question).
And really, lexicographic ordering is just the most natural thing to do. If I gave you a list of words and asked you to order them without giving you any criteria, would you order them by length or alphabetically? (Note, however, that this means that '20' is less than '3'.)

I believe it's doing a lexical sort, exactly like a dictionary order. The maximum would be the furthest one in the dictionary.

http://ruby-doc.org/core-2.1.0/Enumerable.html#method-i-max_by
ar = ['one','two','three','four','five']
ar.max_by(&:length) # => "three"

regex for a full name

I've recently been receiving a lot of first name only entries in a form. While maybe I should have had 2 separate first and last name fields this always seemed to me a bit much. But I would like to try and get a full name which basically can only be determined by having at least one space.
I came up with this, but I'm wondering if someone has a better and possibly simpler solution?
/([a-zA-ZàáâäãåèéêëìíîïòóôöõøùúûüÿýñçčšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆČŠŽ∂ð,.'-]{2,}) ([a-zA-ZàáâäãåèéêëìíîïòóôöõøùúûüÿýñçčšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆČŠŽ∂ð,.'-]{2,})/
This is basically this /([a-zA-Z,.'-]) ([a-zA-Z,.'-])/ plus unicode support.

I'd first make sure that you really do need people to give you a last name. Is that a genuine requirement? If not, I'd skip it because it adds unnecessary complication and barriers to entry. If it really IS a requirement, it probably makes sense to have separate first and last name fields in your UI so that it's explicit.
The fact that you didn't do that to begin with suggests that you might not really need the last name as much as you think you do.
To answer your original question, this expression might give you what you're looking for without the guesswork:
/[\w]+([\s]+[\w]+){1}+/
It checks that the string contains at least 2 words separated by whitespace. Like Tim Pietzcker pointed out, validating the words themselves is prone to error.

In Ruby 1.9, you have access to Unicode properties (\p{L} is a Unicode letter). But trying to validate a name in any way (regex or not) is prone to failure because names are not what you think they are.
Your theory that "if there's a space, there must be a last name there" is incorrect, too - think of first and middle names...

Trouble with custom validation of Rails app

I'm making a web app where the point is to change a given word by one letter. For example, if I make a post by selecting the word: "best," then the first reply could be "rest," while the one after that should be "rent," "sent", etc. So, the word a user enters must have changed by one letter from the last submitted word. It would be constantly evolving.
Right now you can make a game and respond just by typing a word. I coded up a custom validation using functionality from the Amatch gem:
http://flori.github.com/amatch/doc/index.html
Posts have many responses, and responses belong to a post.
here's the code:
def must_have_changed_by_one_letter
m = Amatch::Sellers.new(title.strip)
errors.add_to_base("Sorry, you must change the last submitted word by one letter")
if m.match(post.responses.last.to_s.strip) != 1.0
end
When I try entering a new response for a test post I made (original word "best", first response is "rest") I get this:
ActiveRecord::RecordInvalid in ResponsesController#create
Validation failed: Sorry, you must change the last submitted word by one letter
Any thoughts on what might be wrong?
Thanks!

Looks like there are a couple of potential issues here.
For one, is your if statement actually on a separate line than your errors.add_to_base... statement? If so, your syntax is wrong; the if statement needs to be in the same line as the statement it's modifying. Even if it is actually on the correct line, I would recommend against using a trailing if statement on such a long line; it will make it hard to find the conditional.
if m.match(post.responses.last.to_s.strip) != 1.0
errors.add_to_base("Sorry, you must change the last submitted word by one letter")
end
Second, doing exact equality comparison on floating point numbers is almost never a good idea. Because floating point numbers involve approximations, you will sometimes get results that are very close, but not quite exactly equal, to a given number that you are comparing against. It looks like the Amatch library has several different classes for comparing strings; the Sellers class allows you to set different weights for different kinds of edits, but given your problem description, I don't think you need that. I would try using the Levenshtein or Hamming distance instead, depending on your exact needs.
Finally, if neither of those suggestions work, try writing out to a log or in the response the exact values of title.strip and post.responses.last.to_s.strip, to make sure you are actually comparing the values that you think you're comparing. I don't know the rest of your code, so I can't tell you whether those are correct or not, but if you print them out somewhere, you should be easily able to check them yourself.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

The ordinal parsing problem - ruby-on-rails

to_i does essentially 1/2 of that: "72nd".to_i => 72 It doesn't check validity, but if you need to fail on bad input like "72x", you can just re-ordinalize and compare to the original input string. For parsing ordinal words, Wikipedia seems impressively helpful.

Related

Character Replacements

Lua if A==1 or 2 or 3 then

How do Ruby min and max compare strings?

regex for a full name

Trouble with custom validation of Rails app

Categories

Resources