Hey guys this has been tripping me up quite a bit. So here is the general problem:
I am writing an application that requires users to enter their Summoner Names from league of legends. I do a pretty simple data scrape of a match and enter the data into my database. Unfortunately I am having some errors registering users with "special characters".
For this example I will use one problem user: RIÇK
As you can see RICK != RIÇK. So when I do the data scrub from the site I get the correct value which I push onto an array for later use.
Once I need the player names I pull from the array as follows: (player_names is the array)
#temp_player = User.find_by_username(player_names[i].to_s)
The problem is the users with any special characters are not being pulled. Should I not be using find_by? Is to_s changing my original values? I am really quite lost on what to do and would greatly appreciate any help / advice.
Thanks in advance,
Dan
I would like to thank Brian Kung for the link to the following: joelonsoftware.com/articles/Unicode.html It does a great job giving the bare minimum a programmer truly needs to understand.
For my particular issue I had used a HTML scraper to get the contents but which kept HTML entries throughout. When using these with my SQL lookups it was obvious that things were not being found. In order to fix this I used the HTMLEntities Gem to decode the text as follows (as soon as I put the into the array originally):
requires 'RubyGems' #without this cannot include htmlentries as a gem
requires 'HTMLEntries'
coder = HTMLEntries.new
line = '<'
player_names.push(coder.decode(line))
The Takeaway
When working with text and if running into errors I would strongly recommend tracing the strings you are working with to the origin and truly understanding what encoding is being used in each process. By doing this you can easily find where things are going wrong.
Related
My goal is to write a validation class for Rails that is capable of using an OCR recognised text from a business card and is able to detect string snippets and assign them to the correct attributes. I know this cannot be probably 100% perfect but I want to get as close as possible. Here is my approach so far:
I scan business cards via jquery's navigator.mediaDevices
I send the scanned image to a third party API Service, called OCRSpace (a gem is available here: https://github.com/suyesh/ocr_space)
I then get a unformatted array of recognised text snippets back, for example:
result = [['John Doe'], ['+49 160 123456'], ['Mainstr. 45a'], ['12345 Berlin'], ['CEO'], ['johndoe#business-website.de'], ['www.business-website.de']]
I then iterate through the array and do some checks, for example
Using the people library (https://github.com/mericson/people)
to split the name in firstname and lastname (additionally the title
or middlenames) Using the phonelib library
(https://github.com/daddyz/phonelib) to look up a valid phone number
and format it in an international string
Doing a basic regex check on the email address and store it
What I miss now is:
How can I find out what the name-string would possibly be? Right now I let the user choose it (in my example he defines "John Doe" as the name and then the library does the rest). I'm sure I would run into conflicts when using a regex as strings like "Main Street" would then also be recognized as a name?
How do I regex a combination of ZIP-Code and City name? I'm not a regex expert, do you know any good sources that would help? Couldn't find any so far except some regex-checkers in general.
In general: Do you like my approach or is this way too complicated? And do you know some best-practices that look better?
Don't consider this a full answer, but it was too much to make it a comment.
Your way of working seems Ok but I wouldn't use the OCR Service since there are other ways , Tesseract is the best known.
If you do and all the results are comparible presented it seems not too difficult since every piece of info has it's own characteristics.
You can identify the name part because it won't have numbers in it, the rest does, also you can expect to contain it "Mr." or "Mrs." or the such and not "Str.", "street" and so on. You could also use Google Maps to check for correct adresses, there are Ruby gems but have no experience with them.
Your people gem could also help.
You could guess all of this, present the results in you webpage and let the user confirm or adjust.
You could also RegExpr the post-city combination by looking fo a number and string combination in either order but you could also use a gem like ZipCodes to help.
I'm sorry, don't have the time now to test some Regular Expressions now and I don't publish code without testing.
Hope this was some help, success !
Newbie here, wrapping my head 'round this stuff!
I'd like to use the hex number as my url (external identifier) and keep the uuid within the database for a ruby on rails application. Is this even possible?
Thanks a bunch
Many people advise you against it but, yes, it is possible. It will need some code for it, and the solution depends on which version of Rails you use and what you use for the database, which is why I'm going to answer in a generic way.
You will want to have two different fields for the model: one for the external hex representation and another one for a separate UUID. Then, you can use the hex string to find instances in your controller actions, for example.
Please take a look at the following (they don't seem to have the two fields but will point you to the right direction anyway):
Problems setting a custom primary key in a Rails 4 migration
Change Primary Key Issue Rails 4.0
http://www.speakingcode.com/2013/12/07/gracefully-using-custom-primary-keys-in-rails-4-routes-controllers-models-associations-and-migrations/
And a longer post of a similar thing to do: http://ruby-journal.com/how-to-override-default-primary-key-id-in-rails/
Also, the FriendlyId gem might do what you want.
I am building a simple invoice application, and I would like to allow the users to customize the text on the invoice. In addition to this, they should be able to reference specific attributes in my models, i.e. "This is a test {{Model.attribute}}", and once the text is parsed the tag is replaced with the value of that attribute.
I have looked a bit at redcloth, textile and handlebars, but it does look like a little bit overkill to be honest. For instance I would not like to allow the users to input any HTML.
I would really appreciate if someone could point me in the right direction. There is probably a gem for this that I just havent found yet.
Thanks in advance
I use liquid with simpleformat which will sanitise the text.
Before I start, I would like to say that I'm quite a newbie to Xcode and the C Language, and I'm trying my best to learn as much as I can. I have researched for about 2 days now before posting this question but could not find anything helpful :( I am genuinely stuck and would appreciate ANY help. This is most likely a very simple/basic question:
Basically, I am trying to get this data (LINK) which is apparrently in UTF-8 JSON and display it on a simple label on Xcode. However, I do not know how to get that data and parse it at all. I've followed a tutorial online with success, but that deals with JSON objects rather than arrays (which I think I am dealing with).
I would HIGHLY appreciate it if someone could extract/parse the data from the first link given into a basic label on Xcode in code format.Preferably with commentary on what most lines of code are doing for my own benefit, as this would really help me understand how it works. Hopefully from there, I would be able to make good progress.
Once again this is highly appreciated!
Thank YOu.
Here's a sample of the JSON URL for convenience if you don't want to click the link:
[4,"1.0",1343920773538]
[1,"Spring Gardens","59581","275","Barkingside",1343920940000,1343920940000]
[1,"Spring Gardens","59581","275","Barkingside",1343921717000,1343921717000]
[1,"Spring Gardens","59581","549","Loughton",1343921858000,1343921858000]
[1,"Spring Gardens","59581","275","Barkingside",1343922204000,1343922204000]
[2,"Spring Gardens","59581","8a56a0ab37b72b400137cb7cfd954038_29222",0,3,"Bus routes serving this stop are subject to change during the Olympics and Paralympics games. For more information visit www.tfl.gov.uk/buses for more information.",1344668400000]
Use JSONObject like that tutorial shows, you should get a NSDictionary or NSArray at the end which will contain all your values just map those to the label in the end.
If you dont wan to do that, save the response in a char array and navigate through it while checking for [ or " characters when you find one read the chars until the next occurrence and save all the data you read into an array or something but this is messy and involves atleast 3-4 hours of writing your own custom logic for JSON data decryption, you should use JSONObject which is pretty simple
Hi i have this same problem. I have been looking for a json solution and currently i found that the best way to deal with this data is to parse is as csv instead. The solution seems straight up when you try to parse it as CSV instead of JSON
I want to have Greek support in my rails app for the messages/flashes etc. I took a quick look at the I18n framework but it seems like a lot of configuration for something so simple. I thing that there should be a very easy way to do something like this but ( apparently ) I don't know it. If anyone would be willing to help me I'll be glad. Thanks.
Rails has conventions so you don't have to configure.
Create a file in config/locales/el.yml with the contents:
el:
flash_messages:
success: "επιτυχία"
fail: "αποτυγχάνουν"
Then in your controller:
flash[:notice] = t('flash_messages.success')
and you'll get the translated string in your view.
You can change the locale like this:
I18n.locale = :el
I don't know how it could be easier. The "Rails I18n Guide":http://guides.rubyonrails.org/i18n.html has all the gory details if you want to fight the conventions or go beyond simple.
I18N ain't simple.
Here are some of the reasons why internationalization is hard.
First, every piece of text that might be shown to the user is a potential candidate for translation. That means not just properties of components (like menu items and button labels), but also text drawn with stroke drawing calls, and text embedded in pixel images (like this one taken from the MIT EECS web page). Translation can easily change the size or aspect ratio of the text; German labels tend to be much longer than English ones, for example.
Error messages also need to be translated, of course — which is another reason not to expose internal system names in error messages. An English-reading user might be able to figure out what FileNotFoundException means, but it won’t internationalize well.
Don't know Ruby so I can't help you more.