Ruby: how to generate unique alphabetic string in ruby - ruby-on-rails

Is there any in built method in ruby which will generate unique alphabetic string every time(it should not have numbers only alphabets)?
i have tried SecureRandom but it doesn't provide method which will return string containing only alphabets.

SecureRandom has a method choose which:
[...] generates a string that randomly draws from a source array of characters.
Unfortunately it's private, but you can call it via send:
SecureRandom.send(:choose, [*'a'..'z'], 8)
#=> "nupvjhjw"
You could also monkey-patch Random::Formatter:
module Random::Formatter
def alphabetic(n = 16)
choose([*'a'..'z'], n)
end
end
SecureRandom.alphabetic
#=> "qfwkgsnzfmyogyya"
Note that the result is totally random and therefore not necessarily unique.

UUID are designed to have extremely low chance of collision. Since UUID only uses 17 characters, it's easy to change the non-alphabetic characters into unused alphabetic slots.
SecureRandom.uuid.gsub(/[\d-]/) { |x| (x.ord + 58).chr }

Is there any in built method in ruby which will generate unique alphabetic string every time(it should not have numbers only alphabets)?
This is not possible. The only way that a string can be unique if you are generating an unlimited number of strings is if the string is infinitely long.
So, it is impossible to generate a string that will be unique every time.

def get_random_string(length=2)
source=("a".."z").to_a + ("A".."Z").to_a
key=""
length.times{ key += source[rand(source.size)].to_s }
key
end
How about something like this if you like some monkey-patching, i have set length 2 here , please feel free to change it as per your needs
get_random_string(7)

I used Time in miliseconds, than converted it into base 36 which gives me unique aplhanumeric value and since it depends on time so, it will be very unique.
Example: Time.now.to_f.to_s.gsub('.', '').ljust(17, '0').to_i.to_s(36) # => "4j26m4zm2ss"
Take a look at this for full answer: https://stackoverflow.com/a/72738840/7365329

Try this one
length = 50
Array.new(length) { [*"A".."Z", *"a".."z"].sample }.join
# => bDKvNSySuKomcaDiAlTeOzwLyqagvtjeUkDBKPnUpYEpZUnMGF

Related

Remove from a string the characters where bytesize is greater than 2 with Ruby

I have a problem with mysql and certain characters. If a user enters "hello ●", I obtain this error:
Mysql2::Error: Incorrect string value: '\\xE2\\x97\\x8F he...' for column 'subject'
I would like to exclude all characters whose bytesize is greater than two, i.e., keep French characters like é, à, ç, and remove emojis or characters like ●.
Given string = "hèllö>●!", I would like to obtain "hèllö>!". In order to do so, I wrote this:
def bytesize(var)
var.each_char do |char|
puts char.bytesize
end
end
bytesize(string)
1
2
1
1
2
1
3
1
# => "hèllö>●!"
which is not what I expected. What is the best way to remove from all characters whose the bytesize is greater than two from a string?
I don't do that in the model because I can manage this with a gem, but my problem appears when a job wants to put the string in the logs of Amazon SES.
Elaborating on OP's efforts, not using regular expressions:
string = "hèllö>●!"
cleaned = string.each_char.with_object("") do |char, str|
str << char unless char.bytesize > 2
end
p cleaned
I suspect that you are getting that error message because you have the wrong column text encoding. If you are using Unicode in your system, and this day and age you should be, your column type should be utf8mb4. See this on how to change your column types.
Taking your comment into account the following will remove any characters outside the BMP
sentence.gsub(/[\u{10000}-\u{10FFFF}]/,'')

Specifying minimum length when using FFaker::Internet.user_name

I have a spec that keeps failing because FFaker::Internet.user_name generates a word that is less than 5 characters.
How do I specify a minimum length in this stmt:
username { FFaker::Internet.user_name }
String#ljust — Returns a copy of self of a given length, right-padded with a given other string.
username { FFaker::Internet.user_name.ljust(5,"12345") }
From what I see you try to use FFaker in your factory. Why overcomplicate things for your specs, when you could define sequence
sequence(:username) do |n|
"username-#{n}"
end
But the question is valid and you may have some legitimate needs to use ffaker, and there are many ways to do it. You can just concatenate username twice, why not?
username { FFaker.username + FFaker.username }
Or keep looking for a username that length is of minimal lenght:
username do
do
name = FFaker.username
while name.length < 5
name
end
Or monkeypatch ffaker and implement it yourself https://github.com/ffaker/ffaker/blob/0578f7fd31c9b485e9c6fa15f25b3eca724790fe/lib/ffaker/internet.rb#L43 + https://github.com/ffaker/ffaker/blob/0578f7fd31c9b485e9c6fa15f25b3eca724790fe/lib/ffaker/name.rb#L75
for example
class FFaker
def long_username(min_length = 5)
fetch_sample(FIRST_NAMES.select {|name| name.length >= min_lenght })
end
end
There're many ways you can achieve this, but if I had to do it, I'd do something like
(FFaker::Internet.user_name + '___')[0...5]
#=> "Lily_"
There are three underscores because after the quick lookup to the name list, I found the minimum length of first name is two characters so two plus three will always be at least five characters.
I'm only taking five character substring so as to not always have trailing underscore, but that's just my personal preference, you can use username plus three underscores and your test case will do fine.
You can't, but you could do FFaker::Name.name.join, this generates first name and middle name
You can also build it manually with regex using FFaker::String like so:
# 5 characters username, with first character being a letter
username { FFaker::String.from_regexp(/[a-zA-Z]\w{4}/) }
# random 5-10 characters username with first character being a letter
regex = Regexp.new("[a-zA-Z]\\w{#{Random.rand(4..9)}}")
username { FFaker::String.from_regexp(regex) }
Another idea is to add some random numbers to the end. This worked for me, was easy to implement, and looks fairly natural (since humans tend to do it when creating usernames in the wild).
E.g. this adds "1234" to the end of usernames:
"steve" + "1234"
and with Faker:
Faker::Internet.unique.user_name + "1234"
Note: if you want a random string (instead of "1234", try some of these approaches. However, it may not be necessary if you're already using the Faker .unique method as in the above example.

Isolating/removing Characters from string using rails

I am using ruby on rails
I have
article.id = 509969989168Q000475601
I would like the output to be
article.id = 68Q000475601
basically want to get rid of all before it gets to 68Q
the numbers in front of the 68Q can be various length
is there a way to remove up to "68Q"
it will always be 68Q and Q is always the only Letter
is there a way to say remove all characters from 2 digits before "Q"
I'd use:
article.id[/68Q.*/]
Which will return everything from 68Q to the end of the string.
article.id.match(/68Q.+\z/)[0]
You can do this easily with the split method:
'68Q' + article.id.split('68Q')[1]
This splits the string into an array based on the delimiter you give it, then takes the second element of that array. For what it's worth though, #theTinMan's solution is far more elegant.

How would i generate a random and unique string in Ruby?

In a Ruby on Rails app I am working on I allow users to upload files and want to give these files a short, random alphanumeric name. (Eg 'g7jf8' or '3bp76'). What is the best way to do this?
I sas thinking of generating a hash / encrypted string from the original filename and timestamp. Then query the database to double check it doesnt exist. If it does, generate another and repeat.
The issue i see with this approach is if there is high propability of duplicate strings, it could add quite a lote of datbase load.
I use this :)
def generate_token(column, length = 64)
begin
self[column] = SecureRandom.urlsafe_base64 length
end while Model.exists?(column => self[column])
end
Replace Model by your model name
SecureRandom.uuid
Will give you a globally unique String. http://en.m.wikipedia.org/wiki/Universally_unique_identifier
SecureRandom.hex 32
Will give a random String, but it's algorithm is not optimised for uniqueness. Of course the chance of collision with 32 digits, assuming true randomness, is basically theoretical. You could make 1 billion per second for 100 years and have only a 50% chance of a collision.
Use Ruby's SecureRandom.hex function with optional number of character you wanted to generate.
This will always produce new uniq 40 size alpha-numeric string, because it has Time stamp also.
loop do
random_token = Digest::SHA1.hexdigest([Time.now, rand(111..999)].join)
break random_token unless Model.exists?(column_name: random_token)
end
Note: Replace Model by your model_name and column_name by any existing column of your model.
You can assign a unique id by incrementing it each time a new file is added, and convert that id into an encrypted string using OpenSSL::Cipher with a constant key that you save somewhere.
If you end up generating a hex or numeric digest, you can keep the code shorter by representing the number as e.g. Base 62:
# This is a lightweight base62 encoding for Ruby integers.
B62CHARS = ('0'..'9').to_a + ('a'..'z').to_a + ('A'..'Z').to_a
def base62_string nbr
b62 = ''
while nbr > 0
b62 << B62CHARS[nbr % 62]
nbr /= 62
end
b62.reverse
end
If it is important for you to restrict the character set used (for instance not have uppercase chars in file names), then this code can easily be adapted, provided you can find a way of feeding in a suitable random number.
If your file names are supposed to be semi-secure, you need to arrange that there are many more possible names than actual names in storage.
It looks like you actually need a unique filenames, right? Why not forget about complex solutions and simply use Time#nsec?
t = Time.now #=> 2007-11-17 15:18:03 +0900
"%10.9f" % t.to_f #=> "1195280283.536151409"
You can use Time in miliseconds and than convert it into base 36 to reduce it length.
and since it depends on time to it will be very unique.
Example:
Time.now.to_f.to_s.gsub('.', '').ljust(17, '0').to_i.to_s(36) # => "4j26lna7g62"
Take a look at this answer:
https://stackoverflow.com/a/72738840/7365329

Completely random identifier of a given length

I would like to generate a completely random "unique" (I will ensure that using my model) identifier of a given (the length may varies) length containing numbers, letter and special characters
For example:
161551960578281|2.AQAIPhEcKsDLOVJZ.3600.1310065200.0-514191032|
Can someone please suggest the most efficient way to do that in Ruby on Rails?
EDIT: IMPORTANT:
If it is possible please comment on how efficient your proposed solution is because this will be used every time a user enters a website!
Thanks
Using this for an access token is a different story than UUIDs. You need not only pseudo-randomness but additionally this needs to be a cryptographically secure PRNG. If you don't really care what characters you use (they don't add anything to the security) you could use something as the following, producing a URL-safe Base64-encoded access token. URL-safeness becomes important in case you append the token to URLs, similar to what some Java web apps do: "http://www.bla.com/jsessionid=". If you would use raw Base64 strings for that purpose you would produce potentially invalid URLs.
require 'securerandom'
def produce_token(length=32)
token = SecureRandom.urlsafe_base64(length)
end
The probability of getting a duplicate is equal to 2^(-length). Since the output will be Base64-encoded, the actual output will be 4/3 * length long. If installed, this is based on the native OpenSSL PRNG implementation, so it should be pretty efficient in terms of performance. Should the OpenSSL extension not be installed, /dev/urandom will be used if available and finally, if you are on a Windows machine, CryptGenRandom would be used as fallback. Each of these options should be sufficiently performant. E.g., on my laptop running produce_tokena million times finishes in ~6s.
The best solution is:
require 'active_support/secure_random'
ActiveSupport::SecureRandom.hex(16) # => "00c62d9820d16b52740ca6e15d142854"
This will generate a cryptographically secure random string (i.e. completely unpredictable)
Similarly, you could use a library to generate UUIDs as suggested by others. In that case, be sure to use the random version (version 4) and make sure the implementation uses a cryptosecure random generator.
As anything related to security, rolling your own is not the best idea (even though I succumbed to it too, see first versions! :-). If you really want an homemade random string, here's a rewrite of tybro0103's approach:
require 'digest/sha1'
ALPHABET = "|,.!-0123456789".split(//) + ('a'..'z').to_a + ('A'..'Z').to_a
def random_string
not_quite_secure = Array.new(32){ ALPHABET.sample }.join
secure = Digest::SHA1.hexdigest(not_quite_secure)
end
random_string # => "2555265b2ff3ecb0a13d65a3d177b326733bc143"
Note that it hashes the random string, otherwise it could be subject to attack.
Performance should be similar.
Universally Unique Identifieres - UUIDs are tricky to generate yourself ;-) If you want something really reliable, use the uuid4r gem and call it with UUID4R::uuid(1). This will spit out a uuid based on time and a hardware id (the computers mac address). So it's even unique across multiple machines if generated at the exact same time.
A requirement for uuid4r is the ossp-uuid c library which you can install with the packetmanager of your choice (apt-get install libossp-uuid libossp-uuid-dev on debian or brew install ossp-uuid on a mac with homebrew for example) or by manually downloading and compiling it of course.
The advantage of using uuid4r over a manual (simpler?) implementation is that it is a) truly unique and not just "some sort of pseudo random number generator kind of sometimes reliable" and b) it's fast (even with higher uuid versions) by using a native extension to the c library
require 'rubygems'
require 'uuid4r'
UUID4R::uuid(1) #=> "67074ea4-a8c3-11e0-8a8c-2b12e1ad57c3"
UUID4R::uuid(1) #=> "68ad5668-a8c3-11e0-b5b7-370d85fa740d"
update:
regarding speed, see my (totally not scientific!) little benchmark over 50k iterations
user system total real
version 1 0.600000 1.370000 1.970000 ( 1.980516)
version 4 0.500000 1.360000 1.860000 ( 1.855086)
so on my machine, generating a uuid takes ~0.4 milliseconds (keep in mind I used 50000 iterations for the whole benchmark). hope that's fast enough for you
(following the "benchmark")
require 'rubygems'
require 'uuid4r'
require 'benchmark'
n = 50000
Benchmark.bm do |bm|
bm.report("version 1") { n.times { UUID4R::uuid(1) } }
bm.report("version 4") { n.times { UUID4R::uuid(4) } }
end
Update on heroku: the gem is available on heroku as well
def random_string(length=32)
chars = (0..9).to_a.concat(('a'..'z').to_a).concat(('A'..'Z').to_a).concat(['|',',','.','!','-'])
str = ""; length.times {str += chars.sample.to_s}
str
end
The Result:
>> random_string(42)
=> "a!,FEv,g3HptLCImw0oHnHNNj1drzMFM,1tptMS|rO"
It is a bit trickier to generate random letters in Ruby 1.9 vs 1.8 due to the change in behavior of characters. The easiest way to do this in 1.9 is to generate an array of the characters you want to use, then randomly grab characters out of that array.
See http://snippets.dzone.com/posts/show/491
You can check implementations here I used this one
I used current time in miliseconds to generate random but uniqure itentifier.
Time.now.to_f # => 1656041985.488494
Time.now.to_f.to_s.gsub('.', '') # => "16560419854884948"
this will give 17 digits number
sometime it can give 16 digits number because if last digit after point (.) is 0 than it is ignore by to_f.
so, I used rleft(17, '0')
example:
Time.now.to_f.to_s.gsub('.', '').ljust(17, '0') # => "1656041985488490"
Than I used to_s(36) to convert it into short length alphanumeric string.
Time.now.to_f.to_s.gsub('.', '').ljust(17, '0').to_i.to_s(36) # => "4j26hz9640k"
to_s(36) is radix base (36)
https://apidock.com/ruby/v2_5_5/Integer/to_s
if you want to limit the length than you can select first few digits of time in miliseconds:
Time.now.to_f.to_s.gsub('.', '').ljust(17, '0').first(12).to_i.to_s(36) # => "242sii2l"
but if you want the uniqueness accuracy in miliseconds than I would suggest to have atleast first(15) digits of time

Resources