How to extract differing part from many strings? - ruby-on-rails

I have many strings. At least 4, max 12. All the Strings have a differing part in the middle. I want to extract the part of the string for each string that differs from all the other strings. How can I practive this?

http://users.cybercity.dk/~dsl8950/ruby/diff.html
That's all what you need

Related

What is the most efficient way to concatenate strings in Dart?

Languages like Java, let you concatenate Strings using '+" operator.
But as strings are immutable, they advise one to use StringBuilder for efficiency if one is going to repeatedly concatenate a string.
What is the most efficient way to concatenate Strings in Dart ?
https://api.dart.dev/stable/2.9.1/dart-core/StringBuffer-class.html
StringBuffer can be used for concatenating strings efficiently.
Allows for the incremental building of a string using write*() methods. The strings are concatenated to a single string only when toString is called.
It appears that if one uses StringBuffer, one is postponing the performance hit till toString is called?
There are a number of ways to concatenate strings:
String.operator +: string1 + string2. This is the most straightforward. However, if you need to concatenate a lot of strings, using + repeatedly will create a lot of temporary objects, which is inefficient. (Also note that unlike other concatenation methods, + will throw an exception if either argument is null.)
String interpolation: '$string1$string2'. If you need to concatenate a fixed number of strings that are known in advance (such that you can use a single interpolating string), I would expect this to be reasonably efficient. If you need to incrementally build a string, however, this would have the same inefficiency as +.
StringBuffer. This is efficient if you need to concatenate a lot of strings.
Iterable.join: [string1, string2].join(). This internally uses a StringBuffer so would be equivalent.
If you need to concatenate a small, fixed number of strings, I would use string interpolation. It's usually more readable than using +, especially if there are string literals involved. Using StringBuffer in such cases would add some unnecessary overhead.
Here is what I understood:
It appears that performance will vary depending on your use case. If you use StringBuffer, and intend to concatenate a very large string, then as the concatenation occurs only when you call toString, it is at that point where you get the "performance hit".
But if you use "+", then each time you call "+" there is a performance hit. As strings are immutable, each time you concatenate two strings you are creating a new object which needs to be garbage collected.
Hence the answer seems to be to test for your situation, and determine which option makes sense.

Character Replacements

I have a UniCode string UniStr.
I also have a MAP of { UniCodeChar : otherMappedStrs }
I need the 'otherMappedStrs' version of UniStr.
Eg: UniStr = 'ABC', MAP = { 'A':'233','B':'#$','C':'9ij' }, Result = '233#$9ij'
I have come up with the formula below which works;
=ArrayFormula(JOIN("",VLOOKUP(REGEXEXTRACT(A1,REPT("(.)",LEN(A1))),MapRange,2,FALSE)))
The MAP being a whole character set (40 chars) is quite large.
I need to use this function in multiple spreadsheets. How can I subsume the MAP into the formula for portability ?
Is there a better way to iterate a string other than the REGEXEXTRACT method in formula ? This method has limitation for long strings.
I also tested the below formula. Problem here is it gives 2 results (or the size of the array within SUBSTITUTE replacement). If 3 substitutions made, then it gives three results. Can this be resolved ?
=ArrayFormula(SUBSTITUTE(A1,{"s","i"},{"#","#"}))
EDIT;
#Tom 's first solution appears best for my case (1) REGEX has an upper limit on search criteria which does not hinder in your solution (2) Feels fast (did not do empirical testing) (3) This is a better way to iterate string characters, I believe (you answered my Q2 - thanks)
I digress here. I wish google would introduce Named-Formulas or Formula-Aliases. In this case, hypothetically below. I have sent feed back along those lines many times. Nothing :(
MyFormula($str) == ArrayFormula(join(,vlookup(mid($str,row(indirect("1:"&len($str))),1), { "A","233";"B","#$";"C","9ij" },2,false)))
Not sure how long you want your strings to be, but the more traditional
=ArrayFormula(join(,vlookup(mid(A1,row(indirect("1:"&len(A1))),1), { "A","233";"B","#$";"C","9ij" },2,false)))
seems a bit more robust for long strings.
For a more radical idea, supposing the maximum length of your otherMappedStrings is 3 characters, then you could try:
=ArrayFormula(join(,trim(mid("233 #$9ij",find(mid(A1,row(indirect("1:"&len(A1))),1), "ABC")*3-2,3))))
where I have put a space in before #$ to pad it out to 3 characters.
Incidentally the original VLOOKUP is not case sensitive. If you want this behaviour, use SEARCH instead of FIND.
You seem to have several different Qs, but considering only portability, perhaps something like the following would help:
=join(,switch(arrayformula(regexextract(A1&"",rept("(.)",len(A1)))),"A",233,"B","#$","C","9ij"))
extended with 37 more pairs.

How do I keep my rails integer from being converted to binary?

As you may be able to see in the image, I have a User model and #user.zip is stored as an integer for validation purposes (ie, so only digits are stored, etc.). I was troubleshooting an error when I discovered that my sample zip code (00100) was automatically being converted to binary, and ending up as the number 64.
Any ideas on how to keep this from happening? I am new to Rails, and it took me a few hours to figure out the cause of this error, as you might imagine :)
I can't imagine any other information would be helpful here, but please inform me if otherwise.
This is not binary, this is octal.
In Ruby, any number starting with 0 will be treated as an octal number. You should check the Ruby number literals to learn more about this, here's a quote:
You can use a special prefix to write numbers in decimal, hexadecimal, octal or binary formats. For decimal numbers use a prefix of 0d, for hexadecimal numbers use a prefix of 0x, for octal numbers use a prefix of 0 or 0o, for binary numbers use a prefix of 0b. The alphabetic component of the number is not case-sensitive.
For your case, you should not store zipcodes as numbers. Not only in the database, but even as variables don't treat them as numeric values. Instead, store and treat them as strings.
The zip should probably be stored as a string since you can't have a valid integer with leading zeroes.

convert hash value into string in objective c

I have a hash value and I want to convert it into string formate but I do not know how to do that.
Here is the hash value
7616db6c232292d2e56a2de9da49ea810d5bb80d53c10e7b07d9521dc88b3177
Hashing technique is a one way algorithm. So that is not possible, sorry.
You can't. A hash is a one-way function. Plus, two strings could theoretically generate the same hash.
BUT. If you had a large enough, pre-computed list of as many hashes as you could generate for each unique string, you could potentially convert your hash back into a string. Although it would require thousands of GB in entries and a lot of time to compute each and every hash. So it practically isn't possible.

Extended Huffman code

I have this homework: finding the code words for the symbols in any given alphabet. It says I have to use binary Huffman on groups of three symbols. What does that mean exactly? Do i use regular Huffman on [alphabet]^3? If so, how do I then tell the difference between the 3 symbols in a group?
I can't quite tell, because your description of the problem isn't all that detailed, but I would guess that they mean that instead of encoding each symbol in your alphabet individually, you are supposed to tread each triple of symbols as a group.
So, for instance, if your alphabet consists of a, b, and c, instead of generating an encoding for each of those individually, you would create an encoding for aaa, aab, aac, etc. Each one of these strings would be treated as a separate symbol in the Huffman algorithm; you can tell them apart simply by doing string comparison on them. If you need to accept input of arbitrary length, you will also need to include in your alphabet symbols that are strings of length 1 or 2. For instance, if you're encoding the string aabacab, you would need to break that down into the symbols aab, aca, and b.
Does that help answer your question? I wasn't quite sure what you're looking for, so please feel free to edit your question or reply in a comment if this hasn't cleared anything up.
Food for thought: what about shorter strings, and permutations of "block boundaries"? What about 1 and 2 character strings? Do you just count off 3, 6, 9, 12, ... chars into your input text and then null pad any uneven lengths at the end?
If the chunks can be of variable size, then it gets really interesting to find the best fit. I suspect it degenerates into a traveling salesman kind of problem, but maybe there's a neat "theorem" or other tool out there for this kind of thing.
Perhaps try all permutations of 3 chars, saving the most frequently used, then try to come up with a good fit for the 1 and 2 char long gaps? Hmm, sounds like it might be really slow, but doable using some kind of recursive divide and counquer approach: pull out the long string of block length N, then recurse into encoding the gaps as length N - 1.
More questions than answers, I'm afraid.

Resources