I have a mysql table that contains words joined by underscores and also words joined by hyphens.
example: Engineering-Service_Civil-Geotech
I am able to replace the underscore with an ampersand and add a space on either side, but im stuck at how to replace the hyphen with one blank space as well.
$cleanCat = str_replace( '_', ' & ', $Cat);
echo $cleanCat;
The result of the above code gives me one solution but not both:
example: Engineering-Service & Civil-Geotech
Do i have to use a different command to achieve this?
thanks in advance.
$cleanCat = str_replace('-', ' ', str_replace( '_', ' & ', $Cat));
str_replace( '-', ' ', $Cat); or str_replace( '-', ' ', $Cat);
should work
I'm having some trouble to find the right pattern to get the string I want.
My starting string is :
I would like to have
C3: [D3,E3,F3]
I would like to replace each starting commas by double space
Replace coma after colon by double space and left square bracket
Replace trailing commas by right square bracket
For now, I tried this :
> a = ",,,,C3:,D3,E3,F3,,"
=> ",,,,C3:,D3,E3,F3,,"
> b = a.gsub(/^,*/, " ").gsub(/(?<=:),/, " [").gsub(/[,]*$/,"" ).gsub(/[ ]*$/, "]")
=> " C3: [D3,E3,F3]"
> b == " C3: [D3,E3,F3]"
=> false
I can't reach to replace each starting comma by a double space to obtain 8 spaces in this case.
Could you help me to find the right regexp and if possible to improve my code, please ?
To replace each starting comma with a double space, you need to use \G operator, i.e. .gsub(/\G,/, ' '). That operator tells the regex engine to match at the start of the string and then after each successful match. So, you only replace each consecutive comma in the beginning of the string with .gsub(/\G,/, ' ').
Then, you can add other replacements:
s.gsub(/\G,/, ' ').sub(/,+\z/, ']').sub(/:,+/, ': [')
See the IDEONE demo
s = ",,,,C3:,D3,E3,F3,,"
puts s.gsub(/\G,/, ' ').sub(/,+\z/, ']').sub(/:,+/, ': [')
C3: [D3,E3,F3]
To construct the desired string, one needs to know:
the number of leading commas (the size of the string comprised of the leading commas)
the string following the leading commas up to and including the colon
the string between the comma following the colon and two or more commas
It is a simple matter to construct a regex that saves each of these three strings to a capture group:
r = /
(,*) # match leading commas in capture group 1
(.+:) # match up and including colon in capture group 2
, # match comma
(.+) # match any number of any characters in capture group 3
,, # match two commas
/x # extended/free-spacing regex definition mode
",,,,C3:,D3,E3,F3,," =~ r
We can now form the desired string from the contents of the three capture groups:
"#{' '*$1.size}#{$2} [#{$3}]"
#=> " C3: [D3,E3,F3]"
I have to display "#" in the UITextview content and after to put some information.
I looked on the internet via google but I didn't find an explanation which make
me understand the good approach.
Can you help me with some extra advice ?
Thanks !
You can simply do it like this:
_textView.text=#"#Hi Hello"; result will be #Hi Hello
However if you want to use " in the text you need to append it to backslash \
_textView.text=#"#Hi \"Hello"; result will be #Hi "Hello
You can enter almost all special characters without any problem, but you need to take care for the double quotes:
_textView.text=#"#Hi \"Hello * ! # # $ % ^ & ( ) _ + - [ ] ; ' {} <> ,. / ? : \" ";
Is there a way to add a space after commas in a string only if it doesn't exist.
word word,word,word,
Would end up as
word word, word, word,
Is there a function in ruby or rails to do this?
This will be used on hundreds of thousands of sentences, so it needs to be fast (performance would matter).
Using negative lookahead to check no space after comma, then replace with comma and space.
print 'word word,word,word,'.gsub(/,(?![ ])/, ', ')
Just use a regular expression to replace all instances of "," not followed by a space with ", ".
str = "word word,word,word,"
str = str.gsub(/,([^ ])/, ', \1') # "word word, word, word,"
If the string contains no multiple adjacent spaces (or should not contain such), you don't need a regex:
"word word, word, word,".gsub(',', ', ').squeeze(' ')
#=> "word word, word, word, "
Add missing space:
"word word,word,word,".gsub(/,(?=\w)/, ', ') # "word word, word, word,"
and removing the last unnecessary comma if necessary
"word word,word,word,".gsub(/,(?=\w)/, ', ').sub(/,\Z/, '') # "word word, word, word"
I'd like to use convert_uudecode function, but encoded string contains a quotation mark ( " ) and also an apostrophe ( ' )
I can't just do it like this:
print convert_uudecode("M:'1T<#HO+V1N87=R;W0N;F%Z=V$N<&PO;&EC96YC97,O8F5S="UD96%L'0` ` ");
cos as you can see there is already a quotation mark.
I also cant do it this way:
print convert_uudecode('M:'1T<#HO+V1N87=R;W0N;F%Z=V$N<&PO;&EC96YC97,O8F5S="UD96%L'0` ` ');
becouse rendered string also contains an apostrophe.
Any help?
Exchange each Apostrophe ' inside the string with '
and for each Quotation mark " you need to use "
An another alternative is to replace the " with \" and ' with \'
Visit this link below:
Hexadecimal value, Entity encoding etc
I want to create a very simple parser to convert:
"I wan't this to be ready by 10:15 p.m. today Mr. Gönzalés.!" to:
' ',
' ',
' ',
' ',
' ',
' ',
' ',
' ',
' ',
' ',
' ',
So basically I want consecutive letters and numbers to be grouped into a single string. I'm using Python 3 and I don't want to install external libs. I also would like the solution to be as efficient as possible as I will be processing a book.
So what approaches would you recommend me with regard to solving this problem. Any examples?
The only way I can think of now is to step trough the text, character for character, in a for loop. But I'm guessing there's a better more elegant approach.
You are looking for a procedure called tokenization. That means splitting raw text into discrete "tokens", in our case just words. For programming languages this is fairly easy, but unfortunately it is not so for natural language.
You need to do two things: Split up the text in sentences and split the sentences into words. Usually we do this with regular expressions. Naïvely you could split sentences by the pattern ". ", ie period followed by space, and then split up the words in sentences by space. This won't work very well however, because abbreviations are often also ending in periods. As it turns out, tokenizing and sentence segmentation is actually fairly tricky to get right. You could experiment with several regexps, but it would be better to use a ready made tokenizer. I know you didn't want to install any external libs, but im sure this will spare you pain later on. NLTK has good tokenizers.
I believe this is a solution:
import regex
text = "123 2 can't, 4 Å, é, and 中ABC _ sh_t"
print(regex.findall('\d+|\P{alpha}|\p{alpha}+', text))
Can it be improved?