Capitalize first letter of every word in Lua - lua

I'm able to capitalize the first letter of my string using:
str:gsub("^%l", string.upper)
How can I modify this to capitalize the first letter of every word in the string?

I wasn't able to find any fancy way to do it.
str = "here you have a long list of words"
str = str:gsub("(%l)(%w*)", function(a,b) return string.upper(a)..b end)
print(str)
This code output is Here You Have A Long List Of Words. %w* could be changed to %w+ to not replace words of one letter.
Fancier solution:
str = string.gsub(" "..str, "%W%l", string.upper):sub(2)
It's impossible to make a real single-regex replace because lua's pattern system is simple.

in the alternative answer listed you get inconsistent results with words containing apostrophes:
str = string.gsub(" "..str, "%W%l", string.upper):sub(2)
will capitalize the first letter after each apostrophe irregardless if its the first letter in the word
eg: "here's a long list of words" outputs "Here'S A Long List Of Words"
to fix this i found a clever solution here
utilizing this code:
function titleCase( first, rest )
return first:upper()..rest:lower()
end
string.gsub(str, "(%a)([%w_']*)", titleCase)
will fix any issues caused by that weird bug

function titleCase( first, rest )
return first:upper()..rest:lower()
end
string.gsub(str, "(%a)([%w_']*)", titleCase)
BunchOfText {"Yeppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppppp"}

I have a feeling I will be returning to this question when I need to put something in proper title case.
Below is the Lua code to do exactly that.
It has the disadvantage of not preserving the original spacing between words but it's good enough for now.
-- Lua is like python in syntax, and barebones like C -_-
function Set (list)
local set = {}
for _, l in ipairs(list) do set[l] = true end
return set
end
function firstToUpper(str)
return (str:gsub("^%l", string.upper))
end
function titlecase(str)
-- We need to break the string into pieces
words = {}
for word in string.gmatch(str, '([^%s]+)') do
table.insert(words, word)
end
-- We need to capitalize anything that is not a:
-- - Article
-- - Coordinating Conjunction
-- - Preposition
-- Thus we have a blacklist of such words
local blacklist = Set {
"at", "but", "by", "down", "for", "from",
"in", "into", "like", "near", "of", "off",
"on", "onto", "out", "over", "past", "plus",
"to", "up", "upon", "with", "nor", "yet",
"so", "the"
}
for index, word in pairs(words) do
if(not (blacklist[word] ~= nil)) then
words[index] = firstToUpper(word)
end
end
-- First and last words are always capitalized
words[1] = firstToUpper(words[1])
words[#words] = firstToUpper(words[#words])
-- Concat elements in list via space character
local result = ""
for index, word in pairs(words) do
result = result .. word
if(index ~= #words) then
result = result .. ' '
end
end
return result
end
print(titlecase("the world"))
print(titlecase("I walked my dog this morning ..."))
print(titlecase("The art of Lua"))
--- Output:
----------------------
--- The World
--- I Walked My Dog This Morning ...
--- The Art of Lua

Related

Lua unusual variable name (question mark variable)

I have stumbled upon this line of code and I am not sure what the [ ? ] part represents (my guess is it's a sort of a wildcard but I searched it for a while and couldn't find anything):
['?'] = function() return is_canadian and "eh" or "" end
I understand that RHS is a functional ternary operator. I am curious about the LHS and what it actually is.
Edit: reference (2nd example):
http://lua-users.org/wiki/SwitchStatement
Actually, it is quite simple.
local t = {
a = "aah",
b = "bee",
c = "see",
It maps each letter to a sound pronunciation. Here, a need to be pronounced aah and b need to be pronounced bee and so on. Some letters have a different pronunciation if in american english or canadian english. So not every letter can be mapped to a single sound.
z = function() return is_canadian and "zed" or "zee" end,
['?'] = function() return is_canadian and "eh" or "" end
In the mapping, the letter z and the letter ? have a different prononciation in american english or canadian english. When the program will try to get the prononciation of '?', it will calls a function to check whether the user want to use canadian english or another english and the function will returns either zed or zee.
Finally, the 2 following notations have the same meaning:
local t1 = {
a = "aah",
b = "bee",
["?"] = "bee"
}
local t2 = {
["a"] = "aah",
["b"] = "bee",
["?"] = "bee"
}
If you look closely at the code linked in the question, you'll see that this line is part of a table constructor (the part inside {}). It is not a full statement on its own. As mentioned in the comments, it would be a syntax error outside of a table constructor. ['?'] is simply a string key.
The other posts alreay explained what that code does, so let me explain why it needs to be written that way.
['?'] = function() return is_canadian and "eh" or "" end is embedded in {}
It is part of a table constructor and assigns a function value to the string key '?'
local tbl = {a = 1} is syntactic sugar for local tbl = {['a'] = 1} or
local tbl = {}
tbl['a'] = 1
String keys that allow that convenient syntax must follow Lua's lexical conventions and hence may only contain letters, digits and underscore. They must not start with a digit.
So local a = {? = 1} is not possible. It will cause a syntax error unexpected symbol near '?' Therefor you have to explicitly provide a string value in square brackets as in local a = {['?'] = 1}
they gave each table element its own line
local a = {
1,
2,
3
}
This greatly improves readability for long table elements or very long tables and allows you maintain a maximum line length.
You'll agree that
local tbl = {
z = function() return is_canadian and "zed" or "zee" end,
['?'] = function() return is_canadian and "eh" or "" end
}
looks a lot cleaner than
local tbl = {z = function() return is_canadian and "zed" or "zee" end,['?'] = function() return is_canadian and "eh" or "" end}

Trying to make function which takes string as input and returns no. of words in whole string

**It takes Input as a string such as this - 'Nice one'
And Output gives - 4,3 (which is no. Of words in sentence or string)
**
function countx(str)
local count = {}
for i = 1, string.len(str) do
s = ''
while (i<=string.len(str) and string.sub(str, i, i) ~= ' ' ) do
s = s .. string.sub(str, i, i)
i = i+1
end
if (string.len(s)>0) then
table.insert(count,string.len(s))
end
end
return table.concat(count, ',')
end
You can find a simple alternative with your new requirements:
function CountWordLength (String)
local Results = { }
local Continue = true
local Position = 1
local SpacePosition
while Continue do
SpacePosition = string.find(String, " ", Position)
if SpacePosition then
Results[#Results + 1] = SpacePosition - Position
Position = SpacePosition + 1
-- if needed to print the string
-- local SubString = String:sub(Position, SpacePosition)
-- print(SubString)
else
Continue = false
end
end
Results[#Results + 1] = #String - Position + 1
return Results
end
Results = CountWordLength('I am a boy')
for Index, Value in ipairs(Results) do
print(Value)
end
Which gives the following results:
1
2
1
3
def countLenWords(s):
s=s.split(" ")
s=map(len,s)
s=map(str,s)
s=list(s)
return s
The above functions returns a list containing number of characters in each word
s=s.split(" ") splits string with delimiter " " (space)
s=map(len,s) maps the words into length of the words in int
s=map(str,s) maps the values into string
s=list(s) converts map object to list
Short version of above function (all in one line)
def countLenWords(s):
return list(map(str,map(len,s.split(" "))))
-- Localise for performance.
local insert = table.insert
local text = 'I am a poor boy straight. I do not need sympathy'
local function word_lengths (text)
local lengths = {}
for word in text:gmatch '[%l%u]+' do
insert (lengths, word:len())
end
return lengths
end
print ('{' .. table.concat (word_lengths (text), ', ') .. '}')
gmatch returns an iterator over matches of a pattern in a string.
[%l%u]+ is a Lua regular expression (see http://lua-users.org/wiki/PatternsTutorial) matching at least one lowercase or uppercase letter:
[] is a character class: a set of characters. It matches anything inside brackets, e.g. [ab] will match both a and b,
%l is any lowercase Latin letter,
%u is any uppercase Latin letter,
+ means one or more repeats.
Therefore, text:gmatch '[%l%u]+' will return an iterator that will produce words, consisting of Latin letters, one by one, until text is over. This iterator is used in generic for (see https://www.lua.org/pil/4.3.5.html); and on any iteration word will contain a full match of the regular expression.

Lua need to split at comma

I've googled and I'm just not getting it. Seems like such a simple function, but of course Lua doesn't have it.
In Python I would do
string = "cat,dog"
one, two = string.split(",")
and then I would have two variables, one = cat. two = dog
How do I do this in Lua!?
Try this
str = 'cat,dog'
for word in string.gmatch(str, '([^,]+)') do
print(word)
end
'[^,]' means "everything but the comma, the + sign means "one or more characters". The parenthesis create a capture (not really needed in this case).
If you can use libraries, the answer is (as often in Lua) to use Penlight.
If Penlight is too heavy for you and you just want to split a string with a single comma like in your example, you can do something like this:
string = "cat,dog"
one, two = string:match("([^,]+),([^,]+)")
Add this split function on the top of your page:
function string:split( inSplitPattern, outResults )
if not outResults then
outResults = { }
end
local theStart = 1
local theSplitStart, theSplitEnd = string.find( self, inSplitPattern, theStart )
while theSplitStart do
table.insert( outResults, string.sub( self, theStart, theSplitStart-1 ) )
theStart = theSplitEnd + 1
theSplitStart, theSplitEnd = string.find( self, inSplitPattern, theStart )
end
table.insert( outResults, string.sub( self, theStart ) )
return outResults
end
Then do as follows:
local myString = "Flintstone, Fred, 101 Rockledge, Bedrock, 98775, 555-555-1212"
local myTable = myString:split(", ")
for i = 1, #myTable do
print( myTable[i] ) -- This will give your needed output
end
For more information, visit : Tutorial: Lua String Magic
Keep Coding...............:)
-- like C strtok, splits on one more delimiter characters (finds every string not containing any of the delimiters)
function split(source, delimiters)
local elements = {}
local pattern = '([^'..delimiters..']+)'
string.gsub(source, pattern, function(value) elements[#elements + 1] = value; end);
return elements
end
-- example: var elements = split("bye# bye, miss$ american# pie", ",#$# ")
-- returns "bye" "bye" "miss" "american" "pie"
To also handle optional white space you can do:
str = "cat,dog,mouse, horse"
for word in str:gmatch('[^,%s]+') do
print(word)
end
Output will be:
cat
dog
mouse
horse
This is how I do that on mediawiki:
str = "cat,dog"
local result = mw.text.split(str,"%s*,%s*")
-- result[0] will give "cat", result[1] will give "dog"
actually, if you don't care spaces, you can use:
str = "cat,dog"
local result = mw.text.split(str,",")
-- result[0] will give "cat", result[1] will give "dog"
The API used here is implemented in Scribunto MediaWiki extension. Here is the split() method reference documentation and here is the source code for that. It relies on a lot of other capabilities in Scribunto's Lua common libraries, so it will only work for you if you are actually using MediaWiki or plan to import most of the Scribunto common library.
Functions like string.split() are largely unnecessary in Lua since you can
express string operations in LPEG.
If you still need a dedicated function a convenient approach is
to define a splitter factory (mk_splitter() in below snippet)
from which you can then derive custom splitters.
local lpeg = require "lpeg"
local lpegmatch = lpeg.match
local P, C = lpeg.P, lpeg.C
local mk_splitter = function (pat)
if not pat then
return
end
pat = P (pat)
local nopat = 1 - pat
local splitter = (pat + C (nopat^1))^0
return function (str)
return lpegmatch (splitter, str)
end
end
The advantage of using LPEG is that the function accepts
both valid Lua strings and patterns as argument.
Here is how you would use it to create a function that
splits strings at the , character:
commasplitter = mk_splitter ","
print (commasplitter [[foo, bar, baz, xyzzy,]])
print (commasplitter [[a,b,c,d,e,f,g,h]])

Lua String Split

Hi I've got this function in JavaScript:
function blur(data) {
var trimdata = trim(data);
var dataSplit = trimdata.split(" ");
var lastWord = dataSplit.pop();
var toBlur = dataSplit.join(" ");
}
What this does is it take's a string such as "Hello my name is bob" and will return
toBlur = "Hello my name is" and lastWord = "bob"
Is there a way i can re-write this in Lua?
You could use Lua's pattern matching facilities:
function blur(data) do
return string.match(data, "^(.*)[ ][^ ]*$")
end
How does the pattern work?
^ # start matching at the beginning of the string
( # open a capturing group ... what is matched inside will be returned
.* # as many arbitrary characters as possible
) # end of capturing group
[ ] # a single literal space (you could omit the square brackets, but I think
# they increase readability
[^ ] # match anything BUT literal spaces... as many as possible
$ # marks the end of the input string
So [ ][^ ]*$ has to match the last word and the preceding space. Therefore, (.*) will return everything in front of it.
For a more direct translation of your JavaScript, first note that there is no split function in Lua. There is table.concat though, which works like join. Since you have to do the splitting manually, you'll probably use a pattern again:
function blur(data) do
local words = {}
for m in string.gmatch("[^ ]+") do
words[#words+1] = m
end
words[#words] = nil -- pops the last word
return table.concat(words, " ")
end
gmatch does not give you a table right away, but an iterator over all matches instead. So you add them to your own temporary table, and call concat on that. words[#words+1] = ... is a Lua idiom to append an element to the end of an array.

Ruby function to merge two string into one

Given two string like the ones below, I would like to merge them to generate the following. The results makes little sense, however, both strings have 'a sentence' in common, which is what counts as the connector between the two strings:
"This is a sentence is a great thing"
s1 = "This is a sentence"
s2 = "a sentence is a great thing"
Is there a function for this in ruby?
Here's a solution that works.
def str_with_overlap(s1, s2)
result = nil
(0...(s2.length)).each do |idx|
break result = s1 + s2[(idx + 1)..-1] if s1.end_with?(s2[0..idx])
end
result
end
str_with_overlap("This is a sentence", "a sentence is a great thing")
# => This is a sentence is a great thing
As far as I know, there is no built-in function for this in Ruby.
You probably have to write an own function for this. The straightforward one runs in quadratic time in the input length. However, it is possible to do it in linear time in the input size by using this algorithm.
there is no built-in method in Ruby, but u can try this one
class String
def merge str
result = self + str
for i in 1..[length,str.length].min
result = self[0,length-i] + str if self[-i,i] == str[0,i]
end
result
end
end
"This is a sentence".merge "a sentence is a great thing"
Functional approach (works at word-level):
ws1, ws2 = [s1, s2].map(&:split)
idx = 0.upto(ws1.size-1).detect { |i| ws1[i..-1] == ws2[0, ws1.size-i] } || 0
(ws1[0, ws1.size-idx] + ws2).join(" ")
=> "This is a sentence is a great thing"

Resources