I want to count the number of occurrences of words in a text and give the top ten words and their number of occurrences.
I use the function io.open() to open a input file as file-handle, then do something on the file-handle, put the results in a table. then close the input file-handle. and open a output file which is a new file as file-handle try to write the results to this file. but it does not work. the code is following.
the txt "ioinput.txt" is input file which has a article and the txt "iooutput.txt" is the output file
input_file = io.open("ioinput.txt", r)
--[[
This block of code is to count the number of word,
which has been verified by the print function in the following.
--]]
input_file:close()
output_file = io.open("iooutput.txt", a)
local n = 10
for i = 1, n do
output_file:write(words[i], "\t", counter[words[i]], "\n")
--print(words[i], "\t", counter[words[i]], "\n")
end
output_file:flush()
output_file:close()
Please refer to the Lua 5.4 Reference Manual: io.open
io.open (filename [, mode])
This function opens a file, in the mode specified in the string mode.
In case of success, it returns a new file handle.
The mode string can be any of the following:
"r": read mode (the default);
"w": write mode;
"a": append mode;
"r+": update mode, all previous data is preserved;
"w+": update mode, all previous data is erased;
"a+": append update mode, previous data is preserved, writing is only allowed at the end of file.
The mode string can also have a 'b' at the end, which is needed in
some systems to open the file in binary mode.
Please note that the optional mode is to be provided as a string.
In your code
input_file = io.open("ioinput.txt", r) and output_file = io.open("ioinput.txt", a)
your using modes r and a. Both nil values. The mode defaults to "r" which is read mode. You cannot write to a file opened in read mode.
Related
Probably it's an easy thing, but I'm a Lua beginner...
I'm creating a very simple QSC QSYS plugin to control a projection server using KVL API. Server API is based on hex strings.
For example this command asks the server to load a the playlist with 9bf5455689ed4c019731c6dd3c071f0e uuid:
Controls["LoadSPL"].EventHandler = function()
sock:Write(
"\x06\x0e\x2b\x34\x02\x05\x01\x0a\x0e\x10\x01\x01\x01\x03\x09\x00\x83\x00\x00\x14\x00\x00\x00\x01\x9b\xf5\x45\x56\x89\xed\x4c\x01\x97\x31\xc6\xdd\x3c\x07\x1f\x0e"
)
end
Now I need to be able to create a string with a variable UUID, according to the text indicated in a textbox (or a list of available UUIDs read from the server) in the user interface.
I will concatenate this string to the fixed part of the command.
How can I correctly make a string like
ad17fc696b49454db17d593db3e553e5 become
\xad\x17\xfc\x69\x6b\x49\x45\x4d\xb1\x7d\x59\x3d\xb3\xe5\x53\xe5?
Try this:
local input = "ad17fc696b49454db17d593db3e553e5"
local output = input:gsub("%w%w", function(s) return string.char(tonumber(s, 16)) end)
Explanation: this takes every pair of characters, interprets them as base 16 numeric string, and then takes the character with that number, and uses that to replace the original characters.
EDIT: To make it clear what's going on, and why the other answers are wrong, backslash escape sequences like \xad are a feature of the Lua source code, in memory it's represented by a byte with value 173, just like A is represented by a byte with value 65. Trying to concatenate a literal backslash character with hexadecimal characters does not create an escape code. So the way to do that is manually with string.char.
#! /usr/bin/env lua
str = 'ad17fc696b49454db17d593db3e553e5'
strx = ''
for i = 1, #str, 2 do -- loop through every-other position in your string
chars = str :sub( i, i+1 ) -- capture every 2 chars
strx = strx ..'\\x' ..chars
end -- append a literal backslash, the letter x, then those 2 chars
target = [[\xad\x17\xfc\x69\x6b\x49\x45\x4d\xb1\x7d\x59\x3d\xb3\xe5\x53\xe5]]
print( x, x == target ) -- print results, and test if it meets expected target
\xad\x17\xfc\x69\x6b\x49\x45\x4d\xb1\x7d\x59\x3d\xb3\xe5\x53\xe5 true
This can be code-golfed into a one-liner
x=''for i=1,#s,2 do x=x..'\\x'..s:sub(i,i+1)end
lets suppose that i have this .txt file:
this is line one
hello world
line three
in Lua, i want to creat a string only with the content of line two something like
i want to get a specific line from this file and put into a string
io.open('file.txt', 'r')
-- reads only line two and put this into a string, like:
local line2 = "hello world"
Lua files has the same methods as io library.
That means files have read() with all options as well.
Example:
local f = io.open("file.txt") -- 'r' is unnecessary because it's a default value.
print(f:read()) -- '*l' is unnecessary because it's a default value.
f:close()
If you want some specific line you can call f:read() and do nothing with it until you begin reading required line.
But more proper solution will be f:lines() iterator:
function ReadLine(f, line)
local i = 1 -- line counter
for l in f:lines() do -- lines iterator, "l" returns the line
if i == line then return l end -- we found this line, return it
i = i + 1 -- counting lines
end
return "" -- Doesn't have that line
end
So I'm writing a Lua script and I tested it but I got an error that I don't know how to fix:
.\search.lua:10: malformed pattern (missing ']')
Below is my code. If you know what I did wrong, it would be very helpful if you could tell me.
weird = "--[[".."\n"
function readAll(file)
local c = io.open(file, "rb")
local j = c:read("*all")
c:close()
return(j)
end
function blockActive()
local fc = readAll("functions.lua")
if string.find(fc,weird) ~= nil then
require("blockDeactivated")
return("false")
else
return("true")
end
end
print(blockActive())
Edit: first comment had the answer. I changed
weird = "--[[".."\n" to weird = "%-%-%[%[".."\r" The \n to \r change was because it was actually supposed to be that way in the first place.
This errors because string.find uses Lua Patterns.
Most non-alpha-numeric characters, such as "[", ".", "-" etc. convey special meaning.
string.find(fc,weird), or better, fc:find(weird) is trying to parse these special characters, and erroring.
You can use these patterns to cancel out your other patterns, however.
weird = ("--[["):gsub("%W","%%%0") .. "\r?\n"
This is a little daunting, but it will hopefully make sense.
the ("--[[") is the orignal first part of your weird string, working as expected.
:gsub() is a function that replaces a pattern with another one. Once again, see Patterns.
"%W" is a pattern that matches every string that isn't a letter, a number, or an underscore.
%%%0 replaces everything that matches with itself (%0 is a string that represents everything in this match), following a %, which is escaped.
So this means that [[ will be turned into %[%[, which is how find, and similar patterns 'escape' special characters.
The reason \n is now \r?\n refers back to these patterns. This matches it if it ends with a \n, like it did before. However, if this is running on windows, a newline might look like \r\n. (You can read up on this HERE). A ? following a character, \r in this case, means it can optionally match it. So this matches both --[[\n and --[[\r\n, supporting both windows and linux.
Now, when you run your fc:find(weird), it's running fc:find("%-%-%[%[\r?\n"), which should be exactly what you want.
Hope this has helped!
Finished code if you're a bit lazy
weird = ("--[["):gsub("%W","%%%0") .. "\r?\n" // Escape "--[[", add a newline. Used in our find.
// readAll(file)
// Takes a string as input representing a filename, returns the entire contents as a string.
function readAll(file)
local c = io.open(file, "rb") // Open the file specified by the argument. Read-only, binary (Doesn't autoformat things like \r\n)
local j = c:read("*all") // Dump the contents of the file into a string.
c:close() // Close the file, free up memory.
return j // Return the contents of the string.
end
// blockActive()
// returns whether or not the weird string was matched in 'functions.lua', executes 'blockDeactivated.lua' if it wasn't.
function blockActive()
local fc = readAll("functions.lua") // Dump the contents of 'functions.lua' into a string.
if fc:find(weird) then // If it functions.lua has the block-er.
require("blockDeactivated") // Require (Thus, execute, consider loadfile instead) 'blockDeactived.lua'
return false // Return false.
else
return true // Return true.
end
end
print(blockActive()) // Test? the blockActve code.
I would like to use FORTRAN streaming I/O to make a program that tells me how many lines a text-file has. The idea is to make something like this:
OPEN(UNIT=10,ACCESS='STREAM',FILE='testfile.txt')
nLines=0
bContinue=.TRUE.
DO WHILE (bContinue)
READ(UNIT=10) cCharacter
IF (cCharacter.EQ.{EOL-char}) nLines=nLines+1
IF (cCharacter.EQ.{EOF-char}) bContinue=.FALSE.
ENDDO
(I didn't include variable declaration but I think you get the idea of what they are; the only important clarification would be that that cCharacter has LEN=1)
My problem is that I don't know how to check if the character I just read from the file is an end-of-line or end-of-file (the "ifs" in the code). When you read and print characters this way, you eventually get newlines in the same place you had them in the original text, so I think it does read and recognize them as "characters", somehow. Perhaps turning the characters into integers and comparing to the appropriate number? Or is there a more direct way?
(I know that you can use the register reading (EDIT: I meant record reading) to do a program that reads lines more easily and add an IOstatus to check for eof, but the "line counter" is just a useful example, the idea is to learn how to move in a more controlled way through a textfile)
Checking for a specific character as line terminator makes you program OS dependent. It would be better to use the facilities of the language so that your program is compiler and OS dependent. Since lines are basically records, why do this with steam I/O? That request seems to make an easy job into a hard one. If are can use regular IO, here is an example program to count the lines in a text file.
EDIT: the code fragment was changed into a program to answer questions in the comments. With "line" as a character variable, when I test the program with gfortran and ifort I don't see a problem when the input file has empty or blank lines.
program test_lc
use, intrinsic :: iso_fortran_env
integer :: LineCount, Read_Code
character (len=200) :: line
open (unit=51, file="temp.txt", status="old", access='sequential', form='formatted', action='read' )
LineCount = 0
ReadLoop: do
read (51, '(A)', iostat=Read_Code) line
if ( Read_Code /= 0 ) then
if ( Read_Code == iostat_end ) then
exit ReadLoop ! end of file --> line count found
else
write ( *, '( / "read error: ", I0 )' ) Read_Code
stop
end if
end if
LineCount = LineCount + 1
write (*, '( I0, ": ''", A, "''" )' ) LineCount, trim (line)
if ( len_trim (line) == 0 ) write (*, '("The above is an empty or all blank line.")' )
end do ReadLoop
write (*, *) "found", LineCount, " lines"
end program test_lc
If you want to do further processing of the file, you can rewind it.
P.S.
The main reason that I have used Fortran Stream IO is to read files produced by other languages, e.g., C
Portable methods are provided to write new-line boundaries; I'm not aware of a portable method to test for such.
I'm writing an Applescript to parse an iOS Localization file (/en.lproj/Localizable.strings), translate the values and output the translation (/fr.lproj/Localizable.strings) to disk in UTF-16 (Unicode) encoding.
For some reason, the generated file has an extra space between every letter. After some digging, I found the cause of the problem in Learn AppleScript: The Comprehensive Guide to Scripting.
"If you accidently read a UTF-16 file
as MacRoman, the resulting value may
look at first glance like an ordinary
string, especially if it contains
English text. You'll quickly discover
that something is very wrong when you
try to use it, however: a common
symptom is that each visible character
in your "string" seems to have an
invisible character in front of it.
For example, reading a UTF-16 encoded
text file containing the phrase "Hello
World!" as a string produces a string
like " H e l l o W o r l d ! ", where
each " " is really an invisible ASCII
0 character."
So for example my English localization string file has:
"Yes" = "Yes";
And the generated French localization string file has:
" Y e s " = " O u i " ;
Here is my createFile method:
on createFile(fileFolder, fileName)
tell application "Finder"
if (exists file fileName of folder fileFolder) then
set the fileAccess to open for access file fileName of folder fileFolder with write permission
set eof of fileAccess to 0
write ((ASCII character 254) & (ASCII character 255)) to fileAccess starting at 0
--write «data rdatFEFF» to fileAccess starting at 0
close access the fileAccess
else
set the filePath to make new file at fileFolder with properties {name:fileName}
set the fileAccess to open for access file fileName of folder fileFolder with write permission
write ((ASCII character 254) & (ASCII character 255)) to fileAccess starting at 0
--write «data rdatFEFF» to fileAccess starting at 0
close access the fileAccess
end if
return file fileName of folder fileFolder as text
end tell
end createFile
And here is my writeFile method:
on writeFile(filePath, newLine)
tell application "Finder"
try
set targetFileAccess to open for access file filePath with write permission
write newLine to targetFileAccess as Unicode text starting at eof
close access the targetFileAccess
return true
on error
try
close access file filePath
end try
return false
end try
end tell
end writeFile
Any idea what I'm doing wrong?
Here's the handlers I use to read and write as UTF16. You don't need a separate "create file" handler. The write handler will create the file if it doesn't exist. Set the "appendText" variable to true or false. False means overwrite the file and true means add the new text to the end of the current text in the file. I hope this helps.
on writeTo_UTF16(targetFile, theText, appendText)
try
set targetFile to targetFile as text
set openFile to open for access file targetFile with write permission
if appendText is false then
set eof of openFile to 0
write (ASCII character 254) & (ASCII character 255) to openFile starting at eof -- UTF-16 BOM
else
tell application "Finder" to set fileExists to exists file targetFile
if fileExists is false then
set eof of openFile to 0
write (ASCII character 254) & (ASCII character 255) to openFile starting at eof -- UTF-16 BOM
end if
end if
write theText to openFile starting at eof as Unicode text
close access openFile
return true
on error theError
try
close access file targetFile
end try
return theError
end try
end writeTo_UTF16
on readFrom_UTF16(targetFile)
try
set targetFile to targetFile as text
targetFile as alias -- if file doesn't exist then you get an error
set openFile to open for access file targetFile
set theText to read openFile as Unicode text
close access openFile
return theText
on error
try
close access file targetFile
end try
return false
end try
end readFrom_UTF16
If you're getting actual spaces between every character, you've probably got the '(characters i thru j of someText) as string' anti-pattern in your code [1]. That will split a string into a list of characters, then coerce it back into a string with your current text item delimiter inserted between each character. The correct (i.e. fast and safe) way to get a sub-string is this: 'text i thru j of someText' (p179-181).
OTOH, if you are getting invisible characters between each character [2], then yes, that'll be an encoding issue, typically reading a UTF16-encoded file using MacRoman or other single-byte encoding. If your file has a valid Byte Order Mark then any Unicode-savvy text editor should read it using the correct encoding.
[1] p179 states that this idiom is unsafe, but forgets to provide a practical demonstration of the problems it causes. [3]
[2] IIRC the example on p501 was meant to use rectangle symbols to represent invisible characters, i.e. "⃞H⃞e⃞l⃞l⃞o" not " H e l l o", but didn't come out quite that way so might be misread as meaning visible spaces. [3]
[3] Feel free to submit errata to Apress.