I made an AppleScript earlier today that displays the subscriber count of a YouTube for Geektools, but I wanted it to be easier for people to use and tried to make it work off the name of the file (ex. taking subcount-PewDiePie.scpt and outputting PewDiePie's sub count), and I've made the inputting the name from the file name work, but its giving me errors when I try to take the number out of the api's response
the working (original)'s code
set apiResponse to (do shell script "curl -s 'https://www.googleapis.com/youtube/v3/channels?part=statistics&forUsername=PewDiePie&fields=items%2Fstatistics%2FsubscriberCount&key=AIzaSyAEQGj2ZcDrTU0ZqzteD8eDVJwB9cpmvEo'")
on returnNumbersInString(inputString)
set s to quoted form of inputString
do shell script "sed s/[a-zA-Z\\']//g <<< " & s
set dx to the result
set numlist to {}
repeat with i from 1 to count of words in dx
set this_item to word i of dx
try
set this_item to this_item as number
set the end of numlist to this_item
end try
end repeat
end returnNumbersInString
returnNumbersInString(apiResponse)
The broken customizable code
set channelName to path to me as text
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"subcount-"}
set nameFilter to text items of channelName
set channelName to item 2 of nameFilter
set AppleScript's text item delimiters to {"."}
set nameFilter to the text items of channelName
set channelName to item 1 of nameFilter
set curlLink to "https://www.googleapis.com/youtube/v3/channels?part=statistics&forUsername=" & channelName & "&fields=items%2Fstatistics%2FsubscriberCount&key=AIzaSyAEQGj2ZcDrTU0ZqzteD8eDVJwB9cpmvEo"
set curlCommand to "curl -s " & (quoted form of curlLink)
set apiResponse to {do shell script curlCommand}
on returnNumbersInString(inputString)
set s to quoted form of inputString
do shell script "sed s/[a-zA-Z\\']//g <<< " & s
set dx to the result
set numlist to {}
repeat with i from 1 to count of words in dx
set this_item to word i of dx
try
set this_item to this_item as number
set the end of numlist to this_item
end try
end repeat
end returnNumbersInString
returnNumbersInString(apiResponse)
Every time I do the second one it outputs the error
Can’t get quoted form of {"{
\"items\": [
{
\"statistics\": {
\"subscriberCount\": \"76957805\"
}
}
]
}"}.
It's failing immediately after it gets the info from the website, which doesn't make any sense because none of the code beyond how it got the website link has been changed, can anyone help me resolve this
You've enclosed your do shell script command in braces here:
set apiResponse to {do shell script curlCommand}
Therefore, the apiResponse is now a list containing an JSON string, instead of simply a JSON string. Remove the braces so the line reads:
set apiResponse to do shell script curlCommand
Related
I have a code in AppleScript, which is supposed to have an input from last copied text separated by commas. i have made so that script recognise the copied as list.
have a look to my example
i have a list of names
Apple Watch
iPhone
iPad
macBook
in order to recognise as a list i have made in to readable list like
"Apple Watch","iPhone","iPad","macBook"
and have copied to my clipboard expecting the to have automatically pasted to my code as input. But unfortunately the code doesn't recognise each string separately and gives the same output like this: "Apple Watch","iPhone","iPad","macBook" instead of this:
Apple Watch
iPhone
iPad
macBook
tell application "Safari"
activate
set Storage to get clipboard
set theList to {Storage}
tell application "System Events"
set varX to 1
set condition to 0
repeat until condition = length of theList
set varName to item varX of theList
keystroke of varName
delay 0.2
keystroke return
delay 0.2
set varX to varX + 1
set condition to condition + 1
end repeat
end tell
end tell
the same thing will work according to my need if i paste that list in the place Storage
but i need to happen it automatically without me pasting the list everytime by opening the script.
i apologise for becoming very wordy
Can anybody please give me solution?
As understand you have in the clipboard the text "Apple Watch","iPhone","iPad","macBook". And, you want to keystroke it line by line in the Safari window. If this is case, then you have just to replace commas with returns in the clipboard's text, then keystroke this whole replaced text:
set Storage to the clipboard
set {ATID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, ","}
set theList to text items of Storage
set AppleScript's text item delimiters to return
set theList to theList as text
set AppleScript's text item delimiters to ATID
tell application "System Events"
set frontmost of process "Safari" to true
keystroke theList
end tell
If your task is to fill in individual fields of some form in Safari, then time delays are needed there and the text should be sent in parts. More or less like this:
set Storage to the clipboard
set {ATID, AppleScript's text item delimiters} to {AppleScript's text item delimiters, ","}
set theList to text items of Storage
set AppleScript's text item delimiters to ATID
tell application "System Events"
set frontmost of process "Safari" to true
repeat with anItem in theList
keystroke (contents of anItem) & return
delay 0.2
end repeat
end tell
I have a text file to process, with some example content as follows:
[FCT-FCTVALUEXXXX-IA]
Name=value
Label = value
Zero or more lines of text
Abbr=
Zero or more lines of text
Field A=1
Field B=0
Zero or more lines of text
Hidden=N
[Text-FCT-FCTVALUEXXXX-IA-Note]
One or more note lines
[FCT-FCT-FCTVALUEZ-IE-DETAIL]
Zero or more lines of text
[FCT-FCT-FCTVALUEQ-IA-DETAIL]
Zero or more lines of text
[FCT-_FCTVALUEY-IA]
Name=value
Zero or more lines of text
Label=value
Zero or more lines of text
Field A=1
Abbr=value
Field A=1
Zero or more lines of text
Hidden=N
I need to find sections like this:
[FCT-FCTVALUEXXXX-IA]
Name=value
Label = value
Zero or more lines of text
Abbr=
Zero or more lines of text
Field A=1
Field B=0
Zero or more lines of text
Hidden=N
and extract FCT-FCTVALUEXXXX-AA, Name, Label, Abbr, Field A and B and Hidden, and then find a corresponding section (if it exists):
[Text-FCT-FCTVALUEXXXX-IA-Note]
One or more note lines
end extract the Note lines as a single string.
I don't care about the sections
[FCT-FCT-FCTVALUEZ-IE-DETAIL]
Zero or more lines of text
All three sorts of sections can appear anywhere in the file, including right at the end, and there's no predictable relationship in position between the sections.
The order of Abbr and Fields A and B cannot be guaranteed but they always appear after Name and Label and before Hidden.
What I have so far:
strParse = "(%[FCT%-.-%-)([IF])([EA])%]%c+Name=(.-)%c.-Label=(.-)%c(.-)Hidden=(%a)%c" --cant pull everything out at once because the order of some fields is not predictable
for id, rt, ft, name, label, detail, hidden in strFacts:gmatch(strParse) do
--extract details
abbr=detail:match("Abbr=(.-)%c") --may be blank
if abbr == nil then abbr = "" end
FieldA = detail:match("Field A=(%d)")
FieldB = detail:match("Field B=(%d)")
--need to sanitise id which could have a bunch of extraneous material tacked on the front and use it to get the Note
ident=id:match(".*(%[FCT%-.-%-$)")..rt..ft
Note = ParseAutonote(ident) --this is a function to parse the note which I've yet to test so a dummy function returns ""
tblResults[name]={ident, rt, ft, name, label, abbr, FieldA, FieldB, hidden, note}
end
Most of it works OK (after many hours of working on it), but the piece that isn't working is:
(".*(%[FCT%-.-%-$)")
which is supposed to pull out the final occurrence of FCT-sometext-
in the string id
My logic: anchor the search to the end of the string and capture the shortest possible string beginning with "[FCT-" and ending with "-" at the end of the string.
Given a value of either "[FCT-_ABCD-PDQR-" or
"[FCT-XYZ-DETAIL]lines of text[FCT-_ABCD-PDQR-" it returns nil when I want it to return "FCT-_ABCD-PDQR-". (Note ABCD, PDQR etc can be any length of text containing Alpha, - and _).
As you discovered yourself (".*(%[FCT%-.-%-)$") works the way you want,
where (".*(%[FCT%-.-%-$)") does not. $ and ^ are anchors and must come at the end or beginning of the pattern, they can not appear inside a capture closure.
When the anchor characters appear anywhere else in the pattern they will be part of the string you are looking for, excluding cases where ^ is used in a set to exclude chars i.e.: excluding upper-case chars [^A-Z]
Here are examples of the pattern matching using the an example string and the pattern from your question.
print(string.match("[FCT-_ABCD-PDQR-", (".*(%[FCT%-.-%-$)"))) -- initial pattern
> nil
print(string.match("[FCT-_ABCD-PDQR-$", (".*(%[FCT%-.-%-$)"))) -- $ added to end of string
> [FCT-_ABCD-PDQR-$
print(string.match("[FCT-_ABCD-PDQR-", (".*(%[FCT%-.-%-)$"))) -- $ moved to end of pattern
> [FCT-_ABCD-PDQR-
I have a problem trying to execute shell scripts from apple script. I do a "grep", but as soon as it contains special characters it doesn't work as intended.
(The script reads a list list ob subfolders in a directory and checks if any of the subfolders appear in a file.)
Here is my script:
set searchFile to "/tmp/output.txt"
set theCommand to "/usr/local/bin/pdftotext -enc UTF-8 some.pdf" & space & searchFile
do shell script theCommand
tell application "Finder"
set companies to get name of folders of folder ("/path/" as POSIX file)
end tell
repeat with company in companies
set theCommand to "grep -c " & quoted form of company & space & quoted form of searchFile
try
do shell script theCommand
set CompanyName to company as string
return CompanyName
on error
end try
end repeat
return false
The problem is e.g. with strings with umlauts. "theCommand" is somehow differently encoded that when I do it on the CLI directly.
$ grep -c 'Württemberg' '/tmp/output.txt' --> typed on command line
3
$ grep -c 'Württemberg' '/tmp/output.txt' --> copy & pasted from AppleScript
0
$ grep -c 'rttemberg' '/tmp/output.txt' --> no umlauts, no problems
3
The "ü" from the first and the second line are different; a echo 'Württemberg' | openssl base64 shows this.
I tried several encoding tricks at different places, basically everything I could find or think of.
Does anyone have any idea? How can I check which encoding a string has?
Thanks in advance!
Sebastian
Overview
This can work by escaping each character that has an accent in each company name before they are used in the grep command.
So, you'll need to escape each one of those characters (i.e. those which have an accent) with double backslashes (i.e. \\). For example:
The ü in Württemberg will need to become \\ü
The ö in Königsberg will need to become \\ö
The ß in Einbahnstraße will need to become \\ß
Why is this necessary:
These accented characters, such as a u with diaeresis, are certainly getting encoded differently. Which type of encoding they receive is difficult to ascertain. My assumption is that the encoding pattern used begins with a backslash - hence why escaping those characters with backslashes fixes the issue. Consider the u with diaeresis in the previous link, it shows that for the C/C++ language the ü is encoded as \u00FC.
Solution
In the complete script below you'll notice the following:
set accentedChars to {"ü", "ö", "ß", "á", "ė"} has been added to hold a list of all characters that will need to be escaped. You'll need to explicitly state each one as there doesn't seem to be a way to infer whether the character has an accent.
Before assigning the grepcommand to theCommand variable we firstly escape the necessary characters via the line reading:
set company to escapeChars(company, accentedChars)
As you can see here we are passing two arguments to the escapeChars sub-routine, (i.e. the non-escaped company variable and the list of accented characters).
In the escapeChars sub-routine we iterate over each char in the accentedChars list and invoke the findAndReplace sub-routine. This will escape any instances of those characters with backslashes found in the company variable.
Complete script:
set searchFile to "/tmp/output.txt"
set accentedChars to {"ü", "ö", "ß", "á", "ė"}
set theCommand to "/usr/local/bin/pdftotext -enc UTF-8 some.pdf" & ¬
space & searchFile
do shell script theCommand
tell application "Finder"
set companies to get name of folders of folder ("/path/" as POSIX file)
end tell
repeat with company in companies
set company to escapeChars(company, accentedChars)
set theCommand to "grep -c " & quoted form of company & ¬
space & quoted form of searchFile
try
do shell script theCommand
set CompanyName to company as string
return CompanyName
on error
end try
end repeat
return false
(**
* Checks each character of a given word. If any characters of the word
* match a character in the given list of characters they will be escapd.
*
* #param {text} searchWord - The word to check the characters of.
* #param {text} charactersList - List of characters to be escaped.
* #returns {text} The new text with the item(s) replaced.
*)
on escapeChars(searchWord, charactersList)
repeat with char in charactersList
set searchWord to findAndReplace(char, ("\\" & char), searchWord)
end repeat
return searchWord
end escapeChars
(**
* Replaces all occurances of findString with replaceString
*
* #param {text} findString - The text string to find.
* #param {text} replaceString - The replacement text string.
* #param {text} searchInString - Text string to search.
* #returns {text} The new text with the item(s) replaced.
*)
on findAndReplace(findString, replaceString, searchInString)
set oldTIDs to text item delimiters of AppleScript
set text item delimiters of AppleScript to findString
set searchInString to text items of searchInString
set text item delimiters of AppleScript to replaceString
set searchInString to "" & searchInString
set text item delimiters of AppleScript to oldTIDs
return searchInString
end findAndReplace
Note about current counts:
Currently your grep pattern only reports the number of lines that the word was found on. Not how many instances of the word were found.
If you want the actual number of instances of the word then use the -o option with grep to output each occurrence. Then pipe that to wc with the -l option to count the number of lines. For example:
grep -o 'Württemberg' /tmp/output.txt | wc -l
and in your AppleScript that would be:
set theCommand to "grep -o " & quoted form of company & space & ¬
quoted form of searchFile & "| wc -l"
Tip: If your want to remove the leading spaces in the count/number that gets logged then pipe it to sed to strip the spaces: For example via your script:
set theCommand to "grep -o " & quoted form of company & space & ¬
quoted form of searchFile & "| wc -l | sed -e 's/ //g'"
and the equivalent via the command line:
grep -o 'Württemberg' /tmp/output.txt | wc -l | sed -e 's/ //g'
This is for code:
set source_failed = `cat mine.log`
set dest_failed = `cat their.log`
foreach t ($source_failed)
set isdiff = 0
set sflag = 0
foreach t2 ($dest_failed)
if ($t2 == $t) then
set sflag = 1
break
endif
end
...
end
Problem is that the inner foreach loop runs okay for first few 10 iterations. After that iteration, suddenly I got
foreach: no match
Moreover, I am iterating over the array of strings, not files. What is the reason behind this error?
The problem is (probably) that mine.log and/or their.log contain special globbing characters, such as * or ?. The shell will try to expand this to a file. There are no matches for this accidental pattern, and hence the error "no match".
The easiest way to prevent this behaviour is to add set noglob to the top. From tcsh(1):
noglob If set, Filename substitution and Directory stack substitution
(q.v.) are inhibited. This is most useful in shell scripts
which do not deal with filenames, or after a list of filenames
has been obtained and further expansions are not desirable.
You can re-enable this behaviour by using set glob.
Alternativly, you can use :q. From tcsh(1):
Unless enclosed in `"' or given the `:q' modifier the results of variable
substitution may eventually be command and filename substituted.
[..]
When the `:q' modifier is applied to a substitution the variable will expand
to multiple words with each word sepa rated by a blank and quoted to
prevent later command or filename sub stitution.
But you need to be very careful about quoting when you use the variable. In the below example, the echo command will fail if you don't add quotes (set noglob is much easier):
set source_failed = `cat source`
foreach t ($source_failed:q)
echo "$t"
end
What I'm trying to do is to get the names of all TV shows on this Wikipedia page.
Ok, so I did this first:
property showsWebList : {}
tell application "Safari"
set loadDelay to 2 -- in seconds; test for your system
make new document at end of every document
set URL of document 1 to "http://en.wikipedia.org/wiki/List_of_television_programs_by_name"
delay loadDelay
set nrOfUls to do JavaScript "document.getElementById('mw-content-text').querySelectorAll('ul').length;" in document 1
set nrOfUls to nrOfUls - 1 as number
log nrOfUls
repeat with ws from 1 to nrOfUls
delay loadDelay
set nrOfLis to do JavaScript "document.getElementById('mw-content-text').getElementsByTagName('UL')[" & ws & "].querySelectorAll('li').length;" in document 1
set nrOfLis to nrOfLis - 1 as number
log nrOfLis
repeat with rs from 0 to nrOfLis
delay 0.3
set aShow to do JavaScript "document.getElementById('mw-content-text').getElementsByTagName('UL')[" & ws & "].getElementsByTagName('LI')[" & rs & "].getElementsByTagName('I')[0].getElementsByTagName('A')[0].innerHTML;" in document 1
if aShow is not "" or "missing value" then
copy aShow to end of showsWebList
end if
end repeat
end repeat
end tell
And this works exactly how I want it to. The problem is that it takes 15 minutes until it's done and you gotta have the safari document in front the whole time. So my thought was to pick up the whole code and parse it. Not that easy. This is how my code looks now:
tell application "Safari"
make new document at end of every document
set URL of document 1 to "http://en.wikipedia.org/wiki/List_of_television_programs_by_name"
delay 4
set orgHTML to do JavaScript "document.getElementById('mw-content-text').innerHTML;" in document 1
set orgHTML to orgHTML as text
set readyText to my extractBetween(orgHTML, "<li><i><a ", "</a></i></li>")
log (item 0 of readyText)
set removeArray to my extractBetween(readyText, "href", ">")
set completeArray to {}
repeat with rt from 0 to (count readyText)
repeat with ra from 0 to (count removeArray)
if (item ra of removeArray) is in (item rt of readyText) then
set completeName to trim_line((item rt of readyText), (item ra of removeArray), 1)
set end of completeArray to completeName
end if
end repeat
end repeat
log completeArray
end tell
on extractBetween(SearchText, startText, endText)
set tid to AppleScript's text item delimiters -- save them for later.
set AppleScript's text item delimiters to startText -- find the first one.
set liste to text items of SearchText
set AppleScript's text item delimiters to endText -- find the end one.
set extracts to {}
repeat with subText in liste
if subText contains endText then
copy text item 1 of subText to end of extracts
end if
end repeat
set AppleScript's text item delimiters to tid -- back to original values.
return extracts
end extractBetween
on trim_line(this_text, trim_chars, trim_indicator)
-- 0 = beginning, 1 = end, 2 = both
set x to the length of the trim_chars
-- TRIM BEGINNING
if the trim_indicator is in {0, 2} then
repeat while this_text begins with the trim_chars
try
set this_text to characters (x + 1) thru -1 of this_text as string
on error
-- the text contains nothing but the trim characters
return ""
end try
end repeat
end if
-- TRIM ENDING
if the trim_indicator is in {1, 2} then
repeat while this_text ends with the trim_chars
try
set this_text to characters 1 thru -(x + 1) of this_text as string
on error
-- the text contains nothing but the trim characters
return ""
end try
end repeat
end if
return this_text
end trim_line
Not that smooth and not working. Somehow it seems like I can't get the items out of the list, because it doesn't see it as a list item. Can someone help me out?
Cheers
I would recommend a different approach. DL the source, and then just grab the title between tags. The whole script takes under two seconds. Start with:
property baseURL : "http://en.wikipedia.org/wiki/List_of_television_programs_by_name"
set rawHTML to do shell script "curl '" & baseURL & "'"
set preTag to "\" title=\"" -- " title="
set otid to AppleScript's text item delimiters
set AppleScript's text item delimiters to preTag
set rawList to text items of rawHTML
set nameList to {}
repeat with eachLine in rawList
set theOff to offset of ">" in eachLine
set thisName to text 1 thru (theOff - 2) of eachLine
-- add some error checking here to skip the opening non-title hits, and to fine-tune the precise title string
set nameList to nameList & return & thisName
end repeat
set AppleScript's text item delimiters to otid
return nameList
Add a little error checking, and tweak which preTag and postTag fits best.
I suggest you make use of a specialized 3rd-party tool for this task, which can greatly speed things up.
Here's a solution using the multi-platform web-scraping CLI xidel:
A shell command to demonstrate its brevity and speed (takes less than 1 sec. on my system) - extracts all show names from the page:
xidel -e '//*[#id="mw-content-text"]/ul/li/i/a' https://en.wikipedia.org/wiki/List_of_television_programs_by_name
An equivalent AppleScript snippet - be sure to fill in the path to where you place xidel on your system below:
set targetUrl to "https://en.wikipedia.org/wiki/List_of_television_programs_by_name"
set xPathExpr to "//*[#id=\"mw-content-text\"]/ul/li/i/a"
# Fill in the path to `xidel` on your system here:
set xidelPath to "/path/to/xidel"
# Perform scraping and convert result into an AppleScript list.
set showNames to paragraphs of ¬
(do shell script ¬
quoted form of xidelPath & " -e " & quoted form of xPathExpr & " " & ¬
quoted form of targetUrl)
Here's another solution, use javascript to get the names without any AppleScript loop.
The javascript script takes less than one second to get the names.
tell application "Safari"
make new document at end of every document with properties {URL:"http://en.wikipedia.org/wiki/List_of_television_programs_by_name"}
delay 2 -- in seconds; test for your system
set showsWebList to do JavaScript "var a=new Array();var ul=document.getElementById('mw-content-text').querySelectorAll('UL'); for (var i=1;i<ul.length;i++){li=ul[i].querySelectorAll('LI'); for (var j=0; j< li.length; j++){try {var t=li[j].getElementsByTagName('I')[0].getElementsByTagName('A')[0].innerText; a.push(t)} catch(e) {}}} a;" in document 1
end tell
curl/sed/perl solution:
do shell script "curl 'http://en.wikipedia.org/wiki/List_of_television_programs_by_name' | sed -n '/0-9/,/NewPP/p' | sed -n '/^<li/ s/^.*title=.\\([^\"]*\\).*$/\\1/p' | perl -n -mHTML::Entities -e ' ; print HTML::Entities::decode_entities($_);'"
Here another solution using awk using a very simple script. If the line begins with <li><i> then remove html tags (gsub) and then print it. Then by using every paragraph of the return separated output is converted into a list.
set theURL to "http://en.wikipedia.org/wiki/List_of_television_programs_by_name"
every paragraph of (do shell script "curl " & theURL & " | awk '/^\\<li\\>\\<i\\>/{gsub(\"<[^>]*>\", \"\");print}'")