Watch a web page for changes - comparison

I googled and couldn't find any could that would compare a webpage to a previous version.
In this case the page I'm trying to watch is link text. There are services that can watch a page, but I'd like to set this up on my own server.
I've set this up as a wiki so anyone can add to the code. Here's my idea
Check if previous version of file exists. If false then download page
If page exists, diff to find differences and email the new content along with dates of new and old versions.
This script would be called nightly via cron or on-demand via the browser (the latter is not a priority)
Sounds simple, maybe I'm just not looking in the right place.

Perhaps a simple sh-script like this, featuring wget, diff & test?
#!/bin/sh
WWWURI="http://foo.bar/testfile.html"
LOCALCOPY="testfile.html"
TMPFILE="tmpfile"
WEBFILE="changed.html"
MAILADDRESS="$(whoami)"
SUBJECT_NEWFILE="$LOCALCOPY is new"
BODY_NEWFILE="first version of $LOCALCOPY loaded"
SUBJECT_CHANGEDFILE="$LOCALCOPY updated"
SUBJECT_NOTCHANGED="$LOCALCOPY not updated"
BODY_CHANGEDFILE="new version of $LOCALCOPY"
# test for old file
if [ -e "$LOCALCOPY" ]
then
mv "$LOCALCOPY" "$LOCALCOPY.bak"
wget "$WWWURI" -O"$LOCALCOPY" -o/dev/null
diff "$LOCALCOPY" "$LOCALCOPY.bak" > $TMPFILE
# test for update
if [ -s "$TMPFILE" ]
then
echo "$SUBJECT_CHANGEDFILE"
( echo "$BODY_CHANGEDFILE" ; cat "$TMPFILE" ) | tee "$WEBFILE" | mail -s "$SUBJECT_CHANGEDFILE" "$MAILADDRESS"
else
echo "$SUBJECT_NOTCHANGED"
fi
else
wget "$WWWURI" -O"$LOCALCOPY" -o/dev/null
echo "$BODY_NEWFILE"
echo "$BODY_NEWFILE" | tee "$WEBFILE" | mail -s "$SUBJECT_NEWFILE" "$MAILADDRESS"
fi
[ -e "$TMPFILE" ] && rm "$TMPFILE"
Update: Pipe through tee, little spelling & remove of $TMPFILE

You can check This SO posting to get a few ideas and also information about the challenge of detecting "true" changes to a web page (with fluctuating advertisement block, and other "noise")

Related

Running iOS UIAutomation as a post-action build script is return as a posix spawn error

I'm entirely new to using bash and Xcode build scripts and so my code is probably a jungle full of errors.
The idea here is to trigger the script below which will scrape the directory that it is saved in for any .js automation scripts. It will then send these scripts to instruments to be run one at a time. I found some nifty code that created time stamped files and so I used that to create a more meaningful storage system.
#!/bin/bash
# This script should run all (currently only one) tests, independently from
# where it is called from (terminal, or Xcode Run Script).
# REQUIREMENTS: This script has to be located in the same folder as all the
# UIAutomation tests. Additionally, a *.tracetemplate file has to be present
# in the same folder. This can be created with Instruments (Save as template...)
# The following variables have to be configured:
#EXECUTABLE="Plans.app"
# Find the test folder (this script has to be located in the same folder).
ROOT="$( cd -P "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# Prepare all the required args for instruments.
TEMPLATE=`find $ROOT -name '*.tracetemplate'`
#EXECUTABLE=`find ~/Library/Application\ Support/iPhone\ Simulator | grep "${EXECUTABLE}$"`
echo "$BUILT_PRODUCTS_DIR"
echo "$PRODUCT_NAME"
EXECUTABLE="${BUILT_PRODUCTS_DIR}/${PRODUCT_NAME}.app/"
SCRIPTS=`find $ROOT -name '*.js'`
# Prepare traces folder
TRACES="${ROOT}/Traces/`date +%Y-%m-%d_%H-%M-%S`"
mkdir -p "$TRACES"
printf "\n" >> "$ROOT/results.log"
echo `date +%Y-%m-%d_%H-%M-%S` >> "$ROOT/results.log"
# Get the name of the user we should use to run Instruments.
# Currently this is done, by getting the owner of the folder containing this script.
USERNAME=`ls -l "${ROOT}/.." | grep \`basename "$ROOT"\` | awk '{print $3}'`
# Bring simulator window to front. Depending on the localization, the name is different.
osascript -e 'try
tell application "iPhone Simulator" to activate
on error
tell application "iOS Simulator" to activate
end try'
# Prepare an Apple Script that promts for the password.
PASS_SCRIPT="tell application \"System Events\"
activate
display dialog \"Password for user $USER:\" default answer \"\" with hidden answer
text returned of the result
end tell"
# Run all the tests.
for SCRIPT in $SCRIPTS; do
echo -e "\nRunning test script $SCRIPT"
TESTC="sudo -u ${USER} xcrun instruments -l -c -t ${TEMPLATE} ${EXECUTABLE} -e UIARESULTSPATH ${TRACES}/${TRACENAME} -e UIASCRIPT ${SCRIPT} >> ${ROOT}/results.log"
#echo "$COMMAND"
echo "Executing command $TESTC" >> "$ROOT/results.log"
echo "here $TESTC" >> "$ROOT/results.log"
OUTPUT=$(TESTC)
echo $OUTPUT >> "$ROOT/results.log"
echo "Finished logging" >> "$ROOT/results.log"
SCRIPTNAME=`basename "$SCRIPT"`
TRACENAME=`echo "$SCRIPTNAME" | sed 's_\.js$_.trace_g'`
for i in $(ls -A1t $PWD | grep -m 1 '.trace')
do
TRACEFILE="$PWD/$i"
done
if [ -e $TRACEFILE ]; then
mv "$TRACEFILE" "${TRACES}/${TRACENAME}"
fi
if [ `grep " Fail: " results.log | wc -l` -gt 0 ]; then
echo "Test ${SCRIPTNAME} failed. See trace for details."
open "${TRACES}/${TRACENAME}"
exit 1
break
fi
done
rm results.log
A good portion of this was taken from another Stack Overflow answer but because of the repository setup that I'm working with I needed to keep the paths abstract and separate from the root folder of the script. Everything seems to work (although probably not incredibly efficiently) except for the actual xcrun command to launch instruments.
TESTC="sudo -u ${USER} xcrun instruments -l -c -t ${TEMPLATE} ${EXECUTABLE} -e UIARESULTSPATH ${TRACES}/${TRACENAME} -e UIASCRIPT ${SCRIPT} >> ${ROOT}/results.log"
echo "Executing command $TESTC" >> "$ROOT/results.log"
OUTPUT=$(TESTC)
This is turned into the following by whatever black magic Bash runs on:
sudo -u Braains xcrun instruments -l -c -t
/Users/Braains/Documents/Automation/AppName/TestCases/UIAutomationTemplate.tracetemplate
/Users/Braains/Library/Developer/Xcode/DerivedData/AppName-
ekqevowxyipndychtscxwgqkaxdk/Build/Products/Debug-iphoneos/AppName.app/ -e UIARESULTSPATH
/Users/Braains/Documents/Automation/AppName/TestCases/Traces/2014-07-17_16-31-49/ -e
UIASCRIPT /Users/Braains/Documents/Automation/AppName/TestCases/Test-Case_1js
(^ Has inserted line breaks for clarity of the question ^)
The resulting error that I am seeing is:
posix spawn failure; aborting launch (binary ==
/Users/Braains/Library/Developer/Xcode/DerivedData/AppName-
ekqevowxyipndychtscxwgqkaxdk/Build/Products/Debug-iphoneos/AppName.app/AppName).
I have looked all over for a solution to this but I can't find anything because Appium has a similar issue. Unfortunately I don't understand the systems well enough to know how to translate the fixes to Appium to my own code but I imagine it's a similar issue.
I do know that the posix spawn failure is related to threading, but I don't know enough about xcrun to say what's causing the threading issue.
Related info:
- I'm building for the simulator but it'd be great to work on real devices too
- I'm using xCode 5.1.1 and iOS Simulator 7.1
- This script is meant to be run as a build post action script in xCode
- I did get it briefly working once before I broke it and couldn't get it back to the working state. So I think that means all of my permissions are set correctly.
UPDATE: So I've gotten to the root of this problem although I have not found a fix yet. First of all I have no idea what xcrun is for and so I dropped it. Then after playing around I found that my Xcode environment variables are returning the wrong path, probably because of some project setting somewhere. If you copy the Bash command from above but replace Debug-iphoneos with Debug-iphonesimulator the script can be run from the command line and will work as expected.
So for anyone who happens across this the only solution I could find was to hardcode the script for the simulator.
I changed EXECUTABLE="${BUILT_PRODUCTS_DIR}/${PRODUCT_NAME}.app/" to be EXECUTABLE="${SYMROOT}/Debug-iphonesimulator/${EXECUTABLE_PATH}". This is obviously not a great solution but it works for now.

jenkins plugin for triggering build whenever any file changed in a given directory

I am looking for functionality where we have a directory with some files in it.
Whenever any one makes a change in any of the files in the directory, jenkins shoukd trigger a build.
Is there any plugin or mathod for this functionality. Please advise.
Thanks in advance.
I have not tried it myself, but The FSTrigger plugin seems to do what you want:
FSTrigger provides polling mechanisms to monitor a file system and
trigger a build if a file or a set of files have changed.
If you can monitor the directory with a script, you can trigger the build with a HTTP GET, for example with wget or curl:
wget -O- $JENKINS_URL/job/JOBNAME/build
Although slightly related.. it seems like this issue was about monitoring static files on system.. however there are many version control systems for just this purpose.
I answered this in another post if you're using git to track changes on the files themselves:
#!/bin/bash
set -e
job_name="whatever"
JOB_URL="http://myserver:8080/job/${job_name}/"
FILTER_PATH="path/to/folder/to/monitor"
python_func="import json, sys
obj = json.loads(sys.stdin.read())
ch_list = obj['changeSet']['items']
_list = [ j['affectedPaths'] for j in ch_list ]
for outer in _list:
for inner in outer:
print inner
"
_affected_files=`curl --silent ${JOB_URL}${BUILD_NUMBER}'/api/json' | python -c "$python_func"`
if [ -z "`echo \"$_affected_files\" | grep \"${FILTER_PATH}\"`" ]; then
echo "[INFO] no changes detected in ${FILTER_PATH}"
exit 0
else
echo "[INFO] changed files detected: "
for a_file in `echo "$_affected_files" | grep "${FILTER_PATH}"`; do
echo " $a_file"
done;
fi;
You can add the check directly to the top of the job's exec shell, and it will exit 0 if no changes detected.. Hence, you can always poll the top level of the repo for check-in's to trigger a build. And only complete a build if the files in question change.

Unpredictable result with echo command when used in a for loop

I am trying to write a small shell script to find out the signed and unsigned jars in particular directory. While the script works fine until 4-5 jars, it starts showing unpredictable results if my directory has more jars ( currntly about 25 jars). I am not sure but a possible reason for this might be related to buffering of STDIN/STDOUT. I tried to find a resolution for this in related posts but could not get a clear answer.
Here goes my script: ( it takes JAVA_HOME as an argument ):
#! /bin/bash
JV_HOME=$1
for i in `ls *.jar`
do
echo "scanning $i ..."
FILE=$i
$JV_HOME/bin/jarsigner -verify -verbose -certs $FILE | grep "jar verified" ;
if [ $? -eq 0 ]; then
echo "\n$FILE is code-signed\n"
else
echo "\n$FILE is unsigned/unverified..\n"
fi
done
For some of the jars, it says the jar to be unsigned, when they are actually signed when checked individually with the following command:
$JAVA_HOME/bin/jarsigner -verify -verbose -certs
What could possibly be wrong with the above script?
Thanks in advance,
Pabi

How can I remove duplicates (deduplicate) a mbox format email mailbox?

I've got a mbox mailbox containing duplicate copies of messages, which differ only in their "X-Evolution:" header.
I want to remove the duplicate ones, in as quick and simple a way as possible. It seems like this would have been written already, but I haven't found it, although I've looked at the Python mailbox module, the various perl mbox parsers, formail, and so forth.
Does anyone have any suggestions?
This a small script, which I used for it:
#!/bin/bash
IDCACHE=$(mktemp -p /tmp)
formail -D $((1024*1024*10)) ${IDCACHE} -s
rm ${IDCACHE}
The mailbox needs to be piped through it, and in the meantime it will be deduplicated.
-D $((1024*1024*10)) sets a 10 Mebibyte cache, which is more than 10x the amount needed to deduplicate an entire year of my mail. YMMV, so adjust it accordingly. Setting it too high will cause some performance loss, setting it to low will let it slip duplicates.
formail is part of the procmail utility bundle, mktemp is part of coreutils.
I didn't look at formail (part of procmail) in enough detail. It does have such such an option, as mentioned in places like: http://hints.macworld.com/comment.php?mode=view&cid=115683 and http://us.generation-nt.com/answer/deleting-duplicate-mail-messages-help-172481881.html
'formail -D' and 'reformail -D' can only process one email per execution. Each mail needs to be separated from mbox first before being processed. I use reformail from maildrop instead since it's still in active development.
remove old idcache, tmpmail, nmbox
run dedup.sh .
nmbox is the output with duplicate messages removed.
dedup.sh
#! /bin/sh
# $1 = mbox, thunderbird mailbox
# wmbox.sh is called for each mail.
cat $1 | reformail -s ./wmbox.sh
wmbox.sh
#! /bin/sh
# stdin: a email
# called by dedup.sh
TM=tmpmail
if [ -f $TM ] ; then
echo error!
exit 1
fi
cat > $TM
# mbox format, each mail end with a blank line
echo "" >> $TM
cat $TM | reformail -D 99999999 idcache
# if this mail isn't a dup (reformail return 1 if message-id is not found)
if [ $? != 0 ]; then
# each mail shall have a message-id
if grep -q -i '^message-id:' $TM; then
cat tmpmail >> nmbox
fi
fi
rm $TM

Is there a good macvim git diff?

I have been using textmate for many years and I just made the switch to macvim and one thing that I used all the time with textmate was the command git df which in my .gitconfig was just an alias for
[alias]
df = !git diff | mate
and what that did was give me a screen like this
Is there a replacement in mvim that I can add somewhere for me to get similar behavior
I describe what I use here.
Basically, add the following lines to your "~/.gitconfig":
[diff]
tool = default-difftool
[difftool "default-difftool"]
cmd = default-difftool.sh $LOCAL $REMOTE
With the following wrapper script:
#! /bin/bash
if [[ -f /Applications/MacVim.app/Contents/MacOS/Vim ]]
then
# bypass mvim for speed
VIMPATH='/Applications/MacVim.app/Contents/MacOS/Vim -g -dO -f'
elif [[ -f /usr/local/bin/mvim ]]
then
# fall back to mvim
VIMPATH='mvim -d -f'
else
# fall back to original vim
VIMPATH='vimdiff'
fi
$VIMPATH $#
You can get the diff one file at a time by doing:
git difftool -t vimdiff
vimdiff can be replaced with gvimdiff for gvim, so I would assume you can also replace it with mvimdiff for macvim.
I am not sure of a way to pipe the entirety of git diff into vim though.

Resources