Strip parameter values from URL - parsing

I'd like to strip parameter values from a URL, but leave the parameter names in place: I.e.,
Change
http://abc.def.edu/pager/page.cfm?pai=97878&pager=123
into
http://abc.def.edu/pager/page.cfm?pai=&pager=
I've tried:
sed "s/=.*\&/=\&/g"
With no success. Am I getting close? I've seen lots of posts about stripping parameters entirely, but nothing about just stripping the values. Please redirect me and accept my apologies if this has already been addressed.
Thanks,
Al

$ sed -r 's/=[^\&]+/=/g' <<< 'http://abc.def.edu/pager/page.cfm?pai=97878&pager=123'
OUTPUT:
http://abc.def.edu/pager/page.cfm?pai=&pager=

A bit more heavy-weight than sed:
$ perl -pe 's/(?<==).+?(?=&|$)//g' <<< "$url"
http://abc.def.edu/pager/page.cfm?pai=&pager=

Related

How can I grep for a hyphenated string within a very long single line string?

I am a Windows guy, so I am out of my element here. I apologize if this is a stupid question, but I have failed to obtain an answer after six hours of research, trial and error. It would make my week if a pro could weigh in!!
This is my string:
server ITW-F280 ITW-F280HQ-AV up numProc=1 numFD=5 mem=23720kB ITW-F280HQ-DHC up numProc=2 numFD=16 mem=47040kB ITW-F280HQ-NGF up numProc=1 numFD=4 mem=117424kB ITW-F280HQ-VPN up numProc=11 numFD=118 mem=2880536kB
The string within the aforementioned string that I wish to grep is ITW-F280HQ-AV. For context, this value changes depending on which firewall the command is executed on, but the one commonality is the HQ-AV tail—everything before that is different depending on the environment. In this example there are two hyphens, but other firewalls may only have the one common hyphen.
Here's what I have tried so far, to no avail...
Returns nothing:
grep '\b\W[[:space:]]*HQ-AV[[:space:]]'
Returns the entire string, unfiltered:
grep '\b\W[[:space:]]*AV[[:space:]]'
Returns the entire string, unfiltered:
grep '[[:space:]]*AV\b'
Returns only -F280HQ-AV, so this theoretically would work on firewalls w/ only one hyphen:
grep -o '\-\w*-AV\b
I have tried hundreds of combinations—far too many to list here. I have reviewed the documentation, but due to my weak Linux background, I am firing in the dark.
Any help would be GREATLY appreciated!! :)
I was able to solve this by trimming everything before the desired string, leaving ONLY this:
ITW-F280HQ-AV up numProc=1 numFD=5 mem=23720kB ITW-F280HQ-DHC up numProc=2 numFD=16 mem=47040kB ITW-F280HQ-NGF up numProc=1 numFD=4 mem=117424kB ITW-F280HQ-VPN up numProc=11 numFD=118 mem=2880536kB
And then I used sed to trim everything after the first space:
sed 's/\s.*$//'
Hope this helps someone!

bitbucket API 2.0 page parameters using non-default pagelen

I have run into a cumbersome limitation of the bitbucket API 2.0 - I am hoping there is a way to make it more usable.
When one wants to retrieve a list of repositories from the bitbucket API 2.0, this url can be used:
https://api.bitbucket.org/2.0/repositories/{teamname}
This returns the first 10 repos in the list. To access the next 10, one simply needs to add a page parameter:
https://api.bitbucket.org/2.0/repositories/{teamname}?page=2
This returns the next 10. One can also adjust the number of results returned using the pagelen parameter, like so:
https://api.bitbucket.org/2.0/repositories/{teamname}?pagelen=100
The maximum number can vary per account, but 100 is the maximum any team is able to request with each API call. The cumbersome part is that I cannot find a way to get page 2 with a pagelen of 100. I have tried variations on the following:
https://api.bitbucket.org/2.0/repositories/{teamname}?pagelen=100&page=2
https://api.bitbucket.org/2.0/repositories/{teamname}?page=2&pagelen=100
I've also tried using parameters such as limit or size to no avail. Is the behavior I seek even possible? Some relevant documentation can be found here.
EDIT: It appears this behavior is possible, however the bitbucket 2.0 API will only recognize multiple parameters if the entire url is in quotes.
Example:
curl "https://api.bitbucket.org/2.0/repositories/{teamname}?pagelen=100&page=2"
ORIGINAL ANSWER: I was able to get around this by creating a bash script that looped through each page of 10 results, adding each new 10 repos to a temporary file and then cloning into those 10 repos. The only manual thing that needs to be done is to update the upper limit in the for loop to be the last page expected.
Here is an example script:
for thisPage in {1..23}
do
curl https://api.bitbucket.org/2.0/repositories/[organization]?page=$thisPage -u [username]:[password] > repoinfo
for repo_name in `cat repoinfo | sed -r 's/("slug": )/\n\1/g' | sed -r 's/"slug": "(.*)"/\1/' | sed -e 's/{//' | cut -f1 -d\" | tr '\n' ' '`
do
echo "Cloning " $repo_name
git clone https://[username]#bitbucket.org/[organization]/$repo_name
echo "---"
done
done
Much help was gleaned from:
https://haroldsoh.com/2011/10/07/clone-all-repos-from-a-bitbucket-source/
and http://adomingues.github.io/2015/01/10/clone-all-repositories-from-a-user-bitbucket/ Thanks!

Search string occurrence and display directory wise count

We have a error log directory structure wherein we store all errors log files for a particular day in datewise directories -
errorbackup/20150629/errorlogFile3453123.log.xml
errorbackup/20150629/errorlogFile5676934.log.xml
errorbackup/20150629/errorlogFile9812387.log.xml
errorbackup/20150628/errorlogFile1097172.log.xml
errorbackup/20150628/errorlogFile1908071_log.xml
errorbackup/20150627/errorlogFile5675733.log.xml
errorbackup/20150627/errorlogFile9452344.log.xml
errorbackup/20150626/errorlogFile6363446.log.xml
I want to search for a particular string in the error log file and get the output such that I will get directory wise search result of a count of that string's occurrence. For example grep "blahblahSQLError" should output something like-
20150629:0
20150628:0
20150627:1
20150626:1
This is needed because we fixed some errors in one of the release and I want to make sure that there are no occurrences of that error since the day it was deployed to Prod. Also note that there are thousands of error log files created every day. Each error log file is created with a random number in its name to ensure uniqueness.
If you are sure the filenames of the log files will not contain any "odd" characters or newlines then something like the following should work.
for dir in errorbackup/*; do
printf '%s:%s\n' "${dir#*/}" "$(grep -l blahblahSQLError "$dir/"*.xml | wc -l)"
done
If they can have unexpected names then you would need to use multiple calls to grep and count the matching files manually I believe. Something like this.
for dir in errorbackup/*; do
_dcount=0;
for log in "$dir"/*.xml; do
grep -l blahblahSQLError "$log" && _dcount=$((_dcount + 1));
done
done
Something like this should do it:
for dir in errorbackup/*
do
awk -v dir="${dir##*/}" -v OFS=':' '/blahblahSQLError/{c++} END{print dir, c+0}' "$dir"/*
done
There's probably a cuter way to do it with find and xargs to avoid the loop and you could certainly do it all within one awk command but life's too short....

jq substring gives "jq: error: Cannot index string with object"

Problem
I'm trying to filter a json JQ result to only show a substring of the original string. For example if a JQ filter grabed the value
4ffceab674ea8bb5ec421c612536696839bbaccecf64e851dfc270d795ee55d1
I want it to only return the first 10 characters 4ffceab674.
What I've tried
On the Official JQ website you can find an example that should give me what I need:
Command: jq '.[2:4]'
Input: "abcdefghi"
Output: "cd"
I've tried to test this out with a simple example in the unix terminal:
# this works fine, => "abcdefghi"
echo '"abcdefghi"' | jq '.'
# this doesn't work => jq: error: Cannot index string with object
echo '"abcdefghi"' | jq '.[2:4]'
So, it turns out most of these filters are not yet in the released version. For reference see issue #289
What you could do is download the latest development version and compile from source. See download page > From source on Linux
After that, if indexing still doesn't work for strings, you should, at least, be able to do explode, index, implode combination, which seems to have been your plan.
Looking at the jq-1.3 manual I suspect there isn't a solution using that version since it offers no primitives for extacting parts of a string.

using variable external like searching pattern in awk

I have next example of script:
The proposal is to find ID = 67109AB inside "file.txt", using an external variable, in this case, it's called: var.
But, when I run script, it doesn't take value of variable like search pattern.
So, someone can help me to know if there is missing something?
Thanks for your help.
fu="/67109AB/"
awk -v var="$fu" '
var {
print $0
}' file.txt
I think you're looking for something like this:
fu="67109AB"; awk -v var="$fu" '$0 ~ var' file.txt

Resources