Extract Unique IP sources from pcap file - wireshark

My approach is to:
1 - Use tshark and export the list in a txt file.
tshark -r file.cap -T fields -e ip.src > output.txt
2 - Use sort to delete the double ips
sort output.txt | uniq > uniqueip.txt
3 - use uniqueip.txt to count the lines with
wc -l output.txt
I noticed right after i get the output.txt has some strange formatting where some ips are in line? why are they not in a new line?
This it the output.txt
"58.176.204.64"
"180.168.211.204"
"103.248.63.253"
"216.245.214.196,146.231.254.240"
"112.104.105.79"
"216.245.214.196,146.231.254.131"
"112.104.105.79"
"10.0.61.65,146.231.254.12"

The reason why some lines contain more than 1 IP address separated by a comma is because the packet itself contains more than 1 IP header. Such is the case for tunneling protocols or for ICMP error packets whose payload contains the original IP header that caused the ICMP error packet to be sent in the first place, or for other types of packets as well.

Related

How to use grep to extract ip adresses and date/time strings from log file?

I have a log file that looks like this:
May 25 05:34:16 server sshd[1203]: Received disconnect from 192.0.2.2 port 39102:11
May 25 05:34:16 server sshd[1203]: Disconnected from 192.0.2.1 port 39102
Now i want to extract all of the ip addresses and the date/time strings at the beginning using grep. I already know how to get the ips:
grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' /log.txt
and the dates/times:
grep -o '[A-Z][a-z][a-z] [0-3][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]' /log.txt
but i don't know how to get both at the same time in a format like:
May 25 05:34:16 192.0.2.1
I've read something like:
grep -oE 'match1|match2' /log.txt
but that doesn't seem to work.
Printing two matches in the single line is easier with awk, following will print date(by printing $1,$2,$3 and all the valid IP address.
gawk '{match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/,a);split(a[0],b,".")} b[1]<=255&& b[2]<=255 && b[3]<=255 && b[4]<=255 &&length(a[0]){print $1,$2,$3, a[0]}' log_file
May 25 05:34:16 192.0.2.2
May 25 05:34:16 192.0.2.1
Explanation: First use match function to capture all the strings having format of digit.digit.digit.digit and store them into an array called "a" , then split the captured array(a) by dot(.) characters and check if each is <= 255 to ensure the IP address is valid.
Note that: GNU awk is used here.
Also note that, the regex you mentioned will also print invalid IP addresses (Eg: 333.222.555.666).
You could use your 2 patterns in a capturing group and use those in the replacement using sed:
sed -i -E 's#^([A-Z][a-z][a-z] [0-3][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]).* ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*$#\1 \2#g' log.txt
That will match:
^ Start of string
([A-Z][a-z][a-z] [0-3][0-9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]) Your date/time like pattern
.* Match any char 0+ times
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}) Match space followed your ip like pattern
.* Match any char 0+ times
$ End of string
Result
May 25 05:34:16 192.0.2.2
May 25 05:34:16 192.0.2.1
With any awk in any shell on any UNIX box:
$ awk '{print $1, $2, $3, $(NF-2)}' file
May 25 05:34:16 192.0.2.2
May 25 05:34:16 192.0.2.1

how to remove zero packets (empty streams) records in wireshark

I am very new to wireshark. in my day to day job i need to remove the packet bytes zero records from captured PCAP file. please help me in this process. attached image is for reference
wireshark packets zero.png
Since you have 47 TCP Streams and 28 that you want to remove, it might be a bit faster to filter for all the TCP streams that you do want to keep since there are only 19 of those.
For the 19 streams you want:
Right-click on the first TCP conversation and choose "Prepare a Filter -> Selected -> A<-->B".
For the next 17 TCP conversations, right-click on each one and choose "Prepare a Filter -> ... And Selected -> A<-->B".
Finally, for the last TCP stream, right-click on the TCP conversation and choose "Apply as Filter -> ... And Selected -> A<-->B".
You may wish to export the resulting filtered packets to a new file via "File -> Export Specified Packets... -> All packets:Displayed" so you won't have to keep filtering for those streams anymore.
If you have a large number of streams to filter, then you are better off scripting something. Here's a script you can use that seems to work well in my testing on my Linux machine. If you're using Windows, you will need to write an equivalent batch file, or you may be able to use it as is if you have Cygwin installed.
#!/bin/sh
# Check usage
if [ ${#} -lt 2 ] ; then
echo "Usage: $0 <infile> <outfile>"
exit 0
fi
infile=${1}
outfile=${2}
# TODO: Could also pass the filter on the command-line too.
filter="ip.dst eq 192.168.10.44 and tcp.len > 0"
stream_filter=
for stream in $(tshark -r ${infile} -Y "${filter}" -T fields -e tcp.stream | sort -u | tr -d '\r')
do
if [[ -z ${stream_filter} ]] ; then
stream_filter="tcp.stream eq ${stream}"
else
stream_filter+=" or tcp.stream eq ${stream}"
fi
done
tshark -r ${infile} -Y "${stream_filter}" -w ${outfile}
echo "Wrote ${outfile}"

How to extract full set of features from an existing pcap file using tshark or any other tool?

I am new to network traffic analysis.
I have used the following Tshark command, but no luck.
C:\Program Files\Wireshark>tshark -r C:\Users\Ravi\Desktop\IDS-augustdocuments\iscxdataset\testbed13jun.pcapCopy\split\small_00057_20100613213752.pcap separator=, -R "tcp.dat
a" -T fields frame.number -e appName -e totalSourceBytes > C:\Users\Ravi\Desktop\IDS-augustdocuments\iscxdataset\testbed13jun.pcapCopy\split\18oct.csv
tshark: "=" was unexpected in this context.
Any suggestions to extract features like Direction ( for the flows), totalSourceBytes, totalDestinationBytes, totalDestinationPackets, totalSourcePackets, sourceTCPFlagsDescription etc.
Yes. Bro IDS or Argus (Auditing Network Activit).
Argus example:
racluster -L0 -m proto -r filepcap.arg -s proto saddr daddr spkts dpkts sbytes dbytes
Proto SrcAddr DstAddr SrcPkts DstPkts SrcBytes DstBytes
udp 84.125.xxx.xxx 0.0.0.0 2634 2580 205131 317889
tcp 84.125.xxx.xxx 0.0.0.0 34143 42585 6078099 48276978
arp 84.125.xxx.xxx 84.xxx.xxx.x 3 3 126 180
Best Regards,
You have to use quotes:
separator=","
I used Bro IDS to get the required fields from the conn.log file.
1) Configure the Bro IDS
(Follow this link to install Bro IDS)
https://www.digitalocean.com/community/tutorials/how-to-install-bro-ids-2-2-on-ubuntu-12-04
2) Start the Bro IDs
3) use the command "bro -r your pcap file.pcap" and this will generate a .log files in the current directory.
4) Inspect the logs like conn.log, dns.log, http.log, etc. for different information from the pcap log file.

Plot RTT histogram using wireshark or other tool

I have a little office network and I'm experiencing a huge internet link latency. We have a simple network topology: a computer configured as router running ubuntu server 10.10, 2 network cards (one to internet link, other to office network) and a switch connecting 20 computers. I have a huge tcpdump log collected at the router and I would like to plot a histogram with the RTT time of all TCP streams to try to find out the best solution to this latency problem. So, could somebody tell me how to do it using wireshark or other tool?
Wireshark or tshark can give you the TCP RTT for each received ACK packet using tcp.analysis.ack_rtt which measures the time delta between capturing a TCP packet and the ACK for that packet.
You need to be careful with this as most of your ACK packets will be from your office machines ACKing packets received from the internet, so you will be measuring the RTT between your router seeing the packet from the internet and seeing the ACK from your office machine.
To measure your internet RTT you need to look for ACKS from the internet (ACKing data sent from your network). Assuming your office machines have IP addresses like 192.168.1.x and you have logged all the data on the LAN port of your router you could use a display filter like so:
tcp.analysis.ack_rtt and ip.dst==192.168.1.255/24
To dump the RTTs into a .csv for analysis you could use a tshark command like so;
tshark -r router.pcap -Y "tcp.analysis.ack_rtt and ip.dst==192.168.1.255/24" -e tcp.analysis.ack_rtt -T fields -E separator=, -E quote=d > rtt.csv
The -r option tells tshark to read from your .pcap file
The -Y option specifies the display filter to use (-R without -2 is deprecated)
The -e option specifies the field to output
The -T options specify the output formatting
You can use the mergecap utility to merge all your pcap files into one one file before running this command. Turning this output into a histogram should be easy!
Here's the 5-min perlscript inspired by rupello's answer:
#!/usr/bin/perl
# For a live histogram of rtt latencies, save this file as /tmp/hist.pl and chmod +x /tmp/hist.pl, then run:
# tshark -i wlp2s0 -Y "tcp.analysis.ack_rtt and ip.dst==192.168.0.0/16" -e tcp.analysis.ack_rtt -T fields -E separator=, -E quote=d | /tmp/hist.pl
# Don't forget to update the interface "wlp2s0" and "and ip.dst==..." bits as appropriate, type "ip addr" to get those.
#t[$m=0]=20;
#t[++$m]=10;
#t[++$m]=5;
#t[++$m]=2;
#t[++$m]=1;
#t[++$m]=0.9;
#t[++$m]=0.8;
#t[++$m]=0.7;
#t[++$m]=0.6;
#t[++$m]=0.5;
#t[++$m]=0.4;
#t[++$m]=0.3;
#t[++$m]=0.2;
#t[++$m]=0.1;
#t[++$m]=0.05;
#t[++$m]=0.04;
#t[++$m]=0.03;
#t[++$m]=0.02;
#t[++$m]=0.01;
#t[++$m]=0.005;
#t[++$m]=0.001;
#t[++$m]=0;
#h[0]=0;
while (<>) {
s/\"//g; $n=$_; chomp($n); $o++;
for ($i=$m;$i>=0;$i--) { if ($n<=$t[$i]) { $h[$i]++; $i=-1; }; };
if ($i==-1) { $h[0]++; };
print "\033c";
for (0..$m) { printf "%6s %6s %8s\n",$t[$_],sprintf("%3.2f",$h[$_]/$o*100),$h[$_]; };
}
The newer versions of tshark seem to work better with a "stdbuf -i0 -o0 -e0 " in front of the "tshark".
PS Does anyone know if wireshark has DNS and ICMP rtt stats built in or how to easily get those?
2018 Update: See https://github.com/dagelf/pping

Extracting n rows of text from a large csv file

I have a CSV file (foo.csv) with 200,000 rows. I need to break it into four files (foo1.csv, foo2.csv... etc.) with 50,000 rows each.
I already tried simple ctrl-v/-c using gui text editors, but the my computer slows to a halt.
What unix command(s) could I use to accomplish this task?
I don't have a terminal handy to try it out, but it should be just split -d -l 50000 foo.csv.
Hopefully the naming isn't terribly important because with the -d option, the output files will be named foo.csv00 .. foo.csv03. You can add the -a 1 option so that the suffixes are 0-3, but there's no simple way to get the suffix to be injected into the middle of the filename.
you should use head and tail.
head -n 50000 myfile > part1.csv
head -n 100000 myfile | tail -n 50000 > part2.csv
head -n 150000 myfile | tail -n 50000 > part3.csv
etc ...
Else, but with no control on file names, you can use unix command split.
sed -n 2000,4000p somefile.txt
will print from lines 2000 to 4000 to stdout.
split -l50000 foo.csv
You can use sed
I wrote this little shell script for this topic very similar at yours.
This shell script + awk works fine for me:
#!/bin/bash
awk -v initial_line=$1 -v end_line=$2 '{
if (NR >= initial_line && NR <= end_line)
print $0
}' $3
Used with this sample file (file.txt):
one
two
three
four
five
six
The command (it will extract from second to fourth line in the file):
edu#debian5:~$./script.sh 2 4 file.txt
Output of this command:
two
three
four
Of course, you can improve it, for example by testing that all argument values are the expected :-)

Resources