How to remove data link layer from pcap file? - wireshark

I'm making a script that is inspecting packets, but headers giving me a headache. I have a DSL connection/Wireless at home, and the data link layer is appearing in Wireshark capture, either PPP or WLAN depending on which one I am currently using.
I've been searching for a thsark, editcap or tcpdump or whatever tutorial but I couldn't find any.
Basically all I need: < program_name_which_nicely_removes_transport_layer > input.pcap < output.pcap_only_containing_ethernet_header+ip+tcp+data > or something similar.
I have found a program named bittwiste, but it's operating with fixed sizes as I realized, but I need something 'universal', where I don't have to determine the used link type + size.
Any help is appreciated!
Thank you in advance.

With Perl and the libraries Net::Pcap and Net::PcapWriter you could do the following to remove the PPPoE layer. This works at least for my router (fritz.box). This needs to be adapted to other encapsulations:
#!/usr/bin/perl
# usage: pcap_remove_pppoe.pl infile.pcap outfile.pcap
use strict;
use warnings;
use Net::Pcap qw(pcap_open_offline pcap_loop pcap_datalink :datalink);
use Net::PcapWriter;
my $infile = shift or die "no input file";
my $outfile = shift; # use stdout if not given
my $err;
my $pcap = pcap_open_offline($infile,\$err) or
die "failed to open $infile: $err";
my $ll = pcap_datalink($pcap);
die "expected DLT_EN10MB, got $ll" if $ll != DLT_EN10MB;
my $offset = 14; # DLT_EN10MB
# open and initialize output
my $pw = Net::PcapWriter->new($outfile);
# process packets
pcap_loop($pcap, -1, sub {
my (undef,$hdr,$data) = #_;
my $dl = substr($data,0,$offset,''); # remove data link layer
my $etype = unpack("n",substr($dl,-2,2)); # get ethernet type
$etype == 0x8864 or return; # ignore any type except PPPoE Session
substr($data,0,8,''); # remove 8 byte PPPoE layer
substr($data,-2,2,pack("n",0x0800)); # set ethernet type to IPv4
# output: data link layer with IP type + IP layer (PPPoE removed)
$pw->packet($data, [ $hdr->{tv_sec},$hdr->{tv_usec} ]);
},undef);

Related

How to find the file paths of all namespace loaded in an application using tcl/TK under Unix?

For existing flow, there would be a whole bunch of namespaces loaded when running some script job.
However, if I want to check & trace the usage of some command in some namespace, I need to find the script path of the certain namespace.
Is there some way to get that? Particularly, I'm talking about Primetime scripts.
Technically, namespaces don't have script paths. But we can do something close enough:
proc report_current_file {call code result op} {
if {$code == 0} {
# If the [proc] call was successful...
set cmd [lindex $call 1]
set qualified_cmd [uplevel 1 [list namespace which $cmd]]
set file [file normalize [info script]]
puts "Defined $qualified_cmd in $file"
}
}
trace add execution proc leave report_current_file
It's not perfect if you've got procedures creating procedures dynamically — the current file might be wrong — but that's fortunately not what most code does.
Another option that might work for you is to use tcl::unsupported::getbytecode, which produces a lot of information in machine-readable format (a dictionary). One of the pieces of information is the sourcefile key. Here's an example running interactively on my machine:
% parray tcl_platform
tcl_platform(byteOrder) = littleEndian
tcl_platform(engine) = Tcl
tcl_platform(machine) = x86_64
tcl_platform(os) = Darwin
tcl_platform(osVersion) = 20.2.0
tcl_platform(pathSeparator) = :
tcl_platform(platform) = unix
tcl_platform(pointerSize) = 8
tcl_platform(threaded) = 1
tcl_platform(user) = dkf
tcl_platform(wordSize) = 8
% dict get [tcl::unsupported::getbytecode proc parray] sourcefile
/opt/local/lib/tcl8.6/parray.tcl
Note that the procedure has to be already defined for this to work. And if Tcl's become confused about what file the code was in (because of dynamic programming trickery) then that key is absent.

What standard input and output would be if there's no terminal connected to server?

This question came up into my mind when I was thinking about ways of server logging yesterday.
Normally, we open a terminal connected to local computer or remote server, run an executable, and print (printf, cout) some debug/log information in the terminal.
But for those processes/executables/scripts running on the server which are not connected to a terminal, what are the standard input and output?
For example:
Suppose I have a crontab task, running a program on the server many times a day. If I write something like cout << "blablabla" << endl; in the program. What's gonna happen? Where those output will flow into?
Another example I came up and wanted to know is, if I write a CGI program (use C or C++) for let's say a Apache web server, what is the standard input and output of my CGI program ? (According to this C++ CGI tutorial, I guess the standard input and output of the CGI program are in some ways redirected to the Apache server. Because it's using cout to output the html contents, not by return. )
I've read this What is “standard input”? before asking, which told me standard input isn't necessary to be tied to keyboard while standard output isn't necessary to be tied to a terminal/console/screen.
OS is Linux.
The standard input and standard output (and standard error) streams can point to basically any I/O device. This is commonly a terminal, but it can also be a file, a pipe, a network socket, a printer, etc. What exactly those streams direct their I/O to is usually determined by the process that launches your process, be that a shell or a daemon like cron or apache, but a process can redirect those streams itself it it would like.
I'll use Linux as an example, but the concepts are similar on most other OSes. On Linux, the standard input and standard output stream are represented by file descriptors 0 and 1. The macros STDIN_FILENO and STDOUT_FILENO are just for convenience and clarity. A file descriptor is just a number that matches up to some file description that the OS kernel maintains that tells it how to write to that device. That means that from a user-space process's perspective, you write to pretty much anything the same way: write(some_file_descriptor, some_string, some_string_length) (higher-level I/O functions like printf or cout are just wrappers around one or more calls to write). To the process, it doesn't matter what type of device some_file_descriptor represents. The OS kernel will figure that out for you and pass your data to the appropriate device driver.
The standard way to launch a new process is to call fork to duplicate the parent process, and then later to call one of the exec family of functions in the child process to start executing some new program. In between, it will often close the standard streams it inherited from its parent and open new ones to redirect the child process's output somewhere new. For instance, to have the child pipe its output back to the parent, you could do something like this in C++:
int main()
{
// create a pipe for the child process to use for its
// standard output stream
int pipefds[2];
pipe(pipefds);
// spawn a child process that's a copy of this process
pid_t pid = fork();
if (pid == 0)
{
// we're now in the child process
// we won't be reading from this pipe, so close its read end
close(pipefds[0]);
// we won't be reading anything
close(STDIN_FILENO);
// close the stdout stream we inherited from our parent
close(STDOUT_FILENO);
// make stdout's file descriptor refer to the write end of our pipe
dup2(pipefds[1], STDOUT_FILENO);
// we don't need the old file descriptor anymore.
// stdout points to this pipe now
close(pipefds[1]);
// replace this process's code with another program
execlp("ls", "ls", nullptr);
} else {
// we're still in the parent process
// we won't be writing to this pipe, so close its write end
close(pipefds[1]);
// now we can read from the pipe that the
// child is using for its standard output stream
std::string read_from_child;
ssize_t count;
constexpr size_t BUF_SIZE = 100;
char buf[BUF_SIZE];
while((count = read(pipefds[0], buf, BUF_SIZE)) > 0) {
std::cout << "Read " << count << " bytes from child process\n";
read_from_child.append(buf, count);
}
std::cout << "Read output from child:\n" << read_from_child << '\n';
return EXIT_SUCCESS;
}
}
Note: I've omitted error handling for clarity
This example creates a child process and redirects its output to a pipe. The program run in the child process (ls) can treat the standard output stream just as it would if it were referencing a terminal (though ls changes some behaviors if it detects its standard output isn't a terminal).
This sort of redirection can also be done from a terminal. When you run a command you can use the redirection operators to tell your shell to redirect that commands standard streams to some other location than the terminal. For instance, here's a convoluted way to copy a file from one machine to another using an sh-like shell:
gzip < some_file | ssh some_server 'zcat > some_file'
This does the following:
create a pipe
run gzip redirecting its standard input stream to read from "some_file" and redirecting its standard output stream to write to the pipe
run ssh and redirect its standard input stream to read from the pipe
on the server, run zcat with its standard input redirected from the data read from the ssh connection and its standard output redirected to write to "some_file"

Get application data in net frame via tshark command line

Here I need parse a custom protocol in many .pcapng files , I want direct filter and output the application raw data via tshark command .
At first , I use the "-e data.data" option , but ,some of the application data could be decode as other protocol , and wouldn't be output by -e data.data.
Then , I find a way that special the "disable-protocol" file under wireshark profile folder,but ,I must take the profile file and deploy it before run the parse program on other PC.
And, I tried disable all the protocol except udp and tcp ,but it can't work.
I also disable the known conflict protocols , it works ,but there may be same mistake on other unknown protocol and the tshark's output still can't be trust completely.
I works on Windows7 and wireshark 2.2.use python 2.7 for parse work.
In a summary , what I want is a portable command line that can flexible and direct output all data after UDP information in a net frame.
could I disable decode on some ports by just add options in command line?
EDIT1:
I find in wireshark 1.12,there is a "Do not decode" option in "decode as..." dialog , if enable it,the display is what I want.but in wireshark 2.2,they removed the option.and I still need a command line to do this filter.
After 48 hours and 26 times viewed ,it still no response but one vote up.
I already give up this way, and decode the frame by myself.
what I want is the udp srcport and dstport, and the application data.
In actual , every net frame has a same length of header , so ,it's easy to strip the header by a fixed offset , and get the special data.
In my case , I just do some filter and use -x option for output.,as this:
tshark -r xxx.pcapng -j udp -x
the output may looks like this:
(just for example,not real case)
Every line contains three parts :The first column is offset reference, the next 16 columns are bytes in hex , and the remains are the characters map to the data.
My code:
def load_tshark_data(tshark_file_path):
tshark_exe = "c:/Program Files/Wireshark/tshark.exe"
output = subprocess.check_output([
tshark_exe,
"-r",tshark_file_path,
"-j","udp",
"-x"
])
hex_buff = ""
line_buff = ""
for c in output:
if c == "\n":
if len(line_buff) > 54:
hex_buff += line_buff[5:53]
line_buff = ''
else:
src_port = int(hex_buff[0x22*3 : 0x24*3].replace(" ",""),16)
dst_port = int(hex_buff[0x24*3 : 0x26*3].replace(" ",""),16)
app_data = hex_buff[0x2a*3 : ].strip(" ")
hex_buff = ""
yield [src_port, dst_port, app_data]
else:
line_buff += c
hope this can help any one also blocked by such a problem

ipython redirect stdout display corruption

I'm developing a system in python, and one functionality I need is the ability to have console output go to both the console and a user-specified file. This is replicating the Diary function in MATLAB. I have the following that works perfectly well on both IDLE on windows and python cmdline in ubuntu (this all exists inside a module that gets loaded):
class diaryout(object):
def __init__(self):
self.terminal = sys.stdout
self.save = None
def __del__(self):
try:
self.save.flush()
self.save.close()
except:
# do nothing, just catch the error; maybe it self was instantiated, but never opened
1/1
self.save = None
def dclose(self):
self.__del__()
def write(self, message):
self.terminal.write(message)
self.save.write(message)
def dopen(self,outfile):
self.outfile = outfile
try:
self.save = open(self.outfile, "a")
except Exception, e:
# just pass out the error here so the Diary function can handle it
raise e
def Diary(outfile = None):# NEW TO TEST
global this_diary
if outfile == None:
# None passed, so close the diary file if one is open
if isinstance(this_diary, diaryout):
sys.stdout = this_diary.terminal # set the stdout back to stdout
this_diary.dclose() # flush and close the file
this_diary = None # "delete" it
else:
# file passed, so let's open it and set it for the output
this_diary = diaryout() # instantiate
try:
this_diary.dopen(outfile) # open & test that it opened
except IOError:
raise IOError("Can't open %s for append!"%outfile)
this_dairy=none # must uninstantiate it, since already did that
except TypeError:
raise TypeError("Invalid input detected - must be string filename or None: %s"%Diary.__doc__)
this_dairy=none # must uninbstantiate it, since already did that
sys.stdout = this_diary # set stdout to it
Far superior to both IDLE and the plain python cmline, I'm using ipython; herein my problem lies. I can turn on the "diary" perfectly fine with no error but the display on the console gets messed. The attached screenshot shows this . The output file also becomes similarly garbled. Everything goes back to normal when I undo the redirection with Diary(None). I have tried editing the code so that it never even writes to the file, with no effect. It seems almost like something is forcing an unsupported character set or something I don't understand.
Anyone have an idea about this?

How to get netmask?

I know how to get from ifconfig. (linux)
But is there another way? Can found it in socket.
You need to use IO#ioctl. This is totally non-portable. On my linux box this code words:
require 'socket'
sock = Socket.new(Socket::AF_INET, Socket::SOCK_DGRAM,0)
buf = ["eth0",""].pack('a16h16')
sock.ioctl(0x891b, buf)
netmask = "#{buf[20]}.#{buf[21]}.#{buf[22]}.#{buf[23]}" #=> "255.255.255.240"
Ioctl differs considerably between systems and I had to look through a few system header files to get the right sizes for the [].pack, the location of the address in buf and the numeric value for SIOCGIFBRDADDR (the first argument to ioctl).
If these values don't work for you I can give you more information on how to find them.

Resources