How is data divided into packets? - network-programming

Hi sorry if this is a stupid question (I just started learning network programming), but I've been looking all over google about how files/data are divided into packets. I've read everywhere that somehow files are broken up into packets have headers/footers applied as they go through the OSI model and are sent through the wire where the recipient basically does the reverse and removes the headers.
My question is how exactly are files/data broken up into packets and how are they reassembled at the other end?
How does whatever doing the reassembling know when the last packet of the data has arrived and etc?
Is it possible to reassemble packets captured from another machine? And if so how?
(Also if it means anything I'm mostly interested in how this work for TCP type packets)
I also have some packets captured from an application on my computer through WireShark, they're labeled as TCP protocol, what I want to do is reassemble them back into the original data, but how can you tell which packets belong to which set of data?
Any pointers towards resources is much appreciated, thank you!

My question is how exactly are files/data broken up into packets
What's being sent over a network isn't necessarily a file. In the cases where it is a file, there are several different protocols that can send files, and the answer to the question depends on the protocol.
For FTP and HTTP, the entire contents of the file is probably being sent as a single data stream over TCP (preceded by headers in the case of HTTP, and just raw, over the connection, in the case of FTP).
For TCP, there's a "maximum segment size" negotiated by the client and server, based on factors such as the maximum packet size on the various networks between the server and client, and the file data is sent, sequentially, in chunks whose size is limited by the maximum packet size and the size of IP and TCP headers.
For remote file access protocols such as SMB, NFS, and AFP, what goes over the wire are "file read" and "file write" requests; the reply to a "file read" request includes some reply headers and, if the read is successful, the chunk of file data that the read request asked for, and a "file write" request includes some request headers and the chunk of file data being written. Those are not guaranteed to be an entire file, in order, but if the program reading or writing the file is reading or writing the entire file in sequential order, the entire file's data will be available. The packet sizes will depend on the size of the read reply/write request headers and on the read or write size being used; those packets might be broken into multiple TCP segments, based on the TCP "maximum segment size" and the size of the IP and TCP headers.
My question is how exactly are files/data broken up into packets
For FTP, the recipient of the data knows that there is no more data when the side of the TCP connection over which the data is being transmitted is closed.
For HTTP, the recipient of the data knows that there is no more data when the side of the TCP connection over which the data is being transmitted is closed or, if the connection is "persistent" (i.e., it remains open for more requests and replies), when the amount of data specified by the "Content-Size:" header, sent before the data, has been transmitted (or other similar mechanisms, such as the "last chunk" indication for chunked encoding).
For file access protocols, there's no real "we're at the end of data" indication; the closest approximation, for SMB, AFP, and NFSv4, is a "file close" operation.
Is it possible to reassemble packets captured from another machine? And if so how?
It depends on the protocol, but, for HTTP and SMB, if the capture has been read into Wireshark (and all the file data is in the capture!), you can use the "Export Objects" menu, and, for some protocols, you might also be able to use tcpflow.

My question is how exactly are files/data broken up into packets and how are they reassembled at the other end?
They are basically just chopped up. Each internet packet (with header info add) can only hold a few hundred bytes of actual data.
How does whatever doing the reassembling know when the last packet of the data has arrived and etc?
For a transfer the packets are numbered, so the receiving process knows how to put them together. If it loses a packet, it can request a resend.
Is it possible to reassemble packets captured from another machine? And if so how?
I don't understand the question. How would you get these packets unless you were a man-in-the-middle?
These answers are true for TCP packets.

First determine what size you want to transmit.
then put header, data and footer for each transmission.
See buffer length and data array should be divisible by number of packets without giving fractions.
Here header should contain frame number, time stamp, packet number
payload data
footer ---your company information.
prepare data fragments before sending

Related

Can I remove a packet payload inside a .p4 program?

I would like to know if it's possible to completely remove a packet payload from a packet inside a .p4 program or at least modify it to random data. The reason behind this is that I'm cloning a packet and sending it to a different host (monitor) and this host does not need the packets payload.
Depends on what are you trying to do. If you would like to remove the some kind of header then it's enough to call
hdr.random_header.setInvalid()
if you call that in Egress it should remove fields of the header from the packet.
If you have len fields in headers you might also use
truncate(new_size)
when you know the size of packet without payload. If you already know easier option please share it here.

Google Nearby connections - Not able to transfer large bytes between 2 devices

When I try to send an object with multiple images(converted to string using Base64) as STREAM type, from the onPayloadTransferUpdate() method, I can see "Failure" result and the devices(tested only when 2 devices are connected) automatically disconnect after that. Is Google Nearby connections not the right option to send large bytes?
Nearby Connections should be able to handle that. There's no explicit size limit on STREAM payloads.
I would suggest chunking the bytes (eg. send a couple KB at a time) and seeing if that helps. You can get into weird situations when you send entire files at once because it loads the bytes into memory twice (once inside your app, and once inside the Nearby process) which can cause out of memory errors. Binder, the interprocess communication layer on Android, also has a limited buffer to send data between processes.
You can also save it as a temporary file and send it as a FILE payload, in which case we will handle the chunking for you.
Disclaimer: I work on Nearby Connections.
1) You don't need to Base64-encode the data for the sake of Nearby Connections -- your STREAM can have raw binary data, and that'll work just fine.
2) How big is this data you're sending, and at what byte offset (you can see this in the PayloadTransferUpdate you get with Status.ERROR) does it fail at? It sounds like your devices are just getting disconnected.
3) What Strategy are you using?
4) If you still have discovery ongoing (i.e. you haven't called stopDiscovery()), try stopping that and then sending your Payload -- discovery is a heavyweight operation that can make it hard to reliably maintain connections between devices for long intervals.

Missing bytes on IdUDPServer.OnRead event in buffer array - Delphi XE3

Can't seem to find anywhere informations about this, but, is TIdUDPServer.OnRead event passing everything that comes in to the AData array or not?
According to WireShark readings, I'm missing 42 bytes of data; While I should be getting 572 bytes of data on each reading, the AData size is always 530, and seems like always the same bytes are missing.
The device that sends data is broadcasting it, and I can get everything I need except for 2 bytes, which seems to be 2 of those that are missing.
Any hints on this one?
Edit:
I should mention that these are the very first 42 bytes; Everything afterwards is received fine;
The OnUDPRead event passes everything the socket receives from the OS. UDP operates on messages. Unlike TCP, a UDP read is an all-or-nothing operation, either a whole UDP message is read or an error occurs, there is no in-between.
If you are missing data, then either the OS is not providing it (such as if it belongs to the UDP and/or IP headers), or you are not reading data from the AData parameter correctly. If you think this is not the case, then you need to update your question to show your actual OnUDPRead handler code, an example WireShark dump showing the data being captured from the network, and the data that is making it to your OnUDPRead handler.
Update: The OS does not provide access to the packet headers (unless you are using a RAW socket, which TIdUDPServer does not use, but that is a whole other topic of discussion). The AData parameter of the OnUDPRead event provides only the application data portion of a packet, as that is what the OS provides. You cannot access the packet headers.
That being said, you can get the packet's source IP:Port, at least, via the ABinding.PeerIP and ABinding.PeerPort properties of the OnUDPRead event. However, there is no way to retrieve the other packet header values (nor should you ever need them in most situations), unless you sniff the network yourself, such as with a pcap library.

How to know whether this is the last packet sent from a bulletin board system

I am writing a bulletin board system (BBS) reader on ios. I use GCDAsyncSocket library to handle packets sending and receiving. The issue that I have is the server always splits the data to send into multiple packets. I can see that happens by printing out the receiving string in didReceiveData() function.
From the GCDAsyncSocket readme, I understand TCP is a stream. I also know there are some end of stream mechanisms, such as double CR LFs at the end. I have used WireShark to parse the packets, but there is no sign of some sort of pattern in the last data packet. The site is not owned by me, so I couldn't make it to send certain bytes. There must be some way to detect the last packet, otherwise how BBS clients handle displaying data?
Double CR LFs are not end of stream. That is just part of the details of HTTP protocol, for example, but has nothing to do with closing the stream. HTTP 1.1 allows me to send multiple responses on a single stream, with double CR LF after HTTP header, without end of stream.
The TCP socket stream will return 0 on a read when it is closed from the other end, blocking or non-blocking.
So assuming the server will close the socket when it is done sending to you, you can loop and perform a blocking read and if returns > 0, process the data, then read again. if < 0, process the error code (could be fatal or not), and if == 0, socket is closed from the other side, don't read again.
For non-blocking sockets, you can use select() or some other API to detect when the stream becomes ready to read. I'm not familiar with the specific lib you are using but if it is a POSIX / Berkeley sockets API, it will work that way.
In either case, you should build a buffer of input, concatenating the results of each read until you are ready to process. As you've found, you can't assume that a single read will return a complete application level packet. But as to your question, unless the server wants you to close the socket, you should wait for read to return 0.

IOS NSInputStream

I got a problem when using NSInputStream.
I have client app which connect to a server then server will start to send message to my client app through TCP repeatedly about 1 message per second. Server is just broadcasting message to client and message is xml format. The server send a message as one packet.
Now the problem is that when I read byte from NSInputStream the data got truncated which mean instead of receive 1 complete message, I got 2 separate data(partial xml) respond from time to time. I am not able to debug because it already happen when I read data byte from NSInputStream.
I use Wireshark to analyse every packet I receive and when it happen data got truncated too, because TCP so partial data retransmit to my client.
I have tried to log every partial data byte, the sum of partial data always around 1600 byte.
I have no idea how did they design and implement server side, but I do know there are many of people connect to that server and continuous get broadcasting message from it.
Does anyone encounter this problem? Can anyone help? Is it possible that data is over the max size and get splited?
This is not a problem per se. It is part of the design of TCP and also of NSInputStream. You may receive partial messages. It's your job to deal with that fact, wait until you receive a full message, and then process the completed message.
1600 bytes is a little strange. I would expect 1500 bytes, since that's the largest legal Ethernet packet (or especially somewhere around 1472, which is a really common MTU, minus some for the headers). Or I might expect a multiple of 1k or 4k due to buffering in NSInputStream. But none of that matters. You have to deal with the fact that you will not get messages necessarily at their boundaries.

Resources