I have a huge pcap file. I want to know facebook usage in terms of data transfered (upload, download). For that, I am using wireshark to read this file. From a question on stackoverflow , there are many fields that can be used to find bytes.
frame.len==243
ip.len=229
udp.length==209
data.len=201
Now, I have test frame.len and ip.len both gives different results. What I should consider correct ? I am a newbie in networks terminology and I have to just find correct data transfered.
What happens, when you connects to server and requests some simple page:
Server application generates requested data (e.g. <body>Hello world</body> string) and passes it to HTTP layer
HTTP layer generates necessary header according to RFC (specifies HTTP version, status code, content type etc), prepends it to generated data and pass everything to TCP layer
TCP layer may break data into more than one pieces (not our case, message is already too small) and prepend necessary info for transport layer to each piece (src/dst port number, sequence number, some flags, checksum etc), then passes it to IP level
IP layer prepends necessary info for routing (source/dest addresses, TTL and other stuff), then passes it to lower layer (e.g. Ethernet)
Ethernet adds its part (MAC addresses, maybe VLAN tags) and pushes all to physical device
Resulted data is sent byte-by-byte from server's NIC to network
So your question is actually up to you. What do you want to measure? Is it "data, which I need to display excluding all auxiliary info"? Or is it "all number of bytes I need to send/receive for getting this lovely cat picture"? Here is a list of fields to get size of each part:
To get data lenght only (as string, unfortunately): http.content_length_header == "606"
To get (data + HTTP header) length: tcp.len == 973
To get (data + HTTP + TCP + IP layers): ip.len=1013
To get every byte sent: frame.len == 1027
If you want to measure bandwidth occupation, use frame.len. If you're interested in "pure site weight", it should be independent from environment, so use http.content_length_header. Things might become more complicated on high level considering the following:
Era of HTTPS means you can't easily observe HTTP content in traces, so tcp.len might be the highest option
Some data (e.g. audio, video) is transferred over different protocol stack (e.g. IP - UDP - RTP)
Related
Read a lot of specifications and still can't get a simple thing.
All UDS requests encapsulated in ISO-TP packets, which are encapsulated in simple CAN frames, so ECU constantly receives a stream of frames from CAN bus.
How does ECU decide that this CAN frame is a part of any high-level protocol?
For example, I've sent Security request to ECU, CAN frame data will look like this
02 27 01
How does ECU determine that this is not just a chunk of data but a part of the protocol?
I wasn't able to find any relation to ISO/OSI stack when high-level protocols "talk to each other" using headers, so we know how to decode data packets.
The CAN message IDs that are used for specific protocols are defined per system.
In most cases the OBD-II will be sent over CAN ID 7DFh for the query and higher IDs for responses from different modules, but even that might be different on specific car models.
One way of figuring out the CAN IDs that are used for UDS-based communication is to send simple tester-present (SID 3Eh) messages and watching for CAN IDs which seems to have an appropriate response.
UDS via CAN is specified in the DoCAN ISO-15765-2 part and describe the network and transport layer for a functional (broadcast) and physical (p2p) communication between ECUs or better control functions.
Normal CAN id's don't implement any network functionalities like a addressing. For that purpose the SAE J1939 network layer is used. In a J1939 network each CAN client has a address and each functionality a parameter group (PGN). All of this is encoded in a 29bit CAN ID. For example the CAN ID 0x18EF8081. This will transport a message from the CAN client 0x81 to 0x80 via the PGN 0xEF, 0x18 is the priority.
In UDS over CAN the PGNs 0xDA (physical) and 0xDB (functional) are used for all communications. With that information's you can implement a CAN ID filter which match only the PGN part of the CAN ID.
I basically understand the concept of PDO mapping in CANopen networks. It allows to to broadcast real-time data with small header.
how it is made? How do I setup my devices to know how to send/receive PDO's? Do I need some kind of software for that?
A lot the answers to your questions depend on the specific devices you are using, but in general...
Do I need some kind of software for that?
You do not need specialized software to configure a CANopen device. They can be configured over CANbus using SDOs. A USB CANbus dongle is more than sufficient although manuually constructing SDOs is tedious. Companies exist that provide software to configure any CANopen device e.g. Vector. Often vendors will provide a specialized GUI to configure their devices e.g. AMC's DriveWare. If one is available you should probably use it.
How is it made?
PDOs (Process Data Objects) in contrast to SDOs (Service Data Objects) do not include meta-data about the contents of the message and TPDOs may be transmitted without a specific request from the master. This allows PDOs to use the bus more efficiently. The trick is that the contents of PDO messages must be agreed upon ahead of time. This agreement is specified using the PDO Communications Parameters and PDO Mapping Parameters entries of your devices Object Dictionary. How they may be configured or if they can be configured at all is device dependent. Most commonly PDOs can be configured at run-time during pre-operational mode through SDOs. Though this may be (and is likely to be) unnecessary if the defaults provided by your device are sufficient.
The contents of a PDO is configured through its corresponding "Mapping Parameters" in the devices Object Dictionary. TPDO Mapping parameters start at index 0x1A00. TPDO0 corresponds to 0x1A00, TPDO1 to 0x1A01 etc.
The mappings are held in the sub-indexes and are encoded as 32-bit unsigned integers. The format is first the 16-bit index then the 8-bit sub-index and lastly the size in bits of the parameter to use. The granularity of the size is device dependent. Some may only provide byte level granularity. E.g. if you had a REAL32 variable in the object dictionary at 0x2000,0x02 you wanted sent as the only parameter of TPDO0 you would set 0x1A00,0x01 to 0x20000220. RPDOs are configured in the same fashion with their indexes starting at 0x1600.
The next piece in the puzzle are the communication parameters. RPDOs usually do not need to be configured in this fashion. TPDOs do need configuration. The indexes start at 0x1800 and correspond to the TPDOs in the same fashion as the mapping parameter indexes.
COBID (0x01) UNSIGNED32 Arbitration/COB-ID PDO will use.
XMIT_TYPE (0x02) UNSIGNED8 When PDO is transmitted
INHIBIT_TIME (0x03) UNSIGNED16 Minimum time between PDO messages (useconds)
EVENT_TIME (0x05) UNSIGNED16 Timeout for sending (mseconds)
PDO message layout takes the associated TPDOnCOMPARAM,COBID for the arbitration ID and appends each of the mapped parameters from TPDOnMAPPARAMS. For TPDOs this is done internally by the device and is sent. For RPDOs the master does this, sends the PDO and device decodes the message writing each parameter to its Object Dictionary.
How do I setup my devices to know how to send/receive PDO's?
The default connection set includes four TPDOs (transmitted from node), and four RPDOs (received by node). More can be specified (up to 512 each) depending on your device.
PDOs are only transmitted/received when a CANopen node is brought into "Operational Mode". To do this you need to send an NMT (Network ManagemenT) start command (Code Specifier = 1). Using 0 for the node ID indicates a broadcast message that all nodes will respond to.
NMT Messages:
Have a COB-ID of 0
Have a payload of 2 bytes
NMT Message Format (CAN-bus payload):
+--------------------------+
| Code Specifier | Node ID |
+----------------+---------+
| ff | ff |
+----------------+---------+
Well, even Embarcadero states that it is not guaranteed to return accurate result of the bytes ready to read in the socket buffer, but if you look at it, when you place -1 at Socket.ReceiveBuf (this is what ReceiveLength wraps) it calls ioctlsocket with FIONREAD to determine the amount of data pending in the network's input buffer that can be read from socket s.
so, how is it not safe or bad ?
e.g: ioctlsocket(Socket.SocketHandle, FIONREAD, Longint(i));
The documentation you mention specifically says (emphasis mine)
Note: ReceiveLength is not guaranteed to be accurate for streaming socket connections.
This means that the length is not known ahead of time because it's being supplied by a stream of data. Obviously, if you don't know how big the data is that's being sent ahead of time, you can't properly set the length the client should expect.
Consider it like generic code to copy a file. If you don't know ahead of time how big the file is you'll be copying, you can't predict how many bytes you'll be copying. In the case of the socket, the stream size that's supplying the socket isn't known in advance (for instance, for data being generated real-time and sent), so there's no way to inform the client socket how much to expect.
I am about to write a message protocol going over a TCP stream. The receiver needs to know where the message boundaries are.
I can either send 1) fixed length messages, 2) size fields so the receiver knows how big the message is, or 3) a unique message terminator (I guess this can't be used anywhere else in the message).
I won't use #1 for efficiency reasons.
I like #2 but is it possible for the stream to get out of sync?
I don't like idea #3 because it means receiver can't know the size of the message ahead of time and also requires that the terminator doesn't appear elsewhere in the message.
With #2, if it's possible to get out of sync, can I add a terminator or am I guaranteed to never get out of sync as long as the sender program is correct in what it sends? Is it necessary to do #2 AND #3?
Please let me know.
Thanks,
jbu
You are using TCP, the packet delivery is reliable. So the connection either drops, timeouts or you will read the whole message.
So option #2 is ok.
I agree with sigjuice.
If you have a size field, it's not necessary to add and end-of-message delimiter --
however, it's a good idea.
Having both makes things much more robust and easier to debug.
Consider using the standard netstring format, which includes both a size field and also a end-of-string character.
Because it has a size field, it's OK for the end-of-string character to be used inside the message.
If you are developing both the transmit and receive code from scratch, it wouldn't hurt to use both length headers and delimiters. This would provide robustness and error detection. Consider the case where you just use #2. If you write a length field of N to the TCP stream, but end up sending a message which is of a size different from N, the receiving end wouldn't know any better and end up confused.
If you use both #2 and #3, while not foolproof, the receiver can have a greater degree of confidence that it received the message correctly if it encounters the delimiter after consuming N bytes from the TCP stream. You can also safely use the delimiter inside your message.
Take a look at HTTP Chunked Transfer Coding for a real world example of using both #2 and #3.
Depending on the level at which you're working, #2 may actually not have an issues with going out of sync (TCP has sequence numbering in the packets, and does reassemble the stream in correct order for you if it arrives out of order).
Thus, #2 is probably your best bet. In addition, knowing the message size early on in the transmission will make it easier to allocate memory on the receiving end.
Interesting there is no clear answer here. #2 is generally safe over TCP, and is done "in the real world" quite often. This is because TCP guarantees that all data arrives both uncorrupted* and in the order that it was sent.
*Unless corrupted in such a way that the TCP checksum still passes.
Answering to old message since there is stuff to correnct:
Unlike many answers here claim, TCP does not guarantee data to arrive uncorrupted. Not even practically.
TCP protocol has a 2-byte crc-checksum that obviously has a 1:65536 chance of collision if more than one bit flips. This is such a small chance it will never be encountered in tests, but if you are developing something that either transmits large amounts of data and/or is used by very many end users, that dice gets thrown trillions of times (not kidding, youtube throws it about 30 times a second per user.)
Option 2: size field is the only practical option for the reasons you yourself listed. Fixed length messages would be wasteful, and delimiter marks necessitate running the entire payload through some sort of encoding-decoding stage to replace at least three different symbols: start-symbol, end-symbol, and the replacement-symbol that signals replacement has occurred.
In addition to this one will most likely want to use some sort of error checking with a serious checksum. Probably implemented in tandem with the encryption protocol as a message validity check.
As to the possibility of getting out of sync:
This is possible per message, but has a remedy.
A useful scheme is to start each message with a header. This header can be quite short (<30 bytes) and contain the message payload length, eventual correct checksum of the payload, and a checksum for that first portion of the header itself. Messages will also have a maximum length. Such a short header can also be delimited with known symbols.
Now the receiving end will always be in one of two states:
Waiting for new message header to arrive
Receiving more data to an ongoing message, whose length and checksum are known.
This way the receiver will in any situation get out of sync for at most the maximum length of one message. (Assuming there was a corrupted header with corruption in message length field)
With this scheme all messages arrive as discrete payloads, the receiver cannot get stuck forever even with maliciously corrupted data in between, the length of arriving payloads is know in advance, and a successfully transmitted payload has been verified by an additional longer checksum, and that checksum itself has been verified. The overhead for all this can be a mere 26 byte header containing three 64-bit fields, and two delimiting symbols.
(The header does not require replacement-encoding since it is expected only in a state whout ongoing message, and the entire 26 bytes can be processed at once)
There is a fourth alternative: a self-describing protocol such as XML.
I need to calculate total data transfer while transferring a fixed size data from client to server in TCP/IP. It includes connecting to the server, sending request,header, receiving response, receiving data etc.
More precisely, how to get total data transfer while using POST and GET method?
Is there any formula for that? Even a theoretical one will do fine (not considering packet loss or connection retries etc)
FYI I tried RFC2616 and RFC1180. But those are going over my head.
Any suggestion?
Thanks in advance.
You can't know the total transfer size in advance, even ignoring retransmits. There are several things that will stop you:
TCP options are negotiated between the hosts when the connection is established. Some options (e.g., timestamp) add additional data to the TCP header
"total data transfer size" is not clear. Ethernet, for example, adds quite a few more bits on top of whatever IP used. 802.11 (wireless) will add even more. So do HDLC or PPP going over a T1. Don't even think about frame relay. Some links may use compression (which will reduce the total size). The total size depends on where you measure it, even for a single packet.
Assuming you're just interested in the total octet size at layer 2, and you know the TCP options that will be negotiated in advance, you still can't know the path MTU. Which may change, even while the connection is in progress. Or if you're not doing path MTU discovery (which would be wierd), then the packet may get fragmented somewhere, and the remote end will see a different amount of data transfer than you.
I'm not sure why you need to know this, but I suggest that:
If you just want an estimate, watch a typical connection in Wireshark. Calculate the percent overhead (vs. the size of data you gave to TCP, and received from TCP). Use that number to estimate: it will be close enough, except in pathological situations.
If you need to know for sure how much data your end saw transmitted and received, use libpcap to capture the packet stream and check.
i'd say on average that request and response have about 8 lines of headers each and about 30 chars per line. Then allow for the size increase of converting any uploaded binary to Base64.
You didn't say if you also want to count TCP packet headers, in which case you could assume an MTU of about 1500 so add 16 bytes (tcp header) per 1500 data bytes
Finally, you could always setup a packet sniffer and count actual bytes for a sample of data.
oh yeah, and you may need to allow for deflate/gzip encoding as well.