When parsing network data, is there any sort of procedure to determine what sort of frame you see in a packet?
For example, I can see some packets that are Ethernet II, or IEEE 802.3. Is there a way of figuring out which interface you are looking at just by reading the bits?
Related
Let's assume we have a 64bit cpu which will always read 8 bytes memory at a time and I want to store a 4 bytes int. According to the definition of natural alignment, a 4-byte object is aligned to an address that's a multiple of 4 (e.g. 0x0000, 0x0004). But here is the problem, why cannot I store it at address 0x0001 for example? To my understanding, since the CPU will always read 8 bytes data, reading from address 0x0000 can still get the int stored at 0x0001 in one go. So, why natural alignment is needed in this case?
Modern CPUs (Intel, Arm) will quite happily read from unaligned addresses. The CPUs are architected typically to read much more than 8 bytes per cycle: perhaps 16 bytes or 32 bytes, and the deep pipelines of the CPUs manage quite nicely to extract the wanted 8 bytes from arbitrary addresses without any visible penalties.
Often, but not always, algorithms can be written without much concern about the alignment of arrays (or the start of each row of 2-dimensional array).
The pipelined architectures possibly read aligned blocks of 16-bytes at a time, meaning that when 8 bytes are read from address 0x0009, the CPU actually needs to read 2 16-byte blocks, combine those and extract the middle 8 bytes. Things become even more complicated, when the memory is not available at first level cache and a full cache line of 64 bytes needs to be fetched from next level cache or from main memory.
In my experience (writing and optimising image processing algorithms for SIMD), many Arm64 implementations hide the cost of loading from unaligned addresses almost perfectly for algorithms with simple and linear memory access. Things become worse, if the algorithm needs to read heavily from many unaligned addresses, such as when filtering with kernel of 3x3 or larger, or when calculating high-radix FFTs, suggesting that the CPUs capabilities of transferring memory and combining the become soon exhausted.
I am an undergraduate student who is volunteering in a computer vision research project. As a part of the project, I wish to make a dependent data stream (a stream in which the value of each data sample depends on the previous data sample seen), independent. For this, I need to determine a scalar at which intervals I must sample the stream so that no 2 consecutive data samples are dependent.
For instance, maybe at a jump factor of 10, that is, sampling after every 10 data points in the stream, the resultant reduced data stream is independent.
My question is how can we determine this scalar jump factor for effective sampling such that the new data stream has independent data points?
From my research, I have been unable to find any statistical test that could be helpful.
Thanks in advance.
I have converted 349,900 words from a dictionary file to md5 hash. Sample are below:
74b87337454200d4d33f80c4663dc5e5
594f803b380a41396ed63dca39503542
0b4e7a0e5fe84ad35fb5f95b9ceeac79
5d793fc5b00a2348c3fb9ab59e5ca98a
3dbe00a167653a1aaee01d93e77e730e
ffc32e9606a34d09fca5d82e3448f71f
2fa9f0700f68f32d2d520302906e65ce
1c9b32ff1b53bd892b87578a11cbd333
26a10043bba821303408ebce568a2746
c3c32ff3481e9745e10defa7ce5b511e
I want to train a neural network to decrypt a hash using just simple architecture like MultiLayer Perceptron. Since all hash value is of length 32, I was thingking that the number of input nodes is 32, but the problem here is the number of output nodes. Since the output are words in the dictionary, it doesn't have any specific length. It could be of various length. That is the reason why Im confused on how many number of output nodes shall I have.
How will I encode my data, so that I can have specific number of output nodes?
I have found a paper here in this link that actually decrypt a hash using neural network. The paper said
The input to the neural network is the encrypted text that is to be decoded. This is fed into the neural network either in bipolar or binary format. This then traverses through the hidden layer to the final output layer which is also in the bipolar or binary format (as given in the input). This is then converted back to the plain text for further process.
How will I implement what is being said in the paper. I am thinking to limit the number of characters to decrypt. Initially , I can limit it up to 4 characters only(just for test purposes).
My input nodes will be 32 nodes representing every character of the hash. Each input node will have the (ASCII value of the each_hash_character/256). My output node will have 32 nodes also representing binary format. Since 8 bits/8 nodes represent one character, my network will have the capability of decrypting characters up to 4 characters only because (32/8) = 4. (I can increase it if I want to. ) Im planning to use 33 nodes. Is my network architecture feasible? 32 x 33 x 32? If no, why? Please guide me.
You could map the word in the dictionary in a vectorial space (e.g. bag of words, word2vec,..). In that case the words are encoded with a fix length. The number of neurons in the output layer will match that length.
There's a great discussion about the possibility of cracking SHA256 hashes using neural networks in another Stack Exchange forum: https://security.stackexchange.com/questions/135211/can-a-neural-network-crack-hashing-algorithms
The accepted answer was that:
No.
Neural networks are pattern matchers. They're very good pattern
matchers, but pattern matchers just the same. No more advanced than
the biological brains they are intended to mimic. More thorough, more
tireless, but not more sophisticated.
The patterns have to be there to be found. There has to be a bias in
the data to tease out. But cryptographic hashes are explicitly and
extremely carefully designed to eliminate any bias in the output. No
one bit is more likely than any other, no one output is more likely to
correlate to any given input. If such a correlation were possible, the
hash would be considered "broken" and a new algorithm would take its
place.
Flaws in hash functions have been found before, but never with the aid
of a neural network. Instead it's been with the careful application of
certain mathematical principles.
The following answer also makes a funny comparison:
SHA256 has an output space of 2^256, and an input space that's
essentially infinite. For reference, the time since the big bang is
estimated to be 5 billion years, which is about 1.577 x 10^27
nanoseconds, which is about 2^90 ns. So assuming each training
iteration takes 1 ns, you would need 2^166 ages of the universe to
train your neural net.
How do we use one hot encoding if the number of values which a categorical variable can take is large ?
In my case it is 56 values. So as per usual method I would have to add 56 columns (56 binary features) in the training dataset which will immensely increase the complexity and hence the training time.
So how do we deal with such cases ?
Use a compact encoding. This trades space for time, although one-hot encodings can often enjoy a very small time penalty.
The most accessible idea is a vector of 56 booleans, if your data format supports that. The one with the most direct mapping is to use a 64-bit integer, each bit being a boolean. This is how we implement one-hot vectors in hardware design. Most 4G languages (and mature 3G languages) include fast routines for bit manipulation. You will need get, set, clear, and find bits.
Does that get you moving?
I'm looking at designing a low-level radio communications protocol, and am trying to decide what sort of checksum/crc to use. The hardware provides a CRC-8; each packet has 6 bytes of overhead in addition to the data payload. One of the design goals is to minimize transmission overhead. For some types of data, the CRC-8 should be adequate, for for other types it would be necessary to supplement that to avoid accepting erroneous data.
If I go with a single-byte supplement, what would be the pros and cons of using a CRC8 with a different polynomial from the hardware CRC-8, versus an arithmetic checksum, versus something else? What about for a two-byte supplement? Would a CRC-16 be a good choice, or given the existence of a CRC-8, would something else be better?
In 2004 Phillip Koopman from CMU published a paper on choosing the most appropriate CRC, http://www.ece.cmu.edu/~koopman/crc/index.html
This paper describes a polynomial selection process for embedded
network applications and proposes a set of good general-purpose
polynomials. A set of 35 new polynomials in addition to 13 previously
published polynomials provides good performance for 3- to 16-bit CRCs
for data word lengths up to 2048 bits.
That paper should help you analyze how effective that 8 bit CRC actually is, and how much more protection you'll get from another 8 bits. A while back it helped me to decide on a 4 bit CRC and 4 bit packet header in a custom protocol between FPGAs.