Every YouTube video has two links - youtube
I've tried playing with YouTube video URL's and found that two every video has two links for example
Suppose a video has the following link
https://www.youtube.com/watch?v=YykjpeuMNEk
Now I change the last letter of the link to make it
https://www.youtube.com/watch?v=YykjpeuMNEl
Try out both the links would open one video.
The logic is change the last letter with the consecutive letter, the letters are case sensitive.
So if the last letter is 'a' change it to 'b', if 'A' change it to 'B', if '1' change it to '2'.
Can someone explain me what is happening in this case?
This is because YouTube IDs use a variant of Base64, and each Base64 character is pure ASCII, meaning it provides only 6 bits, and the final decoded byte value is a multiple of 8 bits. This inevitably ends up not matching completely, and unless specifically indicated with extra end characters, some of these lowest bits simply have no meaning.
YouTube ID: 6 bits * 11 = 66 bits.
The given data seems to indicate that YouTube video IDs are actually a 64-bit number converted to Base64. Since we have 66 bits, and need only 64, that would mean that the last 2 bits are simply ignored.
When practically applied, this doesn't seem to be completely true, though.
YykjpeuMNEk => k = 1101011
If we were to just ignore the last 2 bits there, then we see k is actually the highest value (ends on 11), and the other ones would be the lower values, namely h, i, and j; respectively 1101000, 1101001 and 1101010. Instead they're l, m and n.
This is probably just due to the way the final value is processed to a 64-bit number, though. The theory still holds true; YouTube IDs are only accurate up to 64 bit, despite being 66-bit Base64 strings.
Meaning, every YouTube URL has not two, but in fact four IDs that match it.
Related
How to filter a random string generator from certain results
Is it possible to add a filter to a random string gen so that it cannot produce certain strings. I am using this to create unique codes for my users and I need to make sure that a code is not assigned more than once. This is how I am generating random alphanumeric func randomString(length: Int) -> String { let letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" return String((0..<length).map{ _ in letters.randomElement()! }) }
As long as your strings is long enough, your current scheme is already everything you need. You can make collisions of random values arbitrarily unlikely. This is the basis of UUIDv4. There are just so many possible values (>10^36) that no one will ever pick a duplicate. If you can use UUIDs directly, I recommend that. It's well supported. If you need to use letters and numbers, then you can do the same thing, though. UUID gets that by having 122 bits of randomness. You have 62 symbols. To get 122 bits of randomness, you need ~20 characters. If that's ok, you're done. Generate 20 character random codes and I promise, they will never collide. For each character you shorten it by, the likelihood of a collision goes up. This is called a Birthday Attack. Back-of-the-envelope, you expect (50% likely) your first collision after 1.25*sqrt(H) where H is the number of values you can encode. So for 10 characters, you expect your first collision around 1.25*sqrt(62^10) or around 1 billion documents. If your total number of documents is a few orders of magnitude smaller than that (a few million or fewer), then 10 characters will be fine. Other scales can be similar calculated. But if you can, just use UUID.
Bug in ARCore acquireDepthImage?
The documentation states that it's 16 bits each with top 3 bits set to 0, meaning the range should be about 8,192. My code is calling depthImage.getPlanes()[0].getBuffer().asShortBuffer().get(x) The range of these numbers is about the full signed short range and seemingly random. To debug, I tried printing the following values: depthImage.getPlanes()[0].getBuffer().get(0); depthImage.getPlanes()[0].getBuffer().get(1); The first one oscillates randomly and the second one is almost always in the low numbers such as 0-6, but I've seen as high as 21. It seems like the 2nd byte is the most significant byte of the number and the 1st byte is the least significant byte (i.e. they are reversed).
It's little-endian encoding, use Short.reverseBytes(): depthMm = java.lang.Short.reverseBytes(depthImage.planes[0].buffer.getShort(offset))
Does a stuffing bit in CAN count towards the next stuffing group
If you have a sequence of bits in CAN data: 011111000001 There will need to be a stuffed 0 after the ones, and a stuffed 1 after the 0s. But I'm not sure where the 1 should go. The standard seems ambiguous to me because sometimes it talks about "5 consecutive bits during normal operation", but sometimes it says "5 consecutive bits of data". Does a stuffing bit count as data? i.e. should it be: 01111100000011 Or 01111100000101
Bit stuffing only applies to the CAN frame until the ACK-bit. In the End-Of-Frame and Intermission fields, no bit stuffing is applied. It does not matter what is transmitted. It is simply "after 5 consecutive bits of the same value" one complementary bit is inserted. The second of your examples is correct. 6 consecutive bits make the message invalid.
From the old Bosch CAN2.0B spec, chapter 5: The frame segments START OF FRAME, ARBITRATION FIELD, CONTROL FIELD, DATA FIELD and CRC SEQUENCE are coded by the method of bit stuffing. Meaning everything from the start of the frame to the 15 bit CRC can have bit stuffing, but not the 1 bit CRC delimiter and the rest of the frame. Whenever a transmitter detects five consecutive bits in the bit stream to be transmitted This "bit stream" refers to all the fields mentioned in the previously quoted sentence. ...in the actual transmitted bit stream The actual transmitted bit stream is the original data + appended stuffing bit(s).
How are Urbit phonetic names encoded?
Urbit points (network addresses) are identified by 32-bit integers, but they're typically not referred to by their number. Instead, I usually see them represented in a human-pronounceable form where every byte is converted into a three-letter syllable. For example: 8 bits galaxy ~lyt 16 bits star ~diglyt 32 bits planet ~picder-ragsyt 64 bits moon ~diglyt-diglyt-picder-ragsyt 128 bits comet ~racmus-mollen-fallyt-linpex--watres-sibbur-modlux-rinmex I initially assumed that every byte had a single text representation, but have seen that planets names usually don't include the name of their star, so it must be more complicated than that. How does Urbit's phonetic name encoding system (#p-names) work?
Urbit's phonetic naming system encodes unsigned integers as human-readable strings. These unsigned integers sometimes represent the byte strings they encode to in big-endian (although that representation can't track leading zeros so the byte length must communicated out-of-band if needed). The phonetic naming scheme operates on these big-endian bytes. The phonetic naming system has two variants. For general use there is #q-encoding, which is suitable for values of any length, and is frequently used to represent binary data in Hoon code or when interacting with the Dojo REPL. For Urbit point names there is #p-encoding, which is based on #q-encoding but modifies certain cases. #q-Encoding: Pairs of Syllables Urbit phonetic names are made up of 3-letter syllables, organized in two lists of 256 syllables each. Each syllable consists of a consonant, a vowel, then another consonant. The "prefix" syllable list uses the vowels a, i, and o, and the "suffix" syllable list uses the vowels e, u, and y, with one exception: zod, the first entry in the suffix list. The full syllable lists are included below. Values fitting in one byte, from 0x00 to 0xFF, are encoded by taking the corresponding syllable from the suffix list. Examples: 0x00 becomes ~zod, 0x01 becomes ~nec. Values fitting in two bytes, from 0x0100 to 0xFFFF, are encoded by looking up the syllable corresponding to the high byte in the prefix list and concatenating the syllable corresponding to the low byte in the suffix list. Examples: 0x0100 becomes ~marzod, 0x0101 becomes ~marnec. Larger values are encoded by splitting them into two-byte pairs in big-endian order, encoding each as described above for values fitting in two bytes, and joining the results with - hyphen/minus characters. If the value is an odd number of bytes, the first byte pair is padded with a leading zero. Examples: 0x01_0000 becomes ~doznec-dozzod, 0x0101_0101 becomes ~marnec-marnec. #p-Encoding: Scrambling Planets The #p-encoding scheme is the same as #q for most values. However, it is different for values between 17 and 64 bits, which correspond to the IDs of planets and moons. Planets are intended to correspond to real individuals on the Urbit network. Each planet is spawned from a star, and the 16 lower bits of the planet's ID are those of its parent star's ID. Under the #q-encoding system, this would also mean that the last two syllables of every planet's name would be its star's name. The Urbit developers didn't want each individual's name on the network to include the name of the star that happened to spawn their planet initially: that would artificially associate them with the star forever, even though they could immediately transfer their planet to a different star. Their solution was to scramble all planet names randomly, to obfuscate the relationship between a planet's name and its parent star's name. This is implemented as a custom (obviously non-secure) cipher over the space of possible planet IDs. Because each star has 216 - 1 planets, the number of planets is not a power of two, so a conventional block cipher won't work directly. Instead, they use the construction described in Ciphers with Arbitrary Finite Domains (Black and Rockway 2002) over a custom Feistel-style block cipher optimized for speed (and compatibility). This scrambling is applied on planet IDs, and on the lower 32 bits of a moon ID (which correspond to its parent planet's ID). Under #p-encoding, the planet with ID 0x01_0101 becomes ~ralnyt-botdyt, showing no connections to its parent star ~marnec. The star-planet relationship is the only one that is obfuscated. If you look at the names of a planet's moons, they include the name of the planet directly: for example, ~ralnyt-botdyt's moon 0x01_0001_0101 becomes ~doznec-ralnyt-botdyt, and 0x02_0001_0101 becomes ~dozbud-ralnyt-botdyt. Implementations When writing Hoon code, such as at the Dojo REPL, you can use the standard #p and #q functions directly to encode values to the corresponding phonetic names. In Hoon, a #p-encoded value is identified with the prefix ~ and a #q-encoded value is identified with the prefix .~, and either can be decoded back with the #u function. Hoon also uses . the period character as a (mandatory) thousands separator in integer literals. > `#p`1.529.729.032 ~diglyt-diglyt > `#q`1.529.729.032 .~fonbyn-mopful > `#u`~diglyt-diglyt 1.529.729.032 > `#u`.~diglyt-diglyt 3.246.440.832 In JavaScript, the official urbit-ob package provides similar functions. import ob from "urbit-ob"; ob.patp(1529729032); // ~diglyt-diglyt ob.patq(1529729032); // ~fonbyn-mopful ob.patp2dec("~diglyt-diglyt"); // 1529729032 ob.patq2dec("~diglyt-diglyt"); // 3246440832 Full Syllable Lists prefixes = ["doz","mar","bin","wan","sam","lit","sig","hid","fid","lis","sog", "dir","wac","sab","wis","sib","rig","sol","dop","mod","fog","lid","hop","dar", "dor","lor","hod","fol","rin","tog","sil","mir","hol","pas","lac","rov","liv", "dal","sat","lib","tab","han","tic","pid","tor","bol","fos","dot","los","dil", "for","pil","ram","tir","win","tad","bic","dif","roc","wid","bis","das","mid", "lop","ril","nar","dap","mol","san","loc","nov","sit","nid","tip","sic","rop", "wit","nat","pan","min","rit","pod","mot","tam","tol","sav","pos","nap","nop", "som","fin","fon","ban","mor","wor","sip","ron","nor","bot","wic","soc","wat", "dol","mag","pic","dav","bid","bal","tim","tas","mal","lig","siv","tag","pad", "sal","div","dac","tan","sid","fab","tar","mon","ran","nis","wol","mis","pal", "las","dis","map","rab","tob","rol","lat","lon","nod","nav","fig","nom","nib", "pag","sop","ral","bil","had","doc","rid","moc","pac","rav","rip","fal","tod", "til","tin","hap","mic","fan","pat","tac","lab","mog","sim","son","pin","lom", "ric","tap","fir","has","bos","bat","poc","hac","tid","hav","sap","lin","dib", "hos","dab","bit","bar","rac","par","lod","dos","bor","toc","hil","mac","tom", "dig","fil","fas","mit","hob","har","mig","hin","rad","mas","hal","rag","lag", "fad","top","mop","hab","nil","nos","mil","fop","fam","dat","nol","din","hat", "nac","ris","fot","rib","hoc","nim","lar","fit","wal","rap","sar","nal","mos", "lan","don","dan","lad","dov","riv","bac","pol","lap","tal","pit","nam","bon", "ros","ton","fod","pon","sov","noc","sor","lav","mat","mip","fip"] suffixes = ["zod","nec","bud","wes","sev","per","sut","let","ful","pen","syt", "dur","wep","ser","wyl","sun","ryp","syx","dyr","nup","heb","peg","lup","dep", "dys","put","lug","hec","ryt","tyv","syd","nex","lun","mep","lut","sep","pes", "del","sul","ped","tem","led","tul","met","wen","byn","hex","feb","pyl","dul", "het","mev","rut","tyl","wyd","tep","bes","dex","sef","wyc","bur","der","nep", "pur","rys","reb","den","nut","sub","pet","rul","syn","reg","tyd","sup","sem", "wyn","rec","meg","net","sec","mul","nym","tev","web","sum","mut","nyx","rex", "teb","fus","hep","ben","mus","wyx","sym","sel","ruc","dec","wex","syr","wet", "dyl","myn","mes","det","bet","bel","tux","tug","myr","pel","syp","ter","meb", "set","dut","deg","tex","sur","fel","tud","nux","rux","ren","wyt","nub","med", "lyt","dus","neb","rum","tyn","seg","lyx","pun","res","red","fun","rev","ref", "mec","ted","rus","bex","leb","dux","ryn","num","pyx","ryg","ryx","fep","tyr", "tus","tyc","leg","nem","fer","mer","ten","lus","nus","syl","tec","mex","pub", "rym","tuc","fyl","lep","deb","ber","mug","hut","tun","byl","sud","pem","dev", "lur","def","bus","bep","run","mel","pex","dyt","byt","typ","lev","myl","wed", "duc","fur","fex","nul","luc","len","ner","lex","rup","ned","lec","ryd","lyd", "fen","wel","nyd","hus","rel","rud","nes","hes","fet","des","ret","dun","ler", "nyr","seb","hul","ryl","lud","rem","lys","fyn","wer","ryc","sug","nys","nyl", "lyn","dyn","dem","lux","fed","sed","bec","mun","lyr","tes","mud","nyt","byr", "sen","weg","fyr","mur","tel","rep","teg","pec","nel","nev","fes"]
Why bytes of one word has opposite order in binary files?
I was reading BMP file in hex editor while discovered something odd. Two first letters "BM" are written in order, however the next word(2B), which is means file size, is 36 30 in hex. Actual size is 0x3036. I've noticed that other numbers are stored the same way. I'm also using MARS MIPS emulator which can display memory by words. String in.bmp is stored as b . n i / \0 p m. Why data isn't stored continuously?
It depends not on the data itself but on how you store this data: per byte, per word (2 bytes, usually), or per long (4 bytes -- again, usually). As long as you store data per byte you don't see anything unusual; data appears "continuous". However, with longer units, you are subject to endianness. It appears your emulator is assuming all words need to have their bytes reversed; and you can see in your example that this assumption is not always valid. As for the BM "magic" signature: it's not meant to be read as a word value "BM", but rather as "first, a single byte B, then a single byte M". All next values are written in little-endian order, not only 'exchanging' your 36 and 30 but also the 2 zeroes 'before' (or 'after') (the larger values in the BMP header are of 4 bytes long type).