NSUUID duplication chance form different device. - ios

I need to generate Unique ID for the device when the application installed, and store this value on the device, then need to communicate with server using this UUID. And it seems NSSUUD suit for the sitiation, but I am confused is there any chance of duplication of the UUID from multiple device. I already found the answer https://stackoverflow.com/a/6963990/1573209 where it describe that the version 1 type uses MAC address and 60 bit clock to generate UUID, so the duplication chance is negligible. Where as the Version4 uses some fixed number and some random number to generate the UUID, and the doc says that UUIDs created by NSUUID conform to RFC 4122 version 4 and are created with random bytes
Does that mean the chance of duplication higher?.
Then how can I use version 1 type of UUID generator, I cant see any documentation for it.

You can have look at this RFC 4122. UUID conforming to RFC 4122 are practically unique in given space and time. You can also see Random UUID probability of duplicates.
Out of a total of 128 bits, two bits indicate an RFC 4122 ("Leach-Salz") UUID and four bits the version (0100 indicating "randomly generated"), so randomly generated UUIDs have 122 random bits. The chance of two such UUIDs having the same value can be calculated using probability theory (birthday problem). Probabilities of an accidental clash after calculating n UUIDs, with x = 122 is found to be very close to zero
For n=2^36 which is 68,719,476,736 probability of collision is found to be 0.0000000000000004. For lesser value of n, this value will be even less and probability increases as more UUID's are generated. In above estimation n represents number of UUID's generated.

Related

Is there an EDI Segment that can contain more than 256 characters?

Is there an EDI x12 segment that has no character limit? We often use the MSG segment for open text fields but this is capped at 256 characters, so we’re looking for an alternative that can handle 500+ characters.
The short answer
The MTX Text segment allows you to send messages of up to 4096 characters long, which is the longest available in X12. You can’t just swap out an MSG segment for an MTX segment, though. You can only use MTX if it’s included in the transaction set, and that depends on which X12 'release' (version) you're using.
For the 005010 release (one of the more popular ones), here are the transaction sets that MTX appears in:
105 Business Entity Filings
113 Election Campaign and Lobbyist Reporting
150 Tax Rate Notification
155 Business Credit Report
179 Environmental Compliance Reporting
194 Grant or Assistance Application
251 Pricing Support
274 Healthcare Provider Information
284 Commercial Vehicle Safety Reports
500 Medical Event Reporting
620 Excavation Communication
625 Well Information
650 Maintenance Service Order
805 Contract Pricing Proposal
806 Project Schedule Reporting
814 General Request, Response or Confirmation
832 Price/Sales Catalog
836 Procurement Notices
840 Request for Quotation
843 Response to Request for Quotation
850 Purchase Order
855 Purchase Order Acknowledgment
860 Purchase Order Change Request - Buyer Initiated
865 Purchase Order Change Acknowledgment/Request - Seller Initiated
Some additional clarification
Technically, character limits don't apply to X12 segments – what you're referring to is an X12 element. A segment is just a container for elements, and the element you're referring to is the element referenced in MSG01 (the first element of the MSG segment).
Each X12 element references an ID number. For each element, the ID number points to a dictionary that specifies the name, description, type, minimum length, and maximum length. In the case of MSG01, it points to data element [933][1].
Data element 933 – the one you're currently using – actually has a character limit of 264 characters (more than 256 characters, but not by much). Note: the link above is to the 005010 X12 release, but I checked backed to 003010 and up to 008030 and it seems to be 264 characters all the way through.
Now, back to your original question: is there a data element that allows for a larger character payload?
The answer is that there are 8 data elements that accept a payload larger than 264 characters.
Two of them are binary data types, which we can likely eliminate off the bat:
785. Binary Data. A string of octets which can assume any binary pattern from hexadecimal 00 to FF. Note: The maximum length is dependent upon the maximum data value that can be entered in DE 784, which value is 999,999,999,999,999. Max characters: 999999999999999.
1700. Transformed Data. Binary or filtered data having one or more security policy options applied; transformed data may represent compressed, encrypted, or compressed and encrypted plaintext. Max characters: 10000000000000000.
The rest are strings, which is promising:
364. Communication Number. Complete communications number including country or area code when applicable. Max characters: 2048.
1565. Look-up Value. Value used to identify a certificate containing a public key. Max characters: 4096.
1566. Keying Material. Additional material required for decrypting the one-time key. Max characters: 512.
1567. One-time Encryption Key. Hexadecimally filtered encrypted one-time key. Max characters: 512.
1573. Encoded Security Value. Encoded representation of the Security Value specified by the Security Value Qualifier. Max characters: 1.00E+16.
And, last but not least:
1551. Textual Data. To transmit large volumes of message text. Max characters: 4096.
Looks like a winner!
Note that element 1551 appears in only one segment: MTX, which was introduced in the 003060 X12 release. And in the initial 003060 release, it was only included in one X12 Transaction Set: 194 Grant or Assistance Application (which makes sense – a longer field was needed for grant applications).
It seems that as new releases were developed, the MTX segment made its way into more and more transaction sets – likely for exactly the reason you're asking. In 003070, it was included in 5 transaction sets; in 004010, 15; in 005010, 24, and so on.
The MTX segment uses element 1551 in both MTX02 and MTX03, so you can get double the length by using both of them. Note that there's a 'relational condition': If MTX-03 is present, then MTX-02 is required (in other words, you can't use MTX03 if you don't use MTX02 first).
And depending on the transaction set, the MTX segment may be able to be repeated as well.
Long story short: if the MTX segment is in the transaction set / release you're using, you're likely in luck.
Hope this helps.
Use multiple MSGs and trim the data at each to the maximum allowed. You usually have free text segments set with repetitions > 1, so you should be okay. That's how everybody does it.

Is information stored in registers/memory structured as binary?

Looking at this question on Quora HERE ("Are data stored in registers and memory in hex or binary?"), I think the top answer is saying that data persistence is achieved through physical properties of hardware and is not directly relatable to either binary or hex.
I've always thought of computers as 'binary', but have just realized that that only applies to the usage of components (magnetic up/down or an on/off transistor) and not necessarily the organisation of, for example, memory contents.
i.e. you could, theoretically, create an abstraction in memory that used 'binary components' but that wasn't binary, like this:
100000110001010001100
100001001001010010010
111101111101010100001
100101000001010010010
100100111001010101100
And then recognize that as the (badly-drawn) image of 'hello', rather than the ASCII encoding of 'hello'.
An answer on SO (What's the difference between a word and byte?) mentions that processors can handle 'words', i.e. several bytes at a time, so while information representation has to be binary I don't see why information processing has to be.
Can computers do arithmetic on hex directly? In this case, would the internal representation of information in memory/registers be in binary or hex?
Perhaps "digital computer" would be a good starting term and then from there "binary digit" ("bit"). Electronically, the terms for the values are sometimes "high" and "low". You are right, everything after that depends on the operation. Most of the time, groups of bits are operated on together. Commonly groups are 1, 8, 16, 32 and 64 bits. The meaning of the bits depends on the program but some operations go hand-in-hand with some level of meaning.
When the meaning of a group of bits is not known or important, humans like to be able to decern the value of each bit. Binary could be used but more than 8 bits is hard to read. Although it is rare to operate on groups of 4 bits, hexadecimal is much more readable and is generally used regardless of the number of bits. Sometimes octal is used but that's based on contexts where there is some meaning to a subgrouping of the 3 bits or an avoidance of digits beyond 9.
Integers can be stored in two's complement format and often CPUs have instructions for such integers. Once such operation is negation. For a group of 8 bits, it would map 1 to -1,… 127 to -127, and -1 to 1, … -127 to 127, and 0 to 0 and -128 to -128. Decimal is likely the most valuable to humans here, not base 256, base 2 or base 16. In unsigned hexadecimal, that would be 01 to FF, …, 00 to 00, 80 to 80.
For an intro to how a CPU might do integer addition on a group of bits, see adder circuits.
Other number formats include IEEE-754 floating point and binary-coded decimal.
I think you understand that digital circuits are binary. So, based on the above, yes, operations do operate on a higher conceptual level despite the actual storage.

Are UUID's, and the most basic level, just a string of unique characters?

I am currently learning about UUID in iOS, and of course I'm trying to make sense of them. From what I can gather, when you call NSUUID(), it returns a 128 bit string that is completely unique (though I'm not currently interested in how it can ensure a completely unique string, I figure it takes into account the date, time, and device identity). To make use of this string, you can append it to the end of the Document Directory (which I believe is unique to each application) to ensure a unique file path that can be used to access files later. Is this a correct understanding of the concept?
Globally Unique Identifiers are 128-bit binary strings.
Microsoft COM uses them to prevent "name collisions" between components without needing some "central naming authority" (like we have for DNS names, IP addresses, broadcast frequencies, etc etc).
GUIDs are likely to be unique ... but it's not guaranteed.
Here is a good article explaining more:
http://betterexplained.com/articles/the-quick-guide-to-guids/
And yes, your understanding of iOS NSUUIDs is exactly right:
http://nshipster.com/nstemporarydirectory/
http://nshipster.com/uuid-udid-unique-identifier/
It depends on the version of Universally unique identifier. Version 4 is almost guaranteed to be unique but not completely. Wikipedia states the following:
"Out of a total of 128 bits, two bits indicate an RFC 4122 ("Leach-Salz") UUID and four bits the version (0100 indicating "randomly generated"), so randomly generated UUIDs have 122 random bits. The chance of two such UUIDs having the same value can be calculated using probability theory (birthday paradox). Using the approximation"
Reference: https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_.28random.29

What should i do to maintain performance of a mobile app which is using database?

I'm building an app using database.
I have a words table and everytime user types something, this app will record and update word the database.
And the frequency field will be auto increase after user enter one matched word.
But the trouble is user type day by day and i afraid the search performance will be reduce after times and also the Int field will reach to the limit (max limit Int) someday.
So, i limit the database to around less than 50.000 records.
I delete less-used records after a certain time.
But i don't know how to deal with frequency Int field of each word?
How to know exactly frequency usage of each word without increasing the field forever?
I recommend that you use a logarithmic scale for the frequency values. That's what is often done in situations like this. See Wikipedia to learn about logarithmic scales.
For example, if you have a word MAN that has a frequency of 15, the value you store in the database would be log(15) ~= 1.17609125906.
If you then find 4 new occurrences of MAN, then you want to add 4 to the field. You cannot add the log values directly because log(x)+log(y)=log(x*y). (See the Logarithm Rules section of this article for more information on log rules.)
Instead -- assuming you use a base 10 logarithm, you would use this formula:
SET frequency = log(10^frequency+4)
Depending on the length of your words, the few bytes for the frequency don't matter. With an unsigned four bytes integer, you can count up to more than two billion, which is way above the number of words what the user can type in in their whole lifespan.
So may want to go for two or three bytes, but the savings may be negligible.
Anyway, there are the following approaches for preventing overflow:
You can detect it, and then undo the operations, scale everything down by some factor of two, and then redo.
You can periodically check all your numbers and do the scaling when approaching the limit.
You can do a probabilistic update like below.
Probabilistic update
Instead of simply incrementing the frequency every time by one, you do it only with a probability which gets lower and lower as the counter grows. For example, you can do the increment with a probability of 1.0 / (oldValue + 1) or 2 ** -oldValue. The latter leads to a logarithmic growth, but, unlike the idea in the other answer, it works.
There are obviously some disadvantages due to the randomness and precision loss, but when all you care about is the relative frequency, it should be good enough.

How Unique is CRC16 Value?

I'm developing an OpenSource .NET Licensing Engine.
This engine use hardware id (harddisk serial number) as lock and CRC16 this value to get shorten identifier.
Example value is MAXTOR ST3100, 476300BE and CRC16 result is 3FF0
My concern is how often 2 diffrent value get same CRC16 value, or should I use CRC32 instead ?
Probability of collision between 2 items = 1 ⁄ 0x10000 = 0.00152%...
But if you have more than 2 items, see the Birthday Problem -- it gets a lot more likely:
You just need 300 items to get a 50% probability of collision.
http://www.texify.com/img/%5CLARGE%5C%21%5CLARGE%5C%21%5Cleft%281%20-%20%5Cfrac%7B0%7D%7B2%5E%7B16%7D%7D%5Cright%29%5Cleft%281%20-%20%5Cfrac%7B1%7D%7B2%5E%7B16%7D%7D%5Cright%29%5Cleft%281%20-%20%5Cfrac%7B2%7D%7B2%5E%7B16%7D%7D%5Cright%29%5Cleft%281%20-%20%5Cfrac%7B3%7D%7B2%5E%7B16%7D%7D%5Cright%29%5Ccdots%5Cleft%281%20-%20%5Cfrac%7BN%7D%7B2%5E%7B16%7D%7D%5Cright%29%3D%2050%25%20%5C%5C%20N%20%5Capprox%20300.gif
As CRC16 is a 16-bit value, I'd say that the chance is around 1 in 65536.
No hashing method generates unique values, collisions being guaranteed at some point. The closest bet based on your requirements is simply to use the harddisk serial number as-is.
Hackers will crack it easily though.

Resources