what are Advantages and Disadvantages for nand gates in sr latch? - digital

There is 4 Nand gate used implementing SR latch which can be easily done with two nor gate. But why we prefer nand over nor? what are it's advantages and disadvantages?

Related

Gaussian Process Confidence vs Credible Intervals

Since Gaussian Process returns a distribution and not a point estimate, why this example (and actually in every example with GP) talk about Confidence Intervals on the analogues for Bayesian statistics the Credible Intervals?
I was wondering about that, too. My assumption is the following: GaussianProcessRegressor from sklearn implements Algorithm 2.1 from Rasmussen & Williams (2006). Throughout this book, the ±2σ interval around µ is referred to as "95% confidence region". They simply do not make a distinction between "confidence" and "credible" region. I think the authors of sklearn adopted that.
C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, MIT Press, 2006

State-of-the-art method for large-scale near-duplicate detection of documents?

To my understanding, the scientific consensus in NLP is that the most effective method for near-duplicate detection in large-scale scientific document collections (more than 1 billion documents) is the one found here:
http://infolab.stanford.edu/~ullman/mmds/ch3.pdf
which can be briefly described by:
a) shingling of documents
b) minhashing to obtain theminhash signatures of the shingles
c) locality-sensitive hashing to avoid doing pairwise similarity calculations for all signatures but instead focus only to pairs within buckets.
I am ready to implement this algorithm in Map-Reduce or Spark, but because I am new to the field (I have been reading upon large-scale near-duplicate detection for about two weeks) and the above was published quite a few years ago, I am wondering whether there are known limitations of the above algorithm and whether there are different approaches that are more efficient (offering a more appealing performance/complexity trade-off ).
Thanks in advance!
Regarding the second step b) there are recent developments which significantly speed up the calculation of signatures:
Optimal Densification for Fast and Accurate Minwise Hashing, 2017,
https://arxiv.org/abs/1703.04664
Fast Similarity Sketching, 2017, https://arxiv.org/abs/1704.04370
SuperMinHash - A New Minwise Hashing Algorithm for Jaccard Similarity Estimation, 2017, https://arxiv.org/abs/1706.05698
ProbMinHash - A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity, 2019, https://arxiv.org/pdf/1911.00675.pdf

Hierarchical Clustering with branching factor > 2?

All the hierarchical clustering methods that I have seen implemented in Python (scipy, scikit-learn, etc.,) split or combine two clusters at a time. This forces the branching factor to be 2 at each node. For my purpose, I want the model to allow branching factor to be greater than 2. That's helpful in situations where there are ties between clusters.
I'm not familiar with any hierarchical clustering techniques that have a branching factor greater than 2; do they exist?
Cluster this data set with single link:
0 0
0 1
1 0
1 1
And you will see a 4-way merge.
But for other linkages, always finding the best 3-way split would likely increase the runtime cost to O(n^4). You really don't want that.

How could I deal with the sparse feature with high dimension in an SVR task?

I have a twitter-like(another micro blog) data set with 1.6 million datapoints and tried to predict the its retweet numbers based on its content. I extracted its keyword and use the keywords as the bag of words feature. Then I got 1.2 million dimension feature. The feature vector is very sparse,usually only ten dimension in one data point. And I use SVR to do the regression. Now it has taken 2 days. I think the training time might take quite a long time. I don't know if I do this task like this is normal. Is there any way or is it necessary to optimize this problem?
BTW. If in this case , I don't use any kernel and the machine is 32GB RAM and i-7 16 cores. How long the training time will be in estimation? I used the lib pyml.
You need to find a dimensionality reduction approach that works for your problem.
I've worked on a similar problem to yours and I found that Information Gain worked well, but there are others.
I found this paper (Fabrizio Sebastiani, Machine Learning in Automated Text Categorization, ACM Computing Surveys, Vol. 34, No.1, pp.1-47, 2002) to be a good theoretical treatment of text classification, including feature reduction by a variety of methods from the simple (Term Frequency) to the complex (Information-Theoretic).
These functions try to capture the intuition that the best terms for ci are the
ones distributed most differently in the sets of positive and negative examples of
ci. However, interpretations of this principle vary across different functions. For instance, in the experimental sciences χ2 is used to measure how the results of an observation differ (i.e., are independent) from the results expected according to an initial hypothesis (lower values indicate lower dependence). In DR we measure how independent tk and ci are. The terms tk with the lowest value for χ2(tk, ci) are thus the most independent from ci; since we are interested in the terms which are not, we select the terms for which χ2(tk, ci) is highest.
These techniques help you choose terms that are most useful in separating the training documents into the given classes; the terms with the highest predictive value for your problem.
I've been successful using Information Gain for feature reduction and found this paper (Entropy based feature selection for text categorization Largeron, Christine and Moulin, Christophe and Géry, Mathias - SAC - Pages 924-928 2011) to be a very good practical guide.
Here the authors present a simple formulation of entropy-based feature selection that's useful for implementation in code:
Given a term tj and a category ck, ECCD(tj , ck) can be
computed from a contingency table. Let A be the number
of documents in the category containing tj ; B, the number
of documents in the other categories containing tj ; C, the
number of documents of ck which do not contain tj and D,
the number of documents in the other categories which do
not contain tj (with N = A + B + C + D):
Using this contingency table, Information Gain can be estimated by:
This approach is easy to implement and provides very good Information-Theoretic feature reduction.
You needn't use a single technique either; you can combine them. Ter-Frequency is simple, but can also be effective. I've combined the Information Gain approach with Term Frequency to do feature selection successfully. You should experiment with your data to see which technique or techniques work most effectively.
At first you can simply remove all words with high frequency and all words with low frequency, because both of them don't tell you much about content of a text, then you have to do a word-stemming.
After that you can try to reduce dimensionality of your space, with Feature hashing, or some more advance dimensionality reduction trick (PCA, ICA), or even both of them.

Best 8-bit supplemental checksum for CRC8-protected packet

I'm looking at designing a low-level radio communications protocol, and am trying to decide what sort of checksum/crc to use. The hardware provides a CRC-8; each packet has 6 bytes of overhead in addition to the data payload. One of the design goals is to minimize transmission overhead. For some types of data, the CRC-8 should be adequate, for for other types it would be necessary to supplement that to avoid accepting erroneous data.
If I go with a single-byte supplement, what would be the pros and cons of using a CRC8 with a different polynomial from the hardware CRC-8, versus an arithmetic checksum, versus something else? What about for a two-byte supplement? Would a CRC-16 be a good choice, or given the existence of a CRC-8, would something else be better?
In 2004 Phillip Koopman from CMU published a paper on choosing the most appropriate CRC, http://www.ece.cmu.edu/~koopman/crc/index.html
This paper describes a polynomial selection process for embedded
network applications and proposes a set of good general-purpose
polynomials. A set of 35 new polynomials in addition to 13 previously
published polynomials provides good performance for 3- to 16-bit CRCs
for data word lengths up to 2048 bits.
That paper should help you analyze how effective that 8 bit CRC actually is, and how much more protection you'll get from another 8 bits. A while back it helped me to decide on a 4 bit CRC and 4 bit packet header in a custom protocol between FPGAs.

Resources