Savitzky - Golay filter for 2D Matrices - image-processing

i am doing some research about implementing a Savitzky-Golay filter for images. As far as i have read, the main application for this filter is signal processing, e.g. for smoothing audio-files.
The idea is fitting a polynomial through a defined neighbourhood around point P(i) and setting this point P to his new value P_new(i) = polynomial(i).
The problem in 2D-space is - in my opinion - that there is not only one direction to do the fitting. You can use different "directions" to find a polynomial. Like for
[51 52 11 33 34]
[41 42 12 24 01]
[01 02 PP 03 04]
[21 23 13 43 44]
[31 32 14 53 54]
It could be:
[01 02 PP 03 04], (horizontal)
[11 12 PP 23 24], (vertical)
[51 42 PP 43 54], (diagonal)
[41 42 PP 43 44], (semi-diagonal?)
but also
[41 02 PP 03 44], (semi-diagonal as well)
(see my illustration)
So my question is: Does the Savitzky-Golay filter even make sense for 2D-space, and if yes, is there and any defined generalized form for this filter for higher dimensions and larger filter masks?
Thank you !

A first option is to use SG filtering in a separable way, i.e. filtering once on the horizontal rows, then a second time on the vertical rows.
A second option is to rewrite the equations with a bivariate polynomial (bicubic f.i.) and solve for the coefficients by least-squares.

Related

Google sheets importHTML removes zero and treats commas as decimal

I'm trying to import a table where the commas are the 1000 separator,
example: 32,100 is 32100 but it is treating it as 32.1 instead.
This is a similar table (first one / top left):
https://en.wikipedia.org/wiki/Demographics_of_the_world
imgur for screenshots:
https://imgur.com/a/hJR9tox
I want it to say:
Year million
1500 458
1600 580
1700 682
1750 791
1800 978
1850 1262
1900 1650
1950 2521
1999 5978
2008 6707
2011 7000
2015 7350
2018 7600
2020 7750
But it comes out as:
Year million
1500 458
1600 580
1700 682
1750 791
1800 978
1850 1,262
1900 1,65
1950 2,521
1999 5,978
2008 6,707
2011 7
2015 7,35
2018 7,6
2020 7,75
This is the function I'm using:
=IMPORTHTML("https://en.wikipedia.org/wiki/Demographics_of_the_world"; "table"; 1)
I have also tried using this function:
=IMPORTXML("https://en.wikipedia.org/wiki/Demographics_of_the_world"; "//*[#id='mw-content-text']/div/table[1]/tbody")
But that shows as this witch is extremely hard to understand since it looks like this and still removes the zeros:
World Population[1][2] Yearmillion 1500458 1600580 1700682 1750791 1800978 18501,262 19001,65 19502,521 19995,978 20086,707 20117 20157,35 20187,6 20207,75
Other things i have tried is:
forsing it to always print out three decimals, that wont work since it adds more numbers to the end of all numbers.
The main & easiest possible solution that you have is to change your Spreadsheet's locale setting to one that uses the , as mile separator.
As an alternative, if changing this setting is really not a possibility, you could create a script that uses URLFetchApp to retrieve the page's contents and parses the values, taking into considerations the usage of , as mile separator.

how to use seq2seq to decode concatenated string

Am trying to decode a concatenated String like below ...
SQCB7A750BATWE SQ CB 7 A 750 B A T WE
PT05A1219PY023 PT 05 A 12 19 P Y 023
PT55A1019PX02 PT 55 A 10 19 P X 02
PT33SE2215SW023 PT 33 SE 22 15 S W 023
PT05A2216PW023(LC) PT 05 A 22 16 P W 023 (LC)
am looking for a smarter way rather than hard-coded rules as the input will have variations(number of characters and digits), I came across SEQ2SEQ model and I want to know if it's possible to use it in such problem
I already followed some tutorials to get a taste of it, but the results weren't even close
it also seems there are 2 approaches character level and word level as per this tutorial
Character level:
Input sentence: SQCACA333BA71A
Decoded sentence: P 9(PDD366AZ2IDD4K )F)F(L)L)1)1)1) 6A
-
Input sentence: SQCAAC152DA71A
Decoded sentence: P 9(PDD366AZ2IDD4K )F)F(L)L)1)1)1) 6A
am still trying to implement the word level, but I'd like to know if the problem can be solved using this approach (seq2seq)

The DHT in JPEG does not contain the actual huffman code, how come?

The DHT contains 16 bytes that just contains count of how many values were encoded with huffman code of each length from 1 bit all the way to 16 bits. After this, it contains the actual values that were encoded, all these value are 8 bits in size.
Q: Why is huffman code not stored, how does decoder derive the codes?
Q: If there are say 4 values that have huffman code of 3 bits long, we shall write them as 4 bytes. Does it matter what order they are in or they have to be in ascending or descending order? I do know that the values must be in order such that the values with 1 bit huffman code are then followed by values with 2 bit huffman code e.t.c.
Q: I have used jpegsnoop to look at huffman table of different files, some made in MS paint and some were downloaded. I find that they all have the SAME table.
Here are the DHT entries I got from JPEG snoop:
Destination ID = 1
Class = 1 (AC Table)
Codes of length 01 bits (000 total):
Codes of length 02 bits (002 total): 00 01
Codes of length 03 bits (001 total): 02
Codes of length 04 bits (002 total): 03 11
Codes of length 05 bits (004 total): 04 05 21 31
Codes of length 06 bits (004 total): 06 12 41 51
Codes of length 07 bits (003 total): 07 61 71
Codes of length 08 bits (004 total): 13 22 32 81
Codes of length 09 bits (007 total): 08 14 42 91 A1 B1 C1
Codes of length 10 bits (005 total): 09 23 33 52 F0
Codes of length 11 bits (004 total): 15 62 72 D1
Codes of length 12 bits (004 total): 0A 16 24 34
Codes of length 13 bits (000 total):
Codes of length 14 bits (001 total): E1
Codes of length 15 bits (002 total): 25 F1
Codes of length 16 bits (119 total): 17 18 19 1A 26 27 28 29 2A 35 36 37 38 39 3A 43
44 45 46 47 48 49 4A 53 54 55 56 57 58 59 5A 63
64 65 66 67 68 69 6A 73 74 75 76 77 78 79 7A 82
83 84 85 86 87 88 89 8A 92 93 94 95 96 97 98 99
9A A2 A3 A4 A5 A6 A7 A8 A9 AA B2 B3 B4 B5 B6 B7
B8 B9 BA C2 C3 C4 C5 C6 C7 C8 C9 CA D2 D3 D4 D5
D6 D7 D8 D9 DA E2 E3 E4 E5 E6 E7 E8 E9 EA F2 F3
F4 F5 F6 F7 F8 F9 FA
Total number of codes: 162
And
Destination ID = 1
Class = 0 (DC / Lossless Table)
Codes of length 01 bits (000 total):
Codes of length 02 bits (003 total): 00 01 02
Codes of length 03 bits (001 total): 03
Codes of length 04 bits (001 total): 04
Codes of length 05 bits (001 total): 05
Codes of length 06 bits (001 total): 06
Codes of length 07 bits (001 total): 07
Codes of length 08 bits (001 total): 08
Codes of length 09 bits (001 total): 09
Codes of length 10 bits (001 total): 0A
Codes of length 11 bits (001 total): 0B
Codes of length 12 bits (000 total):
Codes of length 13 bits (000 total):
Codes of length 14 bits (000 total):
Codes of length 15 bits (000 total):
Codes of length 16 bits (000 total):
Total number of codes: 012
The AC table compresses RRRRSSSS that contain zero-run length and AC coefficient magnitude while the DC table compresses SSSS. Thus, I think that the AC table must contain total of 255 entries (exlcuded 0) while the DC table must be 15 entries (excluded 0). However, neither of the tables contain this many total number of codes. WHY?
Q: Why is huffman code not stored, how does decoder derive the codes?
The reason the Huffman tables is defined as they are rather than with the actual codes is that it is much smaller and simpler to encode that way. PNG uses a similar but different method.
Keep in mind that to store the Huffman codes in the JPEG stream you would need to include both the length and the code itself. The result would be much larger than encoding a count of lengths.
Q: If there are say 4 values that have huffman code of 3 bits long, we shall write them as 4 bytes. Does it matter what order they are in or they have to be in ascending or descending order?
If the Huffman code has 3 bits, it is written as three bits to the JPEG stream. The codes are generated in ascending order.
Q: I have used jpegsnoop to look at huffman table of different files, some made in MS paint and some were downloaded. I find that they all have the SAME table.
The encoder is being lazy and using a fixed Huffman table. There is a sample Huffman table in the JPEG standard that they often use. To generate optimal Huffman codes, the encoder must make two passes over the data. With a preset table, the encoder only needs to make one pass.
F.1.2.1.2 and F.1.2.2.1 of the JPEG Specification explain why the Huffman tables are not fully populated. For baseline encoding DC difference values are limited to 11 bits (table F.1) and AC values are limited to 10 bits (table F.2).
Since DC Huffman symbols only need SSSS values from 0 to 11 their Huffman trees need only 12 codes as you've reported.
AC Huffman symbols have a prefix zero run count from 0 to 15. With 11 bit sizes that works out to 16 * 11 = 176 symbols. However, they don't include the symbols 0x10, 0x20, ... 0xE0 because they are redundant. They encode a run of 1, 2, ... 14 zeros followed by a 0 value. If an encoder has, say, 7 zero values followed by a 3 bit value it can encode that as 0x73. There would be no point encoding it with two symbols 0x60;0x03.
Ignoring those 14 useless symbols we end up with 162 codes as you have reported.
By the way, the 0xF0 (ZRL) value is needed because there isn't a symbol that can express a run of 16 zeros followed by a value thus is cannot be merged.
I don't know why the JPEG spec limits the DC and AC values to a certain number of bits. I speculate that the extra precision would have no effect or is typically thrown away by quantization. Or maybe it has to do with the mathematics of the Inverse Discrete Cosine Transform. Keep in mind that these Huffman encoded values are (quantized) coefficients for the IDCT and are only indirectly related to the continuous tone RGB output.
The Huffman encoding is almost fully determined by the relative frequencies of all 256 symbols (except tiebreaker rules). This means you can choose many, many formats to encode those relative frequencies; the most simple one would be to simply store all these frequencies. The receiver can then rebuild the encoding from that order.
Background: the two least frequent characters of a Huffman encoding share the same (long) prefix, and differ only in the last bit. This combination is then assigned a joint frequency (sum of both combinations), which is used recursively to determine the prefix. Finally, you end up with two sets, one holding X characters and the other holding 256-X characters. The first set has prefix 0 and the second set has prefix 1.
Yes, that's arbitrary, you could swap those 0 and 1, and have a similar table and the same compression ratio - a 0 is just as long as a one. That's why you have detailed rules (e.g. most common set gets 0, tiebreaker is first byte in set)
Back to encoding. You want to store these relative frequencies efficiently, as we're using compression here. Now, as I pointed out, when we have codes suffix-0 and suffix-1, they're both equally long (namely the suffix length plus one). So we know from the fact that there are 119 unique 16 bits codes that there are 60 unique prefixes with length 15. Calculating backwards, we also know that there are two unique symbols with length 15, total 62, so there must be 31 prefixes of length 14. We can backtrack again to prefixes of length 1.
Again, it's necessary to point out that we don't know here the exact values of those prefixes, and the matching symbols. This depends on the tiebreaker rules, as pointed out, but those rules are fixed for JPEG.
JPEG does have a bit of a special case: Huffman codes for very rare symbols should be longer than 16 bits. That's inconvenient, so in building the table you don't choose the two sets with the least combined frequency if either of them already has a long suffix - combining those two subsets would just make the suffix even longer. You see this with all the 16-bit codes in the example: most should have had longer codes in a pure Huffman encoding.
I think the worst case is if the most frequent character appears 50%, the next 25%, etc. You'll get codes 0, 10, 110, 1110 etcetera. That's unary counting, which is indeed optimal for that case, but the longest code is now 256 bits. You'd need a document with 2^256 bytes to have a frequency of 2^-256, though.

VARSTOCASES (in SPSS) function with unequal spacing of waves

My problem with the VARSTOCASES is that I'm unable to deal with unequal spacing of waves in longitudinal data (I'm using the NLSY79). My dependent variable (log of wage) is not available for all years. But with R, I can easily deal with that using a syntax like this :
ld = reshape(d, varying = c("logwage1989", "logwage1990", "logwage1991", "logwage1992", "logwage1993", "logwage1994", "logwage1996", "logwage1998", "logwage2000", "logwage2002", "logwage2004", "logwage2006", "logwage2008", "logwage2010"), v.names = "logwage", timevar = "year", times = c("1989", "1990", "1991", "1992", "1993", "1994", "1996", "1998", "2000", "2002", "2004", "2006", "2008", "2010"), direction = "long")
And in SPSS, what I do is something like this :
VARSTOCASES
/make logwage from logwage1989 logwage1990 logwage1991 logwage1992 logwage1993 logwage1994 logwage1996 logwage1998 logwage2000 logwage2002 logwage2004 logwage2006 logwage2008 logwage2010
/index= year(14)
/keep=grade AFQT educmom educdad occupationmom occupationdad familyincome.
In the above, 14 is the total number of waves. And what SPSS outputs is a series of numbers going from 1 to 14. The data is collected once every year first, and then it's collected once every two years. For SPSS, the values 1 and 2 in the year variable correspond to 1989 and 1990 while values 13 and 14 correspond to 2008 and 2010, respectively. And that's the problem.
How would you write the reshape function in SPSS as I did in R ?
On the VARSTOCASES command instead of using a numeric index you can use a string index, which will put the original variable names into the column. This can then be converted to a numeric column of the years.
DATA LIST FREE /logwage1989 logwage1990 logwage1991 logwage1992 logwage1993 logwage1994 logwage1996 logwage1998
logwage2000 logwage2002 logwage2004 logwage2006 logwage2008 logwage2010.
BEGIN DATA.
89 90 91 92 93 94 96 98 00 02 04 06 08 10
END DATA.
VARSTOCASES
/MAKE logwage FROM logwage1989 TO logwage2010
/INDEX=year (logwage).
*Now convert to an actual year.
COMPUTE year = REPLACE(year,"logwage","").
ALTER TYPE year (F4.0).

How correlation help in matching two images?

It is known that if we are finding the most matching window to the current window in the entire image, then wherever the correlation is maximised then that is the matching window.
[22 12 14] (window)
(image)
[22 12 34 54 ]
[112 34 54 111 ]
[12 22 12 34 ]
[11 22 12 14 ]
But correlation is product of corresponding values in two windows.
So, if some of the window have high intensity values then they will always provide better match. e.g. in above example we will have higher correlation value for 2nd row.
Probably you need Normalized Cross Correlation, the maximum will be in 4th row.

Resources