WEKA Regression Model - machine-learning

I'm trying to build a test a regression model in WEKA. Problem is I don't know enough about WEKA to accomplish what I'm trying to do. The data set I'm using is a sample set from a WEKA repository. Here is the first few lines of my .arff
#relation autompg
#attribute MPG numeric
#attribute Cynlinders numeric
#attribute Displacement numeric
#attribute Horsepower numeric
#attribute Weight numeric
#attribute Acceleration numeric
#attribute Year numeric
#attribute Origin numeric
#attribute me {'chevrolet chevelle malibu','buick skylark 320','plymouth satellite','amc rebel sst','ford torino','ford galaxie 500','chevrolet impala','plymouth fury iii','pontiac catali','amc ambassador dpl','citroen ds-21 pallas','chevrolet chevelle concours (sw)','ford torino (sw)','plymouth satellite (sw)','amc rebel sst (sw)','dodge challenger se','plymouth cuda 340','ford mustang boss 302','chevrolet monte carlo','buick estate wagon (sw)','toyota coro mark ii','plymouth duster','amc hornet','ford maverick','datsun pl510','volkswagen 1131 deluxe sedan','peugeot 504','audi 100 ls','saab 99e','bmw 2002','amc gremlin','ford f250','chevy c20','dodge d200','hi 1200d','chevrolet vega 2300','toyota coro','ford pinto','volkswagen super beetle 117','plymouth satellite custom','ford torino 500','amc matador','pontiac catali brougham','dodge moco (sw)','ford country squire (sw)','pontiac safari (sw)','amc hornet sportabout (sw)','chevrolet vega (sw)','pontiac firebird','ford mustang','mercury capri 2000','opel 1900','peugeot 304','fiat 124b','toyota corolla 1200','datsun 1200','volkswagen model 111','plymouth cricket','toyota coro hardtop','dodge colt hardtop','volkswagen type 3','chevrolet vega','ford pinto rubout','amc ambassador sst','mercury marquis','buick lesabre custom','oldsmobile delta 88 royale','chrysler newport royal','mazda rx2 coupe','amc matador (sw)','ford gran torino (sw)','plymouth satellite custom (sw)','volvo 145e (sw)','volkswagen 411 (sw)','peugeot 504 (sw)','reult 12 (sw)','ford pinto (sw)','datsun 510 (sw)','toyouta coro mark ii (sw)','dodge colt (sw)','toyota corolla 1600 (sw)','buick century 350','chevrolet malibu','ford gran torino','dodge coronet custom','mercury marquis brougham','chevrolet caprice classic','ford ltd','plymouth fury gran sedan','chrysler new yorker brougham','buick electra 225 custom','amc ambassador brougham','plymouth valiant','chevrolet nova custom','volkswagen super beetle','ford country','plymouth custom suburb','oldsmobile vista cruiser','toyota cari','datsun 610','maxda rx3','mercury capri v6','fiat 124 sport coupe','chevrolet monte carlo s','pontiac grand prix','fiat 128','opel manta','audi 100ls','volvo 144ea','dodge dart custom','saab 99le','toyota mark ii','oldsmobile omega','chevrolet nova','datsun b210','chevrolet chevelle malibu classic','plymouth satellite sebring','buick century luxus (sw)','dodge coronet custom (sw)','audi fox','volkswagen dasher','datsun 710','dodge colt','fiat 124 tc','honda civic',subaru,'fiat x1.9','plymouth valiant custom','mercury morch','chevrolet bel air','plymouth grand fury','buick century','chevroelt chevelle malibu','plymouth fury','buick skyhawk','chevrolet monza 2+2','ford mustang ii','toyota corolla','pontiac astro','volkswagen rabbit','amc pacer','volvo 244dl','honda civic cvcc','fiat 131','capri ii','reult 12tl','dodge coronet brougham','chevrolet chevette','chevrolet woody','vw rabbit','dodge aspen se','ford grada ghia','pontiac ventura sj','amc pacer d/l','datsun b-210','volvo 245','plymouth volare premier v8','mercedes-benz 280s','cadillac seville','chevy c10','ford f108','dodge d100','honda accord cvcc','buick opel isuzu deluxe','reult 5 gtl','plymouth arrow gs','datsun f-10 hatchback','oldsmobile cutlass supreme','dodge moco brougham','mercury cougar brougham','chevrolet concours','buick skylark','plymouth volare custom','ford grada','pontiac grand prix lj','chevrolet monte carlo landau','chrysler cordoba','ford thunderbird','volkswagen rabbit custom','pontiac sunbird coupe','toyota corolla liftback','ford mustang ii 2+2','dodge colt m/m','subaru dl','datsun 810','bmw 320i','mazda rx-4','volkswagen rabbit custom diesel','ford fiesta','mazda glc deluxe','datsun b210 gx','oldsmobile cutlass salon brougham','dodge diplomat','mercury morch ghia','pontiac phoenix lj','ford fairmont (auto)','ford fairmont (man)','plymouth volare','amc concord','buick century special','mercury zephyr','dodge aspen','amc concord d/l','buick regal sport coupe (turbo)','ford futura','dodge magnum xe','datsun 510','dodge omni','toyota celica gt liftback','plymouth sapporo','oldsmobile starfire sx','datsun 200-sx','audi 5000','volvo 264gl','saab 99gle','peugeot 604sl','volkswagen scirocco','honda accord lx','pontiac lemans v6','mercury zephyr 6','ford fairmont 4','amc concord dl 6','dodge aspen 6','ford ltd landau','mercury grand marquis','dodge st. regis','chevrolet malibu classic (sw)','chrysler lebaron town # country (sw)','vw rabbit custom','maxda glc deluxe','dodge colt hatchback custom','amc spirit dl','mercedes benz 300d','cadillac eldorado','plymouth horizon','plymouth horizon tc3','datsun 210','fiat strada custom','buick skylark limited','chevrolet citation','oldsmobile omega brougham','pontiac phoenix','toyota corolla tercel','datsun 310','ford fairmont','audi 4000','toyota coro liftback','mazda 626','datsun 510 hatchback','mazda glc','vw rabbit c (diesel)','vw dasher (diesel)','audi 5000s (diesel)','mercedes-benz 240d','honda civic 1500 gl','reult lecar deluxe','vokswagen rabbit','datsun 280-zx','mazda rx-7 gs','triumph tr7 coupe','ford mustang cobra','honda accord','plymouth reliant','dodge aries wagon (sw)','toyota starlet','plymouth champ','honda civic 1300','datsun 210 mpg','toyota tercel','mazda glc 4','plymouth horizon 4','ford escort 4w','ford escort 2h','volkswagen jetta','reult 18i','honda prelude','datsun 200sx','peugeot 505s turbo diesel','saab 900s','volvo diesel','toyota cressida','datsun 810 maxima','oldsmobile cutlass ls','ford grada gl','chrysler lebaron salon','chevrolet cavalier','chevrolet cavalier wagon','chevrolet cavalier 2-door','pontiac j2000 se hatchback','dodge aries se','ford fairmont futura','amc concord dl','volkswagen rabbit l','mazda glc custom l','mazda glc custom','plymouth horizon miser','mercury lynx l','nissan stanza xe','honda civic (auto)','datsun 310 gx','buick century limited','oldsmobile cutlass ciera (diesel)','chrysler lebaron medallion','ford grada l','toyota celica gt','dodge charger 2.2','chevrolet camaro','ford mustang gl','vw pickup','dodge rampage','ford ranger','chevy s-10'}
#data
18,8,307,130,3504,12,70,1,'chevrolet chevelle malibu'
15,8,350,165,3693,11.5,70,1,'buick skylark 320'
18,8,318,150,3436,11,70,1,'plymouth satellite'
My question is this, when I run a test set with the following data:
#relation autompg
#attribute MPG numeric
#attribute Cynlinders numeric
#attribute Displacement numeric
#attribute Horsepower numeric
#attribute Weight numeric
#attribute Acceleration numeric
#attribute Year numeric
#attribute Origin numeric
#attribute me {'chevrolet chevelle malibu','buick skylark 320','plymouth satellite','amc rebel sst','ford torino','ford galaxie 500','chevrolet impala','plymouth fury iii','pontiac catali','amc ambassador dpl','citroen ds-21 pallas','chevrolet chevelle concours (sw)','ford torino (sw)','plymouth satellite (sw)','amc rebel sst (sw)','dodge challenger se','plymouth cuda 340','ford mustang boss 302','chevrolet monte carlo','buick estate wagon (sw)','toyota coro mark ii','plymouth duster','amc hornet','ford maverick','datsun pl510','volkswagen 1131 deluxe sedan','peugeot 504','audi 100 ls','saab 99e','bmw 2002','amc gremlin','ford f250','chevy c20','dodge d200','hi 1200d','chevrolet vega 2300','toyota coro','ford pinto','volkswagen super beetle 117','plymouth satellite custom','ford torino 500','amc matador','pontiac catali brougham','dodge moco (sw)','ford country squire (sw)','pontiac safari (sw)','amc hornet sportabout (sw)','chevrolet vega (sw)','pontiac firebird','ford mustang','mercury capri 2000','opel 1900','peugeot 304','fiat 124b','toyota corolla 1200','datsun 1200','volkswagen model 111','plymouth cricket','toyota coro hardtop','dodge colt hardtop','volkswagen type 3','chevrolet vega','ford pinto rubout','amc ambassador sst','mercury marquis','buick lesabre custom','oldsmobile delta 88 royale','chrysler newport royal','mazda rx2 coupe','amc matador (sw)','ford gran torino (sw)','plymouth satellite custom (sw)','volvo 145e (sw)','volkswagen 411 (sw)','peugeot 504 (sw)','reult 12 (sw)','ford pinto (sw)','datsun 510 (sw)','toyouta coro mark ii (sw)','dodge colt (sw)','toyota corolla 1600 (sw)','buick century 350','chevrolet malibu','ford gran torino','dodge coronet custom','mercury marquis brougham','chevrolet caprice classic','ford ltd','plymouth fury gran sedan','chrysler new yorker brougham','buick electra 225 custom','amc ambassador brougham','plymouth valiant','chevrolet nova custom','volkswagen super beetle','ford country','plymouth custom suburb','oldsmobile vista cruiser','toyota cari','datsun 610','maxda rx3','mercury capri v6','fiat 124 sport coupe','chevrolet monte carlo s','pontiac grand prix','fiat 128','opel manta','audi 100ls','volvo 144ea','dodge dart custom','saab 99le','toyota mark ii','oldsmobile omega','chevrolet nova','datsun b210','chevrolet chevelle malibu classic','plymouth satellite sebring','buick century luxus (sw)','dodge coronet custom (sw)','audi fox','volkswagen dasher','datsun 710','dodge colt','fiat 124 tc','honda civic',subaru,'fiat x1.9','plymouth valiant custom','mercury morch','chevrolet bel air','plymouth grand fury','buick century','chevroelt chevelle malibu','plymouth fury','buick skyhawk','chevrolet monza 2+2','ford mustang ii','toyota corolla','pontiac astro','volkswagen rabbit','amc pacer','volvo 244dl','honda civic cvcc','fiat 131','capri ii','reult 12tl','dodge coronet brougham','chevrolet chevette','chevrolet woody','vw rabbit','dodge aspen se','ford grada ghia','pontiac ventura sj','amc pacer d/l','datsun b-210','volvo 245','plymouth volare premier v8','mercedes-benz 280s','cadillac seville','chevy c10','ford f108','dodge d100','honda accord cvcc','buick opel isuzu deluxe','reult 5 gtl','plymouth arrow gs','datsun f-10 hatchback','oldsmobile cutlass supreme','dodge moco brougham','mercury cougar brougham','chevrolet concours','buick skylark','plymouth volare custom','ford grada','pontiac grand prix lj','chevrolet monte carlo landau','chrysler cordoba','ford thunderbird','volkswagen rabbit custom','pontiac sunbird coupe','toyota corolla liftback','ford mustang ii 2+2','dodge colt m/m','subaru dl','datsun 810','bmw 320i','mazda rx-4','volkswagen rabbit custom diesel','ford fiesta','mazda glc deluxe','datsun b210 gx','oldsmobile cutlass salon brougham','dodge diplomat','mercury morch ghia','pontiac phoenix lj','ford fairmont (auto)','ford fairmont (man)','plymouth volare','amc concord','buick century special','mercury zephyr','dodge aspen','amc concord d/l','buick regal sport coupe (turbo)','ford futura','dodge magnum xe','datsun 510','dodge omni','toyota celica gt liftback','plymouth sapporo','oldsmobile starfire sx','datsun 200-sx','audi 5000','volvo 264gl','saab 99gle','peugeot 604sl','volkswagen scirocco','honda accord lx','pontiac lemans v6','mercury zephyr 6','ford fairmont 4','amc concord dl 6','dodge aspen 6','ford ltd landau','mercury grand marquis','dodge st. regis','chevrolet malibu classic (sw)','chrysler lebaron town # country (sw)','vw rabbit custom','maxda glc deluxe','dodge colt hatchback custom','amc spirit dl','mercedes benz 300d','cadillac eldorado','plymouth horizon','plymouth horizon tc3','datsun 210','fiat strada custom','buick skylark limited','chevrolet citation','oldsmobile omega brougham','pontiac phoenix','toyota corolla tercel','datsun 310','ford fairmont','audi 4000','toyota coro liftback','mazda 626','datsun 510 hatchback','mazda glc','vw rabbit c (diesel)','vw dasher (diesel)','audi 5000s (diesel)','mercedes-benz 240d','honda civic 1500 gl','reult lecar deluxe','vokswagen rabbit','datsun 280-zx','mazda rx-7 gs','triumph tr7 coupe','ford mustang cobra','honda accord','plymouth reliant','dodge aries wagon (sw)','toyota starlet','plymouth champ','honda civic 1300','datsun 210 mpg','toyota tercel','mazda glc 4','plymouth horizon 4','ford escort 4w','ford escort 2h','volkswagen jetta','reult 18i','honda prelude','datsun 200sx','peugeot 505s turbo diesel','saab 900s','volvo diesel','toyota cressida','datsun 810 maxima','oldsmobile cutlass ls','ford grada gl','chrysler lebaron salon','chevrolet cavalier','chevrolet cavalier wagon','chevrolet cavalier 2-door','pontiac j2000 se hatchback','dodge aries se','ford fairmont futura','amc concord dl','volkswagen rabbit l','mazda glc custom l','mazda glc custom','plymouth horizon miser','mercury lynx l','nissan stanza xe','honda civic (auto)','datsun 310 gx','buick century limited','oldsmobile cutlass ciera (diesel)','chrysler lebaron medallion','ford grada l','toyota celica gt','dodge charger 2.2','chevrolet camaro','ford mustang gl','vw pickup','dodge rampage','ford ranger','chevy s-10'}
#data
14,8,455,225,4425,10,70,1,'pontiac catali'
15,8,390,190,3850,8.5,70,1,'amc ambassador dpl'
My question is this... no matter what is in my data. When I choose me test set, the # instances is always ?. And the output never evaluates the data. Other than the long forumla that the filter creates this is my output
Time taken to build model: 3.98 seconds
=== Evaluation on test set ===
=== Summary ===
Correlation coefficient 0.9917
Mean absolute error 0.5322
Root mean squared error 0.971
Relative absolute error 7.7403 %
Root relative squared error 11.6685 %
Total Number of Instances 223

Click on 'More Options..' in the Classify tab and select 'Output Predictions' to show the predictions on the test set.

Related

Census data extraction for time series

I am trying to download the average population for AZ counties using tidycensus, using the code below. How can I download population data for a time series period from 2000-2019 (interpolating for years that do not have decennial census or acs data)
library(tidycensus)
library(tidyverse)
soc.2010 <- get_decennial(geography = "county", state = "AZ", year = 2010, variables = (c(pop="P001001")), survey="sf1")
soc.16 <- get_acs(geography = "county", year=2016, variables = (c(pop="B01003_001")),state="AZ", survey="acs5") %>% mutate(Year = "2016")
You can use the tidycensus function, get_estimates() to get population estimates by county for each year beginning in 2010.
library(tidycensus)
library(dplyr)
get_estimates(
geography = "county",
state = "AZ",
product = "population",
time_series = TRUE
) %>%
filter(DATE >= 3) %>%
mutate(year = DATE + 2007)
#> # A tibble: 300 x 6
#> NAME DATE GEOID variable value year
#> <chr> <dbl> <chr> <chr> <dbl> <dbl>
#> 1 Pima County, Arizona 3 04019 POP 981620 2010
#> 2 Pima County, Arizona 4 04019 POP 988381 2011
#> 3 Pima County, Arizona 5 04019 POP 993052 2012
#> 4 Pima County, Arizona 6 04019 POP 997127 2013
#> 5 Pima County, Arizona 7 04019 POP 1004229 2014
#> 6 Pima County, Arizona 8 04019 POP 1009103 2015
#> 7 Pima County, Arizona 9 04019 POP 1016707 2016
#> 8 Pima County, Arizona 10 04019 POP 1026391 2017
#> 9 Pima County, Arizona 11 04019 POP 1036554 2018
#> 10 Pima County, Arizona 12 04019 POP 1047279 2019
#> # ... with 290 more rows
The API returns somewhat confusing date codes that I've converted to years. See the date code to year mapping for 2019 population estimates for more information.
For years prior to 2010, the Census API uses a different format that is not accessible via tidycensus. But here is an API call that gives you population by county by year for 2000 to 2010:
https://api.census.gov/data/2000/pep/int_population?get=GEONAME,POP,DATE_DESC&for=county:*&in=state:04
["Graham County, Arizona","33356","7/1/2001 population estimate","04","009"],
["Graham County, Arizona","33224","7/1/2002 population estimate","04","009"],
["Graham County, Arizona","32985","7/1/2003 population estimate","04","009"],
["Graham County, Arizona","32703","7/1/2004 population estimate","04","009"],
["Graham County, Arizona","32964","7/1/2005 population estimate","04","009"],
["Graham County, Arizona","33701","7/1/2006 population estimate","04","009"],
["Graham County, Arizona","35175","7/1/2007 population estimate","04","009"],
["Graham County, Arizona","36639","7/1/2008 population estimate","04","009"],
["Graham County, Arizona","37525","7/1/2009 population estimate","04","009"],
["Graham County, Arizona","37220","4/1/2010 Census 2010 population","04","009"],

Data Visualization & Machine Learning

Preprocess the data and see the results after and before preprocessing(Report as accuracy)
Draw the following charts:
Corelation chart Heatmap chart
Missing Values Heatmap chart
Line chart/ scatter chart for Country Vs Purchased, Age Vs Purchased and Salary Vs Purchased
Country Age Salary Purchased
France 44 72000 No
Spain 27 48000 Yes
Germany 30 54000 No
Spain 38 61000 No
Germany 40 Yes
France 35 58000 Yes
Spain 52000 No
France 48 79000 Yes
Germany 50 83000 No
France 37 Yes
France 18888 No
Spain 17 67890 Yes
Germany 12000 No
Spain 38 98888 No
Germany 50 Yes
France 35 58000 Yes
Spain 12345 No
France 23 Yes
Germany 55 78456 No
France 43215 Yes
Sometimes it's hard to understand from scatter plot like Country vs Purchased. Three country of your list somehow purhcased. It can be helpful to do heatmap here
import pandas as pd
from matplotlib import pyplot as plt
#read csv using panda
df = pd.read_csv('Data.csv')
copydf = df
#before data preprocessing
print(copydf)
#fill nan value with average of age and salary
df['Age'] = df['Age'].fillna(df['Age'].mean(axis=0))
df['Salary '] = df['Salary'].fillna(df['Salary'].mean(axis=0))
#after data preprocessing
print(df)
plt.figure(1)
# Country Vs Purchased
plt.subplot(221)
plt.scatter(df['Country'], df['Purchased'])
plt.title('Country vs Purchased')
plt.grid(True)
# Age Vs Purchased
plt.subplot(222)
plt.scatter(df['Age'], df['Purchased'])
plt.title('Age vs Purchased')
plt.grid(True)
# Salary Vs Purchased
plt.subplot(223)
plt.scatter(df['Salary'], df['Purchased'])
plt.title('Salary vs Purchased')
plt.grid(True)
plt.subplots_adjust(top=0.92, bottom=0.08, left=0.10, right=0.95, hspace=0.75,
wspace=0.5)
plt.show()

Vowpal Wabbit - precision recall f-measure

How do you usually get precision, recall and f-measure from a model created in Vowpal Wabbit on a classification problem?
Are there any available scripts or programs that are commonly used for this with vw's output?
To make a minimal example using the following data in playtennis.txt :
2 | sunny 85 85 false
2 | sunny 80 90 true
1 | overcast 83 78 false
1 | rain 70 96 false
1 | rain 68 80 false
2 | rain 65 70 true
1 | overcast 64 65 true
2 | sunny 72 95 false
1 | sunny 69 70 false
1 | rain 75 80 false
1 | sunny 75 70 true
1 | overcast 72 90 true
1 | overcast 81 75 false
2 | rain 71 80 true
I create the model with:
vw playtennis.txt --oaa 2 -f playtennis.model --loss_function logistic
Then, I get predictions and raw predictions of the trained model on the training data itself with:
vw -t -i playtennis.model playtennis.txt -p playtennis.predict -r playtennis.rawp
Going from here, what scripts or programs do you usually use to get precision, recall and f-measure, given training data playtennis.txt and the predictions on the training data in playtennis.predict?
Also, if this where a multi-label classification problem (each instance can have more than 1 target label, which vw can also handle), would your proposed scripts or programs capable to process these?
Given that you have a pair of 'predicted vs actual' value for each example, you can use Rich Caruana's KDD perf utility to compute these (and many other) metrics.
In the case of multi-class, you should simply consider every correctly classified case a success and every class-mismatch a failure to predict correctly.
Here's a more detailed recipe for the binary case:
# get the labels into *.actual (correct) file
$ cut -d' ' -f1 playtennis.txt > playtennis.actual
# paste the actual vs predicted side-by-side (+ cleanup trailing zeros)
$ paste playtennis.actual playtennis.predict | sed 's/\.0*$//' > playtennis.ap
# convert original (1,2) classes to binary (0,1):
$ perl -pe 's/1/0/g; s/2/1/g;' playtennis.ap > playtennis.ap01
# run perf to determine precision, recall and F-measure:
$ perf -PRE -REC -PRF -file playtennis.ap01
PRE 1.00000 pred_thresh 0.500000
REC 0.80000 pred_thresh 0.500000
PRF 0.88889 pred_thresh 0.500000
Note that as Martin mentioned, vw uses the {-1, +1} convention for binary classification, whereas perf uses the {0, 1} convention so you may have to translate back and forth when switching between the two.
For binary classification, I would recommend to use labels +1 (play tennis) and -1 (don't play tennis) and --loss_function=logistic (although --oaa 2 and labels 1 and 2 can be used as well). VW then reports the logistic loss, which may be more informative/useful evaluation measure than accuracy/precision/recall/f1 (depending on the application). If you want 0/1 loss (i.e. "one minus accuracy"), add --binary.
For precision, recall, f1-score, auc and other measures, you can use the perf tool as suggested in arielf's answer.
For standard multi-class classification (one correct class for each example), use --oaa N --loss_function=logistic and VW will report the 0/1 loss.
For multi-label multi-class classification (more correct labels per example allowed), you can use --multilabel_oaa N (or convert each original example into N binary-classification examples).

SPSS Calculate percentiles with weighted average

My background is in databases and SQL coding. I’ve used the CTABLES feature in SPSS a little, mostly for calculating percentiles which is slow in sql. But now I have a data set where I need to calculate percentiles for a weighted average which is not as straightforward, and I can’t figure out if it’s possible in SPSS or not.
I have data similar to the following
Country Region District Units Cost per Unit
USA Central DivisionQ 10 3
USA Central DivisionQ 12 2.5
USA Central DivisionQ 25 1.5
USA Central DivisionQ 6 4
USA Central DivisionA 3 3.25
USA Central DivisionA 76 1.75
USA Central DivisionA 42 1.5
USA Central DivisionA 1 8
USA Eastern DivisionQ 14 3
USA Eastern DivisionQ 25 2.5
USA Eastern DivisionQ 75 1.5
USA Eastern DivisionQ 9 4
USA Eastern DivisionA 100 3.25
USA Eastern DivisionA 4 1.75
USA Eastern DivisionA 33 1.5
USA Eastern DivisionA 17 8
452 51
For every possible segmentation (Country, Country-Region, Country-Region-District, Country-District etc.)
I want to get the Avg. Cost per Unit, ie. Cost per Unit weighted by Units, so that is total SUM(Units*CostPerUnit)/SUM(Units)
And I need to get the 10th, 25th, 50th, 75th, 90th percentiles for each possible segmentation.
The way I do this part in SQL is extract all the rows in the segment, sort and rank by Cost Per Unit. Get a running sum of Units for each row. Determine the ratio of that running sum to the total units, and that percentage determines which row has the Cost Per Unit for that percentile. An example , for Country = USA and Division = Q
Unit Running
Country Units Cost Unit divided by
Per Unit Running Total Units
USA Central DivisionQ 25 1.5 25 0.14 10th
USA Eastern DivisionQ 75 1.5 100 0.56 25th/50
USA Central DivisionQ 12 2.5 112 0.63
USA Eastern DivisionQ 25 2.5 137 0.77 75th
USA Central DivisionQ 10 3 147 0.83
USA Eastern DivisionQ 14 3 161 0.91 90th
USA Central DivisionQ 6 4 167 0.94
USA Eastern DivisionQ 9 4 176 1
This takes a very long time to do for each segment. Is it possible to leverage SPSS to do the same thing more easily?
Use SPLIT FILES (Data > Select Cases) to define the group and then use FREQUENCIES (Analyze > Descriptive Statistics > Frequencies) to calculate the statistics. Suppress the actual frequency tables (/FORMAT=NOTABLE).

3D reconstruction, matlab

I have image A and image B from same camera.
Points in Image A
PA=[
1 2172 998.9
2 2405 225.2
3 1480 1420
4 1045 1342
5 3039 1789
6 3727 968.7
7 1038 443.1
8 3606 856.6
9 1248 520.1
10 2189 976.8
]
Points in Image B
PB=[
1 2363 1598
2 2551 840.7
3 1768 2045
4 1404 1985
5 3040 2335
6 3636 1485
7 1393 1142
8 3514 1379
9 1550 1199
10 2378 1575]
t=1e-4;
Fundamental matrix
[F, inliers] = ransacfitfundmatrix(x1, x2, t);
F=[ 5.12243654806919e-009 -5.65511649689218e-008 -3.90901140383986e-006
9.48853562184938e-008 4.56036186476569e-008 -0.00133231474573608
-0.000178137312702315 0.00112651242300972 1.10421882784367]
Camera file
focallength =18.6188 mm
format size
width =22.6791 mm
height=15.1130 mm
Image size
5184*3456 pixel
Principle point
x0=11.5399 mm
y0=07.8574 mm
lens distortion (ideal)
K1=0 mm
K2=0 mm
K3=0
P1=0mm
P2=0 mm
Homography
H = vgg_H_from_x_lin(x1,x2)
**Question A= I want to get back PointsB
e.g.,
PointsB(:,1)==H*x1(:,1)
The results are wrong, why, any thing missing
More detail:
x2(:,1)'*F*x1(:,1)= -0.000644154818346676 % I guess its OK.
PointsB(:,1)==H*x1(:,1)= [ 2240.66095080911
1522.92361373263
0.953866074561989] %%%%%% WHY not 1
PB=[ 1 2363 1598]; SHOULD BE
Question B= How can I have 3D points from above informations.
Any link or matlab code would be helpful.
How can I use
vgg_X_from_xP_lin.m 3D point from image projections and cameras,
linear
X = vgg_X_from_xP_lin(u,P,imsize) % what is u
Are the two images taken with the same camera?
Question A: what you are looking for is point correspondences between the two images. One way to find corresponding points is to use local feature matching. There are many algorithms for detecting interest points and finding feature desriptors, such as SIFT, SURF, BRISK, FREAK, etc.
Question B: You can get the 3D points using triangulation. Also see Direct Linear Transformation in Multiple View Geometry in computer vision by Hartley and Zisserman.

Resources