VGG features extraction in certain format - machine-learning

I'm trying to get this repo to work. I followed the instruction and get the sample data using this script (taken from the same repo):
#!/usr/bin/env sh
# This script downloads the trained S2VT VGG (RGB) model,
# associated vocabulary, and frame features for the validation set.
echo "Downloading Model and Data [~400MB] ..."
wget --no-check-certificate https://www.dropbox.com/s/wn6k2oqurxzt6e2/s2s_vgg_pstream_allvocab_fac2_iter_16000.caffemodel
wget --no-check-certificate https://www.dropbox.com/s/20mxirwrqy1av01/yt_allframes_vgg_fc7_val.txt
wget --no-check-certificate https://www.dropbox.com/s/v1lrc6leknzgn3x/yt_coco_mvad_mpiimd_vocabulary.txt
echo "Organizing..."
DIR="./snapshots"
if [ ! -d "$DIR" ]; then
mkdir $DIR
fi
mv s2s_vgg_pstream_allvocab_fac2_iter_16000.caffemodel $DIR"/s2vt_vgg_rgb.caffemodel"
echo "Done."
In the next step they said that I need to sample video frames and extract VGG features for the frames. I'm not exactly sure how to do this. I have followed the instruction on Caffe but the features are not in the same format.
So how do I extract the VGG features in the same format as the yt_allframes_vgg_fc7_val.txt?

This repo provides the script for extracting VGG features from videos -
https://github.com/jesu9/VGGFeatExtract
In particular, see the script video_demo.py. This will output mat files which you will have to convert to txt files.
VGG 16 layer model and prototxt files available at -
https://gist.github.com/ksimonyan/211839e770f7b538e2d8

Related

Is there a guide on porting edk2 to a new ARM64 platform?

I am new to EDK2.
For porting ekd2 firmware to a new ARM64 platform, it would be good to first get a minimum edk2 port which can run UEFI Shell at least, improvements can be added gradually based on that.
It seems that the first step is rather steep, e.g., how to determine a minimal set of "items" in .dsc and .fdf file for a platform? In my case, I would like to build the .fd for my platform and treat it as BL33 of TF-A, effectively I would like to build an edk2 firmware to replace u-boot.
It seems that such a guide is hard to find on the web. I found a old version of edk2 which contains some instructions, but apparently they are obsolete (not exist in latest master branch, while can be found in UDK branches such as UDK2014), and I am not sure why those documents are removed from master branch.
Currently I can build .fd for FVP (edk2-platforms/Platform/ARM/VExpressPkg/ArmVExpress-FVP-AArch64.dsc), and it seems that the build output FVP_AARCH64_EFI.fd is supposed to be treated as BL33. Theoretically this could be a prototype for my new ARM64 platform, but to me it's too complex to start with: the firmware is about 2.5MiB in size (as compare to 500K of u-boot), so I guess it's far from a "minimum" version. but it's hard to figure out what features to be removed (and how).
I am wondering if there is a detailed guide on such topic...
After 1 month of trial and error, today I managed to bring my ARM64 platform into a UEFI Shell environment. I treat it as my 1st milestone on the EDK2 journey. Below I will try to summarize the steps I took so far, as a tentative answer to my question above. Guidance/corrections/comments are welcomed.
Get familiar with UEFI/PI spec and EDK2 implementation by reading books/specs/articles. Well, UEFI/PI specs are thousands of pages long...how to start? My main reading list is:
"Beyond Bios--Developing with the Unified Extensible Firmware Interface", 3rd ed, by Vincent Zimmer, et al. As the authors explained, the book is a kind of high level summary of the thousands-paged specs. And I find that the book is well organized for a new comer to get familiar with various UEFI related concepts. The main purpose of the 1st read (before playing with edk2 code base) is to get familiar with concepts and architectural ideas, not the details yet. Related sections need to be consulted later when reading EDK2 implementations.
EDK2 specs, including:
EDKII User Manual
EDKII Build Specification
EDKII DSC/FDF/DEC/INF File Specification
Various articles on the web...
Get a reference platform which can correctly boot a FD image built from latest EDK2 source, and play with the boot manager and Shell environment a bit. In my case, I chose RPi4B. For me, this is very important, as the reference platform serves as a handrail during the whole process, that whenever I encounter bugs or have doubts, I check the source/log of the reference platform. This solves most of the problems I encountered. Btw, always generating "build log" and "build report" for both reference platform and the target platform, as the two files contains very detailed information for comparison and check. Consult the EDK2 build spec on how to generate these two files during build.
I use the following script to build for RPi4B platform:
#!/bin/bash
# https://github.com/tianocore/edk2-platforms#how-to-build-linux-environment
export WORKSPACE=/home/bruin/work/tianocore
export PACKAGES_PATH=$WORKSPACE/edk2:$WORKSPACE/edk2-platforms:$WORKSPACE/edk2-non-osi
pushd $WORKSPACE
rm -rf ./Build/RPi4
source edk2/edksetup.sh
echo "Building BaseTools..."
make -C edk2/BaseTools all
#sudo apt install acpica-tools # iasl
# pip install antlr4-python3-runtime # -Y EXECUTION_ORDER
echo "Building firmware for Pi4B..."
GCC5_AARCH64_PREFIX=aarch64-none-linux-gnu- build \
-n 4 \
-a AARCH64 \
-p Platform/RaspberryPi/RPi4/RPi4.dsc \
-t GCC5 \
-b NOOPT \
-v -d 9 -j RPi4-build.log \
-y RPi4-build-report.txt \
-Y PCD \
-Y LIBRARY \
-Y DEPEX \
-Y HASH \
-Y BUILD_FLAGS \
-Y FLASH \
-Y FIXED_ADDRESS \
-Y EXECUTION_ORDER \
all
How to use the build result RPI_EFI.fd on RPi4B, consult the following:
edk2-platforms/Platform/RaspberryPi/RPi4/Readme.md
readme.md inside https://github.com/pftf/RPi4/releases/download/v1.17/RPi4_UEFI_Firmware_v1.32.zip. btw, I need to replace the original start4.elf and fixup4.dat with the ones in the zip file, otherwise, the boot of RPi4 will fail, complaining something like below:
RpiFirmwareGetClockRate: Get Clock Rate return: ClockRate=0 ClockId=C
ASSERT [ArasanMMCHost] /home/bruin/work/tianocore/edk2-platforms/Platform/RaspberryPi/
Drivers/ArasanMmcHostDxe/ArasanMmcHostDxe.c(263): BaseFrequency != 0
It's worth to analysis the RPI_EFI.fd content to some extend, by using some UEFI utilities. I mainly use the GUI version UEFITool of sudo apt install uefitool uefitool-cli. Other tools are also available. The anotomy of RPI_EFI.fd is of help when reading EDK2 build specs for checking understanding of the concepts.
One special aspect of RPI_EFI.fd is that the 1st 128K is bl31.bin binary from ATF. I guess this is due to the special booting connfiguration methods for RPi. For my platform, I don't need such kind of packaging, I only need to build the UEFI image MY.fd, which is treated as BL33 image and packaged into fip.bin togehter with BL2 and BL31 images by ATF build script.
Another aspect to notice is the "reset vector" in the begining of the .fd file. This related to the entry point of UEFI image (and entry point of each EDK2 modules), as well as interpreting the BL instruction for AArch64. Basically, it can be summarized as below:
The first [Components] in RPI_EFI.fd is ArmPlatformPkg/PrePi/PeiUniCore.inf, which is of MODULE_TYPE = SEC.
What's this component: this is the first (and only) SEC (Security) module in RPi4. What the name PrePi and Pei implies?
... the PI spec is not tied to edk2 PEIMs, and I don't see where EDKII PEI modules are currently the only "acknowledged" silicon init environment. The edk2 tree itself seems to contain platforms that don't use the edk2 PEI module set at all, but (IIRC) jump from SEC to DXE. I believe "ArmPlatformPkg/PrePi" and "ArmVirtPkg/PrePi" are related to this.
--- https://listman.redhat.com/archives/edk2-devel-archive/2020-November/msg00021.html
Its entry point: all UEFI components have the same entry point (_ModuleEntryPoint).
By "component", it means either a UEFI driver and UEFI app, both are PE32 executables, usually with suffix .efi.
The .efis are converted from ELF executables (.dll) by GenFw tool: modifying the file headers.
To verify that "all components' entry point is _ModuleEntryPoint":
Check the .dll generating command line in build report (build -y <BUILD_REPORT_FILE>), we have two flags "aarch64-none-linux-gnu-gcc" -o xxx.dll -u _ModuleEntryPoint -Wl,-e,_ModuleEntryPoint ...:
-u: gcc --help -v|grep "undefined SYMBOL" gives -u SYMBOL --undefined SYMBOL: star with undefined reference to SYMBOL.
Wl,-e: ld --help|grep "entry" gives -e ADDRESS, --entry ADDRESS Set start address.
Check all .dll files that Entry point address == _ModuleEntryPoint: find . -type f -name "*.dll" -exec sh -c "readelf -a {} |grep -E 'Entry point address|_ModuleEntryPoint'" \;
Its entry point is the entry point of whole UEFI FD image (i.e., from bl33_base_addr jump to this _ModuleEntryPoint):
Topology of the UEFI Firmware File
A UEFI Firmware File (actually a UEFI Firmware Device - FD file) is a collection of UEFI binaries encapsulated into a single image. The format of this image is defined by the Platform Initialization Specification Volume 3. A Vector Table is located at the base of this file. A 'BL' branch instruction at the base of the firwmare (location of the Reset Entry into the Vector Table) will jump to the first 'SEC' module of the UEFI Firmware Image.
--- https://github.com/lzeng14/tianocore/wiki/ArmPkg-Debugging
To verify the statements above:
Disassember the reset vector (i.e., the 1st word) of generated .FD (we got offset=0x360):
$ xxd -l 4 -e TEST.fd <== dump 4 bytes in little endian
00000000: 140000d8 <== BL {PC}+(0xd8<<2); offset=0x360
Check the Entry point in .dll (we got offset=0x240):
$ aarch64-none-elf-objdump -t ArmPlatformPrePiUniCore.dll|grep _ModuleEntryPoint
0000000000000240 g F .text 0000000000000000 _ModuleEntryPoint
$ readelf -h ArmPlatformPrePiUniCore.dll|grep Entry
Entry point address: 0x240
Compare contents of two files at different offset (we got identicial content):
$ xxd -s 0x360 -l 64 TEST.fd <== skip 0x360 bytes, dump 64 bytes
00000360: 901e 0094 050a 0094 ea03 00aa a1cd 0a58 ...............X
00000370: 0200 e0d2 2200 c0f2 0240 a0f2 0200 80f2 ...."....#......
00000380: c303 a0d2 e3ff 9ff2 6304 00d1 6300 028b ........c...c...
00000390: 0400 a1d2 0400 80f2 2000 03eb 8400 0054 ........ ......T
$ xxd -s 0x240 -l 64 ArmPlatformPrePiUniCore.dll <== skip 0x240 bytes
00000240: 901e 0094 050a 0094 ea03 00aa a1cd 0a58 ...............X
00000250: 0200 e0d2 2200 c0f2 0240 a0f2 0200 80f2 ...."....#......
00000260: c303 a0d2 e3ff 9ff2 6304 00d1 6300 028b ........c...c...
00000270: 0400 a1d2 0400 80f2 2000 03eb 8400 0054 ........ ......T
Prepare an empty pkg, and make it build ok. The main purpuse is to do some exercise with EDK2 build system, and use the empty pkg as the start point for the new platform.
Make a copy of RaspberryPi.dec, change all gRaspberry to gMyPlatform.
Make a copy of RPi4.dsc and RPi4.fdf, and comment out all stuff in DSC and FDF file.
Replace all GUIDs in DSC/FDF/DEC files, generating new ones using online guid generator.
Note that PCD are declared in DEC files, and DEC files are refered by modules (INF files). As the empty package contains no module, no PCD definition will be available in FDF. So for a success build of the empty package, we need to comment out all PCD reference in FDF.
The NOOPT build command for MyPlatform is as below:
#!/bin/bash
export WORKSPACE=/home/bruin/work/tianocore
export PACKAGES_PATH=$WORKSPACE/edk2:$WORKSPACE/edk2-platforms:$WORKSPACE/edk2-non-osi
pushd $WORKSPACE
source edk2/edksetup.sh
echo "Building BaseTools..."
make -C edk2/BaseTools all
echo "Building UEFI firmware for MyPlatform..."
GCC5_AARCH64_PREFIX=aarch64-none-linux-gnu- build \
-n 4 \
-a AARCH64 \
-p Platform/MyCorp/MyPlatform/MyPlatform.dsc \
-t GCC5 \
-b NOOPT \
-v -d 9 -j MyPlatform-build.log \
-y MyPlatform-build-report.txt \
-Y EXECUTION_ORDER \
-Y PCD \
-Y LIBRARY \
-Y DEPEX \
-Y HASH \
-Y BUILD_FLAGS \
-Y FLASH \
-Y FIXED_ADDRESS \
all
popd
Add the 1st component ArmPlatformPrePiUniCore. This component is to prepare the HOBs for DXE phae. The main purpose is to get serial port working and memory config correct. Another purpose of this step is to familiar with steps for adding a component/module/lib. Below is a brief summary of the steps:
Uncomment the module's INF into both DSC ([Components] section), and FDF ([FV.FVMAIN_COMPACT]).
Rebuild the pkg, and resolve all Instance of library class [xxxLib] is not found errors reported, by updating [LibraryClasses] sections of DSC.
This step is a repeating process for dozens of times.
Some lib-class has multiple lib-instances, making sure choose the appropriate lib-instance (ref the build-report of RPi4).
if encounter ModuleEntryPoint.iiii:31: Error: immediate out of range: enable gArmTokenSpaceGuid.PcdFdBaseAddress and gArmTokenSpaceGuid.PcdFdSize in FDF.
if encounter undefined reference to _gPcd_BinaryPatch_PcdSerialClockRate: set PcdSerialClockRate in [PcdsPatchableInModule] section in DSC. FIXME: why? ref.
Check the PCDs listed in build log: inspect any abnormal PCD values, and supply correct values.
Customize platform-specific drivers or libraries.
SerialPortLib: locate the lib-class header file (MdePkg/Include/Library/SerialPortLib.h) by find edk2 -type f -name "*.dec" -exec grep -Hn SerialPortLib. The following functions are required:
SerialPortInitialize()
SerialPortWrite()
SerialPortRead()
SerialPortPoll()
SerialPortSetControl(): RETURN_UNSUPPORTED
SerialPortGetControl(): RETURN_UNSUPPORTED
SerialPortSetAttributes(): RETURN_UNSUPPORTED
ArmPlatformLib: interface header at Include/Library/ArmPlatformLib.h. The following functions are required:
ArmPlatformGetCorePosition(): return cpu idx in the cluster given the MPIDR value. this function is used in _ModuleEntryPoint for setting stack for secondary cores. Assuming one cluster for now.
ArmPlatformIsPrimaryCore()
ArmPlatformGetPrimaryCoreMpId()
ArmPlatformGetBootMode()
ArmPlatformPeiBootAction()
ArmPlatformInitialize()
ArmPlatformGetVirtualMemoryMap()
ArmPlatformGetPlatformPpiList()
etc...
Uncomment more modules in DSC/FDF, module by module...For driver/libs which are RPi platform specific, we can:
either search the edk2/edk2-platform for similiar driver or lib instances, or
copy the RPi4 implementation and comment out most of the content, make the pkg build success first, and then bug fixing.
Debugging: my current main debugging method is through adding "printf()", i.e., the edk2 macro DEBUG((DEBUG_INFO,)). One needs to set gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel to an appropriate value to see more debug info.

How visualize j48 tree in weka

I have an output tree in weka but can't view it (right click ...). Is there a tool to generate the resulting tree in an understandable way from the copy of the log (figures)?
The above textual representation cannot be converted into other formats, unless you write your own parser.
However, if you use the -g option on the command-line, the tree will get output on stdout in dot-notation. You can then take this output and convert it into other formats, like PNG or PDF using the GraphViz software.
You can run Weka from the command line if you have java installed.
On my Windows machine from the Weka-3-9-5 directory:
C:\Weka-3-9-5> java -cp weka.jar weka.classifiers.trees.J48 -C 0.25 -M 2 -t .\data\iris.arff
This gives you the output that you current have with the trees. However:
C:\Weka-3-9-5> java -cp weka.jar weka.classifiers.trees.J48 -C 0.25 -M 2 -t .\data\iris.arff -g
gives you a different format:
digraph J48Tree {
N0 [label="petalwidth" ]
...
}
and you can feed this to GraphViz to get a nice printed tree.
I put the digraph output into a tree.txt file and then generated a png image file through GraphViz:
C:\GraphViz> dot -Tpng tree.txt > tree.png

Invalid JPEG data, size 43 error in building part of retraining inceptionv3 of tensorflow for poets

I am trying to train my model of image recognition using tensorflow for poets .
It is getting build for the example case of flower data and generates bottle necks too but when i try to retrain it for another test case (where i made four folders namely Chandler,Darth,Ross,Joey in a folder named tf_files this folder is present in my home)
when i run this command
python tensorflow/examples/image_retraining/retrain.py \
--bottleneck_dir=/tf_files/bottlenecks \
--how_many_training_steps 500 \
--model_dir=/tf_files/inception \
--output_graph=/tf_files/retrained_graph.pb \
--output_labels=/tf_files/retrained_labels.txt \
--image_dir /tf_files
I am getting this error
InvalidArgumentError (see above for traceback): Invalid JPEG data, size
43[[Node: DecodeJpeg = DecodeJpeg[acceptable_fraction=1, channels=3,
dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false,
_device="/job:localhost/replica:0/task:0/cpu:0"](_recv_DecodeJpeg
/contents_0)]]
and many warning
But when i run the above command just for flower standard example
python tensorflow/examples/image_retraining/retrain.py \
--bottleneck_dir=/tf_files/bottlenecks \
--how_many_training_steps 500 \
--model_dir=/tf_files/inception \
--output_graph=/tf_files/retrained_graph.pb \
--output_labels=/tf_files/retrained_labels.txt \
--image_dir /flower_photos
I am not getting any error .My model gets retrained and i get a prediction
Being very new in this field please guide where i am wrong be its installation mistake or something
The error that you get is because there might be a few images in your dataset that are of png format. On the other hand, the flower dataset has all images of jpeg format.
Also, the retrain.py is written to handle images of jpeg format. You might want to follow this thread to know more about handling png images

What's the /path/to/initial/clusters parameter means in Mahout 0.5 kmeans example?

I tried to run the kmeans example in Mahout 0.5, but failed! I found in kmeans.props that it required a strange parameter, -c, which means path_to_initial_clusters.
What's this stuff? How could I prepare for it?
kmeans.props:
The following parameters must be specified
i|input = /path/to/input
c|clusters = /path/to/initial/clusters
So mahout cannot needs input in specific format to carry out its clustering algorithm.
So have a look at
seq2sparse: : Sparse Vector generation from Text sequence files
seqdirectory: : Generate sequence files (of Text) from a directory
As an example say for Reuters 21587 data set.
Following are the steps :
1.mahout seqdirectory -c UTF-8
-i examples/reuters-extracted/ -o reuters-seqfiles
2.mahout seq2sparse -i reuters-seqfiles/ -o reuters-vectors -ow
3.mahout kmeans -i reuters-vectors/tfidf-vectors/ \
-c reuters-initial-clusters \
-o reuters-kmeans-clusters \
-dm org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure \
-cd 1.0 -k 20 -x 20 -cl
Hope it helps
K-means requires initial clusters in order to iteratively update the centroid (which is the center of a cluster) until it converge.
-c, path_to_initial_clusters ask you just to give a directory for mahout to store its initial clusters.
You can specify any path for mahout to store the initial clusters and mahout will compute the initial clusters and store in the directory. Or you can compute the initial cluster by canopy clustering or other method, and tell mahout the directory of the initial cluster you have computed to initialize K-means clustering.

Can't read Mahout generated sequence files with hadoop streaming

I am trying to stream a sequence file generated by one of the Mahout examples to see its contents:
hadoop jar hadoop-streaming-0.20.2-cdh3u0.jar \
-input /tmp/mahout-work-me/20news-bydate/bayes-test-input-output/ \
-output /tmp/me/mm \
-mapper "cat" \
-reducer "wc -l" \
-inputformat SequenceFileAsTextInputFormat
The job starts successfully and eventually dies with:
11/11/30 21:08:39 INFO streaming.StreamJob: map 0% reduce 0%
11/11/30 21:09:17 INFO streaming.StreamJob: map 100% reduce 100%
java.lang.RuntimeException: java.io.IOException: WritableName can't load class: org.apache.mahout.common.StringTuple
I wonder if something is wrong with my streaming jar file, if I I need to point explicitly to the Mahout jar that has this class (tried setting HADOOP_CLASSPATH to the location of mahout-core-0.5-cdh3u2.jar but did not work), or maybe even something else?
Any help is appreciated. Thanks.
Add this option:
-libjars mahout-core-0.5-cdh3u2.jar

Resources