PETSc vectorize operations with neighboring vector values - vectorization

I'm implementing finite difference algorithm from uFDTD book. Many FDM equations involve operations on adjoined vector elements.
For example, an update equation for electric field
ez[m] = ez[m] + (hy[m] - hy[m-1]) * imp0
uses adjoined vector values hy[m] and hy[m-1].
How can I implement these operations in PETSc efficiently? Is there something beyond local vector loops and scatterers?

If my goal was efficiency, I would call a stencil engine. There are many many many papers, and sometimes even open source code, for example, Devito. The idea is that PETSc manages the data structure and parallelism. Then you can feed the local data brick to your favorite stencil computer.

Related

machine learning algorithm with list of arrays as data set

i'm new in data science and i'm searching for machine learning algorithm that take data set as List of arrays each array have sequence of floats data
A little bit of context: we have some angels that took from user motion ,
by these angels we determines if the user make the correct motion or not ,
the motion represented in our system in list of array each array has sequence of angels
any help please ? i searched for a lot of time but have no result !
Check out neupy. It is a great library for new machine learning users. I would suggest just the standard back propagation algorithm with momentum. It has been proven that newer adaptive learning techniques don't do as well as the simple gradient back propagation algorithm with momentum.
It is easy to implement. It would be implemented for example using the following code,
A: Create data set
x = np.zeros((len(list[0]),len(list)))
for i in np.arange(len(list)):
for j in np.arange(len(list[0]):
x[i][j] = list[i][j]
This would be the input. Then you create the architecture
B: Create Architecture
network = layers.Input(len(list[0])) > layers.Sigmoid(int(len(list[0])/2)) > layers.Sigmoid(2)
C: Use Gradient Descent With Momentum
gdnet = layers.Algorithms.Momentum(network,momentum=0.1)
gdnet.train(x,y, max_iter=1000)
Where y is the movement of interest.
D: Predict Motion
y_predicted = gdnet(x)
In general, most libraries take in numpy arrays as inputs.
There are a number of ways to wrangle your data into that format. I find pandas (https://pandas.pydata.org/pandas-docs/stable/) to be the most convenient way. If you have the data in .csv file, excel sheet or some other common, structured format, pandas has functions for loading that in with no pain at all
If you give some more details (Are you using a machine learning library (like sci-kit), what format the data is in) i can be of more help.

Feed a complex-valued image into Neural network (tensorflow)

I'm working on a project which tries to "learn" a relationship between a set of around 10 k complex-valued input images (amplitude/phase; real/imag) and a real-valued output-vector with 48 entries. This output-vector is not a set of labels, but a set of numbers which represents the best parameters to optimize the visual impression of the given complex-valued image. These parameters are generated by an algorithm. It's possible, that there is some noise in the data (comming from images and from the algorithm which generates the parameter-vector)
Those parameters more-less depends on the FFT (fast-fourier-transform) of the input image. Therfore I was thinking of feeding the network (5 hidden-layers, but architecture shouldn't matter right now) with a 1D-reshaped version of the FFT(complexImage) - some pseudocode:
// discretize spectrum
obj_ft = fftshift(fft2(object));
obj_real_2d = real(obj_ft);
obj_imag_2d = imag(obj_ft);
// convert 2D in 1D rows
obj_real_1d = reshape(obj_real_2d, 1, []);
obj_imag_1d = reshape(obj_imag_2d, 1, []);
// create complex variable for 1d object and concat
obj_complx_1d(index, :) = [obj_real_1d obj_imag_1d];
opt_param_1D(index, :) = get_opt_param(object);
I was wondering if there is a better approach for feeding complex-valued images into a deep-network. I'd like to avoid the use of complex gradients, because it's not really necessary?! I "just" try to find a "black-box" which outputs the optimized parameters after inserting a new image.
Tensorflow gets the input: obj_complx_1d and output-vector opt_param_1D for training.
There are several ways you can treat complex signals as input.
Use a transform to make them into 'images'. Short Time Fourier Transforms are used to make spectrograms which are 2D. The x-axis being time, y-axis being frequency. If you have complex input data, you may choose to simply look at the magnitude spectrum, or the power spectral density of your transformed data.
Something else that I've seen in practice is to treat the in-phase and quadrature (real/imaginary) channels separate in early layers of the network, and operate across both in higher layers. In the early layers, your network will learn characteristics of each channel, in higher layers it will learn the relationship between the I/Q channels.
These guys do a lot with complex signals and neural nets. In particular check out 'Convolutional Radio Modulation Recognition Networks'
https://radioml.com/research/
The simplest way to feed complex valued numbers with out using complex gradients in your models is to represent the complex values in a different representation. The two main ways are:
Magnitude/Angle components
Real/Imaginary components
I'll show this idea using magnitude/angle components. Assuming you have a 2d numpy array representing an image with shape = (WIDTH, HEIGHT)
import numpy as np
kSpace = np.fft.ifftshift(np.fft.fft2(img))
This would give you a 2D complex array. You can then transform the array into a
data = np.dstack((np.abs(kSpace), np.angle(kSpace)))
This array will be a numpy array with shape = (WIDTH, HEIGHT, 2). This array represents one complex valued image. For a set of images, make sure to concatenate them together to get an array of shape = (NUM_IMAGES, WIDTH, HEIGHT, 2)
I made a simple example of using tensorflow to learn an Fourier Transform with a simple neural network. You can find this example at https://github.com/michaelmendoza/learning-tensorflow

Discrete Wavelet Transform (Daubechies wavelet) of an array complex numbers

Say, I have a signal represented as an array of real numbers y = [1,2,0,4,5,6,7,90,5,6]. I can use Daubechies-4 coefficients D4 = [0.482962, 0.836516, 0.224143, -0.129409], and apply a wavelet transform to receive high- and low-frequencies of the signal. So, the high frequency component will be calculated like this:
high[v] = y[2*v]*D4[0] + y[2*v+1]*D4[1] + y[2*v+2]*D4[2] + y[2*v+3]*D4[3],
and the low frequency component can be calculated using other D4 coefs permutation.
The question is: what if y is complex array? Do I just multiply and add complex numbers to receive subbands, or is it correct to get amplitude and phase, treat each of them like a real number, do the wavelet transform for them, and then restore complex number array for each subband using formulas real_part = abs * cos(phase) and imaginary_part = abs * sin(phase)?
To handle the case of complex data, you're looking at the Complex Wavelet Transform. It's actually a simple extension to the DWT. The most common way to handle complex data is to treat the real and imaginary components as two separate signals and perform a DWT on each component separately. You will then receive the decomposition of the real and imaginary components.
This is commonly known as the Dual-Tree Complex Wavelet Transform. This can best be described by the figure below that I pulled from Wikipedia:
Source: Wikipedia
It's called "dual-tree" because you have two DWT decompositions happening in parallel - one for the real component and one for the imaginary. In the above diagram, g0/h0 represent the low-pass and high-pass components of the real part of the signal x and g1/h1 represent the low-pass and high-pass components of the imaginary part of the signal x.
Once you decompose the real and imaginary parts into their respective DWT decompositions, you can combine them to get the magnitude and/or phase and proceed to the next step or whatever you desire to do with them.
The mathematical proof regarding the correctness of this is outside the scope of what we're talking about, but if you would like to see how this got derived, I refer you to the canonical paper by Kingsbury in 1997 in the work Image Processing with Complex Wavelets - http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=835E60EAF8B1BE4DB34C77FEE9BBBD56?doi=10.1.1.55.3189&rep=rep1&type=pdf. Pay close attention to the noise filtering of images using the CWT - this is probably what you're looking for.

Kohonen Self Organizing Maps: Determining the number of neurons and grid size

I have a large dataset I am trying to do cluster analysis on using SOM. The dataset is HUGE (~ billions of records) and I am not sure what should be the number of neurons and the SOM grid size to start with. Any pointers to some material that talks about estimating the number of neurons and grid size would be greatly appreciated.
Thanks!
Quoting from the som_make function documentation of the som toolbox
It uses a heuristic formula of 'munits = 5*dlen^0.54321'. The
'mapsize' argument influences the final number of map units: a 'big'
map has x4 the default number of map units and a 'small' map has
x0.25 the default number of map units.
dlen is the number of records in your dataset
You can also read about the classic WEBSOM which addresses the issue of large datasets
http://www.cs.indiana.edu/~bmarkine/oral/self-organization-of-a.pdf
http://websom.hut.fi/websom/doc/ps/Lagus04Infosci.pdf
Keep in mind that the map size is also a parameter which is also application specific. Namely it depends on what you want to do with the generated clusters. Large maps produce a large number of small but "compact" clusters (records assigned to each cluster are quite similar). Small maps produce less but more generilized clusters. A "right number of clusters" doesn't exists, especially in real world datasets. It all depends on the detail which you want to examine your dataset.
I have written a function that, with the data set as input, returns the grid size. I rewrote it from the som_topol_struct() function of Matlab's Self Organizing Maps Toolbox into a R function.
topology=function(data)
{
#Determina, para lattice hexagonal, el número de neuronas (munits) y su disposición (msize)
D=data
# munits: número de hexágonos
# dlen: número de sujetos
dlen=dim(data)[1]
dim=dim(data)[2]
munits=ceiling(5*dlen^0.5) # Formula Heurística matlab
#munits=100
#size=c(round(sqrt(munits)),round(munits/(round(sqrt(munits)))))
A=matrix(Inf,nrow=dim,ncol=dim)
for (i in 1:dim)
{
D[,i]=D[,i]-mean(D[is.finite(D[,i]),i])
}
for (i in 1:dim){
for (j in i:dim){
c=D[,i]*D[,j]
c=c[is.finite(c)];
A[i,j]=sum(c)/length(c)
A[j,i]=A[i,j]
}
}
VS=eigen(A)
eigval=sort(VS$values)
if (eigval[length(eigval)]==0 | eigval[length(eigval)-1]*munits<eigval[length(eigval)]){
ratio=1
}else{
ratio=sqrt(eigval[length(eigval)]/eigval[length(eigval)-1])}
size1=min(munits,round(sqrt(munits/ratio*sqrt(0.75))))
size2=round(munits/size1)
return(list(munits=munits,msize=sort(c(size1,size2),decreasing=TRUE)))
}
hope it helps...
Iván Vallés-Pérez
I don't have a reference for it, but I would suggest starting off by using approximately 10 SOM neurons per expected class in your dataset. For example, if you think your dataset consists of 8 separate components, go for a map with 9x9 neurons. This is completely just a ballpark heuristic though.
If you'd like the data to drive the topology of your SOM a bit more directly, try one of the SOM variants that change topology during training:
Growing SOM
Growing Neural Gas
Unfortunately these algorithms involve even more parameter tuning than plain SOM, but they might work for your application.
Kohenon has written on the issue of selecting parameters and map size for SOM in his book "MATLAB Implementations and Applications of the Self-Organizing Map". In some cases, he suggest the initial values can be arrived at after testing several sizes of the SOM to check that the cluster structures were shown with sufficient resolution and statistical accuracy.
my suggestion would be the following
SOM is distantly related to correspondence analysis. In statistics, they use 5*r^2 as a rule of thumb, where r is the number of rows/columns in a square setup
usually, one should use some criterion that is based on the data itself, meaning that you need some criterion for estimating the homogeneity. If a certain threshold would be violated, you would need more nodes. For checking the homogeneity you would need some records per node. Agai, from statistics you could learn that for simple tests (small number of variables) you would need around 20 records, for more advanced tests on some variables at least 8 records.
remember that the SOM represents a predictive model. So validation is the key, absolutely mandatory. Yet, validation of predictive models (see typeI / II error entry in Wiki) is a subject on its own. And the acceptable risk as well as the risk structure also depend fully on your purpose.
You may test the dynamics of the error rate of the model by reducing its size more and more. Then take the smallest one with acceptable error.
It is a strength of the SOM to allow for empty nodes. Yet, there should not be too much of them. Let me say, less than 5%.
Taken all together, from experience, I would recommend the following criterion a minimum of the absolute number of 8..10 records, but those should not be more than 5% of all clusters.
Those 5% rule is of of course a heuristics, which however can be justified by the general usage of the confidence level in statistical tests. You may choose any percentage from 1% to 5%.

How to normalize tf-idf vectors for SVMs?

I am using Support Vector Machines for document classification. My feature set for each document is a tf-idf vector. I have M documents with each tf-idf vector of size N.
Giving M * N matrix.
The size of M is just 10 documents and tf-idf vector is 1000 word vector. So my features are much larger than number of documents. Also each word occurs in either 2 or 3 documents. When i am normalizing each feature ( word ) i.e. column normalization in [0,1] with
val_feature_j_row_i = ( val_feature_j_row_i - min_feature_j ) / ( max_feature_j - min_feature_j)
It either gives me 0, 1 of course.
And it gives me bad results. I am using libsvm, with rbf function C = 0.0312, gamma = 0.007815
Any recommendations ?
Should i include more documents ? or other functions like sigmoid or better normalization methods ?
The list of things to consider and correct is quite long, so first of all I would recommend some machine-learning reading before trying to face the problem itself. There are dozens of great books (like ie. Haykin's "Neural Networks and Learning Machines") as well as online courses, which will help you with such basics, like those listed here: http://www.class-central.com/search?q=machine+learning .
Getting back to the problem itself:
10 documents is rows of magnitude to small to get any significant results and/or insights into the problem,
there is no universal method of data preprocessing, you have to analyze it through numerous tests and data analytics,
SVMs are parametrical models, you cannot use a single C and gamma values and expect any reasonable results. You have to check dozens of them to even get a clue "where to search". The most simple method for doing so is so called grid search,
1000 of features is a great number of dimensions, this suggest that using a kernel, which implies infinitely dimensional feature space is quite... redundant - it would be a better idea to first analyze simplier ones, which have smaller chance to overfit (linear or low degree polynomial)
finally is tf*idf a good choice if "each word occurs in 2 or 3 documents"? It can be doubtfull, unless what you actually mean is 20-30% of documents
finally why is simple features squashing
It either gives me 0, 1 of course.
it should result in values in [0,1] interval, not just its limits. So if this is a case you are probably having some error in your implementation.

Resources