How can i create a function to randomly generate inclusions in an RVE to import in Abaqus - abaqus

I am trying to create a function that will generate random inclusions in RVE
I tried 2 D scripts but I need 3 D

Related

Creating a simple Rcpp package with dependency with other Rcpp package

I am trying to improve my loop computation speed by using foreach, but there is a simple Rcpp function I defined inside of this loop. I saved the Rcpp function as mproduct.cpp, and I call out the function simply using
sourceCpp("mproduct.cpp")
and the Rcpp function is a simple one, which is to perform matrix product in C++:
// [[Rcpp::depends(RcppArmadillo, RcppEigen)]]
#include <RcppArmadillo.h>
#include <RcppEigen.h>
// [[Rcpp::export]]
SEXP MP(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}
So, the function in the Rcpp file is MP, referring to matrix product. I need to perform the following foreach loop (I have simplified the code for illustration):
foreach(j=1:n, .package='Rcpp',.noexport= c("mproduct.cpp"),.combine=rbind)%dopar%{
n=1000000
A<-matrix(rnorm(n,1000,1000))
B<-matrix(rnorm(n,1000,1000))
S<-MP(A,B)
return(S)
}
Since the size of matrix A and B are large, it is why I want to use foreach to alleviate the computational cost.
However, the above code does not work, since it provides me error message:
task 1 failed - "NULL value passed as symbol address"
The reason I added .noexport= c("mproduct.cpp") is to follow some suggestions from people who solved similar issues (Can't run Rcpp function in foreach - "NULL value passed as symbol address"). But somehow this does not solve my issue.
So I tried to install my Rcpp function as a library. I used the following code:
Rcpp.package.skeleton('mp',cpp_files = "<my working directory>")
but it returns me a warning message:
The following packages are referenced using Rcpp::depends attributes however are not listed in the Depends, Imports or LinkingTo fields of the package DESCRIPTION file: RcppArmadillo, RcppEigen
so when I tried to install my package using
install.packages("<my working directory>",repos = NULL,type='source')
I got the warning message:
Error in untar2(tarfile, files, list, exdir, restore_times) :
incomplete block on file
In R CMD INSTALL
Warning in install.packages :
installation of package ‘C:/Users/Lenovo/Documents/mproduct.cpp’ had non-zero exit status
So can someone help me out how to solve 1) using foreach with Rcpp function MP, or 2) install the Rcpp file as a package?
Thank you all very much.
The first step would be making sure that you are optimizing the right thing. For me, this would not be the case as this simple benchmark shows:
set.seed(42)
n <- 1000
A<-matrix(rnorm(n*n), n, n)
B<-matrix(rnorm(n*n), n, n)
MP <- Rcpp::cppFunction("SEXP MP(const Eigen::Map<Eigen::MatrixXd> A, Eigen::Map<Eigen::MatrixXd> B){
Eigen::MatrixXd C = A * B;
return Rcpp::wrap(C);
}", depends = "RcppEigen")
bench::mark(MP(A, B), A %*% B)[1:5]
#> # A tibble: 2 x 5
#> expression min median `itr/sec` mem_alloc
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt>
#> 1 MP(A, B) 277.8ms 278ms 3.60 7.63MB
#> 2 A %*% B 37.4ms 39ms 22.8 7.63MB
So for me the matrix product via %*% is several times faster than the one via RcppEigen. However, I am using Linux with OpenBLAS for matrix operations while you are on Windows, which often means reference BLAS for matrix operations. It might be that RcppEigen is faster on your system. I am not sure how difficult it is for Windows user to get a faster BLAS implementation (https://csgillespie.github.io/efficientR/set-up.html#blas-and-alternative-r-interpreters might contain some pointers), but I would suggest spending some time on investigating this.
Now if you come to the conclusion that you do need RcppEigen or RcppArmadillo in your code and want to put that code into a package, you can do the following. Instead of Rcpp::Rcpp.package.skeleton() use RcppEigen::RcppEigen.package.skeleton() or RcppArmadillo::RcppArmadillo.package.skeleton() to create a starting point for a package based on RcppEigen or RcppArmadillo, respectively.

Delayed dask.dataframe.DataFrame.to_hdf computations crashing

I'm using Dask to to execute the following logic:
read in a master delayed dd.DataFrame from multiple input files (one pd.DataFrame per file)
perform multiple query calls on the master delayed DataFrame
use DataFrame.to_hdf to save all dataframes from the DataFrame.query calls.
If I use compute=False in my to_hdf calls and feed the list of Delayeds returned by each to_hdf call to dask.compute then I get a crash/seg fault. (If I omit compute=False everything runs fine). Some googling gave me some information about locks; I tried adding a dask.distributed.Client with a dask.distributed.Lock fed to to_hdf, as well as a dask.utils.SerializableLock, but I couldn't solve the crash.
here's the flow:
import uproot
import dask
import dask.dataframe as dd
from dask.delayed import delayed
def delayed_frame(files, tree_name):
"""create master delayed DataFrame from multiple files"""
#delayed
def single_frame(file_name, tree_name):
"""read external file, convert to pandas.DataFrame, return it"""
tree = uproot.open(file_name).get(tree_name)
return tree.pandas.df() ## this is the pd.DataFrame
return dd.from_delayed([single_frame(f, tree_name) for f in files])
def save_selected_frames(df, selections, prefix):
"""perform queries on a delayed DataFrame and save HDF5 output"""
queries = {sel_name: df.query(sel_query)
for sel_name, sel_query in selections.items()]
computes = []
for dfname, df in queries.items():
outname = f"{prefix}_{dfname}.h5"
computes.append(df.to_hdf(outname, f"/{prefix}", compute=False))
dask.compute(*computes)
selections = {"s1": "(A == True) & (N > 1)",
"s2": "(B == True) & (N > 2)",
"s3": "(C == True) & (N > 3)"}
from glob import glob
df = delayed_frame(glob("/path/to/files/*.root"), "selected")
save_selected_frames(df, selections, "selected")
## expect output files:
## - selected_s1.h5
## - selected_s2.h5
## - selected_s3.h5
Maybe the HDF library that you're using isn't entirely threadsafe? If you don't mind losing parallelism then you could add scheduler="single-threaded" to the compute call.
You might want to consider using Parquet rather than HDF. It has fewer issues like this.

Dask: what function variable is best to choose for visualize()

I am trying to understand Dask delayed more deeply so I decided to work through the examples here. I modified some of the code to reflect how I want to use Dask (see below). But the results are different than what I expected ie. a tuple vs list. When I try to apply '.visualize()' to see what the execution graph looks like I get nothing.
I worked through all the examples in 'delayed.ipynb' and they all work properly including all the visualizations. I then modified the 'for' loop for one example:
for i in range(256):
x = inc(i)
y = dec(x)
z = add(x, y)
zs.append(z)
to a function call the uses a list comprehension. The result is a variation on the original working example.
%%time
import time
import random
from dask import delayed, compute, visualize
zs = []
#delayed
def inc(x):
time.sleep(random.random())
return x + 1
#delayed
def dec(x):
time.sleep(random.random())
return x - 1
#delayed
def add(x, y):
time.sleep(random.random())
return x + y
def myloop(x):
x.append([add(inc(i), dec(inc(i))) for i in range(8)])
return x
result = myloop(zs)
final = compute(*result)
print(final)
I have tried printing out 'result' (function call) which provides the expected list of delay calls but when I print the results of 'compute' I unexpectedly get the desired list as part of a tuple. Why don't I get a just a list?
When I try to 'visualize' the execution graph I get nothing at all. I was expecting to see as many nodes as are in the generated list.
I did not think I made any significant modifications to the example so what am I not understanding?
The visualize function has the same call signature as compute. So if your compute(*result) call works then try visualize(*result)

How to avoid error converting Cypher Query into NetworkX Graph in Python

I want to convert the results of a Neo4j Data into Graph in Python using py2Neo.
The below link provides a detail program code:
http://nicolewhite.github.io/neo4j-jupyter/hello-world.html
import networkx as nx
%matplotlib inline
results = %cypher MATCH p = (:Person)-[:LIKES]->(:Drink) RETURN p
g = results.get_graph()
nx.draw(g)
The get_graph function throws an error as:
TypeError: error() takes exactly 2 arguments (4 given).
The function expects two arguments. But I do not understand what is wrong with the Cypher Query? Thank you.

Biopython: Cant use .count() for biopython

My goal here is to receive the amount of time 'g' appears in a DNA sequence.
I imported a DNA sequence via Biopython using list comprehension
seq = [record for record in SeqIO.parse('sequences/hiv.gbk.rtf', 'fasta')]
I then tried using the .count() method on the newly created list comp variable
print(seq.count('g'))
I get an error that reads
NotImplementedError: SeqRecord comparison is deliberately not
implemented. Explicitly compare the attributes of interest.
Anyone know what the dealio is? Biopython's manual says all standard python methods should work.
You are trying to apply count to a list. You would to need to apply it to the sequence of each element, e.g.
print(seq[0].seq.count('g'))
or if you want to get the sum of all sequences
print(sum([s.seq.count('g') for s in seq]))
Here is a minimal working example
from Bio import SeqIO
txt = """>gnl|TC-DB|O60669|2.A.1.13.5 Monocarboxylate transporter 2 - Homo sapiens (Human).
MPPMPSAPPVHPPPDGGWGWIVVGAAFISIGFSYAFPKAVTVFFKEIQQIFHTTYSEIAW
>gnl|TC-DB|O60706|3.A.1.208.23 ATP-binding cassette sub-family C member 9 OS=Homo sapiens GN=ABCC9 PE=1 SV=2
MSLSFCGNNISSYNINDGVLQNSCFVDALNLVPHVFLLFITFPILFIGWGSQSSKVQIHH
>gnl|TC-DB|O60721|3.A.1.208.23 Sodium/potassium/calcium exchanger 1 OS=Homo sapiens GN=SLC24A1 PE=1 SV=1
MGKLIRMGPQERWLLRTKRLHWSRLLFLLGMLIIGSTYQHLRRPRGLSSLWAAVSSHQPI
>gnl|TC-DB|O60779|2.A.1.13.5 Thiamine transporter 1 (THTR-1) (ThTr1) (Thiamine carrier 1) (TC1) - Homo sapiens (Human).
MDVPGPVSRRAAAAAATVLLRTARVRRECWFLPTALLCAYGFFASLRPSEPFLTPYLLGP"""
filename = 'sequences.fa'
with open(filename, 'w') as f:
f.write(txt)
seqs = [record for record in SeqIO.parse(filename, 'fasta')]
print(sum([s.seq.count('P') for s in seqs]))
>>> 21
print(seqs[0].seq.count('P'))
>>> 9

Resources