Can't figure out error while running CVB0Driver in Mahout - mahout

I've been trying for the last few hours to get CVB0Driver working and after much trial and error I've come to the following error which I can't figure out. (Using mahout-integration 0.7)
java.lang.Error: Unresolved compilation problem:
at org.apache.mahout.math.function.Functions.mult(Functions.java:770)
at org.apache.mahout.clustering.lda.cvb.TopicModel.<init>(TopicModel.java:139)
at org.apache.mahout.clustering.lda.cvb.TopicModel.<init>(TopicModel.java:113)
at org.apache.mahout.clustering.lda.cvb.TopicModel.<init>(TopicModel.java:108)
at org.apache.mahout.clustering.lda.cvb.TopicModel.<init>(TopicModel.java:92)
at org.apache.mahout.clustering.lda.cvb.CachingCVB0Mapper.setup(CachingCVB0Mapper.java:103)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
Here's the code I'm using, since I have yet to get it working I'm not sure if I'm on the right path, so feel free to comment if you see a mistake I'm making.
String [] args = {"-c","UTF-8","-i",input,"-o",output};
//create the seq file from the directory of text documents
ToolRunner.run(new SequenceFilesFromDirectory(),args);
//tokenize the documents
DocumentProcessor.tokenizeDocuments(new Path(inputDir), analyzer.getClass().asSubclass(Analyzer.class), tokenizedPath, conf);
//create tf vectors
DictionaryVectorizer.createTermFrequencyVectors(tokenizedPath,new Path(outputDir), DictionaryVectorizer.DOCUMENT_VECTOR_OUTPUT_FOLDER, conf, minSupport, maxNGramSize, minLLRValue, -1.0f, true, reduceTasks, chunkSize, sequentialAccessOutput, true);
//calculate the document frequencies
Pair<Long[], List<Path>> dfData = TFIDFConverter.calculateDF( new Path(outputDir, DictionaryVectorizer.DOCUMENT_VECTOR_OUTPUT_FOLDER), new Path(outputDir), conf, chunkSize);
//create tfidf vectors
TFIDFConverter.processTfIdf( new Path(outputDir , DictionaryVectorizer.DOCUMENT_VECTOR_OUTPUT_FOLDER), new Path(outputDir), conf, dfData, minDf, maxDFPercent, norm, true, sequentialAccessOutput, true, reduceTasks);
args = new String[]{"-i","tfidf-vectors/part-r-00000","-o","cvb"};
//create the matrix for cvb
RowIdJob.main(args);
CVB0Driver.run(conf, new Path("cvb/matrix"), mto, numTopics, numTerms, alpha, eta, maxIterations, iterationBlockSize, convergenceDelta, dictionaryPath, dto, msto, randomSeed, testFraction, numTrainThreads, numUpdateThreads, maxItersPerDoc, numReduceTasks, backfillPerplexity);
Any help would be much appreciated.

Okay, seems this was some conflict between maven/eclipse projects.
I had recently imported the mahout-integration 0.7 source into eclipse and somehow badly built it, there was issues with mahout-math and my other project maybe started referencing the badly built jar, I'm not too familiar with maven so I don't know if that was the case or eclipse just went a bit crazy.
After deleting this project from eclipse, everything started to run fine.
This question helped resolve this one - java-unresolved-compilation-problem

Related

Properly Load Microsoft.ML model in Xamarin App?

I was learning a bit about how Machine Learning works and eventually built a small C# application because of a tutorial I was watching where a model determines what species of Bee a particular bee IS. It works well in the C# application, but I was also developing an app project that had picture taking capabilities in mind so I thought "No reason the model I saved in the first project can't work in the second, right?". Apparently there is a reason why it can't work because here is the necessary code to view in the app project:
MLContext m_mlObj;
DataViewSchema m_modelSchema;
ITransformer m_loadedTrainedModel;
var folderPath = DependencyService.Get<IFileSystem>().GetExternalStorage();
var fileDir = Path.Combine(folderPath, "trainedModel.zip");
bool testValue = File.Exists(fileDir);
if(testValue)
{
Console.WriteLine("File in fact exists.");
try
{
m_loadedTrainedModel = m_mlObj.Model.Load(fileDir, out m_modelSchema);
}
catch (Exception e)
{
Console.WriteLine("\n\n\nInner exception: " + e.InnerException);
}
}
else
{
Console.WriteLine("File does not exist");
}
I have a breakpoint on the "m_loadedTrainedModel = m_mlObj.Model.Load(fileDir, out m_modelSchema);" line of course and it always triggers an exception.
The exception states:
---> System.TypeLoadException: Could not load type of field 'Microsoft.ML.Transforms.DnnRetrainTransformer:_tfInputShapes' (9) due to: Could not resolve type with token 01000060 from typeref (expected class 'Tensorflow.TensorShape' in assembly 'TensorFlow.NET, Version=0.20.1.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51') assembly:TensorFlow.NET, Version=0.20.1.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd5 1 type:Tensorflow.TensorShape member:(null)
What I THINK this means is it's checking for Tensorflow.TensorShape inside of the package TensorFlow.NET and isn't finding it. If that's the case, I believe I don't have the correct package installed so my Xamarin project will be able to load the model correctly, maybe? Or if not, then what may be the real underlying issue? The code seems straight forward so I'm a bit perplexed as to how this is failing.
If the solution is blatantly going over my head, my apologies and thank you to anyone who is willing to help.
When working with ML.NET on ARM, TensorFlow is one of the limitations at the time of this writing.
https://devblogs.microsoft.com/dotnet/ml-net-june-updates-model-builder/#ml-net-on-arm
What you might want to consider in the meantime is deploying your model as a Web API and making requests to that API from your mobile app.
https://learn.microsoft.com/dotnet/machine-learning/how-to-guides/serve-model-web-api-ml-net

Errors - bootstrap GLM model - glmmBoot package

I'm trying to run a bootstrap resampling on a GLM model, but I keep running into errors and I can't find any solution online.
The first problem in which I ran into is "variable lenght differs" when i try to use a normalized variable (sub.size_z) and a continuous variable trasformed into factor (hours). I checked my dataset and there are no NAs anywhere, so i don't know how to tackle the problem.
The second error is "Error: Naming mismatch from base to list of coefs", which happens when i try to run the bootstrap using a GLM without having trasformed the variables that i previously mentioned.
Can anyone point me towards a solution? Thanks a lot!
Here's the code I used!
data <-read.delim2("C:/Users/BRIZ_/Desktop/analisi scratching/Hypothesis 3/Hypothesis 3/database.txt")
library(car)
library(lme4)
library(MASS)
library(dplyr)
library(blmeco)
library(MuMIn)
library(lmerTest)
library(AER)
library(DHARMa)
sub.size_z<-scale(sub.size, center=TRUE,scale=TRUE)
hours<- as.factor(hour)
modelboot<-glmmTMB(sc.count~ offset(log(DURATION)) + hours + general.activity + sub.size_z + approach5+ id,data=data, family=genpois(link = "log"))
summary(modelboot)
bootstrap_model(modelboot,base_data=data,resamples =100)

H2O POJO causing javac java.lang.IllegalArgumentException

I have a distributed random forest POJO model using the default model setting except for:
ntrees = 150
max_depth = 50
min_rows = 5
Here are the full settings:
buildModel 'drf', {"model_id":"drf-335270ee-8970-4855-b521-c4fb4ca184f5","training_frame":"frame_0.750","validation_frame":"frame_0.250","nfolds":0,"response_column":"DENIAL","ignored_columns":["tx_match_date"],"ignore_const_cols":true,"ntrees":"150","max_depth":"50","min_rows":"5","nbins":20,"seed":-1,"mtries":-1,"sample_rate":0.6320000290870667,"score_each_iteration":true,"score_tree_interval":0,"balance_classes":false,"nbins_top_level":1024,"nbins_cats":1024,"r2_stopping":1.7976931348623157e+308,"stopping_rounds":0,"stopping_metric":"AUTO","stopping_tolerance":0.001,"max_runtime_secs":0,"checkpoint":"","col_sample_rate_per_tree":1,"min_split_improvement":0.00001,"histogram_type":"AUTO","categorical_encoding":"AUTO","build_tree_one_node":false,"sample_rate_per_class":[],"binomial_double_trees":true,"col_sample_rate_change_per_level":1,"calibrate_model":false}
When I try to compile the pojo with:
$javac -cp "h2o-genmodel.jar" -J-Xmx2g -J-XX:MaxPermSize=128m drf_335270ee_8970_4855_b521_c4fb4ca184f5.java
I get the following error.
An exception has occurred in the compiler (1.8.0_131). Please file a bug against the Java compiler via the Java bug reporting page (http://bugreport.java.com) after checking the Bug Database (http://bugs.java.com) for duplicates. Include your program and the following diagnostic in your report. Thank you.
java.lang.IllegalArgumentException
at java.nio.ByteBuffer.allocate(ByteBuffer.java:334)
at com.sun.tools.javac.util.BaseFileManager$ByteBufferCache.get(BaseFileManager.java:325)
at com.sun.tools.javac.util.BaseFileManager.makeByteBuffer(BaseFileManager.java:294)
at com.sun.tools.javac.file.RegularFileObject.getCharContent(RegularFileObject.java:114)
at com.sun.tools.javac.file.RegularFileObject.getCharContent(RegularFileObject.java:53)
at com.sun.tools.javac.main.JavaCompiler.readSource(JavaCompiler.java:602)
at com.sun.tools.javac.main.JavaCompiler.parse(JavaCompiler.java:665)
at com.sun.tools.javac.main.JavaCompiler.parseFiles(JavaCompiler.java:950)
at com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:857)
at com.sun.tools.javac.main.Main.compile(Main.java:523)
at com.sun.tools.javac.main.Main.compile(Main.java:381)
at com.sun.tools.javac.main.Main.compile(Main.java:370)
at com.sun.tools.javac.main.Main.compile(Main.java:361)
at com.sun.tools.javac.Main.compile(Main.java:56)
at com.sun.tools.javac.Main.main(Main.java:42)
I don't get this error when replacing the DRF model with a deep learning pojo that I have also downloaded from h2o's Flow UI, so I'm thinking it is likely related to the drf_335270ee_8970_4855_b521_c4fb4ca184f5.java file (note that the POJO was too big to preview in H2O's Flow UI). What could be going on here?
Thanks
Instead of trying to compile an H2O random forest POJO, you can download and use a MOJO instead in almost exactly the same way without needing the compile step.
See:
http://docs.h2o.ai/h2o/latest-stable/h2o-genmodel/javadoc/index.html

SharpDX ShaderBytecode.CompileFromFile to PixelShader

Currently i am working on a migration from SlimDX to SharpDX. Some things are different between them, like loading shaders etc:
I have a problem creating the PixelShader class (same applies to the VertexShader class). The problem is, every example I found on this subject will not compile.
For example:
using (var pixelShaderByteCode = ShaderBytecode.CompileFromFile(filename, "PS", "ps_5_0", shaderFlags))
shader.PixelShader = new SharpDX.Direct3D11.PixelShader(device, pixelShaderByteCode);
The problem is, SharpDX.Direct3D11.PixelShader does not take a ComplilationResult as parameter. I could use the vertexShaderByteCode.Bytecode which is a ShaderBytecode, but this is also invalid.
There is a vertexShaderByteCode.Bytecode.Data which is a DataStream. I might create the byte[] from it, but I think this could be solved easier? Did I missed something?
using: SharpDX 3.1.1
I have found the problem:
Looks like i need to reference the SharpDX.D3DCompiler also, to compile for DX11. It was using the DX9 compiler.
SharpDX.D3DCompiler.ShaderBytecode(DX11) vs SharpDX.Direct3D9.ShaderBytecode(DX9)
I'll leave this for anyone who has the same struggles.
CompilationResult is returned by the compile and you can test if the bytecode is null, if it is, you can then check the error codes (Best in debug though :)).
HasErrors : boolean
Message : string
Check these also.

Simple OpenCV problem

Why I try to run the following OpenCV program, it shows the following error :
ERROR:
test_1.exe - Application Error
The application failed to initialize properly (0x80000003).
Click on OK to terminate the application.
CODE:
#include "cv.h"
#include "highgui.h"
int main()
{
IplImage *img = cvLoadImage("C:\\face.bmp");
cvSetImageROI(img, cvRect(100,100, 100, 100));
cvAddS(img, cvScalar(50), img);
cvResetImageROI(img);
cvShowImage("Test", img);
cvWaitKey(0);
return 0;
}
When i press F5(im using vs2008express), the program encounters a break point...i have attached a picture...dont know, whether, it will help or not.
Error Snapshot Link
It is not that, only this program is producing this error, but also any kind of image manipulation funciton containing (OpenCV)program is resulting in this sitution.
Such as : cvSmooth
one last thing, it there any dedicated OpenCV forum or sth like that?
I am an administrator.So, yes, ive the permission.
a version mismatch.
sorry, i didn't get it?Version mismatch with what?
But, i have found the error using dependency walker.
Warning: At least one module has an unresolved import due to a missing export
function in a delay-load dependent module.
and also found that, it is a common problem, and found some info in the FAQ of DW...
Why am I seeing a lot of applications where MPR.DLL shows up in red under
SHLWAPI.DLL because it is missing a function named WNetRestoreConnectionA?
I also get a "Warning: At least one module has an unresolved import due to
a missing export function in a delay-load dependent module" message.
Function name : WNetRestoreConnectionA
But there is no guideline about how to solve it. Though, they say, it is not a problem.
i googled a little and found a suggestion.It says,
Turn off your compilers setting to assume you are programming for Win9x.
(I just lost which setting but it is not that difficult, use a #define...)
But i have no idea, how to do that in Visual Studio 2008 express.
Any suggestion how to solve it...
This usually indicates a problem with a dll; either you don't have permission, or a version is mismatched. Try running as Administrator to see if it is a permissions problem. If that doesn't help, try using the Dependency Walker.

Resources