SPARQL queries and user defined vocabulary - rdflib

I am trying to play with rdflib and a (my) user defined vocabulary (name: ODE).
To do that I have generated a class namespace/_ODE.py derived from DefinedNamespace:
1 from rdflib.term import URIRef
2 from rdflib.namespace import DefinedNamespace, Namespace
3
4
5 class ODE(DefinedNamespace):
6 """
7 DESCRIPTION_EDIT_ME_!
8
9 Generated from: SOURCE_RDF_FILE_EDIT_ME_!
10 Date: 2022-05-02 08:38:55.619901
11 """
12
13 _fail = True
14
15 Function: URIRef
16 Equation: URIRef
17 hasDerivative: URIRef
18 Polynomial: URIRef
19 Ode: URIRef
20
21 _NS = Namespace("ode#")
22
As all the new "classes" of the ODE vocabulary are a specialization of the class "Seq" I have created the module rdflib/ode.py:
1 from rdflib import Seq
2 from rdflib.namespace import RDF,ODE,MATH
3
4 __all__ = ["Function", "Equation","Polynomial","Ode"]
5
6
7 class Ode(Seq):
8 def __init__(self, graph, uri, seq=[], rtype="Ode"):
9 """Creates a Container
10
11 :param graph: a Graph instance
12 :param uri: URI or Blank Node of the Container
13 :param seq: the elements of the Container
14 :param rtype: the type of Container, one of "Bag", "Seq" or "Alt"
15 """
16
17 self.graph = graph
18 self.uri = uri or BNode()
19 self._len = 0
20 self._rtype = rtype # rdf:Bag or rdf:Seq or rdf:Alt
21
22 self.append_multiple(seq)
23
24 # adding triple corresponding to container type
25 self.graph.add((self.uri, RDF.type, ODE[self._rtype]))
26
27 class Function(Ode):
28 def __init__(self, graph, uri, seq=[]):
29 Ode.__init__(self, graph, uri, seq, "Function")
30
31
32 class Equation(Ode):
33 def __init__(self, graph, uri, seq=[]):
34 Ode.__init__(self, graph, uri, seq, "Equation")
35
36 class Polynomial(Ode):
37 def __init__(self, graph, uri, seq=[]):
38 Ode.__init__(self, graph, uri, seq, "Polynomial")
With these two classes I can generate a RDF file in a declarative way.
For example we can create the Function c(t):
1 from rdflib import Graph, URIRef, RDF, BNode, RDFS, Literal, Seq, Bag, Function, Equation, Times, Minus, Polynomial, Ode
2 from rdflib.namespace import ODE, MATH
3
4 # the time t
5 t = BNode("t")
6 graph.add((t,RDFS.label,Literal("t")))
7
8 c_of_t_label = BNode("c")
9 graph.add((c_of_t_label,RDFS.label,Literal("c")))
10 c_of_t_bn = BNode("c_of_t")
11
12 Function(graph,c_of_t_bn,[c_of_t_label,t])
And we obtain the following RDF:
_:c rdfs:label "c" .
_:t rdfs:label "t" .
_:c_of_t a ode:Function ;
rdf:_1 _:c ;
rdf:_2 _:t .
So far, so good. Now I want to execute a SPARQL query on this rdf to retrieve the function.
1 import rdflib
2
3 from rdflib import Graph, URIRef, RDF, BNode, RDFS, Literal, Seq, Bag, Function, Equation, Times, Minus, Polynomial, Ode
4 from rdflib.namespace import ODE, MATH
5
6 def main():
7 g = rdflib.Graph()
8 g.parse("ode_spe", format="ttl")
9
10
11 function = ODE.Function
12
13 query_test= "SELECT ?e WHERE {?e rdf:type ode:Function . }"
14 qres = g.query(query_test)
15
16 print (len(qres))
17 if __name__ == "__main__":
18 main()
But I have no results.
I probably do not do the right thing with ode:Function.
I have two questions:
Is it the right way to add a user defined vocabulary ?
And what can I do to retrieve the function with a SPARQL query
Thank for your help.
Olivier

My eye was drawn to SELECT ?e WHERE {?e rdf:type ode:Function . }. Check that ode is known by the graph. Either add a PREFIX spec in the SPARQL or an initNs keyword arg in the g.query invocation. And/or use g.namespace_manager to bind "ode" to ODE.

Related

YouTube Data API returning inconsistent results with duplicates

There have been numerous questions about inconsistent results from the YouTube Data API: 1, 2, 3, 4, 5, 6. Most of them have accepted answers that seem to indicate there was a problem with the API request that was fixed by the instructions in the answers. But none of those situations apply to the API request discussed here.
There have also been two questions about duplicates in the API results: 1, 2. Both of them have the same answer, which says to use the next-page token. But both questions say the token was used, so that answer is not helpful.
Yesterday, I submitted a series of API requests to get the list of most-viewed videos about 3D printing. The first request in the series was:
https://www.googleapis.com/youtube/v3/search?q=3D print&type=video&maxResults=50&part=id,snippet&order=viewCount&key=<my key>
I ran that in a VBA sub, which took the next-page token from each result and resubmitted the URL with &pageToken=<nextPageToken> inserted.
The result was a list of 649 unique video IDs. So far so good.
After making some changes in the VBA code and seeing some duplicates in the result set, I went back today and ran the original VBA sub again. The result was again a list of 649 video IDs, but this time the list included duplicates and it also included IDs that were not in yesterday's list and was missing IDs that were there yesterday. Here is a comparison from the first two pages and the last two pages of the two result sets:
Page
# on page
# overall
Run 1
Run 2
Same as
Seq
Dup
1
1
1
f2mdMcf-fJs
f2mdMcf-fJs
1
1
2
2
WSauz5KVKTU
WSauz5KVKTU
2
Seq
1
3
3
zsSCUWs7k9Q
XYIUM5TkhMo
None
1
4
4
B5Q1J5c8oNc
zsSCUWs7k9Q
3
Seq
1
5
5
cUxIb3Pt-hQ
B5Q1J5c8oNc
4
Seq
1
6
6
4yyOOn7pWnA
LDjE28szwr8
None
1
7
7
3N46jQ0Xi3c
cUxIb3Pt-hQ
5
Seq
1
8
8
08dBVz8_VzU
4yyOOn7pWnA
6
Seq
...
1
13
13
oeKIe1ik2O8
e1rQ8YwNSDs
11
Seq
1
14
14
FrG_eSECfps
RVB2JreIcoc
12
Seq
1
15
15
pPQCwz2q96o
oeKIe1ik2O8
13
Seq
1
16
16
uo3KuoEiu3I
pPQCwz2q96o
15
NOT
1
17
17
0U6aIwd5h9s
uo3KuoEiu3I
16
Seq
...
1
47
47
ShGsW68zbIo
iu9rhqsvrPs
46
Seq
1
48
48
0q0xS7W78KQ
ShGsW68zbIo
47
Seq
1
49
49
UheJQsXOAnY
0q0xS7W78KQ
48
Seq
Dup
1
50
50
H8AcqOh0wis
H8AcqOh0wis
50
NOT
Dup
2
1
51
EWq3-2VuqbQ
0q0xS7W78KQ
48
NOT
Dup
2
2
52
scuTZza4f_o
H8AcqOh0wis
50
NOT
Dup
2
3
53
bJWJW-mz4_U
UheJQsXOAnY
49
NOT
2
4
54
Ii4VYsh9OlM
EWq3-2VuqbQ
51
NOT
2
5
55
r2-OGUu57pU
scuTZza4f_o
52
Seq
2
6
56
8KTnu18Mi9Q
bJWJW-mz4_U
53
Seq
2
7
57
DconsfGsXyA
Ii4VYsh9OlM
54
Seq
2
8
58
UttEvLJP3l8
8KTnu18Mi9Q
56
NOT
2
9
59
GJOOLH9ZP2I
DconsfGsXyA
57
Seq
2
10
60
ewgmg9Q5Ab8
UttEvLJP3l8
58
Seq
...
13
35
635
qHpR_p8lA4I
FFVOzo7tSV8
639
Seq
13
36
636
DplwDDZNTRc
76IBjdM9s6g
640
Seq
13
37
637
3AObqGsimr8
qEh0uZuu7_U
None
13
38
638
88keQ4PWH18
RhfGJduOlrw
641
Seq
13
39
639
FFVOzo7tSV8
QxzH9QkirCU
643
NOT
13
40
640
76IBjdM9s6g
Qsgz4GbL8O4
None
13
41
641
RhfGJduOlrw
BSgg7mEzfqY
644
Seq
13
42
642
lVEqwV0Nlzg
VcmjbJ2q8-w
645
Seq
13
43
643
QxzH9QkirCU
gOU0BCL-TXs
None
13
44
644
BSgg7mEzfqY
IoOXQUcW24s
646
Seq
13
45
645
VcmjbJ2q8-w
o4_2_a6LzFU
647
Seq
Dup
14
1
646
IoOXQUcW24s
o4_2_a6LzFU
647
NOT
Dup
14
2
647
o4_2_a6LzFU
ijVPcGaqVjc
648
Seq
14
3
648
ijVPcGaqVjc
nk3FlgEuG-s
649
Seq
14
4
649
nk3FlgEuG-s
27ZLFn8Dejg
None
The last three columns have the following meanings:
Same as: If an ID from Run 2 is the same as an ID from Run 1, then this column has the # overall for Run 1.
Seq: Indicates whether the number in column "Same as" is one more than the previous number in that column.
Dup: Indicates whether an ID from Run 2 occurred more than once in that run.
Problems:
The videos XYIUM5TkhMo, LDjE28szwr8, qEh0uZuu7_U, Qsgz4GbL8O4, gOU0BCL-TXs, and 27ZLFn8Dejg were returned as #3, 6, 637, 640, 643, and 649 in Run 2, but were not returned at all in Run 1.
The videos FrG_eSECfps, r2-OGUu57pU, lVEqwV0Nlzg were returned as #14, 55, 642, in Run 1, but were not in Run 2.
The videos 0q0xS7W78KQ, H8AcqOh0wis, and o4_2_a6LzFU were returned as #49, 50, and 645 in Run 2, but then each appears a second time in that run (as well as appearing in Run 1 as #48, 50, and 647).
These results are troubling. They mean that no single search will return a reliable list of videos for a given value of q.
I mentioned at the beginning that previous questions about inconsistent results from the YouTube Data API had answers that seemed to resolve those inconsistencies. Is there a way to do that for this search? Is there something wrong with the way I'm composing the search that is causing the problem?
If there isn't a way to fix the search, then I suppose the only way to get a list of videos on the topic with high confidence of it being complete is to run the search multiple times and merge the results until no new IDs appear that were not in a previous result set. But even then, one would not know if there are other videos lurking undetected.

Using Dask throws ImportError when run inside SageMath

Recently, I have been trying to parallelize some Sage (Sage 9.4 on a MacBook Pro running OSX 11.2.3) code using Dask. The problem I run into is that while I can run Dask inside Sage, it will break whenever I include any code that isn't "pure python." In particular, it keeps throwing an ImportError. Here is a basic example of what I am running into
import time
from dask import delayed
from dask.distributed import Client
from time import sleep
client = Client(n_workers=4)
def Hello():
1+1 #this line breaks things by adding a sage operation
#if I remove it the code runs fine
return 'Hello World'
z = delayed(Hello)()
z.compute()
This code throws the following error
Traceback
ImportError Traceback (most recent call last)
<timed eval> in <module>
~/.sage/local/lib/python3.9/site-packages/dask/base.py in compute(self, **kwargs)
284 dask.base.compute
285 """
--> 286 (result,) = compute(self, traverse=False, **kwargs)
287 return result
288
~/.sage/local/lib/python3.9/site-packages/dask/base.py in compute(*args, **kwargs)
566 postcomputes.append(x.__dask_postcompute__())
567
--> 568 results = schedule(dsk, keys, **kwargs)
569 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
570
~/.sage/local/lib/python3.9/site-packages/distributed/client.py in get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
2669 should_rejoin = False
2670 try:
-> 2671 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
2672 finally:
2673 for f in futures.values():
~/.sage/local/lib/python3.9/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous)
1946 else:
1947 local_worker = None
-> 1948 return self.sync(
1949 self._gather,
1950 futures,
~/.sage/local/lib/python3.9/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
843 return future
844 else:
--> 845 return sync(
846 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
847 )
~/.sage/local/lib/python3.9/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
324 if error[0]:
325 typ, exc, tb = error[0]
--> 326 raise exc.with_traceback(tb)
327 else:
328 return result[0]
~/.sage/local/lib/python3.9/site-packages/distributed/utils.py in f()
307 if callback_timeout is not None:
308 future = asyncio.wait_for(future, callback_timeout)
--> 309 result[0] = yield future
310 except Exception:
311 error[0] = sys.exc_info()
/var/tmp/sage-9.4-current/local/lib/python3.9/site-packages/tornado/gen.py in run(self)
733
734 try:
--> 735 value = future.result()
736 except Exception:
737 exc_info = sys.exc_info()
~/.sage/local/lib/python3.9/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
1811 exc = CancelledError(key)
1812 else:
-> 1813 raise exception.with_traceback(traceback)
1814 raise exc
1815 if errors == "skip":
~/.sage/local/lib/python3.9/site-packages/distributed/protocol/pickle.py in loads()
73 return pickle.loads(x, buffers=buffers)
74 else:
---> 75 return pickle.loads(x)
76 except Exception:
77 logger.info("Failed to deserialize %s", x[:10000], exc_info=True)
/var/tmp/sage-9.4-current/local/lib/python3.9/site-packages/sage/rings/integer.pyx in init sage.rings.integer (build/cythonized/sage/rings/integer.c:54201)()
----> 1 r"""
2 Elements of the ring `\ZZ` of integers
3
4 Sage has highly optimized and extensive functionality for arithmetic with integers
5 and the ring of integers.
/var/tmp/sage-9.4-current/local/lib/python3.9/site-packages/sage/rings/rational.pyx in init sage.rings.rational (build/cythonized/sage/rings/rational.cpp:40442)()
98
99
--> 100 import sage.rings.real_mpfr
101 import sage.rings.real_double
102 from libc.stdint cimport uint64_t
/var/tmp/sage-9.4-current/local/lib/python3.9/site-packages/sage/rings/real_mpfr.pyx in init sage.rings.real_mpfr (build/cythonized/sage/rings/real_mpfr.c:46795)()
----> 1 r"""
2 Arbitrary Precision Real Numbers
3
4 AUTHORS:
5
/var/tmp/sage-9.4-current/local/lib/python3.9/site-packages/sage/libs/mpmath/utils.pyx in init sage.libs.mpmath.utils (build/cythonized/sage/libs/mpmath/utils.c:9062)()
----> 1 """
2 Utilities for Sage-mpmath interaction
3
4 Also patches some mpmath functions for speed
5 """
/var/tmp/sage-9.4-current/local/lib/python3.9/site-packages/sage/rings/complex_mpfr.pyx in init sage.rings.complex_mpfr (build/cythonized/sage/rings/complex_mpfr.c:34594)()
----> 1 """
2 Arbitrary Precision Floating Point Complex Numbers
3
4 AUTHORS:
5
/var/tmp/sage-9.4-current/local/lib/python3.9/site-packages/sage/rings/complex_double.pyx in init sage.rings.complex_double (build/cythonized/sage/rings/complex_double.c:25284)()
96 from cypari2.convert cimport new_gen_from_double, new_t_COMPLEX_from_double
97
---> 98 from . import complex_mpfr
99
100 from .complex_mpfr import ComplexField
ImportError: cannot import name complex_mpfr
The only other time I have seen an ImportError like this is when I have been running sage inside python and did not include a from sage.all import *, so I am wondering if what is happening is that Dask is trying to run my code in python. I'm also not sure whether this qualifies as a Sage or a Dask problem. Any help would be greatly appreciated!

Debugging APL code: how to use `#`(index) and `⊢` (right tack) together?

I am attempting to read Aaron Hsu's thesis on A data parallel compiler hosted on the GPU, where I have landed at some APL code I am unable to fix. I've attached both a screenshot of the offending page (page number 74 as per the thesis numbering on the bottom):
The transcribed code is as follows:
d ← 0 1 2 3 1 2 3 3 4 1 2 3 4 5 6 5 5 6 3 4 5 6 5 5 6 3 4
This makes sense: create an array named d.
⍳≢d
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
This too makes sense. Count the number of elements in d and create a sequence of
that length.
⍉↑d,¨⍳≢d
0 1 2 3 1 2 3 3 4 1 2 3 4 5 6 5 5 6 3 4 5 6 5 5 6 3 4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
This is slightly challenging, but let me break it down:
zip the sequence ⍳≢d = 1..27 with the d array using the ,¨ idiom, which zips the two arrays using a catenation.
Then, split into two rows using ↑ and transpose to get columns using ⍉
Now the biggie:
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
INDEX ERROR
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
Attempting to break it down:
⍳≢d counts number of elements in d
(d,¨⍳≢d) creates an array of pairs (d, index of d)
7 27⍴' ' creates a 7 x 27 grid: presumably 7 because that's the max value of d + 1, for indexing reasons.
Now I'm flummoxed about how the use of ⊢ works: as far as I know, it just ignores everything to the left! So I'm missing something about the parsing of this expression.
I presume it is parsed as:
(⍳≢d)#((d,¨⍳≢d)⊢(7 27⍴' '))
which according to me should be evaluated as:
(⍳≢d)#((d,¨⍳≢d)⊢(7 27⍴' '))
= (⍳≢d)#((7 27⍴' ')) [using a⊢b = b]
= not the right thing
As I was writing this down, I managed to fix the bug by sheer luck: if we increment d to be d + 1 so we are 1-indexed, the bug no longer manifests:
d ← d + 1
d
1 2 3 4 2 3 4 4 5 2 3 4 5 6 7 6 6 7 4 5 6 7 6 6 7 4 5
then:
(⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' '
1
2 5 10
3 6 11
4 7 8 12 19 26
9 13 20 27
14 16 17 21 23 24
15 18 22 25
However, I still don't understand how this works! I presume the context will be useful
for others attempting to leave the thesis, so I'm going to leave the rest of it up.
Please explain what (⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' ' does!
I've attached the raw screenshot to make sure I didn't miss something:
I'm happy to see that you found the the off-by-one error. It stems from Aaron Hsu working with index origin 0. If you set ⎕IO←0 then his code will work.
Some dyadic operators can take an array operand, giving the sequence OPERATOR operand argument, e.g. in -#(1 2 3)(4 5 6 7). This poses a problem because both the operand and the argument are arrays, and juxtaposition of arrays forms a new array with those arrays as elements by a process known as stranding. Compare:
(1 2 3)(4 5 6 7)
┌─────┬───┐
│1 2 3│4 5│
└─────┴───┘
However, in the case of the operator with its array operand, we want to "break" this strand so the left part can act as operand while the right part acts as argument. One way to break the stranding up is by applying a function to the argument, giving the sequence OPERATOR operand Function argument. Now, we don't actually need any transformation of the argument, so an identity function will do: -#(1 2 3)⊢(4 5 6 7).
As for what (⍳≢d)#(d,¨⍳≢d)⊢7 27⍴' ' actually does:
7 27⍴' ' creates a blank matrix.
(⍳≢d) are indices to insert into specified slots in the matrix.
#(d,¨⍳≢d) indicates at which locations in the matrix the above should replace the existing values
⊢ serves solely to separate (d,¨⍳≢d) from 7 27⍴' '. The code could also have been written as ((⍳≢d)#(d,¨⍳≢d))7 27⍴' ' with parentheses serving to "bind" the operand to the operator.

Which clustering model can I use to predict the following outcome?

I have three columns in my dataset. This is the list of restaurants that come under the category 'pizza'.This data was derived from the yelp dataset.There are three columns for each restaurant present. Latitude,Longitude,Checkins. I am supposed to build a model where I should be able to predict the coordinates(latitude,longitude) where I should start a new restaurant so that the number of checkins can be high. There are totally 4951 rows
checkins latitude longitude
0 2 33.394877 -111.600194
1 2 43.841217 -79.303936
2 1 40.442828 -80.186293
3 1 41.141631 -81.356603
4 1 40.434399 -79.922983
5 1 33.552870 -112.133712
6 1 43.686836 -79.293838
7 2 41.131282 -81.490180
8 1 40.500796 -79.943429
9 12 36.010086 -115.118656
10 2 41.484475 -81.921150
11 1 43.842450 -79.027990
12 1 43.724840 -79.289919
13 2 45.448630 -73.608719
14 1 45.577027 -73.330855
15 1 36.238059 -115.210341
16 1 33.623055 -112.339758
17 1 43.762768 -79.491417
18 1 43.708415 -79.475884
19 1 45.588257 -73.428926
20 4 41.152875 -81.358754
21 1 41.608833 -81.525020
22 1 41.425152 -81.896178
23 1 43.694716 -79.304879
24 1 40.442147 -79.956513
25 1 41.336466 -81.784790
26 1 33.231942 -111.721218
27 2 36.291436 -115.287016
28 2 33.641847 -111.995571
29 1 43.570217 -79.566431
... ... ... ...
I tried to approach the problem with clustering using DBSCAN and ended with the following graph. But I am not able to make any sense of it. How do I Proceed further or how do I approach the problem in a different way to get my results?
import pandas as pd
from sklearn.cluster import DBSCAN
import numpy as np
import matplotlib.pyplot as plt
review=pd.read_csv('pizza_category.csv')
checkin=pd.read_csv('yelp_academic_dataset/yelp_checkin.csv')
final=pd.merge(review,checkin,on='business_id',how='inner')
final.dropna()
final=final.reset_index(drop=True)
X=final[['checkins']]
X['latitude']=final[['latitude']].astype(dtype=np.float64).values
X['longitude']=final[['longitude']].astype(dtype=np.float64).values
print(X)
arr=X.values
db = DBSCAN(eps=2,min_samples=5)
y_pred = db.fit_predict(arr)
plt.figure(figsize=(20,10))
plt.scatter(arr[:, 0], arr[:, 1], c=y_pred, cmap="plasma")
plt.xlabel("Feature 0")
plt.ylabel("Feature 1")
Here's the plot I got
This is not a clustering problem.
What you want to do is density estimation, where you estimate density based on previous check-in frequencies.

Problems Implementing AR, ARMA, and possibly more complex timeseries models in pymc3 using theano.scan

I try to implement a simple ARMA model, however have serious difficulties getting it to run. When adding a parameter to the error term everything works fine (see the return x_m1 + a*e statement, commented out below), however if I add a parameter to the auto regressive part, I get a FloatingPointError or LinAlgError or PositiveDefiniteError, depending on the initialization method I use.
The code is also put into a gist you can find here. The model definition is replicated here:
with pm.Model() as model:
a = pm.Normal("a", 0, 1)
sigma = pm.Exponential('sigma', 0.1, testval=F(.1))
e = pm.Normal("e", 0, sigma, shape=(N-1,))
def x(e, x_m1, a):
# return x_m1 + a*e
return a*x_m1 + e
x, updates = theano.scan(
fn=x,
sequences=[e],
outputs_info=[tt.as_tensor_variable(data.iloc[0])],
non_sequences=[a]
)
x = pm.Deterministic('x', x)
lam = pm.Exponential('lambda', 5.0, testval=F(.1))
y = pm.StudentT("y", mu=x, lam=lam, nu=1, observed=data.values[1:]) #
with model:
trace = pm.sample(2000, init="NUTS", n_init=1000)
Here the errors respective to the initialization methods:
"ADVI" / "ADVI_MAP": FloatingPointError: NaN occurred in ADVI optimization.
"MAP": LinAlgError: 35-th leading minor not positive definite
"NUTS": PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71]
For details on the error messages, please look at this github issue posted at pymc3.
To be explicit, I really would like to have a scan-like solution which is easily extendable to for instance a full ARMA model. I know that one can represent the presented AR(1) model without scan by defining logP as already done in pymc3/distributions/timeseries.py#L18-L46, however I was not able to extend this vectorized style to a full ARMA model. The use of theano.scan seems preferable I think.
Any help is highly appriciated!

Resources