Highcharts: Disconnect line when there's no data for position - highcharts

in my project I'm using highcharts/stockcharts. See my JS-fiddle example for a simple graph with X-axis 1 to 10, but no data for "position 9". I want no line to be drawn between 8 and 10, since there is no data for 9.
I've been playing with the connectNulls option, but that only works when providing null as value for position 9. Since Highcharts figures out the intervals by itself I hoped it would recognise on its own that there's no data for that position. Is there any way to make Highcharts not draw a line between 8 and 10 without specifying null for position 9?
Thanks in advance for your replies.
http://jsfiddle.net/bnqhuqt3/

You can not cut a line in a serie. You need to define a new data for series
series: [{
data: [
[1, 98.87],
[2, 98.45],
[3, 98.52],
[4, 99.34],
[5, 98.56],
[6, 98.61],
[7, 98.12],
[8, 98.03],
]
}, {
data: [
[10, 0],
[11, 150]
]
}]
Fiddle Example

Related

highchartr: Synchronize point and line colors and legend

I'm using the highchartr package to create an interactive chart.
My chart has lines on it corresponding to different series. In addition, I would like to have shapes at certain points on the lines.
Its very easy to get the points in the right place. However, I would like to map the point color to the line it is associated with. And when the user clicks on the legend entry for the line, I'd like the associated points to be toggled as well.
The code looks like this:
highchart() %>%
hc_add_series(
type="line",
marker=list(enabled=F),
data=input_data,
mapping=hcaes(x=x, y=y, group=series_name)
) %>%
hc_add_series(
type="point",
data=input_data %>% filter(! is.na(marker)),
mapping=hcaes(x=x, y=y, color=series_name, fill=series_name, group=series_name, shape=marker)
)
The result gets the points in the right place. But the point color is on a different color mapping from the lines. Clicking on the entry for the line in the legend toggles only the line - the points show up as separate entries by series_name.
What
What can I do so:
- The points and lines share the same color mapping
- The points and lines can be toggled together by clicking on the line in the legend
- The points show up separately in the legend based on their shape rather than their color?
Thanks!
Generally, it can be achieved in at least few different ways. It all depends on your data which you haven't provided (I created a sample data).
Additionally, I will provide all the examples in jsFiddle (JavaScript) because it is faster to explain something that way with a quick online example.
The final answer will contain R code (maybe with some custom JavaScript if needed, but all will be reproducible in R.
First of all, your assumption that you need a separate series is wrong and causes problems. If you want markers on your line with the same color and you want to toggle them together on legend click, then you don't need separate series - one series with markers enabled on some points is enough, see this example: https://jsfiddle.net/BlackLabel/s24rk9x7/
In this case, the R data needs to be defined properly.
If you don't want to keep it simple as described above, you can keep lines and markers as separate series as in your original question.
In this case, you can use series.linkedTo property to connect your "point" series to line series (BTW in Highcharts there is no something like "point" series type, it is "scatter" series type. Another reason why your code is wrong and is not working and you got unvoted), but there is a problem with it in Highcharter - doesn't work, seems like a bug and should be reported on Highcharter GitHub repo.
This is a JavaScript version which works fine: https://jsfiddle.net/BlackLabel/3mtdfqLo/
In this example, if you want to keep markers and line series in the same color, you can define colors manually or you can write some custom code (like I did) that will change the color for you automatically.
And this is the same R version which should work, but is not:
library(highcharter)
highchart() %>%
hc_add_series(
data=list(4, 3, 5, 6, 2, 3)
) %>%
hc_add_series(
data=list(14, 13, 15, 16, 12, 13),
id="first"
) %>%
hc_add_series(
data=list(10, 8, 6, 2, 5, 12),
id="second"
) %>%
hc_add_series(
type="scatter",
linkedTo="first",
data=list(list(1, 3), list(2, 5))
) %>%
hc_add_series(
type="scatter",
linkedTo="second",
data=list(list(1, 13), list(2, 15), list(3, 16))
) %>%
hc_plotOptions(
line = list(marker=list(enabled=F))
)
There is probably something wrong with hc_add_series function.
As a workaround, you can write it all as a custom JavaScript code, which (again) works fine:
library(highcharter)
highchart() %>%
hc_plotOptions(
line = list(marker=list(enabled=F))
) %>%
hc_chart(
events = list(load = JS("function() {
this.addSeries({
data: [4, 3, 5, 6, 2, 3],
id: 'first'
});
this.addSeries({
data: [14, 13, 15, 16, 12, 13],
id: 'second'
});
this.addSeries({
data: [10, 8, 6, 2, 5, 12]
});
this.addSeries({
type: 'scatter',
linkedTo: 'first',
data: [[1, 3], [2, 5]]
});
this.addSeries({
type: 'scatter',
linkedTo: 'second',
data: [[1, 13], [2, 15], [3, 16]]
});
}")))
Of course, last examples don't contain functionality that changes colors - you can copy it from the jsFiddle above.

Correct way to split data to batches for Keras stateful RNNs

As the documentation states
the last state for each sample at index i in a batch will be used as
initial state for the sample of index i in the following batch
does it mean that to split data to batches I need to do it the following way
e.g. let's assume that I am training a stateful RNN to predict the next integer in range(0, 5) given the previous one
# batch_size = 3
# 0, 1, 2 etc in x are samples (timesteps and features omitted for brevity of the example)
x = [0, 1, 2, 3, 4]
y = [1, 2, 3, 4, 5]
batches_x = [[0, 1, 2], [1, 2, 3], [2, 3, 4]]
batches_y = [[1, 2, 3], [2, 3, 4], [3, 4, 5]]
then the state after learning on x[0, 0] will be initial state for x[1, 0]
and x[0, 1] for x[1, 1] (0 for 1 and 1 for 2 etc)?
Is it the right way to do it?
Based on this answer, for which I performed some tests.
Stateful=False:
Normally (stateful=False), you have one batch with many sequences:
batch_x = [
[[0],[1],[2],[3],[4],[5]],
[[1],[2],[3],[4],[5],[6]],
[[2],[3],[4],[5],[6],[7]],
[[3],[4],[5],[6],[7],[8]]
]
The shape is (4,6,1). This means that you have:
1 batch
4 individual sequences = this is batch size and it can vary
6 steps per sequence
1 feature per step
Every time you train, either if you repeat this batch or if you pass a new one, it will see individual sequences. Every sequence is a unique entry.
Stateful=True:
When you go to a stateful layer, You are not going to pass individual sequences anymore. You are going to pass very long sequences divided in small batches. You will need more batches:
batch_x1 = [
[[0],[1],[2]],
[[1],[2],[3]],
[[2],[3],[4]],
[[3],[4],[5]]
]
batch_x2 = [
[[3],[4],[5]], #continuation of batch_x1[0]
[[4],[5],[6]], #continuation of batch_x1[1]
[[5],[6],[7]], #continuation of batch_x1[2]
[[6],[7],[8]] #continuation of batch_x1[3]
]
Both shapes are (4,3,1). And this means that you have:
2 batches
4 individual sequences = this is batch size and it must be constant
6 steps per sequence (3 steps in each batch)
1 feature per step
The stateful layers are meant to huge sequences, long enough to exceed your memory or your available time for some task. Then you slice your sequences and process them in parts. There is no difference in the results, the layer is not smarter or has additional capabilities. It just doesn't consider that the sequences have ended after it processes one batch. It expects the continuation of those sequences.
In this case, you decide yourself when the sequences have ended and call model.reset_states() manually.

Splitting complex PDF files using Watson Document Conversion Service

We are implementing Question & Answering System using Watson Discovery Service(WDS). We required each answer unit available in single document. We have complex PDF files as corpus. The PDF files contains two column data, tables and images. Instead ingesting whole PDF files as corpus to WDS and using passage retrieval we are using Watson Document Conversion Service(WDC) to split each PDF file into answer units and later we are ingesting there answer units into WDS.
We are facing two issues with Watson Document Conversion service for complex PDF splitting.
We are expecting each heading as title and corresponding text as data(answer). However it is splitting each chapter as single answer unit. Is there any way to split the two column document based on the heading?
In case the input PDF file contains table the document conversion service reading structured data available in PDF file as simple text(missing table formatting). Is there any way to read structured data from PDF to answer unit?
I would recommend that you first convert your PDF to normalized HTML by using this setting:
"conversion_target": "normalized_html"
and inspect the generated HTML. Look for the places where headings (<h1>, <h2>, ..., <h6>) are detected. Those are the tags that will be used to split by answer units when you switch back to answer_units.
The reason you are currently seeing each chapter being split as an answer unit is because each chapter probably starts with a heading, but no headings are detected within each chapter.
In order to generate more answer units, you will need to tweak the PDF input configurations as described here, so that more headings are generated from the PDF to HTML conversion step and hence more answer units are generated.
For example, the following configuration will detect headings at 6 different levels, based on certain font characteristics for each level:
{
"conversion_target": "normalized_html",
"pdf": {
"heading": {
"fonts": [
{"level": 1, "min_size": 24},
{"level": 2, "min_size": 18, "max_size": 23, "bold": true},
{"level": 3, "min_size": 14, "max_size": 17, "italic": false},
{"level": 4, "min_size": 12, "max_size": 13, "name": "Times New Roman"},
{"level": 5, "min_size": 10, "max_size": 12, "bold": true},
{"level": 6, "min_size": 9, "max_size": 10, "bold": true}
]
}
}
}
You can start with a configuration like this and keep tweaking it until the produced normalized HTML contains the headings at the places that you expect the answer units to be. Then, take the tweaked configuration, switch to answer_units and put it all together:
{
"conversion_target": "answer_units",
"answer_units": {
"selector_tags": ["h1", "h2", "h3", "h4", "h5", "h6"]
},
"pdf": {
"heading": {
"fonts": [
{"level": 1, "min_size": 24},
{"level": 2, "min_size": 18, "max_size": 23, "bold": true},
{"level": 3, "min_size": 14, "max_size": 17, "italic": false},
{"level": 4, "min_size": 12, "max_size": 13, "name": "Times New Roman"},
{"level": 5, "min_size": 10, "max_size": 12, "bold": true},
{"level": 6, "min_size": 9, "max_size": 10, "bold": true}
]
}
}
}
Regarding your second question about tables, unfortunately there is no way to convert table content into answer units. As explained above, answer unit generation is based on heading detection. That being said, if there is a table between two detected headings, that table will be part of the answer unit as any other content between the two headings.

How to add lines between individual data points?

I'm trying to make a scatter plot in Highcharts that only connects two individual points to each other, but doesn't connect to any other points. (To show the change in a data point over time).
Here I illustrate my question. I'd like for there to be a line between the points
[20, 20] and [80, 80]
and a separate line connecting
[60, 40] to [85, 60]
but no line connecting
[80, 80] to [60, 40]
Is there an easily configurable way to do this, or do I have to manually render each line?
You can simple add null between these points.
data: [[20, 20], [80, 80], null, [60, 40], [85, 60]]
Demo

More efficient way to parse a matrix in python?

I have some data in a text file that looks like this:
1895723957
8599325893
5723857831
5025852920
and I'd like to parse it into a list of lists in Python, so the output is
[[1, 8, 9, 5, 7, 2, 3, 9, 5, 7], [8, 5, ...
Right now, I have
data = open('file.txt')
rows = [str(line).strip() for line in data]
matrix=[]
for r in rows:
matrix.append(list(r))
but are there different ways to do this, such as using less lines of code or exploiting comprehensions?
I've tried looking around, but I'm not exactly sure what keywords to use here...
Thanks much!
I'd try something like this:
with open('file.txt', 'r') as handle:
matrix = [map(int, line.strip()) for line in handle]
I came up with the following way after playing around with comprehensions:
data = open('file.txt')
matrix = [[int(c) for c in row.rstrip()] for row in data]
rstrip is thanks to Blender above.

Resources