JUNG: Custom DijkstraShortestPath - jung

I discovered this lib this week, and my first project works, I modelize simple flight booking.
As edge I have created an Flight Class
As vertex I have created an Airport Class
I put duration for each flight and succeed to associate dijsktra algorithm (DijkstraShortestPath)
class Airport {
String name;
}
class Flight {
String flight;
int duration;
}
g = new DirectedSparseMultigraph<Airport, Flight>();
Flight AXXX = new Flight("A57",3);
Flight AYYY = new Flight("A53",1);
ORY = new Airport("ORY");
LYS = new Airport("LYS");
g.addEdge(AXXX, ORY, LYS);
g.addEdge(AYYY, LYS, ORY);
Transformer<Flight, Integer> wtTransformer = new Transformer<Flight, Integer>() {
#Override
public Integer transform(Flight link) {
return link.duration;
}
};
DijkstraShortestPath<Airport, Flight> alg = new DijkstraShortestPath(g, wtTransformer);
Number dist = alg.getDistance(ORY, LYS);
This simple case works well, but now I would to calculate duration as:
Flight1 start 12/01/13 at 12:00 and arrival at 13/01/13 at 14h
Flight2 start 13/01/13 at 18:00 and arrival at 13/01/13 at 20h
In this case I want to calculate duration of flight AND between flight. Cause to get shortest path from one to another flight we need to take care about time to wait between flight and not only flight duration.
But DiskstraShortestPath allow only Transformer as: Transformer so I can’t get reference to previous flight to calculate total duration (wait + flight).
So my question is: what is the best way for my case ?
Create new algorithm (inherit DijsktraShortestPath…)
Create new GraphType (inherit DirectedSparseMultigraph…)
Thanks for you answer guy ;) !

If you are trying to minimize total travel time, then this is, indeed, not a shortest path problem, but a different kind of discrete optimization problem. JUNG does not provide a general solver for discrete optimization problems.
Even if you're trying to minimize flight time only (that is, time spent in the air) then you need to be able to filter the graph (more precisely, the outgoing edges) at each step, because only flights that depart after the previous flight arrives are relevant, i.e., the local topology is a function of time.

Related

Forecasting.ForecastBySsa with Multiple variables as input

I've got this code to predict a time series. I want to have a prediction based upon a time series of prices and a correlated indicator.
So together with the value to forecast, I want to pass a side value but I cannot understand if this is taken into account because prediction doesn't change with or without it. In which way do I need to tell to the algorithm how to consider these parameters?
public static TimeSeriesForecast PerformTimeSeriesProductForecasting(List<TimeSeriesData> listToForecast)
{
var mlContext = new MLContext(seed: 1); //Seed set to any number so you have a deterministic environment
var productModelPath = $"product_month_timeSeriesSSA.zip";
if (File.Exists(productModelPath))
{
File.Delete(productModelPath);
}
IDataView productDataView = mlContext.Data.LoadFromEnumerable<TimeSeriesData>(listToForecast);
var singleProductDataSeries = mlContext.Data.CreateEnumerable<TimeSeriesData>(productDataView, false).OrderBy(p => p.Date);
TimeSeriesData lastMonthProductData = singleProductDataSeries.Last();
const int numSeriesDataPoints = 2500; //The underlying data has a total of 34 months worth of data for each product
// Create and add the forecast estimator to the pipeline.
IEstimator<ITransformer> forecastEstimator = mlContext.Forecasting.ForecastBySsa(
outputColumnName: nameof(TimeSeriesForecast.NextClose),
inputColumnName: nameof(TimeSeriesData.Close), // This is the column being forecasted.
windowSize: 22, // Window size is set to the time period represented in the product data cycle; our product cycle is based on 12 months, so this is set to a factor of 12, e.g. 3.
seriesLength: numSeriesDataPoints, // This parameter specifies the number of data points that are used when performing a forecast.
trainSize: numSeriesDataPoints, // This parameter specifies the total number of data points in the input time series, starting from the beginning.
horizon: 5, // Indicates the number of values to forecast; 2 indicates that the next 2 months of product units will be forecasted.
confidenceLevel: 0.98f, // Indicates the likelihood the real observed value will fall within the specified interval bounds.
confidenceLowerBoundColumn: nameof(TimeSeriesForecast.ConfidenceLowerBound), //This is the name of the column that will be used to store the lower interval bound for each forecasted value.
confidenceUpperBoundColumn: nameof(TimeSeriesForecast.ConfidenceUpperBound)); //This is the name of the column that will be used to store the upper interval bound for each forecasted value.
// Fit the forecasting model to the specified product's data series.
ITransformer forecastTransformer = forecastEstimator.Fit(productDataView);
// Create the forecast engine used for creating predictions.
TimeSeriesPredictionEngine<TimeSeriesData, TimeSeriesForecast> forecastEngine = forecastTransformer.CreateTimeSeriesEngine<TimeSeriesData, TimeSeriesForecast>(mlContext);
// Save the forecasting model so that it can be loaded within an end-user app.
forecastEngine.CheckPoint(mlContext, productModelPath);
ITransformer forecaster;
using (var file = File.OpenRead(productModelPath))
{
forecaster = mlContext.Model.Load(file, out DataViewSchema schema);
}
// We must create a new prediction engine from the persisted model.
TimeSeriesPredictionEngine<TimeSeriesData, TimeSeriesForecast> forecastEngine2 = forecaster.CreateTimeSeriesEngine<TimeSeriesData, TimeSeriesForecast>(mlContext);
// Get the prediction; this will include the forecasted product units sold for the next 2 months since this the time period specified in the `horizon` parameter when the forecast estimator was originally created.
prediction = forecastEngine.Predict();
return prediction;
}
TimeSeriesData has multiple attributes, not only the value of the series that I ant to forecast. Just wonder if they are taken into account when forecasting o not.
Is there a better method to forecast this type of series like LMST? Is this method available in ML.NET?
There is a new ticket for enhancement: Multivariate Time based series forecasting to ML.Net
See ticket: github.com/dotnet/machinelearning/issues/5638

Dexie updates to a table with 2 compound keys get very slow in iOS

I am trying to sort out a performance issue in doing a data restore on iOS (i.e. data exists in a table, then I update some of it from a flat file backup). My actual case is 1250 times slower on iOS than Windows. I started with the raindrops example from Dexie.bulkPut, which does not exhibit this behavior (it's slower, but only by about 15%, and gradually modified it to be more like what I need to do.
What I have found is that if my table has a single compound key, I can bulkPut the data twice, and it takes nearly the same amount of time both times. But if there are two compound keys, the second write takes about the same time on the computer, but much, much longer on iOS (times in seconds).
Windows Windows iOS iOS
Records First write Second write First write Second write
20,000 2.393 1.904 5.057 131.127
40,000 5.231 3.941 9.533 509.616
60,000 7.808 8.331 14.188 1205.181
Here is my test program:
var db = new Dexie("test");
db.delete().then(function() {
db.version(1).stores({
raindrops: '[matrix+row+col], [matrix+done]'
});
db.open();
var t1 = new Date();
var drops = [];
for (var i=0;i<20000;++i) { // make a matrix
drops.push({matrix: 0, row: Math.floor(i/100)+1, col: i % 100 +1, done: i%2});
}
db.raindrops.bulkPut(drops).then(function(lastKey) {
t2 = new Date();
console.log("Time in seconds " + (t2.getTime() - t1.getTime())/1000);
db.raindrops.bulkPut(drops).then(function(lastKey) {
t3 = new Date();
console.log("Reputting -- Time in seconds " + (t3.getTime() - t2.getTime())/1000);
});
});
});
I am wondering if anyone has any suggestions. I need that second index but I also need for people to be able to do a restore in finite time (my users are not very careful about clearing browser data). Is there any chance of this performance improving? The fact that it's so much better for Windows suggests that it doesn't HAVE to be that way. Or is there a way I could drop the second index before doing a restore and then re-indexing? Done is 0 or 1 and that index is there so I can get a quick count of records with done set (or not set), but maybe it would be better to count manually.

How to write unit tests for session windows in a Beam pipeline?

I am writing a pipeline that is processing product events (create, update, delete). Each product belongs to a sale that has a certain duration. I want to be able to perform some aggregation on all the products in a given sale. For the purpose of this example, let's assume I just want a list of unique product IDs per sale.
Therefore, my pipeline is using session windows on the sale id with a very long gap duration (so when the sale closes and there are no more product updates being published, the window for that sale closes too). My question is, how do I write a unit test for that?
For the sake of this test, let's assume the following:
the events are just Strings with the sale ID and the product ID, separated by a space,
the applyDistinctProductsTransform will basically perform what I've said above. Create KV<String, String> elements where the key is the sale id; set session windows with a gap duration of 600 seconds; and finally create a concatenated string of all product IDs per sale.
Here is what I have so far:
I create a TestStream and add some elements: 3 products for sale1. Next, I advance the watermark to 700, well beyond the gap duration. Another product is added and finally the watermark is advanced to infinity.
#Test
public void TestSessionWindow() {
Coder<String> utfCoder = StringUtf8Coder.of();
TestStream<String> onTimeProducts =
TestStream.create(utfCoder).addElements(
TimestampedValue.of("sale1 product1", new Instant(0)),
TimestampedValue.of("sale1 product2", new Instant(0)),
TimestampedValue.of("sale1 product3", new Instant(0))
)
.advanceWatermarkTo(new Instant(700)) // watermark passes trigger time
.addElements(
TimestampedValue.of("campaign1 product9", new Instant(710))
)
.advanceWatermarkToInfinity();
PCollection<KV<String, String>> results = applyDistinctProductsTransform(pipeline, onTimeProducts);
PAssert.that(results).containsInAnyOrder(
KV.of("sale1", "product1,product2,product3"),
KV.of("sale1", "product9")
);
pipeline.run().waitUntilFinish();
}
However,
the pipeline outputs a KV of sale1, product1,product2,product3,product9 so product9 is appended to the window. I would've expected this product to be processed in a separate window and hence end up in a different row in the output PCollection.
how can I only get the results of a single window in the PAssert? I know there is the inWindow function and I've found an example for a fixed time window but I don't know how to do the same for a session window.
You can check out the full code of the PTransform and the unit test.
1) I believe you have a simple unit issue. The window gap duration of 600 is being specified in seconds Duration.standardSeconds yet new Instant(long) uses milliseconds which means that the 600 second gap is larger then the time interval of 700 millis causing the sessions to be merged.
2) Sessions still use interval windows internally. You will need to compute what the output window would be after all sessions are merged based upon your trigger strategy. By default, a session window uses the IntervalWindow(timestamp, gap duration), and merges all overlapping windows to create a larger window. For example, if you had the windows (start time, end time), [10, 14], [12, 18], [4, 14] for the same session key, they would all be merged producing a single [4, 18] window.

ios- thread 1 exc_bad_instruction error in app

I am writing an app to simulate the nba lottery. I have already written the codes to generate the random combinations, and assigned them to each team.
Here is my method to simulate the drawings and assign the draft positions to each team. standingsArray is an array of Team items of type ObjectWrapper, with values of name, seed, wins, losses, draft position exc... for each team. So basically what Im doing is I have 14 balls and randomly choose 4 balls, which constitute a combination (order doesn't matter). So essentially there are a total of 1001 total possible combinations, but one is thrown out. (you can ignore the first while loop because that is just there so that the thrown out combination isnt selected). A number of combinations is assigned to the 14 lottery teams based on record (250 for worst team, 199 for second worst exc...). The argument in my method standingsArray already has the number of possibilities assigned to each team. Next, I randomly pull 4 balls from the total possibilities, and the team with that combination gets the first pick. But because all the combinations for that team selected cant be chosen again for the second pick, I have to remove all of those combinations, but that is very complicated so instead, i make a new array called tempPossibilities which appends all the combinations for every team except the one just selected, which then allows me to generate a new combination to select from.
However, I am getting an error at this line for j in 0...(standingsArray[i].possibilities?.count)!-1{ It says bad instruction error, and I cannot figure out why I am getting this error. And what else doesnt make sense is that the for loop works and the tempPossibilities array is fully populated with the correct amount of combinations (without the lottery team), even though the error happens at the for loop?
Code is below: any help is appreciated, thank you, and sorry for the really long paragraph
func setDraftPositions(var standingsArray: [Team])->[Team]{
var lottery: [Team]=[]
var totalPossibilities: [[Int]]=combosOfLength(14, m: 4)
var tempPossibilities = []
var rand = Int(arc4random_uniform(UInt32(totalPossibilities.count)))
var draw = totalPossibilities[rand]
while (draw==(unused?.first)!) {
rand = Int(arc4random_uniform(UInt32(totalPossibilities.count)))
draw = totalPossibilities[rand]
}
s: for x in 0...13{
for a in 0...(standingsArray[x].possibilities?.count)!-1{
if(draw==standingsArray[x].possibilities![a]){
standingsArray[x].setDraftingPosition(1)
standingsArray[x].isLottery=true;
lottery.append(standingsArray[x])
for i in 0...(standingsArray.count-1) {
if(standingsArray[i].firstName != standingsArray[x].firstName!) {
for j in 0... (standingsArray[i].possibilities?.count)!-1{ //ERROR is happening here
tempPossibilities.append(standingsArray[i].possibilities![j])
}
}
}
standingsArray.removeAtIndex(x)
break s;
}
}
}
(repeat this for the next 2 picks)
Try this:
for j in 0...(standingsArray[i].possibilities?.count)!-1{
should be written like this:
for j in 0...(standingsArray[i].possibilities?.count)! - 1{
it needs proper spacing.

Weighted Graph DijkstraShortestPath : getpath() does not return path with least cost

Thanks for prompt response chessofnerd and Joshua.I am sorry for unclear logsand unclear question.Let me rephrase it.
Joshua:
I am storing my weights in DB and retrieving from DB in transformer.
I have 4 devices connected in my topology and between some devices there are multiple connections and between 2 devices only single connection as shown below.
I am using undirected weighted graph.
Initially all links are assigned weight of 0.When I request a path between D1 and D4 , I increase the weight of each link by 1.
When a second request comes for another path, I am feeding all the weights through Transformer.
When request comes second time, I am correctly feeding weight of 1 for links L1,L2,L3 and 0 for other links.
Since weight of (L4,L5,L3) or (L6,L7,L3) or (L8,L9,L3) is less than weight of (L1,L2,L3), I am expecting I will get one of these paths - (L4,L5,L3) or (L6,L7,L3) or (L8,L9,L3). But I am getting again (L1,L2,L3)
D1---L1-->D2---L2--->D3--L3--->D4
D1---L4-->D2---L5--->D3--L3--->D4
D1---L6-->D2---L7--->D3--L3--->D4
D1---L8-->D2---L9--->D3--L3---->D4
transformer simply returns the weight previosuly stored for link.
Graph topology = new UndirectedSparseMultigraph()
DijkstraShortestPath pathCalculator = new DijkstraShortestPath(topology, wtTransformer);
List path = pathCalculator.getPath(node1, node2);
private final Transformer wtTransformer = new Transformer() {
public Integer transform(Link link) {
int weight = getWeightForLink(link, true);
return weight;
}
}
You're creating DijkstraShortestPath so that it caches results. Add a "false" parameter to the constructor to change this behavior.
http://jung.sourceforge.net/doc/api/edu/uci/ics/jung/algorithms/shortestpath/DijkstraShortestPath.html
(And no, the cache does not get poisoned if you change an edge weight; if you do that, it's your responsibility to create a new DSP instance, or not use caching in the first place.)

Resources