Greedy Algorithm implementation - greedy

So I have some questions concerning the solution to the problem of scheduling n activities that may overlap using the least amount of classrooms possible. The solution is below:
Find the smallest number of classrooms to schedule a set of activities S in. To do this efefficiently
move through the activities according to starting and finishing times. Maintain two lists of classrooms: Rooms that are busy at time t and rooms that are free at time t. When t is the starting time
for some activity schedule this activity to a free room and move the room to the busy list.
Similarly, move the room to the free list when the activity stops. Initially start with zero rooms. If
there are no rooms in the free list create a new room.
The algorithm can be implemented by sorting the activities. At each start or finish time we can
schedule the activities and move the rooms between the lists in constant time. The total time is thus
dominated by sorting and is therefore O(n lg n).
My questions are
1) First, how do you move through the activities by both starting and finishing time at the same time?
2) I don't quite understand how it's possible to move the rooms between lists in constant time. If you want to move rooms from the busy list to the free list, don't you have to iterate over all the rooms in the busy list and see which ones have end times that have already passed?
3) Are there any 'state' variables that we need to keep track of while doing this to make it work?

The way the algorithm works, you need to create a list containing an element for each start time and an element for each end time (so 2n elements in total if there are n activities). Sort this list. When an end time and a start time are equal, sort the end time first -- this will cause back-to-back bookings for halls to work.
If you use linked lists for holding the free and booked halls, you can have the elements you created in step 1 hold pointers back to an activity structure, and this structure can hold a pointer to the list element containing the hall that this activity is assigned to. This will be NULL initially, and will take on a value when that hall is used for that activity. Then when that activity ends, its hall can be looked up in constant time by following two pointers from the activity-end element (first to the activity object, and from there to the hall element).
That should be clear from the above description, hopefully.

Related

Create a model that predicts an event based on other time series events and properties of an object

I have the following data:
Identifier of a person
Days in location (starts at 1 and runs until event)
Age of person in months at that time (so this increases as the days in location increase too).
Smoker (boolean), doesn't change over time in our case
Sex, doesn't change over time
Fall (boolean) this is an event that may never happen, or can happen multiple times during the complete period for a certain person
Number of wounds: (this can go from 0 to 8), a wound mostly doesn't heal immediately so it mostly stays open for a certain period of time
Event we want to predict (boolean), only the last row of a person will have value true for this
I have this data for 1500 people (in total 1500000 records so on average about 1000 records per person). For some people the event I want to predict takes place after a couple of days, for some after 10 years. For everybody in the dataset the event will take place, so the last record for a certain identifier will always have the event we want to predict as 1.
I'm new to this and all the documentation I have found so far doesn't demonstrate time series for multiple persons or objects. When I for example split the data in the machine learning studio, I want to keep records of the same person over time together.
Would it be possible to feed the system after the model is trained with new records and for each day that passes it would give the estimate of the event taking place in the next 5 days?
Edit: sample data of 2 persons: http://pastebin.com/KU4bjKwJ
sounds like very similar to this sample:
https://gallery.cortanaintelligence.com/Experiment/df7c518dcba7407fb855377339d6589f
Unfortunately there is going to be a bit of R code involved. Yes you should be able to retrain the model with new data.

Periodic snapshot fact table with large dimensions

I have been asked to model a star diagram.
I have 3 dimensions:
Date (day,month, year, week, quarter, ...)
place (500 distinct values)
Product (80k different products)
The main question is how many items (products) are stored at the end of a day in every place.
After some study-time with regards to dimensional modeling. I think I should implement a Periodic snapshot table. However reading trough the Kimball Docs, I noticed that a periodic snapshot demands an entry for every combination of the dimensions. This means I should add 40M rows every day (80k*500).
Knowing that the products are (real) slow movers and that many places store zero products during long periods, this sounds like an extreme overkill.
FYI the transactions in the source DB are 150k rows after three years.
So should I really add 40M rows every day, or could I just add the non-empty stores with their products specified? Also if for whatever reason one day all stores are empty, should I make an entry for that day (with dimensions N/A for store and product)?
You modeled correctly. It depends from the specifications, but normally you store only the products that are present in a location (you do not store zeroes), which could yield a number substantially lower than the maximum 80k.
If you want to further reduce your numbers, you could store the last N days and then start to move data in a "cold" table. You store (say) last 10 day snapshot, then only monthly snapshots in the main "hot" Fact Table.
Do not exclude the possibility to calculate the snapshot on the fly in report system, depending on your environment it could be easy (in MDX or DAX for example it is). Mixed solutions are also possible (i.e only the last month calculated on the fly).

Maintaining a Relative Order of Flex Tasks in a SQL DB

I have a task scheduling app that allows people to create 2 types of tasks...
•Strict- tasks with a set start time and duration
•Flex- tasks that have a duration, but no specific start time
Its also important to understand how flex tasks operate- Flex tasks will continuously reschedule themselves throughout your day in the nearest time you have open...so for example if the only task on your schedule today is a flex task like "Go workout - duration:60mins" and you open the app at 4pm it will have "Go workout" scheduled from 4-5pm for you , if you dont click the checkbox indicating you completed the task and open the app again at 5PM "Go workout will be rescheduled to 5-6pm so that the stuff you are meaning to get done is constantly in your face and trying to fit itself into the gaps of your life.
When a user views their schedule here are the steps I go through:
•Grab a array of all strict tasks
•Grab a array of all flex tasks
•Loop through each strict task and figure out how big of a time gap there is between the task currently being looped's end time and the next tasks start time.
•if a gap exists loop through the flex tasks and see if any of them will fit in the time gap in question. if a flex task is small enough to fit in the time gap add it to the strictTasksArray between the task being currently looped and the next task.
This works great as long as there is no need for any kind of ordering when it comes to flex tasks, but now we have added the ability for users to drag and drop flex tasks into a specific relative order aka if I have Task A,B,C,D
and I drag Task D & B to the front so that its now D,B,A,C it needs to save that ordering so that if you close and reopen the app the app will still remember to try to fit task D in , followed by B, A & C .....but im having big trouble thinking of a efficient way to do considering the ordering is relative and not strict...any ideas how to save relative ordering in a SQLIte DB without having to update every tasks's DB record every time a user drag/drops a task and changes the relative ordering?
If you have ever coded in Basic, you might remember numbering code lines. It was advisable to number in increments of 10 so that if later on you would have to insert a line or two you won't have to re-number all the code, just assign a new number in-between those of the previous and the next lines.
So, in your situation I would create a numeric field for Rank and for each new Flex task assign Rank = max(Rank) + 1024 (for example). Afterwards if the tasks are rearranged I would update just one "moved" task's Rank with the average Rank of it's new previous and next neighbours. That way any Rank change would be an update for one row only. Of cause if the Rank is int and I run out of integers in-between two tasks I would have to update them all, but that should be a rear occasion and I would just re-Rank them in new increments of 1024.
Sounds like you'd need some sort of either priority or order_number column to set the order in which the tasks come in. Just make it an int, and weight them accordingly. If you needed the DBMS to keep them in order using a query, you'd have to use sorting:
SELECT task_id, task_group_id, task_name, completed, priority
FROM tasks WHERE user = ? and task_group_id = ? and completed = 0 ORDER BY priority ASC
you can use some sort of foreign key to a task_group table to actually group certain tasks together if they're multipart, and then build a query to find all the ones that are either complete or incomplete. The weightage assigned would still be correct, because the tasks don't refer to each other by ID.

Algorithm for tracking changes in value over time

I am writing a rails app that deals with product inventory. I would like to include the following features, and am struggling with developing an efficient algorithm:
View stock history (how many were in stock on each date)
Quantity removed from warehouse, and quantity added to warehouse over specific periods of time
Amount of time the product was out of stock in any given period
My questions are as follows:
What is the best way of tracking changes? In addition to my Products
table, should I create another table called
HistoricProductQuantities, and insert a new record each time there
is a change in the quantity?
What number should I track? The historic stock quantity (i.e. 50 in
stock on this day, 24 in stock on that day), or the CHANGE in stock
quantity i.e. -5 (5 sold) or 15 (15 added to inventory)? Or do I
track both in separate tables?
Thanks for your help.
First of all I recommend implementing Date Dimensions on your application, as it seems like you will be doing a lot of Time related calculations. Search on Google for date dimensions as it's beyond the scope of your questions. That said, I believe it will be of great benefit for your app to implement and use date dimensions.
As far as your direct questions go:
What is the best way of tracking changes? In addition to my Products table, should I create another table called HistoricProductQuantities, and insert a new record each time there is a change in the quantity?
Yes you could do this, I would probably call it HistoricProductSnapshot and keep track of the product activity in there on daily basis. With this information as well as time dimensions you could do calculations such as "how many of Product X Did we have 5 days ago or a month ago etc etc."
What number should I track? The historic stock quantity (i.e. 50 in stock on this day, 24 in stock on that day), or the CHANGE in stock quantity i.e. -5 (5 sold) or 15 (15 added to inventory)? Or do I track both in separate tables?
I do not have experience writing inventory control software but I believe with the Snapshot table I mentioned on the question above you would only have to keep track of quantities per day. The Change in product counts could then be calculated from your snapshot table. You could for example have a function that will output the product amount in a given time range as an array. Example: From March 1 to March 7 these were the stock amounts for Product Y [45,40,39,27,22,45,44].
Hope that helps. As I said I am not a product inventory guy but I have worked with Point of Sales Systems and the procedure above should give you a could enough start for what you are trying to do.
This gem could be usefull for tracking changes in models https://github.com/collectiveidea/audited
Keep the data raw. I would personally create a new data entry every day, displaying how much items you have in stock per day. Or you can make the interval much shorter, such as every 12 hours.
For our particular use case:
We had a table called Days, which had a many to many relationship with products, and each "relationship" will have a value called quantity (to keep track of quantity of product per day). Additionally per relationship, we had another value for the relationship with transactions (a one to many relationship) that has the entries for the time of transaction and remaining stocks.
I would personally advise you to use the quantity of stock as the raw data, as it will enable you to gather the data such as how much items were removed during a certain transaction, when the item was out of stock and when it became in stock, all through the data. When you have data in which you need to perform statistical calculations on, it's best to store this data as raw values (quantity of the item).

How to calculate user's info completeness in Rails

I have problem in my new rails project.I want to implement a function which can show the user's info completeness by a bar like Linkedin.
I think I can use a variable to record the completeness,but I don't have any idea about how to calculate it.
P.S I have two Model,one is the User Model,another is the Info Model.
This is, in fact, completely arbitrary. It's based entirely on which activities on the site you want to encourage.
A couple of mechanisms you can consider:
Model "accomplishments" with a completed/not completed status. Count up the ones you care about. Store the accomplishments based on activity either as they happen or at the end of the day in some batch job. For each user, calculate the percentage with the usual math (accomplishments completed/sum of available accomplishments) * 100 = percentage.
A variation of the same, but weighted based on what you consider more valuable contributions. In this case, the math is basically sum of (weight n * accomplishment n)/total weight.
The previous Careers.stackoverflow.com model made a geeky joke about Spinal Tap by making it possible to have counts greater than 100%. You can do that simply by undercounting the maximum accomplishments.

Resources