F# parent/child lists to Excel - f#

I'm still new to F# so hopefully my question isn't too dumb. I'm creating an Excel file. I've done a lot of Excel with C# so that isn't a problem. I have a list of parent rows and then a list of child rows. What's the best way to spit that into Excel and keep track of the row in Excel that it belongs in.
Assuming my list rowHdr is a list of Row types, I have something like this:
let setCellText (x : int) (y : int) (text : string) =
let range = sprintf "%c%d" (char (x + int 'A')) (y+1)
sheet.Range(range).Value(Missing.Value) <- text
type Row =
{ row_id:string
parent_id:string
text:string }
let printRowHdr (rowIdx:int) (colIdx:int) (rowToPrint:Row) rows =
setCellText colIdx rowIdx rowToPrint.text
List.iteri (fun i x -> printRowHdr (i+1) 0 x rows) <| rowHdr
I still have trouble thinking about what the best functional approach is at times. Somewhere in the printRowHdr function I need to iterate through the child rows for the rows where the parent_id is equal to parent row id. My trouble is knowing what row in Excel it belongs in. Maybe this is totally the wrong approach, but I appreciate any suggestions.
Thanks for any help, I sincerely appreciate it.
Thanks,
Nick
Edited to add:
Tomas - Thanks for the help. Let's say I have two lists, one with US states and another with cities. The cities list also contains the state abbreviation. I would want to loop through the states and then get the cities for each state. So it might look something like this in Excel:
Alabama
Montgomery
California
San Francisco
Nevada
Las Vegas
etc...
Given those two lists could I join them somehow into one list?

I'm not entirely sure if I understand your question - giving a concrete example with some inputs and a screenshot of the Excel sheet that you're trying to get would be quite useful.
However, the idea of using ID to model parent/child relationship (if that's what you're trying to do) does not sound like the best functional approach. I imagine you're trying to represent something like this:
First Row
Foo Bar
Foo Bar
Second Row
More Stuff Here
Some Even More Neste Stuff
This can be represented using a recursive type that contains the list of items in the current row and then a list of children rows (that themselves can contain children rows):
type Row = Row of list<string> * list<Row>
You can then process the structure using recursive function. An example of a value (representing first three lines from the example above) may be:
Row( ["First"; "Row"],
[ Row( ["Foo"; "Bar"], [] )
Row( ["Foo"; "Bar"], [] ) ])
EDIT: The Row type above would be useful if you had arbitrary nesting. If you have just two layers (states and cities), then you can use list of lists. The other list containing state name together with a nested list that contains all cities in that state.
If you start with two lists, then you can use a couple of F# functions to turn the input into a list of lists:
let states = [ ("WA", "Washington"); ("CA", "California") ]
let cities = [ ("WA", "Seattle"); ("WA", "Redmond"); ("CA", "San Francisco") ]
cities
// Group cities by the state
|> Seq.groupBy (fun (id, name) -> id)
|> Seq.map (fun (id, cities) ->
// Find the state name for this group of cities
let _, name = states |> Seq.find (fun (st, _) -> st = id)
// Return state name and list of city names
name, cities |> Seq.map snd)
Then you can recursively iterate over the nested lists (in the above, they are actually sequences, so you can turn them to lists using List.ofSeq) and keep an index of the current row and column.

Related

How can I iterate over list items in Pandoc's lua-filter function?

Pandoc's lua filter makes it really easy to iterate over a document and munge the document as you go. My problem is I can't figure out how to isolate list item elements. I can find lists and the block level things inside each list item, but I can't figure out a way to iterate over list items.
For example let's say I had the following Markdown document:
1. One string
Two string
2. Three string
Four string
Lets say I want to make the first line of each list item bold. I can easily change how the paragraphs are handled inside OrderedLists, say using this filter and pandoc --lua-filter=myfilter.lua --to=markdown input.md
local i
OrderedList = function (element)
i = 0
return pandoc.walk_block(element, {
Para = function (element)
i = i + 1
if i == 1 then return pandoc.Para { pandoc.Strong(element.c) }
else return element end
end
})
end
This will indeed change the first paragraph element to bold, but it only changes the first paragraph of the first list item because it's iterating across all paragraphs in all list items in the list, not on each list item, then on each paragraph.
1. **One string**
Two string
2. Three string
Four string
If I separate the two list items into two separate lists again the first paragraph of the first item is caught, but I want to catch the first paragraph of every list item! I can't find anything in the documentation about iterating over list items. How is one supposed to do that?
The pandoc Lua filter docs have recently been updated with more info on the properties of each type. E.g., for OrderedList elements, the docs should say (it currently says items instead of content, which is a bug):
OrderedList
An ordered list.
content: list items (List of Blocks)
listAttributes: list parameters (ListAttributes)
start: alias for listAttributes.start (integer)
style: alias for listAttributes.style (string)
delimiter: alias for listAttributes.delimiter (string)
tag, t: the literal OrderedList (string)
So the easiest way is to iterate over the content field and change items therein:
OrderedList = function (element)
for i, item in ipairs(element.content) do
local first = item[1]
if first and first.t == 'Para' then
element.content[i][1] = pandoc.Para{pandoc.Strong(first.content)}
end
end
return element
end

How do I remove rows of an RDD whose key is not in another RDD?

Let's say I have a PairRDD, students (id, name). I would like to only keep rows where id is in another RDD, activeStudents (id).
The solution I have is to create a PairDD from activeStudents, (id, id), and the do a join with students.
Is there a more elegant way of doing this?
Thats a pretty good solution to start with. If active students is small enough you could collect the ids as a map and then filter with the id presence (this avoids having to a do a shuffle).
Much like you thought, you can do an outer join if both RDDs contain keys and values.
val students: RDD[(Long, String)]
val activeStudents: RDD[Long]
val activeMap: RDD[(Long, Unit)] = activeStudents.map(_ -> ())
val activeWithName: RDD[(Long, String)] =
students.leftOuterJoin(activeMap).flatMapValues {
case (name, Some(())) => Some(name)
case (name, None) => None
}
If you don't have to join those two data sets then you should definitely avoid it.
I had a similar problem recently and I successfully solved it using a broadcasted Set, which I used in UDF to check whether each RDD row (rather value from one of its columns) is in that Set. That UDF is than used as the basis for the filter transformation.
More here: whats-the-most-efficient-way-to-filter-a-dataframe.
Hope this helps. Ask if it's not clear.

Determine if record with field of certain value exists in list

I need to determine if a record with a given value exists in a list, what is the most efficient way to do this?
i think like this:
[ L || L = #record{state=determined} <- List ].
And the most efficient way is:
lists:any(fun(#record{state=deter}) -> true; (_) -> false end, List).
The first aproach is applicable if your list contains few records with determined field in the list and you'll get it all.
The second aproach is the most efficient because we are using standart library and if we'll get nedeed record we'll will stop iteration over the list.

How to add to an existing value in a map in Cypher?

I want to replace the value of the 'Amount' key in a map (literal) with the sum of the existing 'Amount' value plus the new 'Amount' value such where both the 'type' and 'Price' match. The structure I have so far is:
WITH [{type:1, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]},
{type:2, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]},
{type:3, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]}] as ExistingOrders,
{type:2, Order:{Price:11,Amount:50}} as NewOrder
(I'm trying to get it to:)
RETURN [{type:1, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]},
{type:2, Orders:[{Price:10,Amount:100},{Price:11,Amount:250},{Price:12,Amount:300}]},
{type:3, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]}] as CombinedOrders
If there is no existing NewOrder.type and NewOrder.Price then it should obviously insert the new record rather than add it together.
Sorry, this is possibly really straight forward, but I'm not very good at this yet.
thanks
Edit:
I should add, that I have been able to get this working for a simpler map structure as such:
WITH [{type:1, Amount:100},{type:2, Amount:200},{type:3, Amount:300}] as ExistingOrders,
{type:2, Amount:50} as NewValue
RETURN reduce(map=filter(p in ExistingOrders where not p.type=NewValue.type),x in [(filter(p2 in ExistingOrders where p2.type=NewValue.type)[0])]|CASE x WHEN null THEN NewValue ELSE {type:x.type,Amount:x.Amount+NewValue.Amount} END+map) as CombinedOrders
But I'm struggling I think because of the Orders[array] in my first example.
I believe you are just trying to update the value of the appropriate Amount in ExistingOrders.
The following query is legal Cypher, and should normally work:
WITH ExistingOrders, NewOrder, [x IN ExistingOrders WHERE x.type = NewOrder.type | x.Orders] AS eo
FOREACH (y IN eo |
SET y.Amount = y.Amount + CASE WHEN y.Price = NewOrder.Order.Price THEN NewOrder.Order.Amount ELSE 0 END
)
However, the above query produces a (somewhat) funny ThisShouldNotHappenError error with the message:
Developer: Stefan claims that: This should be a node or a relationship
What the message is trying to say (in obtuse fashion) is that you are not using the neo4j DB in the right way. Your properties are way too complicated, and should be separated out into nodes and relationships.
So, I will a proposed data model that does just that. Here is how you can create nodes and relationships that represent the same data as ExistingOrders:
CREATE (t1:Type {id:1}), (t2:Type {id:2}), (t3:Type {id:3}),
(t1)-[:HAS_ORDER]->(:Order {Price:10,Amount:100}),
(t1)-[:HAS_ORDER]->(:Order {Price:11,Amount:200}),
(t1)-[:HAS_ORDER]->(:Order {Price:12,Amount:300}),
(t2)-[:HAS_ORDER]->(:Order {Price:10,Amount:100}),
(t2)-[:HAS_ORDER]->(:Order {Price:11,Amount:200}),
(t2)-[:HAS_ORDER]->(:Order {Price:12,Amount:300}),
(t3)-[:HAS_ORDER]->(:Order {Price:10,Amount:100}),
(t3)-[:HAS_ORDER]->(:Order {Price:11,Amount:200}),
(t3)-[:HAS_ORDER]->(:Order {Price:12,Amount:300});
And here is a query that will update the correct Amount:
WITH {type:2, Order:{Price:11,Amount:50}} as NewOrder
MATCH (t:Type)-[:HAS_ORDER]->(o:Order)
WHERE t.id = NewOrder.type AND o.Price = NewOrder.Order.Price
SET o.Amount = o.Amount + NewOrder.Order.Amount
RETURN t.id, o.Price, o.Amount;
There's two parts to your question - one with a simple answer, and a second part that doesn't make sense. Let me take the simple one first!
As far as I can tell, it seems you're asking how to concatenate a new map on to a collection of maps. So, how to add a new item in an array. Just use + like this simple example:
return [{item:1}, {item:2}] + [{item:3}];
Note that the single item we're adding at the end isn't a map, but a collection with only one item.
So for your query:
RETURN [
{type:1, Orders:[{Price:10,Amount:100},
{Price:11,Amount:200},
{Price:12,Amount:300}]},
{type:2, Orders:[{Price:10,Amount:100},
{Price:11,Amount:**250**},
{Price:12,Amount:300}]}]
+
[{type:3, Orders:[{Price:10,Amount:100},
{Price:11,Amount:200},{Price:12,Amount:300}]}]
as **CombinedOrders**
Should do the trick.
Or you could maybe do it a bit cleaner, like this:
WITH [{type:1, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]},
{type:2, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]},
{type:3, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]}] as ExistingOrders,
{type:2, Order:{Price:11,Amount:50}} as NewOrder
RETURN ExistingOrders + [NewOrder];
OK now for the part that doesn't make sense. In your example, it looks like you want to modify the map inside of the collection. But you have two {type:2} maps in there, and you're looking to merge them into something with one resulting {type:3} map in the output that you're asking for. If you need to deconflict map entries and change what the map entry ought to be, it might be that cypher isn't your best choice for that kind of query.
I figured it out:
WITH [{type:1, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},Price:12,Amount:300}]},{type:2, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]},{type:3, Orders:[{Price:10,Amount:100},{Price:11,Amount:200},{Price:12,Amount:300}]}] as ExistingOrders,{type:2, Orders:[{Price:11,Amount:50}]} as NewOrder
RETURN
reduce(map=filter(p in ExistingOrders where not p.type=NewOrder.type),
x in [(filter(p2 in ExistingOrders where p2.type=NewOrder.type)[0])]|
CASE x
WHEN null THEN NewOrder
ELSE {type:x.type, Orders:[
reduce(map2=filter(p3 in x.Orders where not (p3.Price=(NewOrder.Orders[0]).Price)),
x2 in [filter(p4 in x.Orders where p4.Price=(NewOrder.Orders[0]).Price)[0]]|
CASE x2
WHEN null THEN NewOrder.Orders[0]
ELSE {Price:x2.Price, Amount:x2.Amount+(NewOrder.Orders[0]).Amount}
END+map2 )]} END+map) as CombinedOrders
...using nested Reduce functions.
So, to start with it combines a list of orders without matching type, with a list of those orders (actually, just one) with a matching type. For those latter ExistingOrders (with type that matches the NewOrder) it does a similar thing with Price in the nested reduce function and combines non-matching Prices with matching Prices, adding the Amount in the latter case.

F# Excel Range.Sort Fails or Rearranges Columns

I have two cases. The preliminary code:
open Microsoft.Office.Interop.Excel
let xl = ApplicationClass()
xl.Workbooks.OpenText(fileName...)
let wb = xl.Workbooks.Item(1)
let ws = wb.ActiveSheet :?> Worksheet
let rows = string ws.UsedRange.Rows.Count
First, I try the following to sort:
ws.Sort.SortFields.Clear()
ws.Sort.SortFields.Add(xl.Range("A8:A" + rows), XlSortOn.xlSortOnValues, XlSortOrder.xlAscending, XlSortDataOption.xlSortNormal) |> ignore
ws.Sort.SortFields.Add(xl.Range("H8:H" + rows), XlSortOn.xlSortOnValues, XlSortOrder.xlAscending, XlSortDataOption.xlSortNormal) |> ignore
ws.Sort.SetRange(xl.Range("A7:I" + rows))
ws.Sort.Header <- XlYesNoGuess.xlYes
ws.Sort.MatchCase <- false
ws.Sort.Orientation <- XlSortOrientation.xlSortRows
ws.Sort.SortMethod <- XlSortMethod.xlPinYin
ws.Sort.Apply()
This results in "The sort reference is not valid. Make sure that it's within the data you want to sort, and the first Sort By box isn't the same or blank.".
Then I try the following to sort:
ws.Range("A7:I" + rows).Sort(xl.Range("A8:A" + rows), XlSortOrder.xlAscending,
xl.Range("H8:H" + rows), "", XlSortOrder.xlAscending,
"", XlSortOrder.xlAscending, XlYesNoGuess.xlYes,
XlSortOrientation.xlSortRows) |> ignore
This results in my columns being rearranged, though I don't see any logic to the rearrangement. And the rows are not sorted.
Please tell me what I'm doing wrong.
I haven't figured out why the first Sort attempt doesn't work. But I checked out the documentation on the second Sort. Using named parameters and dropping all but necessary parameters, I came up with the following:
ws.Range("A8:H" + rows).Sort(Key1=xl.Range("A8:A" + rows), Key2=xl.Range("H8:H" + rows),
Orientation=XlSortOrientation.xlSortColumns) |> ignore
The default Orientation is xlSortRows. I interpret this as sorting the rows, the normal, default, intuitive sort that Excel does when one sorts manually. Oh no. If you want the normal, default, intuitive sort that Excel does when one sorts manually, specify xlSortColumns.
The first Sort doesn't work because you're probably trying to sort a table (judging by property Header set to XlYesNoGuess.xlYes) by columns located on A and H. In this case you can only sort using "Sort top to bottom"(xlSortColumns).
You can also convince yourself by trying this in Excel first and see what options are available:
1) Select the range you want to apply sorting (e.g. in your case "A7:I" + rows)
2) Click on "Data" menu -> "Sort & Filter" task pane -> Sort
3) Add/Delete columns/rows by using "Add Level"/"Delete Level" buttons
4) Then you can select whatever level you want and click "Options..." button and you will see your available "Sort Options".
You will notice that in case of a table you will have only one available option for "Orientation", which is "Sort top to bottom"
xlSortColumns - sorts by columns (meaning rows are sorted based on those columns)
xlSortRows - sort by rows (meaning columns are sorted based on those rows)

Resources