Modx revolution: Conditional loading of chunks into website - modx-revolution

I employ a snippet to get the number of child elements, returning an integer:
[[!getChildCount? &id=`12345`]]
Depending on the result of this (zero or greater than zero), I want to load two different chunks into my website. Here is my "Modx-Pseudocode" of what I would like to achieve. Been fiddling for hours now and just can't find out about the right syntax. This is what I'd like to write inside the content field:
[[!If [[!getChildCount? &id=`45`]] > 0
then=`[load_chunk_A]`
else=`[load_chunk_B]`]]
Any hints of how this is properly expressed in Modx revolution?

Seems you are talking about IF extra that you should install, if you haven't already done it, to achieve your goal.
[[!If?
&subject=`[[!getChildCount? &id=`45`]]`
&operator=`>`
&operand=`0`
&then=`[[!$chunk_A]]`
&else=`[[!$chunk_B]]`
]]

Related

How to create a Save/Load function on Scratch?

Im trying to make a game on Scratch that will use a feature to generate a special code, and when that code is input into a certain area it will load the stats that were there when the code was generated. I've run into a problem however, I don't know how to make it and I couldn't find a clear cut answer for how to make it.
I would prefer that the solution be:
Able to save information for as long as needed (from 1 second to however long until it's input again.)
Doesn't take too many blocks to make, so that the project won't take forever to load it.
Of course i'm willing to take any solution in order to get my game up and running, those are just preferences.
You can put all of the programs in a custom block with "Run without screen refresh" on so that the program runs instantly.
If you save the stats using variables, you could combine those variable values into one string divided by /s. i.e. join([highscore]) (join("/") (join([kills]) (/))
NOTE: Don't add any "/" in your stats, you can probably guess why.
Now "bear" (pun) with me, this is going to take a while to read
Then you need the variables:
[read] for reading the inputted code
[input] for storing the numbers
Then you could make another function that reads the code like so: letter ([read]) of (code) and stores that information to the [input] variable like this: set [input] to (letter ([read]) of (code)). Then change [read] by (1) so the function can read the next character of the code. Once it letter ([read]) of (code) equals "/", this tells the program to set [*stat variable*] to (input) (in our example, this would be [highscore] since it was the first variable we saved) and set [input] to (0), and repeat again until all of the stats variables are filled (In this case, it repeats 2 times because we saved two variables: [highscore] and [kills]).
This is the least amount of code that it takes. Jumbling it up takes more code. I will later edit this answer with a screenshot showcasing whatever I just said before, hopefully clearing up the mess of words above.
The technique you mentioned is used in many scratch games but there is two option for you when making the save/load system. You can either do it the simpler way which makes the code SUPER long(not joking). The other way is most scratchers use, encoding the data into a string as short as possible so it's easy to transfer.
If you want to do the second way, you can have a look at griffpatch's video on the mario platformer remake where he used a encode system to save levels.https://www.youtube.com/watch?v=IRtlrBnX-dY The tips is to encode your data (maybe score/items name/progress) into numbers and letters for example converting repeated letters to a shorter string which the game can still decode and read without errors
If you are worried it took too long to load, I am pretty sure it won't be a problem unless you really save a big load of data. The common compress method used by everyone works pretty well. If you want more data stored you may have to think of some other method. There is not an actual way to do that as different data have different unique methods for things working the best. Good luck.

erlang list lines from position from end of the file

Is it possible to have a function that will return x lines of file from the end? The function will take parameter defining how far from end we want to read from(in lines measure) and how much lines we want to be returned from that position:
get_lines_file_end(IoDevice, LineNumberPositionFromEnd, LineCount) ->
Example:
We have file with 30 lines 0-29
get_lines_file_end(IoDevice, -10, 10) // will return lines 20-29
get_lines_file_end(IoDevice, -20, 10) // will return lines 10-19
The problem in this is that I can seek only with file:position by certain number of bytes ..
Purpose:
View large log file(hundreds of MB) in page manner starting from last "page".
Erlang is used for rest api which is used by javascript web.
The usage of such function is to view whole log files page by page, where page is represented by x lines of text. No processing of log files, or getting certain information of it is needed.
Thanks
Two points to be made:
To make this efficient you must create metadata about your text file contents to amortize the work involved. This way you can directly skip to the bits you need by seeking using file:position/2 after you have created this metadata.
If this is your use case then you should be partitioning the work differently. The huge text files should either be broken down into smaller text files, or (more likely) you shouldn't be using text files at all. Depending on what your goal is (which you haven't mentioned; I strongly suspect this is to be an X-Y problem) you probably don't want text at all but rather want to know something represented by the text. It may be a good idea to keep the raw text around somewhere just in case, but for actual processing of the data is is almost certainly a better idea to create symbolic data that (much more briefly) represents whatever you find interesting about the data, and store that in a database where seeking, scanning, indexing and doing whatever other things you might want are natural operations.
To build metadata about the files, you will need to do something analogous to:
1> {ok, Data} = file:read_file("TheLongDarkTeaTimeOfTheSoul.txt").
{ok,<<"Douglas Adams. The Long Dark Tea-Time of the Soul\r\n\r\n"...>>}
2> LineEnds = binary:matches(Data, <<"\r\n">>).
[{49,2},
{51,2},
{53,2},
{...}|...]
And then save LineEnds somewhere separately as meta about the file itself. Using this seeking within the file data is elementary (as in, use file:position/2 with the data at linebreak X, or at length(LineEnds) - X or whatever).
But this is still silly.
If you want to hop around within log files, and especially if you want to be able to locate patterns within them, count certain aspects of them, etc. then you would almost certainly do better reading them into a database like Postgres line by line, counting the line numbers as you go. At that point, pagination becomes a trivial issue.
Log files, however, are usually full of the sort of data that is best represented by symbols, not actual text, and it is probably an even better idea to tokenize the log file. Consider the case of access log files. A repeating number of visitors access from a finite number of access points (IPs, or devices, or whatever) an arbitrary number of times. Each aspect of this can be separately indexed and compared rather trivially within a database. The tokenization itself is rather trivial as well. Not only is this solution much faster when it comes to later analysis of the data, but it lends itself naturally to answering otherwise very difficult to answer questions about the contents of the data in a very straightfoward and familiar manner. ...And you don't even have to lose any of the raw data, or intermediate stages of processing (which may all be independently useful in different ways).
Also... note that all of the above work can be made parallel very easily in Erlang. Whatever your computing resource situation is, writing a solution that best leverages your hardware is certainly within grasp (assuming you have enough total data that this is even an issue).
Just like many "How to do X with data Y?" questions, the real answer is always going to revolve around "What is your goal regarding the data and why?"
You can use the file:read_line/1 function to read lines, discarding those that doesn't match your range:
get_lines(File, From) when From > 0 ->
get_lines(File, file:read_line(File), From, 1).
get_lines(_File, eof, _From, _Current) ->
[];
get_lines(File, {ok, _Line}, From, Current) when Current < From ->
get_lines(File, file:read_line(File), From, Current + 1);
get_lines(File, {ok, Line}, From, Current) ->
[Line|get_lines(File, file:read_line(File), From, Current + 1)];
get_lines(_IoDevice, Error, _From, _Current) ->
Error.

Generating data with Google Dataflow

Let's say I want to generate 100 trillion pieces of data (random numbers to keep it simple), and I'd like to use Google Dataflow to do it.
I can think of a dumb way to do this (I'm not 100% sure this would work, but this is where I'd start trying): take a text file that's 10 million lines long, and for every line in the input text file have a DoFn that loops for 10 million iterations, outputting a randomly generated number each iteration that are all eventually outputted to a text file. (whatever is in the original text file would just be ignored).
But I can't help but think there might be a better, less-hacky way to generate data using Dataflow. Any suggestions on a better way to do this?
Thank you!
For small dataset, you can just use pipeline.apply(Create.of(...)) to generate, but it won't scale (the generation code will be executed locally).
A better way may be:
List<Integer> l = ...; // 100k integers inside
pipeline.apply(Create.of(l)).apply(ParDo.of(new Generate100MDoFn())).apply(TextIO.Write.to(...));
so it will make dataflow generate a lot of data evenly in parallel.
Easy, just extend Source class with your own number generator: https://cloud.google.com/dataflow/model/custom-io

How to count occurrences of a substring within string fast with Ruby

I have a text file sized 300MB, I want to count the occurrences of each 10,000 substrings in the file. I want to know how to do it fast.
Now, I use the following code:
content = IO.read("path/to/mytextfile")
Word.each do |w|
w.occurrence = content.scan(w.name).size
w.save
end
Word is an ActiveRecord class.
It took me almost 1 day to finish the counting. Is there anyway to do it faster? Thanks.
Edit1:
Thank you again. I am running rails 2.3.9. The name filed of words table contains what I am searching for, and it contains only unique values. Instead of using Word.each, I use batch(1000 rows a time) load. It should help.
I rewrited the whole code with the idea from bpaulon. Now it only took a few hours to finish the counting.
I profiled the new version code, now the largest time costing methods are utf8 encode supported string truncating code
def truncate(n)
self.slice(/\A.{0,#{n}}/m)
end
and characters counting code
def utf8_length
self.unpack('U*').size
end
Any other faster methods to replace them?
Your use of scan creates an array, counts the size of it, then throws it away. If you have a lot of occurrences of the substring inside a big file, you will create a big array temporarily, potentially burning up CPU time with memory management, but that should still run pretty quickly, even with 300MB.
Because Word is an ActiveRecord class, it is dependent on the schema and any indexes in your database, plus any issues your database server might be having. If the database is not optimized or is responding slowly or the query used to retrieve the data is not efficient, then the iteration will be slow. You might find it a lot faster to grab groups of Word so they are in RAM, then iterate over them.
And, if the database and your code are running on the same machine, you could be suffering from resource constraints like having only one drive, not enough RAM, etc.
Without knowing more about your environment and hardware it's hard to say.
EDIT:
I can grab the substrings into an array/hash first, then add the count results to the array or hash, and write the results back to database after all the counting is done. You think it be faster, right?
No, I doubt that will help a lot, and, without knowing where the problem lies all you might do is make the problem worse because you'll have to load 10,000 records as objects from the database, then build a 10,000 element hash or array which will also be in memory along with the DB records, then write them out.
Ruby will only use a single core, currently, but you can gain speed by using Ruby 1.9+. I'd recommend installing RVM and letting it manage your Ruby. Be sure to read the instructions on that page, then run rvm notes and follow those directions.
What is your Word model and the underlying schema and indexes look like? Is the database on the same machine?
EDIT: From looking at your table schema, you have no indexes except for id which really won't help much for normal look-ups. I'd recommend presenting your schema on Stack Overflow's sibling site https://dba.stackexchange.com/ and explain what you want to do. At a minimum I'd add a key to the text fields to help avoid full table scans for any searches you do.
What might help more is to read: Retrieving Multiple Objects in Batches from "Active Record Query Interface".
Also, look at the SQL being emitted when your Word.each is running. Is it something like "select * from word"? If so, Rails is pulling in 10,000 records to iterate over them one by one. If it is something like "select * from word where id=1" then for every record you have a database read followed by a write when you update the count. That is the scenario that the "Retrieving Multiple Objects in Batches" link will help fix.
Also, I am guessing that content is the text you are searching for, but I can't tell for sure. Is it possible you have duplicated text values causing you to do scans more than once for the same text? If so, select your records using a unique condition on that field and then update your counts for all matching records at one time.
Have you profiled your code to see if Ruby itself can help you pinpoint the problem? Modify your code a little to process 100 or 1000 records. Start the app with the -r profile flag. When the app exits profiler will output a table showing where time was spent.
What version of Rails are you running?
I think you could approach this problem differently
You do not need to scan the file this many times, you could create a db, like in mongo or mysql, and for each word you find, you fetch the db for it and then adds on some "counter" field.
You could ask me "but then I will have to scan my database a lot and it could take a lot more". Well, sure you wouldn't ask this, but it won't take more time because databases are focused in IO, besides you could always index it.
EDIT: There is no way to delimit at all?? Let's say that where you have the a Word.name string you really holds a (not simple) regex. Could the regex contain the \n? Well, if the regex can contain any value, you should estimate the maximum size of string the regex can fetch, double it, and scan the file by that ammount of chars but moving the cursor by that number.
Lets say your estimate of the maximum your regex could fetch it is like 20 chars nad your file has from 0 to 30000 chars. You pass each regex you have from 0 to 40 chars, then again from 20 to 60, from 40 to 80, etc...
You should also hold the position you found of your smaller regex so it wouldn't repeat it.
Finally, this solution seems to be not worth the effort, your problem may have a greater solution based on what that regexes are, but it will be faster than invoke scan Words.count times your your 300Mb string.
You could load your entire "Word" table into a Trie, then do back-tracking since you said there are no delimiters in the text.
So for each character in the text, go down the Trie of words. If you hit a word, increment its count. "Going down the trie" involves three cases:
There's no node at this character. (If you're mid-search, pop the back-tracking stack)
There's a node at this character. (But it's not a Word)
There's a node at this character. (It's a Word - increment and "dirty")
Back-tracking is just keeping track of places you want to go after you've exhausted this "search" of the Trie, which is when you run out of nodes to visit. This will probably be each character you visit that is a root of the Trie.
After you've done this, you can then visit all the nodes you changed and just update the records they represent.
This will take some time to implement, but will surely be faster than each & scan.

Are helpers really faster than partials? What about string building?

I've got a fancy-schmancy "worksheet" style view in a Rails app that is taking way too long to load. (In dev mode, and yes I know there's no caching there, "Completed in 57893ms (View: 54975, DB: 855)") The worksheet is rendered using helper methods, because I couldn't stand maintaining umpteen teeny little partials for the different sorts of rows in the worksheet. Now I'm wondering whether partials might actually be faster?
I've profiled the page load and identified a few cases where object caching will shave a few seconds off, but the profile output suggests that a large chunk of time is spent simply looping through the Worksheet model's constituent objects and appending the string output from the helper. Here's an example of what I'm talking about:
def header_row(wksht)
content_tag(:thead, :class => "ioe") do
content_tag(:tr) do
html_row = []
for i in (0...wksht.class::NUM_COLS) do
html_row << content_tag(:th, h(wksht.column_headings[i].upcase),
:class => wksht.column_classes[i])
end
html_row.join("\n")
end
end
end
OTOH using partials means opening files, spinning off the Ruby interpreter, and in the long run, aggregating a bunch of strings, right? So I'm wondering whether there is another way to speed things up in the helpers. Should I be using something like a stringstream (does that exist in Ruby?), should I get rid of content_tag calls in favor of my own "" string interpolation... I'm willing to write my own performance tests, and share the results, if you have any suggested alternatives to the approach I've already taken.
As it's a fairly complex view (and has an editable version as well), I'd rather not rewrite-and-profile the whole thing more than once. :)
Some related reading:
http://www.viget.com/extend/helpers-vs-partials-a-performance-question/ (old)
http://www.breakingpointsystems.com/community/blog/ruby-string-processing-overhead/
http://blog.purepistos.net/index.php/2008/07/14/benchmarking-ruby-string-interpolation-concatenation-and-appending/
#tadman:
There are row totals and column totals (and more columnar arithmetic), and since they're not all just totals, but also depend on other "magic numbers" from the database, I implemented them in the Ruby code rather than Javascript. (DRY and unit testable.) Javascript is used only in the edit view, and just to add/delete rows (client side only) and to fetch a sheet with fresh totals when the cell contents change. It fetches the whole table because nearly half of the values get updated when an input cell changes.
The worksheet and its rows are actually virtual models; they don't live in the DB, but rather aggregate a boatload of real AR objects. They get created every time a view renders (but that takes 1.7 secs in dev mode, so I'm not worried about it).
I suppose I could transmit a matrix of numbers, rather than marked-up content, and have JS unpack it into the sheet. But that gets unmaintainable fast.
I ended up reading an excellent article at http://www.infoq.com/articles/Rails-Performance ("A Look At Common Performance Problems In Rails"). Then I followed the author's suggestion to cache computations during request processing:
def estimated_costs
#estimated_costs ||=
begin
# tedious vector math
end
end
Because my worksheet does stuff like the above over and over, and then builds on those results to calculate some more rows, this resulted in a 90% speedup right off the bat. Should have been plain as day, but it started with just a few totals, then I showed the prototype to the customer, and it snowballed from there :)
I also wondered whether my array-based math might be inefficient, so I replaced the Ruby Arrays of numbers with NArray (http://narray.rubyforge.org/). The speedup was negligible but the code's cleaner, so it's staying that way.
Finally, I put some object caching in place. The "magic numbers" in the database only change a few times a year at most, and some of them are encrypted, but they need to be used in most of the calculations. That's low-hanging fruit ripe for caching, and it shaved off another 1.25 seconds.
I'll look at eager loading of associations next, as there's probably some time to save there, and I'll do a quick comparison of sending "just the data" vs sending the HTML, as #tadman suggested. About the only partial I can cache is the navigation sidebar. All of the other content depends on the request parameters.
Thanks for your suggestions!
Internally all partials are translated into a block of executable Ruby code and run through exactly the same runtime as any helper methods. Periodically you can see glimpses of this when a malformed template causes the generated code to fail to compile.
Although it stands to reason that helper methods are faster than partials, and a straightforward string interpolation is faster still, it's hard to say if the performance gain from this would make it worth pursuing. Rendering a very large number of partials can be a bottleneck in terms of logging in the development environment, but in a production environment their impact seems less severe.
The only way to figure this one out is to benchmark your pages using two different rendering methods.
As you point out, caching is where you get the big gains. Using Memcached to save large chunks of pre-rendered HTML content can give you exponentially faster load times. Rendering 10,000 rows into HTML will always be slower than retrieving the same snippet from the Rails.cache subsystem.
It's also the case that the content you don't render is always rendered the quickest, so anything you can do to reduce the amount of content you generate for each helper call will provide big gains. If you're building a large spread-sheet style app that's entirely dependent on JavaScript, you may find that bundling up the data as a JSON array and expanding it client-side is significantly faster than unrolling the HTML on the server and shipping it over that way.

Resources