Removing newlines in ruby JSON - ruby-on-rails

So, I am serialising an object in Ruby on Rails into JSON format, using to_json. The output produced is:
'{"description":"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}'
This needs to be parsed by JSON.parse to deserialise the object in client-side javascript. At the moment, this is failing because of the newline characters \n in the "description" value. I've tried to encode the characters appropriately using gsub("\n","\\n") and other permutations, but I can't seem to find a string or regular expression that will correctly match the newlines (and only the newlines). I have tried /\n/, '\n', "\n", "\\n" (this matches everywhere on the string, for some reason), /\\n/ and so on, but haven't been able to find anything. Any ideas what I am missing?
Update: here's the code (javascript, but with embedded ruby) I'm trying to use to populate the javascript object (it's in an ERB view, hence the angle brackets):
var object = JSON.parse('<%= raw #object.to_json %>');
to_json is not overridden in my object code, just the standard rails method.

j = %Q!{"description":"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}!
j.gsub! /\n/, '\\n'
JSON.parse j
# => {"description"=>"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n", "id"=>295, "name"=>"Animi enim dolorem soluta eligendi inventore quia distinctio magni.", "privacy"=>0, "updated_at"=>"2012-11-18T22:24:17Z", "user_id"=>1}
Make your life easy, use single quotes around escaped characters when you need to manipulate them.
After the update…
var object = JSON.parse('<%= raw #object.to_json.gsub(/\n/, %q!\\n!) %>');

Your JSON includes a YAML string, so don't waste time trying to remove the line feeds, or you'll make things worse, or at least cause yourself to do too much work.
require 'json'
require 'yaml'
json = '{"description":"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}'
hash = JSON[json]
puts YAML.load(hash['description'])
Outputs:
Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae labore exercitationem. quos provident.
The JSON, after decoding back into a Ruby hash, looks like:
{"description"=>
"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n",
"id"=>295,
"name"=>"Animi enim dolorem soluta eligendi inventore quia distinctio magni.",
"privacy"=>0,
"updated_at"=>"2012-11-18T22:24:17Z",
"user_id"=>1}
To turn it back into a true JSON string, with description not encoded as YAML, use:
hash['description'] = YAML.load(hash['description']).shift
puts hash.to_json
Which now looks like:
{"description":"Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae labore exercitationem. quos provident.","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}

Related

Is it possible to control font size in the tables while using rmarkdown/bookdown?

I'm using bookdown to prepare some documents.
For some reasons I need to have bigger font size for headings and main text and smaller font-size for tables.
The simple minimal reproducible example is below:
---
papersize: a6
geometry: 'landscape'
site: bookdown::bookdown_site
output:
bookdown::pdf_document2:
latex_engine: xelatex
header-includes:
- \usepackage[fontsize=15pt]{scrextend}
---
Below is a table with narrow first column and wide second column:
| **Seq** | **Description** |
|:---:|-------------|
| `1` | Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. |
| `2` | Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. |
Link to intermediate LaTeX file.
Is it possible to decrease font size for tables to make it smaller than now?
You can force a smaller font size for longtable by adding
\AtBeginEnvironment{longtable}{\tiny}
to your header-includes

Splitting a text into chunks smaller than a specific size, at a newline, in F#

Let's say I have some text:
Lorem ipsum dolor sit amet, consectetur adipiscing elit,\n
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\n
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris\n
nisi ut aliquip ex ea commodo consequat.\n
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore\n
eu fugiat nulla pariatur.\n
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia\n
deserunt mollit anim id est laborum.\n
What is the most efficient way to cut it into chunks of x bytes, where the cut can only happen at the carriage return?
Two methods come to mind:
split the text into lines, add lines to a buffer until the buffer is full, roll back the last line that caused the overflow, and repeat.
find the offset in the text at the buffer length and walk back to the previous carriage return, with proper handling of the beginning and ending of the text
I couldn't find a solution online, but I can't believe that this problem hasn't already been solved many times, and there may be a common implementation of this.
Edit:
more information about my use case:
The code is for a Telegram bot which is used as a communication tool with an internal system.
Telegram allows up to 4kb per message and throttles the number of calls.
Right now I collect all messages, put them in a concurrent queue and then a tasks flushes the queue every second.
Messages can be a single line, can be a collection of lines and can sometimes be larger than 4kb.
I take all the messages (some being multiple lines in one block), aggregate them into a single string, then split the string by carriage return and then I can compose blocks of up to 4kb.
One additional problem I haven't tackled yet, but that's for later, is that Telegram will reject incomplete markup, so I will also need to cut the text based on that at some point.
Not very efficient, and also laboring under the assumptions
that you may want to preserve the newline separators, and
that we can assume that the end of the string is equivalent
to a single newline;
then, an implementation along the lines of your first approach is both functional and straightforward. Just split into lines and combine them unless their combined length exceeds the threshold.
// Comma-separated output of the string lengths
// (plus 1 to compensate for the absence of the EOL)
let printLengths =
Array.map (String.length >> (+) 1 >> string)
>> String.concat ", "
>> printfn "%s"
let text =
"Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum.
"
text.Split '\n' |> printLengths
// prints 57, 67, 67, 41, 77, 26, 74, 37, 1, 1
let foo n (text : string) =
(text.Split '\n', [])
||> Array.foldBack (fun t -> function
| x::xs when String.length x + t.Length + 1 < n -> x+"\n"+t::xs
| xs -> t::xs )
text |> foo 108 |> List.toArray |> printLengths
// prints 57, 67, 108, 77, 100, 39
Most common stream related tasks are already implemented very efficiently in the BCL.
It's probably a good idea to stick with tried-and-tested Stream classes.
let lipsum =
"""
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum.
"""
use stream = new MemoryStream(Encoding.UTF8.GetBytes(lipsum))
use reader = new StreamReader(stream)
let readBlock blockSize =
let writer = new StringBuilder(capacity = blockSize)
let rec readNextline () =
if (not reader.EndOfStream) then do
let line = reader.ReadLine()
if writer.Capacity < line.Length + writer.Length then do
stream.Seek(int64 -line.Length, SeekOrigin.Current) |> ignore
else
writer.AppendLine(line) |> ignore
readNextline ()
readNextline ()
writer.ToString()
readBlock 300 |> printfn "%s"
You can just flush the queue, writing to the same MemoryStream. And call readBlock to keep getting new blocks of at-most your specified size.

Filter using a predicate takes a lot of time

I have 40k strings in an array. I want to filter the array so that I'll get only the matched strings. I have some preconditions like it can have a separator in between, it should be a word search and the searches can have multiple words. So, I went with the regex and it's taking a lot of time.
The following is the code that I generated just for representation purpose here.
var arr = [String]()
for index in stride(from: 0, to: 40000, by: 1) {
arr.append("Lorem ipsum dolor sit er elit lamet, consectetaur cillium adipisicing pecu, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Nam liber te conscient to factor tum poen legum odioque civiuda.")
}
// We specify the words to be searched here
let searchTexts = ["aliqua", "Ut"]
// The time the execution started
print(Date().timeIntervalSince1970)
let predicate = NSPredicate(format: "SELF matches[cd] %#", ".*\\b\(searchTexts.joined(separator: "[ ,.!?;:\"(')-]*"))\\b.*")
let fil = arr.filter { (str) -> Bool in
return predicate.evaluate(with: str)
}
// The time the execution stopped
print(Date().timeIntervalSince1970)
The time taken is 2 seconds and in an iOS simulator. It takes more in devices.
How to improve the regex? I have searched a lot of sites but it didn't help me.
Edit:
The above question had been modified since it involved core data.
My actual question now is how do we apply the same logic to core data fetch?
Do not use a method that requires a whole string match if all you need is a partial match. NSPredicate with MATCHES requires a full string match and you have to use .* or similar to ensure that. However, the .* greedy dot pattern grabs the whole line and then backtracks to accommodate text for the subsequent patterns. The more patterns there are after .*, the less efficient the pattern is.
You need to use a method that will allow partial matches and thus will let you get rid of .*, e.g. a range(of:options:range:locale:) while passing the .regularExpression option.
In your scenario above, you may remove let predicate = NSPredicate(format: "SELF matches[cd] %#", ".*\\b\(searchTexts.joined(separator: "[ ,.!?;:\"(')-]*"))\\b.*" and replace return predicate.evaluate(with: str) with
return str.range(of: "\\b\(searchTexts.joined(separator: "[ ,.!?;:\"(')-]*"))\\b", options: .regularExpression) != nil
See the new regex demo (56 steps), and your regex demo (541 steps).

how to change the colour of a certain sentence in IOS UILabel?

Suppose I have text like this in a label
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam
nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
sed diam voluptua. At vero eos et accusam et justo duo dolores et ea
rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem
ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam
et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea
takimata sanctus est Lorem ipsum dolor sit amet.
How do I change the colour of the the sentence "At vero eos et..." starting in the second line?
I don't believe this is possible with the UILabel class. However, it wouldn't be too hard to do this with some simple HTML in a UIWebView.
[myWebView loadHTMLString:#"Some normal text. <font color=\"red\">Some red text.</font>" baseURL:nil]
Let me know if this works well for you.
You can't do that with a standard UILabel. You'd either have to use multiple UILabels or take a look at CoreText which would also do what you want - http://developer.apple.com/library/mac/#documentation/StringsTextFonts/Conceptual/CoreText_Programming/Introduction/Introduction.html.

How to split text per paragraph based on length?

Hi I am using RedCloth, Rails 3.
Currently I splitling a long text based based on string "-BREAK-".
How do I split text based on character length without splitting in the middle of a sentence.
E.g.,
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas at purus eu nisl consequat mattis. Morbi pretium eros eget erat ornare elementum.
Vivamus in dui sit amet tellus bibendum volutpat. Sed lorem sem, porttitor at mattis quis, volutpat sed quam. Vestibulum eu justo nec dui ullamcorper molestie. Sed eleifend malesuada mattis. Curabitur eleifend elit vitae justo feugiat iaculis. Etiam sed lectus eu quam suscipit fermentum id a sem.
Phasellus sed odio eu urna gravida venenatis venenatis non justo. Praesent tincidunt velit adipiscing ligula pretium commodo. Cras blandit, nibh ac sagittis egestas, enim odio rutrum metus, vel hendrerit felis urna cursus odio. Maecenas elementum erat et arcu vulputate eu fermentum orci semper. Proin luctus purus sit amet nibh blandit cursus.
That will be comprise one page. It's about 794 characters.
First you should split your text to single sentences.
Here's a simple, far-from-perfect way for doing this (I'm sure you could find plenty of more complete patterns elsewhere):
'Gsda asd. Gasd sasd. Tfed fdd.'.scan(/(.+?\.) ?/).map(&:first)
#=> ["Gsda asd.", "Gasd sasd.", "Tfed fdd."]
Then, you should join these sentences, keeping an eye of the paragraph length. You can use something like this:
# using words as units, but sentences are just the same:
s = ['foo', 'bar', 'beef', 'baz', 'hello', 'chunky', 'bacon']
LEN = 7 # minimum length of a paragraph
s.inject([]){|a,i|
if !a.last || a.last.length > LEN
a << i
else
a.last << " #{i}"
end
a
}
#=> ["foo bar beef", "baz hello", "chunky bacon"]
I don't think there's any built in logic for this, so you should just look for "." with a nice regex also specifying that it has to be straight after a word (not whitespace), followed by a space and a capital letter.
Edit: that should give you an array of occurrences from which you can pick the one closest to the character limit.

Resources