How to split text per paragraph based on length? - ruby-on-rails

Hi I am using RedCloth, Rails 3.
Currently I splitling a long text based based on string "-BREAK-".
How do I split text based on character length without splitting in the middle of a sentence.
E.g.,
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas at purus eu nisl consequat mattis. Morbi pretium eros eget erat ornare elementum.
Vivamus in dui sit amet tellus bibendum volutpat. Sed lorem sem, porttitor at mattis quis, volutpat sed quam. Vestibulum eu justo nec dui ullamcorper molestie. Sed eleifend malesuada mattis. Curabitur eleifend elit vitae justo feugiat iaculis. Etiam sed lectus eu quam suscipit fermentum id a sem.
Phasellus sed odio eu urna gravida venenatis venenatis non justo. Praesent tincidunt velit adipiscing ligula pretium commodo. Cras blandit, nibh ac sagittis egestas, enim odio rutrum metus, vel hendrerit felis urna cursus odio. Maecenas elementum erat et arcu vulputate eu fermentum orci semper. Proin luctus purus sit amet nibh blandit cursus.
That will be comprise one page. It's about 794 characters.

First you should split your text to single sentences.
Here's a simple, far-from-perfect way for doing this (I'm sure you could find plenty of more complete patterns elsewhere):
'Gsda asd. Gasd sasd. Tfed fdd.'.scan(/(.+?\.) ?/).map(&:first)
#=> ["Gsda asd.", "Gasd sasd.", "Tfed fdd."]
Then, you should join these sentences, keeping an eye of the paragraph length. You can use something like this:
# using words as units, but sentences are just the same:
s = ['foo', 'bar', 'beef', 'baz', 'hello', 'chunky', 'bacon']
LEN = 7 # minimum length of a paragraph
s.inject([]){|a,i|
if !a.last || a.last.length > LEN
a << i
else
a.last << " #{i}"
end
a
}
#=> ["foo bar beef", "baz hello", "chunky bacon"]

I don't think there's any built in logic for this, so you should just look for "." with a nice regex also specifying that it has to be straight after a word (not whitespace), followed by a space and a capital letter.
Edit: that should give you an array of occurrences from which you can pick the one closest to the character limit.

Related

Is it possible to control font size in the tables while using rmarkdown/bookdown?

I'm using bookdown to prepare some documents.
For some reasons I need to have bigger font size for headings and main text and smaller font-size for tables.
The simple minimal reproducible example is below:
---
papersize: a6
geometry: 'landscape'
site: bookdown::bookdown_site
output:
bookdown::pdf_document2:
latex_engine: xelatex
header-includes:
- \usepackage[fontsize=15pt]{scrextend}
---
Below is a table with narrow first column and wide second column:
| **Seq** | **Description** |
|:---:|-------------|
| `1` | Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. |
| `2` | Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. |
Link to intermediate LaTeX file.
Is it possible to decrease font size for tables to make it smaller than now?
You can force a smaller font size for longtable by adding
\AtBeginEnvironment{longtable}{\tiny}
to your header-includes

SwiftUI ZStack alignment bug

Here is a simple code that produces a strange top alignment and shrunk text of the second element inside ZStack.
BUT if you change the second text a bit (replace text2 by text2Alt1 or by text2Alt2) making it longer or shorter everything becomes correct.
What is the reason for this behavior?
struct VCardView: View {
let text: String
let color: UIColor
var body: some View {
VStack {
Rectangle()
.fill(Color(self.color))
.frame(idealWidth: 800, idealHeight: 500)
.aspectRatio(contentMode: .fit)
Text(self.text)
}
}
}
struct ContentView: View {
var body: some View {
ZStack(alignment: .topLeading) {
VCardView(text: text1, color: .blue)
.frame(width: 180, height: nil)
.alignmentGuide(.leading) { _ in 180 }
.alignmentGuide(.top) { _ in 0 }
//Replace text2 by text2Alt1 or text2Alt2 here:
VCardView(text: text2, color: .green)
.frame(width: 180, height: nil)
.alignmentGuide(.leading) { _ in 0 }
.alignmentGuide(.top) { _ in 0 }
}
}
let text1 = "1Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidu!"
let text2 = "2Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis nat!"
let text2Alt1 = "2Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. C"
let text2Alt2 = "2Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis nat! Cum sociis nat!"
}
Also, you could replace Rectangle() by Image() with appropriate proportions because it was the initial state. Rectangle with the ideal size is just for demonstration.
Here I use ZStack with explicit alignment guides (not HStack) because it's a part of another library. It is essential.
XCode 11.5
iOS 13.5 iPhone SE 2020
Bug looks like this:
Expected layout:
I've seen such Text truncation problems often in SwiftUI. The workaround which most of the time helps: ensure that Text is allowed to adjust it's font size automatically using the minimumScaleFactor modifier.
By modifying VCardView
Text(self.text)
.minimumScaleFactor(0.5)
The misalignment problem disappears for all possible text values (text1, text2, text2Alt1 and text2Alt2).
The resulting text does not look scaled at all. Everything fits nicely.
I do not have a good explanation, but I think if you tell SwiftUI that your Text is not 100% rigid/stiff, it performs better in calculating the elements extents.
Perhaps it is a SwiftUI bug, but I was always OK with this minimumScaleFactor workaround as it hadn't negatively impacted my results.
it doesn't look like a ZStack problem. Try to remove .aspectRatio(contentMode: .fit) from Rectangle() in VCardView.
For rectangle proportion check something like this.

Which is more performant: replacingOccurrences or components(separatedBy:).joined()

I wish to remove all intermediate spaces in s String.
Option 1:
func removingWhitespaces() -> String {
return replacingOccurrences(of: " ", with: "")
}
Option 2:
func removingWhitespaces() -> String {
return components(separatedBy: .whitespaces).joined()
}
Which is more performant ?
As per my point of view, I think option 1 is faster than the options 2.
Reason:
In option 2, you are chaining the return value of components(separatedBy:) to joined(). So ultimately the return value of the joined()` is used whereas in the option 1 you are directly calling the build-in function of the String.
As per my understanding, I would like to suggest Option 1
Because,
replacingOccurrences(of: " ", with: "")
will perform only single operation.
where,
components(separatedBy: .whitespaces).joined()
will perform two operation and taking more time. First it will separate elements by whitespace and create array then on array it will perform join operation and give you output.
Figure it out yourself. Basic performance testing is very simple in Xcode. In a XCTestCase class run these 2 tests
func testPerformance1() {
let string = "I wish to remove all intermediate spaces in a String"
self.measure {
for _ in 0..<10000 {
_ = string.replacingOccurrences(of: " ", with: "")
}
}
}
func testPerformance2() {
let string = "I wish to remove all intermediate spaces in a String"
self.measure {
for _ in 0..<10000 {
_ = string.components(separatedBy: .whitespaces).joined()
}
}
}
and read the result in the console. replacingOccurrences is much faster.
There is no big difference between components(separatedBy: " ") and components(separatedBy: .whitespaces)
Space
When talking about performance, one should take space complexity into account. What's meant by this term is how much memory will be needed to run this piece of code and describes the relationship between the number of elements in the input and the reserved memory. For example, we talk about:
O(n) space complexity when the reserved memory grows with the number of elements in the input.
O(1) space complexity when the reserved memory doesn't grow when the number of the input elements grows.
Between replacingOccurrences(of: " ", with: "") and components(separatedBy: .whitespaces).joined(), the former wins on space complexity since the latter creates an intermidiary array, and in performance, less is more.
Time
Given this string :
let str = "Lorem ipsum dolor sit amet, tempor nulla integer morbi, amet non amet pede quis enim, ipsum in a in congue etiam, aenean orci wisi, habitant ipsum magna auctor quo odio leo. Urna nunc. Semper mauris felis vivamus dictumst. Tortor volutpat fringilla sed, lorem dui bibendum ligula faucibus massa, dis metus volutpat nec ridiculus, ac vel vitae. At pellentesque, at sed, fringilla erat, justo eu at porttitor vestibulum hac, morbi in etiam sed nam. Elit consectetuer lorem feugiat, ante turpis elit et pellentesque erat nec, vitae a fermentum vivamus ut. Orci donec nulla justo non id quis, ante vestibulum nec, volutpat a egestas pretium aliquam non sed, eget vivamus vestibulum, ornare sed tempus. Suscipit laoreet vivamus congue, tempor amet erat nulla, nostrum justo, wisi cras ac tempor tincidunt eu, hac faucibus convallis. Ac massa aenean nunc est orci, erat facilisis. Aliquam donec. Ut blandit potenti quam quis pellentesque, cursus imperdiet morbi ea ut, non mauris consectetuer mauris risus vehicula in, sed rutrum pellentesque turpis. Eros gravida volutpat justo proin donec penatibus, suspendisse fermentum sed proin fringilla libero malesuada, nulla lectus ligula, aliquam amet, nemo quis est. Quis imperdiet, class leo, lobortis etiam volutpat lacus wisi. Vestibulum vitae, nibh sem molestie natoque. Elementum ornare, rutrum quisque ultrices odio mauris condimentum et, auctor elementum erat ultrices. Ex gravida libero molestie facilisi rutrum, wisi quam penatibus, dignissim elementum elit mi, mauris est elit convallis. Non etiam mauris pretium id, tempus neque magna, tincidunt odio metus habitasse in maecenas nonummy. Suspendisse eget neque, pretium fermentum elementum."
The benchmarking code is given below. Each code block will be run separately, while the others will be commented out. :
do {
let start = Date()
let result = str.components(separatedBy: " ").joined()
let end = Date()
print(result.count, end.timeIntervalSince(start))
}
do {
let start = Date()
let result = str.split(separator: " ").joined()
let end = Date()
print(result.count, end.timeIntervalSince(start))
}
do {
let start = Date()
let result = str.filter { !$0.isWhitespace }
let end = Date()
print(s.count, end.timeIntervalSince(start))
}
do {
let start = Date()
var s = str
s.removeAll { $0.isWhitespace }
let end = Date()
print(s.count, end.timeIntervalSince(start))
}
do {
let start = Date()
let result = str.components(separatedBy: .whitespaces).joined()
let end = Date()
print(result.count, end.timeIntervalSince(start))
}
do {
let start = Date()
var result = ""
for char in str where char != " " {
result.append(char)
}
let end = Date()
print(result.count, end.timeIntervalSince(start))
}
do {
let start = Date()
let result = str.replacingOccurrences(of: " ", with: "")
let end = Date()
print(result.count, end.timeIntervalSince(start))
}
do {
let start = Date()
var arr = str.utf8CString
arr.removeAll(where: { $0 != 32 })
var result = ""
arr.withUnsafeBufferPointer { ptr in
result = String(cString: ptr.baseAddress!)
}
let end = Date()
print(result.count, end.timeIntervalSince(start))
}
Compiled with optimization in the terminal using this command:
xcrun swiftc -O ReplaceStr.swift -o replaceStr
-O: with optimizations
ReplaceStr.swift: the name of the file containing the code. You should cd to the location of this file first.
-o: to specify the name of the output compiled file
replaceStr is an example name for the output file
And then run with ./replaceStr
After running each block multiple times, here are the best timings:
components(separatedBy: " ").joined() : 0.77ms
components(separatedBy: .whitespaces).joined() : 0.75ms
str.split(separator: " ").joined() : 0.54ms
filter { !$0.isWhitespace } : 0.52ms
removeAll { $0.isWhitespace } : 0.52ms
for char in str where char != " " : 0.26ms
replacingOccurrences(of: " ", with: "") : 0.23ms
str.utf8CString : 0.18ms
Comparable results were found with a shorter string :
let str = "Lorem ipsum dolor sit amet, tempor nulla integer morbi, amet non amet pede quis enim, ipsum in a in congue etiam, aenean orci wisi, habitant ipsum magna auctor quo odio leo."
Verdict
replacingOccurrences(of: " ", with: "") is better than components(separatedBy: .whitespaces).joined() in time complexity too. This is partially because replacingOccurrences(of:with:) is defined on NSString and not String. In a sense it's like comparing 🍏🍎 to 🍊🍊.
Manipulating the underlying CString beats them all πŸ₯Š and is the overall best.
For more on benchmarking code, here is a good thread.
Using Split with joined is faster the other 2 options
class new: XCTestCase {
func testOption1() {
let string = String(repeating: "This is an example of a performance test case.", count: 10000)
self.measure {//0.0231s
_ = string.replacingOccurrences(of: " ", with: "")
}
}
func testOption2() {
let string = String(repeating: "This is an example of a performance test case.", count: 10000)
self.measure {//0.194s
_ = string.components(separatedBy: " ").joined()
}
}
func testOption3() {
let string = String(repeating: "This is an example of a performance test case.", count: 10000)
self.measure {//0.0184s
_ = string.split(separator: " ").joined()
}
}
}

Removing newlines in ruby JSON

So, I am serialising an object in Ruby on Rails into JSON format, using to_json. The output produced is:
'{"description":"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}'
This needs to be parsed by JSON.parse to deserialise the object in client-side javascript. At the moment, this is failing because of the newline characters \n in the "description" value. I've tried to encode the characters appropriately using gsub("\n","\\n") and other permutations, but I can't seem to find a string or regular expression that will correctly match the newlines (and only the newlines). I have tried /\n/, '\n', "\n", "\\n" (this matches everywhere on the string, for some reason), /\\n/ and so on, but haven't been able to find anything. Any ideas what I am missing?
Update: here's the code (javascript, but with embedded ruby) I'm trying to use to populate the javascript object (it's in an ERB view, hence the angle brackets):
var object = JSON.parse('<%= raw #object.to_json %>');
to_json is not overridden in my object code, just the standard rails method.
j = %Q!{"description":"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}!
j.gsub! /\n/, '\\n'
JSON.parse j
# => {"description"=>"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n", "id"=>295, "name"=>"Animi enim dolorem soluta eligendi inventore quia distinctio magni.", "privacy"=>0, "updated_at"=>"2012-11-18T22:24:17Z", "user_id"=>1}
Make your life easy, use single quotes around escaped characters when you need to manipulate them.
After the update…
var object = JSON.parse('<%= raw #object.to_json.gsub(/\n/, %q!\\n!) %>');
Your JSON includes a YAML string, so don't waste time trying to remove the line feeds, or you'll make things worse, or at least cause yourself to do too much work.
require 'json'
require 'yaml'
json = '{"description":"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}'
hash = JSON[json]
puts YAML.load(hash['description'])
Outputs:
Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae labore exercitationem. quos provident.
The JSON, after decoding back into a Ruby hash, looks like:
{"description"=>
"---\n- Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae\n labore exercitationem.\n quos provident.\n",
"id"=>295,
"name"=>"Animi enim dolorem soluta eligendi inventore quia distinctio magni.",
"privacy"=>0,
"updated_at"=>"2012-11-18T22:24:17Z",
"user_id"=>1}
To turn it back into a true JSON string, with description not encoded as YAML, use:
hash['description'] = YAML.load(hash['description']).shift
puts hash.to_json
Which now looks like:
{"description":"Nulla adipisci quia consequuntur nam ab et. Eius enim ad aut. Asperiores recusandae labore exercitationem. quos provident.","id":295,"name":"Animi enim dolorem soluta eligendi inventore quia distinctio magni.","privacy":0,"updated_at":"2012-11-18T22:24:17Z","user_id":1}

how to change the colour of a certain sentence in IOS UILabel?

Suppose I have text like this in a label
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam
nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat,
sed diam voluptua. At vero eos et accusam et justo duo dolores et ea
rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem
ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam
et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea
takimata sanctus est Lorem ipsum dolor sit amet.
How do I change the colour of the the sentence "At vero eos et..." starting in the second line?
I don't believe this is possible with the UILabel class. However, it wouldn't be too hard to do this with some simple HTML in a UIWebView.
[myWebView loadHTMLString:#"Some normal text. <font color=\"red\">Some red text.</font>" baseURL:nil]
Let me know if this works well for you.
You can't do that with a standard UILabel. You'd either have to use multiple UILabels or take a look at CoreText which would also do what you want - http://developer.apple.com/library/mac/#documentation/StringsTextFonts/Conceptual/CoreText_Programming/Introduction/Introduction.html.

Resources