Ruby: Convert <br> to newline URI encode

965 Views Asked by At

I want to share sometext on whatsapp so I'm converting html to text otherwise it displays all the tags.

Currently I'm using strip_tags to remove tags but that also removes breaks from the text. How do I convert html to text and convert breaks to newline characters and url encode the text.

currently I'm using following

@whatsapp_text = u strip_tags(@post.summary)
2

There are 2 best solutions below

0
davidb On BEST ANSWER

I suggest you tu use Nokogiri to solve this problem. Nokogiri can parse HTML and convert Websites Source into human readable text although it doiesnt convert html breaks to linebreaks it can take away many problems from you. To do this add the follofing line to your Gemfile

gem 'nokogiri'

run bundle install. Then you can solve your problem like this:

Nokogiri::HTML.parse(@post.summary.gsub("<br>", "\r\n").gsub("<br/>", "\r\n")).inner_text

That should do it for you.

0
ilyazub On

ActionView::Helpers::SanitizeHelper#sanitize with scrubber: :newline_block_elements option can preserve whitespace characters (ref: https://github.com/rails/rails-html-sanitizer/issues/154#issuecomment-1551819784).

Mentioning ActionView here because the question is tagged ruby-on-rails. It's possible to use the Loofah gem with Loofah::Scrubbers::NewlineBlockElements scrubber directly.

# $ rails console
helper.sanitize("<div><p>text<br><br></p><span>another text</span><p>wow nested paragraph!!</p></p>", scrubber: :newline_block_elements)
# => "\n\ntext\nanother text\nwow nested paragraph!!\n\n"