Distinguishing multiline strings that need indentation from those that do not

405 Views Asked by At

---+ BRIEF

This question is about indenting multiline strings, e.g. for Perl, in emacs' cperl-mode, so that they do not break up the flow of the code.

I know how to get the indentation I want, using transformations like =~ s/^[^\S\n]*\|//mhr on the string, and providing emacs :about advice to cperl-calculate-indent, using syntax-ppss to recognize when I am in a string.

Trouble is, sometimes I want to do such indentation, and sometimes not.

What I seek advice about conventions, BKMs, for hinting what the desired indentation is.

    my $legacy = qw{
This string
   should
not be indented};

my $needs_trimming = (q{
                          |BEGIN
                          |    SUB
                          |END
                      } =~ s/^[^\S\n]*\|//mgr); 

my $needs_different_triming  = (q{   + - 
                                     , ? : 
                                     | & ^
                                     || && 
                                | =~ s/\s+/ /gr);

I expect others have solved this problem before me.

---+ DETAIL:

I ask this as an emacs/cperl-mode question, but the problem is generic to languages that allow multiline strings. Not just Perl, but also Javascript, C/C++, Javascript (Multiline strings that don't break indentation), LISP and elisp, etc.

I often write Perl code that uses multiline strings.

I dislike how multiline strings break indentation:

  if(cond) {
      my $var = q{
START Line crossing string needs
    to be indented
    differently than rest of code
FINISH
};
      my $var2 = ...
  }

I prefer to keep the same indentation, so I often do things like

   if(cond) {
      my $var = (q{
                  |START Line crossing string needs
                  |     to be indented
                  |     differently than rest of code
                  |FINISH
               } =~ s/^[^\S\n]*\|//mgr);
      my $var2 = ...
   }

   if(cond) {
      my $var = fix_string( {unindent=>'|',trim=>1,}, 
                  q{
                    |START Line crossing string needs
                    |     to be indented
                    |     differently than rest of code
                    |FINISH
                  } =~ s/^[^\S\n]*\|//mgr);
      my $var2 = ...
   }

OK, so this works fine, and I have been doing it for years, possibly decades.

What did not work so well in the past was indentation, using packages like perl-mode' andcperl-mode'.

OK, so I fixed that, using

(defun cperl-calculate-indent--hack--/ag (orig-fun &rest args)
  "hack / experimenting with cperl-calculate-indent"
  (interactive)
  (let ((state (syntax-ppss)))
    (cond
      ((and (nth 3 state)                  ;in string
     (nth 8 state)                     ; beginning of string
     (< (nth 8 state) (point-at-bol))  ; on different line
     )
    (save-excursion
      (goto-start-of-string--ag)
      (+ 4 (current-column)))
    )
      (t (apply orig-fun args))
      )
    )
  )

(advice-add 'cperl-calculate-indent :around #'cperl-calculate-indent--hack--/ag)
;;(advice-remove 'cperl-calculate-indent #'cperl-calculate-indent--hack--/ag)

So now I can indent multiline strings. If that is what I want to do.

Unfortunately, sometimes I do want to indent the multiline string. And sometimes I don't.

  • E.g. I may inherit legacy code that I do not want to accidentally break if I edit it in cperl-mode.
  • Furthermore, there are different types of indentation / transformations that I may do to such multiline strings:

    1. None - standard cperl-mode indentation for multiline strings
    2. Trim prefix at beginning of line, e.g s/^[\S\n]*
    3. Sometimes I want the same sort of indentation that cperl-mode gives to qw{} - basically, I create my own qw that allows things like commas without warnings, s/\s+/ /g
    4. Sometimes I may actually want the string to be indented like HTML, or C, or ...

I know how to do each of these. I can cons up a recognizer for any convention.

But... I am looking for suggestions, ideally standard practices or BKMs, as to how to make this distinction. Something that others will find readable.

E.g. I considered creating my own operators using PerlX::QuoteOperator my_q. But that is non-standard, may confuse others, and may break in future. Besides, what I really want to do is just provide an indentation hint, not change the language.

E.g. as explained above, I already use fixup functions fixup_string( q{ ... } ). I could pattern match on these. But this breaks when the user adds new fixup functions, etc.

E.g. I have considered looking at the first line of the string, e.g.

fixup_string( q{:
                 :not indented 
                 :     indented
              } )

This works for my prefix s/^[^\S\n]*://mgr, but not in other cases.

(I am currently doing (looking-at ".[~!@#$%^&*_+=|:;'?/]$")), which is easy enough, but which has obvious problems (like, what if I want to a string to start "|\n...". I.e. it fails the test "don't break reasonably likely existing code".)

If Perl had a comment that was NOT to end of line, like C's `/.../'I might dp

fixup_string( q/*indent-here*/{|
                 | 
                 |
              } )

but Perl does have such non-end-of-line comments. (Does it? Is there any syntactic trick that is the equivalent?)

Therefore my question.

I am sure that others have solved this problem before. I would love to know what they did.


In many ways this is just a special case of multiple modes inside the same buffer, https://emacswiki.org/emacs/MultipleModes. I would like to avoid the problems mentioned in https://www.emacswiki.org/emacs/HtmlModeDeluxe.

0

There are 0 best solutions below