Some quick thoughts on the first two problems listed: Inscrutability - use the `...

andrewl-hn · on May 28, 2020

Also, named capture groups. Often I find my code much more readable when I name different fragments of the regex and then refer to those names later on in my code.

brigandish · on May 28, 2020

Yes, that would be my next top tip too.

    SHARPS_AND_FLATS = /\\^+ | _+ | =/x
    NOTES = /[a-g]/i

    p = <^
      (?<pitch>#{SHARPS_AND_FLATS})?  # sharps and flats
      (?<notes>#{NOTES})              # notes
      ( ,+ | '+ )?
      ( \\d* )                # duration
      ( / )?                  # fractions
      ( \\d* )                # more duration? I'd need to read the spec
    $>x

    match_data = p.match(music)
    match_data[:pitch] # => "#"
    match_data[:notes] # => "ADEADE"

Much more readable.

njonsson · on May 28, 2020

I got a lot of mileage out of the x modifier, too, until I bumped into the Composed Regex pattern. https://www.martinfowler.com/bliki/ComposedRegex.html

brigandish · on May 28, 2020

Nice explanation (as expected from Fowler). Yep, mix that with a bit of interpolation and an iterator, maybe wrapped in an object and you have something readable and flexible. Not sure why people want to make regex any harder than they need to be?

riffraff · on May 28, 2020

in perl's "apocalypses" on pattern matching[0] (I.e. perl6 regexp design documents) Larry Wall immediately states that /x should be the default (and not be an option at all!) for regexes.

It's incredible how useful it is and yet by not being the default every regex implementation has nudged their users to write inscrutable regular expressions for decades.

[0] https://raku.org/archive/doc/design/apo/A05.html

brigandish · on May 28, 2020

I must give Perl 6 another look, I was using 5 up until about 2009 when I replaced it with Ruby. Be nice to see what's been done in that time.

lizmat · on June 1, 2020

Please note that Perl 6 has been renamed to Raku (https://raku.org using the #rakulang tag on social media).

wodenokoto · on May 28, 2020

Do you have a tutorial or article you recommend to read up on x-modifier?

brigandish · on May 28, 2020

My favourite regex site is Rexegg[1]. I thought it had died so I was using archive.org but it appears to be back. Unmatched, in my opinion.

Still, there's not much to the `x` modifier, you just need to know that it will ignore implicit whitespace and comments (# blah blah) so you must use `\s` or `\t` etc when you really want to match some whitespace. Otherwise you can do things like:

    "Hello, world!".split(/ [ \s , ]+ # this splits on at least one space or comma /x)

Which is overkill for a simple pattern (then again, why not?) but unbelievably useful for a complex one. Add spaces and newlines until it's readable.

It's a small change but a big effect.

[1] https://www.rexegg.com/regex-modifiers.html

amarshall · on May 28, 2020

Implementation may vary, check your regex library’s docs (usually listed under “flags”). Generally there’s not much to it; specify the flag and then whitespace and some form of comment within the string is ignored.