This post is a long-format reply to Jonathan Jordan's recent post. Jonathan's post was about the non-capturing backreference in Regular Expressions. He and I are both working a lot in Behat, which relies heavily on regular expressions to map human-like sentences to PHP code. One of the common patterns in that space is the quoted-string, which is a fantastic context in which to discuss the backreference (and also introduce lookarounds). Please read hist post first, as from here on the tone and perspective of this post is reply-oriented.
In Behat, this (non-capturing groups) is helpful to not pollute your step-definition arguments with certain groups that improve the usability of your step but don't change how it behaves. For example, you can offer steps a choice between "click" and "press" with (?:click|press) without adding an argument to your method.
But it's worth noting that what you're discussing here (?:) is essentially the non-back-reference. It's a way to group things and tell the engine that you don't need to refer back to what was consumed by the group. Behat doesn't often use the real notion of backreferences: re-using the captured group as part of the matching requirements. The basic regex feature is that once you have a capturing group, you can use the espression \1 to refer to that group. You showed an example snippet used often in Behat's Mink extension:
This is essentially an attempt to match any string of characters up to a closing quote, considering that we should allow people to escape their quotes like this: "some \"value\" is safe". However, I think this pattern, while clean, is lackluster in that it doesn't support single quotes. What a great opportunity to explore how useful backreferences can be! Basically, we can use a capturing group's backreference to tell the Regex engine that a string should end with the same quote character that started it. So, what follows is an example of how Behat could improve it's usual quoted-string pattern with this backreference feature, along with some negative lookbehind/lookahead assertions (which I understand might be more than this conversation wanted, but it's a cool thing to explore and this is a great context). This will feel a bit complicated at first, but we'll break it down. Here is my proposed replacement pattern:
This translates to english like this:
1. Match a single or double quote, as long as it's not preceded by \
2. Store that match in a way that I can reference later. (with \1)
3. Continue matching ANY characters...
3.1 As long as they aren't followed by the same quote that was matched in #1...
3.2 unless that quote was itself preceded by a \, then go ahead and proceed.
4. Once you stop matching (because the next character is followed by the ending quote, match that last character.
So, chunk by chunk:
This is our opening chunk, which essentially matches any single or double quote, unless that quote is preceded by a backslash. That's a "Negative Lookbehind", and the part does not consume any characters. If we didn't care to be careful about erroneously matching escaped quotes, it could simply be this:
Next we want to match any string until we encounter an un-escaped quote, but it must be the SAME (e.g. single vs. double) that was matched at the begining. This is where backreferences come in (we need to reference what was matched at the start in order to tell the engine what to look for). We also need a way to say "anything except", but it's not a character class, so we need negative lookahead for this. The basic algorithm is to keep matching characters as long as they are not followed by the same quote that was used to start the string. Here is the simplified version (e.g. without concern for erroneously stopping for an escaped quote)
We can break this down even further. I'll strip out some of the parens for readability; they are essentially to manage what gets captured in the end. The following will match ANY single character that is not followed by the string matched in the first backreference.
Think of this as similar to the following:
The next important thing to realize is that the last character before the ending quote will not get matched. That's why we add the last , to grab that last character. Finally, the whole "interesting" part is wrapped up the the necessary parentheses to capture it.
I've posted an example to Rubular for you to play with!
In case it helps cut through the complexity, here's a comparison of "escaping quotes is not supported" and the "escaping quotes works" version. The first one, when matching "this \"string" will only match 'this\'. The second one will match 'this \"string'
It's probably worth noting that one reason this may be avoided by the community is that the captured results include a group just for the opening quote. The way Behat works, this would garbage-up your method parameters (and you can't not capture that opening quote if you want to use /1). To me, this is a very reasonable thing to be concerned about, and it doesn't exactly appear to be easy to work around, without adding some assumptions and/or complexity that is unneeded.
This issue is for beginners - Basic PHP Strings.
1.View.php - Mix text string & variables
A lot of times when you want to create a custom block template you need to change the core block markup and change/add css declerations inside your duplicate .
Mabye you find a JS slider and you need to change the core image slider block output from:
If you are new to PHP i believe you get a lot of wierd small errors in this procces, so i hope this tutorial will give you ideas where to look in your code.
2.The problem - "unexpected (T_VARIABLE)"
Single and double quotes
If you dont want to insert $vars values inside your string its ok to use either or . But if you inserting variable values inside Single quotes the $var will not parse (php will print $var as a regular text). See example 1-5.
Echo function, vars and Double quotes
The echo function uses quotes ("") to define the start and end of a string. But also in html markup we uses quotes ("") to define the start and end of css declaration as shown in Example 2-5.
Concrete5 throw error:
syntax error, unexpected '$myClass' (T_VARIABLE), expecting ',' or ';'
3. Two Ways to solve this issue
3.1. Wrong approach - "Echo spagethie"
In 3-5 Example - We solve the problem by adding some echo statements. Its works fine, But the code is now "dirty", less readable, and easy to get confused in the procces.
3.2. Eascape quotes
Quotes are speacial character. Escape quotes within the string with a slash. How it works? You escape a character by typing a "slash" before it `\"$myClass\".
3.3. Concatenation operator
Combine one or more variables and text strings. Concatenation Save you a lot of echo statements for clean and readable code.
Step 1: Use single quotes (') for quotes inside your string
step 2: Add Concatenation
Step 3: Use single quotes (') for quotes inside your string
4. Read about this issues in PHP official docs: