Early morning, tenth cup of coffee, unsuccessful attempts to understand why your client (or even worse - server) java application is deadly hanging while calculating a simple regexp on a small line ... If a similar situation has already occurred in your life, you probably already know about backtracking and dark aside of regular expressions. The rest - welcome under the cut!
Backtracking, or eternal expectation of the result
(, , ), . – "evil regexes" ( ):
@Test
public void testRegexJDK8Only() {
final Pattern pattern = Pattern.compile("(0*)*1");
Assert.assertFalse(pattern.matcher("0000000000000000000000000000000000000000").matches());
}
: * (" ") . , ?, +, {n} (n – ).
JDK8 ( – ), JVM matches(). , .
? Pattern/Matcher java.util.regex
:
* - , . , (0) , , , . .
(backtrack) . ; . (0) , , . .
(0) . (0)! , , .
" ! ?" - . : , . , - , , 10 , . , :
@Test
public void testRegexAnyJDK() {
final Pattern pattern = Pattern.compile("([A-Za-z,.!?]+( |\\-|\\')?){1,10}");
Assert.assertFalse(pattern.matcher("scUojcUOWpBImlSBLxoCTfWxGPvaNhczGpvxsiqagxdHPNTTeqkoOeL3FlxKROMrMzJDf7rvgvSc72kQ").matches());
}
80 . JVM JDK8+ – 30 – , . - ReDoS-. , , , – "+" "{1,10}" – .
Java SDK?
, . , , . , . : JDK-5026912, JDK-7006761, JDK-8139263. StackOverflowError, (JDK-5050507). : " ", " ", " ".
"" , . (, - ), API java.util.regex
JDK ( JDK-8234713, JDK-8054028, JDK-7178072). ; , " , , " ().
. , JDK9 : , , , , . , , (JDK-6328855, ). testRegexJDK8Only()
jdk9-b119, JDK. , .
:
, , , ; , "" . , , npmjs.com. , , , , , . – , .
, , . , , . RE2, DFA - . ; – RE2 Rust, Google. JDK7+ RE2/J, C++ - .
2021 JEP-, , RE2.
RE2/J - ?
, RE2/J – . ?
, , : RE2/J – ; . , .