Once again about regexps, backtracking and how you can put two lines of "harmless" code on the JVM blades

Early morning, tenth cup of coffee, unsuccessful attempts to understand why your client (or even worse - server) java application is deadly hanging while calculating a simple regexp on a small line ... If a similar situation has already occurred in your life, you probably already know about backtracking and dark aside of regular expressions. The rest - welcome under the cut!





Backtracking, or eternal expectation of the result

(, , ), . – "evil regexes" ( ):





@Test
public void testRegexJDK8Only() {
  final Pattern pattern = Pattern.compile("(0*)*1");
  Assert.assertFalse(pattern.matcher("0000000000000000000000000000000000000000").matches());
}
      
      



: * (" ") . , ?, +, {n} (n – ).





JDK8 ( – ), JVM matches(). , .





? Pattern/Matcher java.util.regex



:





  1. * - , . , (0) , , , . .





  2. (backtrack) . ; . (0) , , . .





  3.   (0) . (0)! , , .





, "" (0), . ; – – , .





" ! ?" - . : , . , - , , 10 , . , :





@Test
public void testRegexAnyJDK() {
	final Pattern pattern = Pattern.compile("([A-Za-z,.!?]+( |\\-|\\')?){1,10}");
  Assert.assertFalse(pattern.matcher("scUojcUOWpBImlSBLxoCTfWxGPvaNhczGpvxsiqagxdHPNTTeqkoOeL3FlxKROMrMzJDf7rvgvSc72kQ").matches());
}
      
      



80 . JVM JDK8+ – 30 – , . - ReDoS-. , , , – "+" "{1,10}" – .





Java SDK?

, . , , . , . : JDK-5026912, JDK-7006761, JDK-8139263. StackOverflowError, (JDK-5050507). : " ", " ", " ". 





"" , . (, - ), API java.util.regex



JDK ( JDK-8234713, JDK-8054028, JDK-7178072). ; , " , , " (). 





. , JDK9 : , , , , . , , (JDK-6328855, ). testRegexJDK8Only()



jdk9-b119, JDK. , .





:

, , , ; , "" . , , npmjs.com. , , , , , . – , .





, , . , , . RE2, DFA - . ; – RE2 Rust, Google. JDK7+ RE2/J, C++ - .





2021 JEP-, , RE2.





RE2/J - ?

, RE2/J – . ?





  • RE2/J API Matcher;





  • ( RE2/J , – , backreferences). , ;





  • , Google, , – .





  • : " RE2/J . , , java.util.regex ".





, , : RE2/J – ; . , .





  1. ReDoS.





  2. , , , .





  3. , .





, , – , , RE2/J.








All Articles