The Original Jelani Harris The one and only

10Sep/0913

Case-insensitive replaceAll in Java

The replaceAll function in the java.lang.String class replaces each substring found in that matches the regular expression to replace.

String sentence = "The sly brown fox jumped over the lazy fox.";
String result = sentence.replaceAll("fox", "doggie");
System.out.println("Input: " + sentence);
System.out.println("Output: " + result);

Would output:

Input: The sly brown fox jumped over the lazy fox.
Output: The sly brown doggie jumped over the lazy doggie.

However there are cases where we want to replaceall substrings and ignore the case, or make it case insensitive.

String sentence = "The sly brown Fox jumped over the lazy foX.";
String result = sentence.replaceAll("fox", "dog");
System.out.println("Input: " + sentence);
System.out.println("Output: " + result);

Input: The sly brown Fox jumped over the lazy foX.
Output: The sly brown Fox jumped over the lazy foX.

To create the case sensitive version of replaceAll we do not need to create a new wrapper function or create a utility class somewhere. All we need to do is prepend the Case-insensitve pattern modifier (?i) before our regex to indicate that we don't care about the case sensitivity of the regex.

String sentence = "The sly brown Fox jumped over the lazy foX.";
String result = sentence.replaceAll("(?i)fox", "dog");
System.out.println("Input: " + sentence);
System.out.println("Output: " + result);

Input: The sly brown Fox jumped over the lazy foX.
Output: The sly brown dog jumped over the lazy dog.

Tagged as: , Leave a comment
Comments (13) Trackbacks (0)
  1. Hi! It doesn’t work with accents … :/
    Try: Águia
    I consider this a bug.

    Do you know some workaround?

  2. If you have $ sign in your sentence it wouldn’t work.
    like
    String sentence = “The sly brown Fox jumped$ over the lazy foX.”;

  3. Hello Shahin,
    I added a $ to my example string and the replaceAll still worked.

  4. String regex = "[^']";
    String insentiveCase = "(?i)";
     
    String DOCUMENTO_SAIDA = "celso foi  celso Celso CELSO";
    String anotacao = "";
     String termo = "celso";
     
    DOCUMENTO_SAIDA = DOCUMENTO_SAIDA.replaceAll(insentiveCase + termo + regex, anotacao);
     
    System.out.println(DOCUMENTO_SAIDA);

    foi CELSO

    But… he eats space char after the replaced “termo”

  5. Hello Celso,

    I had a hard time trying to follow what exactly you’re trying to do here so I’ll make a few quick assumptions. I’m assuming that you don’t want to eat the whitespace after the search term.

    The regex you have set up now is looking for the term “celso” and one other character that is not a single quote (‘). Because the space character is not a quote the regex matches the term and a single space – then does the replacement. If you were to set the regex to:
    String regex = "";
    Then you’d be matching all variations of celso without the spaces.

    However if you just wanted to match the WORD “celso”, and not part of another word of a mistype (such as “celsOOOOO” which would turn into “OOOO”) you can use the word boundary regex:
    String regex = "\\b";

    I hope I helped.

  6. Hi, just wanted to let you know that this blog entry came up as the top hit on a Google search “replaceall case insensitive” (without quotes), and I found it helpful and well-written. Thanks.

  7. thank you!

  8. Thanks buddy,
    it was a great help to me. Searching for something like that
    for many hours. Finally ur site helps.

    Thanks again, keep up the good work.

  9. Incredible thought it may seem, Java defaults to only understanding ASCII; it ignores its own native charcter set. So you always need to add “(?u)” to make case insensitivity work on normal Java text, which is Unicode not ASCII.

    Also, if you expect things like \w to work on regular Java text, you need to add “(?U)”, which isn’t even supported until Java J.

    In short, you normally want “(?iu)” for case insensitivity, and you want “(?U)” for the UNICODE_CHARCLASSES flag.

    Note that this is still only the simplistic kind of case insensitivity as provided by the Character class, not full casemappings such as provided by the String class. That will be ok for Spanish and Portuguese, but not for German or Greek.

  10. What if you want to highlight the matches that were found AND keep the original case? :)

    eg:

    Input: “The sly brown Fox jumped over the lazy foX.”
    Replace Term: fox
    Output: “The sly brown [bold]Fox[bold] jumped over the lazy [bold]foX[bold].”

  11. [bold] … … You must check Matcher & Pattern class to achieve it. You can parse each “match” and replace it

  12. “The sly brown Fox jumped over the lazy foX.”.replaceAll(“(?i)(fox)”, “[bold]$1[/bold]”);

  13. The correct phrase is “the quick brown fox jumps over the lazy dog”. Which contains all 26 letters of the English alphabet.


Leave a comment

No trackbacks yet.