Java regex to strip out XML tags is not working

Jukebox

New Member
I am trying to remove any XML tags from a Java string. The way I see it, something is an XML tag if it follows one or both of the following forms:
  • \[code\]<*>*<*/*>\[/code\], such as \[code\]<fizz>buzz< /fizz>\[/code\]; or
  • \[code\]<*/*>\[/code\], such as \[code\]< fizz />\[/code\]
My regex is simple:\[code\]String tagful = "Hello <fizz>buzz</fizz>Regexes!";String tagless = tagful.replaceAll("<*>*<*/*>", "");tagless = tagless.replaceAll("<*/*>", "");System.err.println("TAGLESS:\n\t" + tagless);\[/code\]When I run this I get \[code\]Hello <fizzbuzz</fizzRegexes!\[/code\] as the output, whereas (if my XML-stripping code was correct), I should be getting \[code\]Hello Regexes!\[/code\]. Where am I going astray?Please note: I do not want to use any existing libraries; I am looking for a pure Java regex solution here. Thanks in advance!
 
Back
Top