Replacing Nested Double Quotes in a String

BrettJ

New Member
So I am a typography-nazi (they are like grammar-nazis on steroids), and I have a string, which might contain multiple level double quotes, such as:\[code\]$str = 'Outer text "first level "second level "third level" second level" first level" outer text';\[/code\]In my native language, a maximum of three level quoting is typographically correct, with each level having its own quotation mark. I would like to replace all the double quote pairs to their corresponding entities, such that:
  • 1st level: „text” (\[code\]„\[/code\] and \[code\]”\[/code\])
  • 2nd level: »text« (\[code\]»\[/code\] and \[code\]«\[/code\])
  • 3rd level: ’text’ (\[code\]’\[/code\])
  • any additional levels: ’text’ (\[code\]’\[/code\])
So the above text will output as:\[quote\] Outer text „first level »second level ’third level’ second level« first level” outer text\[/quote\]Also, it is possible that there are sibling \[code\]""\[/code\] pairs in the string:\[code\]$str = 'Quote from my book: "She didn\'t feel "depressed", "tired" or "sad"."';\[/code\]So this will output as:\[quote\] Quote from my book: „She didn't feel »depressed«, »tired« or »sad«.”\[/quote\](This could be tricky, but we know that a \[code\]"\[/code\] always followed or preceeded by a space \[code\]\[/code\], or punctuation \[code\],\[/code\], \[code\].\[/code\], \[code\];\[/code\], \[code\]?\[/code\], \[code\]!\[/code\])Finally, the \[code\]$str\[/code\] may contain HTML as well, where the attributes' quotation marks shouldn't be changed:\[code\]$str = '<p class="quote">The error said: <span class="error_msg">"Please restart your "fancy" computer!"</span></p>';\[/code\]I've heard that using recursive regexp would be a possible solution but I'm searching for a more efficient way because the strings might be long HTML texts.
 
Back
Top