Numbering the sentences inside a <P> in a .xml file?

joetcochran

New Member
I'm a beginner programmer and I'm stuck on this possibly easy problem: I want to automatically add numbers to the sentences contained in the P tags of an .xml file. So a sample paragraph in the the .xml file looks like:\[code\]<P>Sentence1. Sentence2. Sentence3.</P>\[/code\]I want to transform this into:\[code\]<P><SUP>1</SUP>Sentence1.<SUP>2</SUP> Sentence2.<SUP>3</SUP> Sentence3.</P>\[/code\]However only the P tags containing at least 2 sentences should be numbered, if it contains only 1 sentence I want to leave it unchanged.Here is the approach I have come up with so far, using regular expressions:\[code\]\.\s.*# Reliably finds the second sentence, Insert <SUP>2</SUP> after it.<P>[^>]*<SUP>2# Finds the beginning of the first sentence if a second sentence exists.\[/code\]However I feel like this is a really awkward approach that I wouldn't really know how to extend for Paragraphs containing 20 sentences or more, or .xml documents containing many paragraphs. Is there a better regular expression to achieve this or a better (Python) tool than regular expressions?
 
Back
Top