need awk script for some xml node concat

AdamWalker

New Member
i'm new to awk and need some assistance with a simple awk script to strip all the character metrics and concat the attributes to squash the XML by quite a bit.input\[code\]<?xml version="1.0"?><document> <page> <block bbox="270 163.717 363.262 224.155"> <line bbox="270 163.717 274.453 182.669"> <span bbox="270 163.717 274.453 182.669" font="Helvetica-Bold" size="16.02"> <char bbox="270 200.519 284.425 224.155" c="f"/> <char bbox="284.43 200.519 291.082 224.155" c="o"/> <char bbox="291.087 200.519 297.74 224.155" c="o"/> </span> </line> <line bbox="270 200.519 363.262 224.155"> <span bbox="270 200.519 363.262 224.155" font="Helvetica-Bold" size="19.98"> <char bbox="270 200.519 284.425 224.155" c="b"/> <char bbox="284.43 200.519 291.082 224.155" c="a"/> <char bbox="291.087 200.519 297.74 224.155" c="r"/> </span> </line> </block> </page></document>\[/code\]desired output\[code\]<?xml version="1.0"?><document> <page> <block bbox="270 163.717 363.262 224.155"> <line bbox="270 163.717 274.453 182.669"> <span bbox="270 163.717 274.453 182.669" font="Helvetica-Bold" size="16.02">foo</span> </line> <line bbox="270 200.519 363.262 224.155"> <span bbox="270 200.519 363.262 224.155" font="Helvetica-Bold" size="19.98">bar</span> </line> </block> </page></document>\[/code\]thanks!
 
Back
Top