Regular expression challenge for awk and grep

liunx

Guest
I've inherited a massive orders archive that I need to extract a total from. Fortunately, I only need a grand total so I don't need to worry about separating products and getting totals per product - just the total number of products appearing.<br /><br />The archive is a giant text file. Each line has shipping address info, date info, a series of products, and a quantity for each product. I need to extract just the quantities.<br /><br />The problem is that the number of products is not consistent among lines, and so the number of quantities on each line is arbitrary - making it difficult (or maybe impossible) to use awk to just print the necessary fields. <br /><br />The one thing that all the quantities do have in common is that they are one digit, preceded by a colon and followed by a less-than (<) sign. (There is never a quantity greater than 9, so it's always 1 digit.)<br /><br />A regular expression that matches them all is /:[0-9]</ <br /><br />How can I create an output that consists of every instance that matches that expression, even when there are multiples per line, with one number per line of output? <br /><br />Ideally, the output would look something like this:<br /><br />3<br />2<br />0<br />0<br />1<br />3<br />2<br />.... ad nauseum.<br /><br /><br />If the output includes the : and < for each number, that's fine too since I can easily use awk at that point to strip them out. <br /><br />My first instinct to to try to use grep or awk - but grep will just print the lines (which is exactly what I already have), and awk will require me to know the exact field numbers (or will it?). <br /><br />This is on a Linux box, by the way.<br /><br />Any ideas?<br /><br />Thanks,<br />Wayne<br />
</div>
 
Top