Python: Run Script to Write Several Files at Once

rusuh

New Member
New to programming. I wrote a Python code (with help from stackoverflow) to read XML files in a directory and use BeautifulSoup to copy the text for an element and write this to an output directory. Here is my code:\[code\]from bs4 import BeautifulSoupimport os, syssource_folder='/Python27/source_xml'for article in os.listdir(source_folder): soup=BeautifulSoup(open(source_folder+'/'+article)) #I think this strips any tags that are nested in the sample_tag clean_text=soup.sample_tag.get_text(" ",strip=True) #This grabs an ID which I used as the output file name article_id=soup.article_id.get_text(" ",strip=True) with open(article_id,"wb") as f: f.write(clean_text.encode("UTF-8"))\[/code\]The problem is I needed this to run for 100,000 XML files. I started the program last night and to my estimate it took 15 hours to run. Is there a way to, instead of reading and writing one at a time, to maybe read/write 100 or 1000 at a time, for example?
 
Back
Top