How can i store xml groups into a dict in correct grouping order

breepapse

New Member
Actually i have a xml of 4000 tags its somehow like this\[code\]<root><in-ca:a context="I2010">$</in-ca:a><in-gaap:context="I2010" b>$</in-gaap:b><in-gaap:c> <in-ca:d context="Icuryear">1</in-ca:d> <in-ca:e context="Icuryear">2</in-ca:e></in-gaap:c><in-ca:t> <in-ca:ff context="Icuryear">ffcuryear</in-ca:ff> <in-ca:gg context="Icuryear"> <in-ca:gg1 context="Icuryear">gg1curyear</in-ca:gg1> <in-gaap:gg2 context="Icuryear">gg2curyear</in-gaap:gg2> </in-ca:gg> <in-gaap:kk context="Icuryear">kkcuryear</in-gaap:kk></in-ca:t></root>\[/code\]Actually lines which are not in groups i am able to store into a separate dict, but tags which contain groups have to be stored into separate dict which is not coming properly.. i am trying to store only group tags into dict i need dict in this form\[code\]groupdict={'c':{ 'd':{'subgroup':'','val':'<in-ca:d context="Icuryear">1</in-ca:d>'}, 'e':{'subgroup':'','val':'<in-ca:e context="Icuryear">e</in-ca:e>} } 't':{ 'ff':{'subgroup':'','val':'<in-ca:ff context="Icuryear">111</in-ca:ff>'} 'gg1':{'subgroup':'gg','val':'<in-ca:gg1 context="Icuryear">555</in-ca:gg1>'} 'gg2':{'subgroup':'gg','val':'<in-gaap:gg2 context="Icuryear">666</in-gaap:gg2>'} kk':{'subgroup':'','val':'<in-gaap:kk context="Icuryear">222</in-gaap:kk>' } }\[/code\]Actually subgroup field will come only if group contain one more sub group, that subgroup name will be added to subgroup field only in case of sub group elements ,for others it will be blank.I tried this code...\[code\]tempvar=''groupdict={}with open("fname.xml") as f:for line in f: if not '$' in line and (line.startswith('<in-gaap') or line.startswith('<in-ca')): var=re.findall('\:(.*?)\>',line) # to get group name tempvar=var[0] groupdict[tempvar]={} else: temp=re.findall('contextRef="(\w+)\"',line) if temp[0].isalpha(): # because grp items will context=" some char" val=re.findall('\$(\w+)curyear',line) # to get grp tag name if len(val): if tempvar: groupdict[tempvar][val[0]]=line\[/code\]Can anyone help thanks in advance
 
Back
Top