Xml simplification/extraction of distinct values - possible LINQ

Ardesa.regilase

New Member
Sorry for this long post....But i have a headache from this task.I have a mile long xml document where I need to extract a list, use distinct values, and pass for transformation to web.I have completed the task using xslt and keys, but the effort is forcing the server to its knees.Description:hundreds of products in xml, all with a number of named and Id'ed cattegories, all categories with at least one subcategory with name and id.The categories are unique with ID, all subcategories are unique WITHIN that category:Simplified example form the huge file (left our tons of info irrelevant to the task):\[code\]<?xml version="1.0" encoding="utf-8"?><root><productlist><product id="1"><name>Some Product</name><categorylist><category id="1"><name>cat1</name><subcategories><subcat id="1"><name>subcat1</name></subcat><subcat id="2"><name>subcat1</name></subcat></subcategories></category><category id="2"><name>cat1</name><subcategories><subcat id="1"><name>subcat1</name></subcat></subcategories></category><category id="3"><name>cat1</name><subcategories><subcat id="1"><name>subcat1</name></subcat></subcategories></category></categorylist></product><product id="2"><name>Some Product</name><categorylist><category id="1"><name>cat1</name><subcategories><subcat id="2"><name>subcat2</name></subcat><subcat id="4"><name>subcat4</name></subcat></subcategories></category><category id="2"><name>cat2</name><subcategories><subcat id="1"><name>subcat1</name></subcat></subcategories></category><category id="3"><name>cat3</name><subcategories><subcat id="1"><name>subcat1</name></subcat></subcategories></category></categorylist></product></productlist></root>\[/code\]DESIRED RESULT:\[code\]<?xml version="1.0" encoding="utf-8"?><root><maincat id="1"><name>cat1</name><subcat id="1"><name>subcat1</name></subcat><subcat id="2"><name>subcat2</name></subcat><subcat id="3"><name>subcat3</name></subcat></maincat><maincat id="2"><name>cat2</name><subcat id="1"><name>differentsubcat1</name></subcat><subcat id="2"><name>differentsubcat2</name></subcat><subcat id="3"><name>differentsubcat3</name></subcat></maincat><maincat id="2"><name>cat2</name><subcat id="1"><name>differentsubcat1</name></subcat><subcat id="2"><name>differentsubcat2</name></subcat><subcat id="3"><name>differentsubcat3</name></subcat></maincat></root>\[/code\](original will from 2000 products produce 10 categories with from 5 to 15 subcategories)Things tried:[*]Xslt with keys - works fine, but pooooor performance[*]Played around with linq: \[code\] IEnumerable<XElement> mainCats = from Category1 in doc.Descendants("product").Descendants("category") select Category1; var cDoc = new XDocument(new XDeclaration("1.0", "utf-8", null), new XElement("root")); cDoc.Root.Add(mainCats); cachedCategoryDoc = cDoc.ToString();\[/code\]Result was a "categories only" (not distinct values of categories or subcategories)Applied the same xlst to that, and got fairly better performance..... but still far from usable...Can i apply some sort of magic with the linq statement to have the desired output??A truckload of good karma goes out to the ones that can point me in det right direction..//SteenNOTE:
  • I am not stuck on using linq/XDocument if anyone has better options
  • Currently on .net 3.5, can switch to 4 if needed
 
Back
Top