I have a task that requires me to pull in a set of xml files, which are all related, then pick out a subset of records from these files, transform some columns, and export to a a single xml file using a different format.I'm new to SSIS, and so far I've managed to first import two of the xml files (for simplicity, starting with just two files). The first file we can call "Item", containing some basic metadata, amongst those an ID, which is used to identify related records in the second file "Milestones". I filter my "valid records" using a lookup transformation in my dataflow - now I have the valid Item ID's to fetch the records I need. I funnel these valid ID's (along with the rest of the columns from Item.xml through a Sort, then into a merge join.The second file is structured with 2 outputs, one containing two columns (ItemID and RowID). The second containing all of the Milestone related data plus a RowID. I put these through a inner merge join, based on RowID, so I have the ItemID in with the milestone data. Then I do a full outer join merge join on both files, using ItemID.This gives me data sort of like this:
- ItemID[1] - MilestoneData[2]
- ItemID[1] - MilestoneData[3]
- ItemID[1] - MilestoneData[4]
- ItemID[1] - MilestoneData[5]
- ItemID[2] - MilestoneData[6]
- ItemID[2] - MilestoneData[7]
- ItemID[2] - MilestoneData[8]