Using text(), is there a way to convert empty text to 'None' with scrapy

zedstar

New Member
I'm running into a problem. The website xml I'm scraping has some values that are empty, but I need to preserve the order of the values.sample:\[code\]<thedata> <some-item> <value xsi:nil="true"/> <value xsi:nil="true"/> <value xsi:nil="true"/> <value xsi:nil="true"/> <value xsi:nil="true"/> <value>44</value> <value>32</value> <value>31</value> <value xsi:nil="true"/> <value xsi:nil="true"/> <value>32</value> <value>31</value> <value>34</value> <value>34</value> <value>33</value> </some-item></thedata>\[/code\]Doing \[code\]text()\[/code\] will ignore empty values:\[code\]class MySpider(XMLFeedSpider): name = 'myspider' start_urls = ['http://www.example.com/somexml.xml'] itertag = 'thedata' # Using XMLFeedSpider def parse_node(self, response, node): item_vals = node.select('some-item/value/text()').extract() print item_vals\[/code\]This will print a list which contains only values that have an integer.Since I need to preserve order, is there a way to tell scrapy to replace any empty values with \[code\]''\[/code\] or \[code\]None\[/code\]?EDIT:@unutbu: I'm still getting the same problem:\[code\] item_vals = node.select('some-item/value/text()').extract() print item_vals item_vals2 = node.select('some-item/value/text()').extract() or None print item_vals2\[/code\]Output:\[code\] [u'44',u'32',u'31',u'32',u'31',u'34',u'34',u'33'] [u'44',u'32',u'31',u'32',u'31',u'34',u'34',u'33']\[/code\]What I want is:\[code\] [None,None,None,None,None,u'44',u'32',u'31',None,None,u'32',u'31',u'34',u'34',u'33']\[/code\]Or something that represents an empty value when it is encountered.
 
Back
Top