Ulexdbulex
New Member
I am representing each XML document as a feature matrix in a csr_matrix format. Now that I have around 3000 XML documents, I got a list of csr_matrices. I want to flatten each of these matrices to become feature vectors, then I want to combine all of these feature vectors to form one csr_matrix representing all the XML documents as one, where each row is a document and each column is a feature. One way to achieve this is through this code \[code\]X= csr_matrix([a.toarray().ravel().tolist() for a in ls])\[/code\]where ls is the list of csr_matrices, however, this is highly inefficient, as with 3000 documents, this simply crashes!In other words, my question is, how to flatten each csr_matrix in that list 'ls' without having to turn it into an array, and how to append the flattened csr_matrices into another csr_matrix.Please note that I am using python with ScipyThanks in advance!