SQL/XML Data Management - Formatting - Escaping

fady88

New Member
I've been working on a basic Content Management System lately which would allow my users to upload the fallowing type of contents:
  • Chat messages : I'm handling this SQL-only, connecting to a separate DB width only SELECT/UPDATE privilages.
  • Articles : I'm using an SQL/XML mix for this. I use SQL to store all datas about an article which are 'sensible' (e-mails, authorship ecc.) and those who will come in handy within an SQL query (date/time, tags, cathegory ecc.). The body of the article, together with other informations on how to display it, its structure and a list of associated resources (galleries, videos, podcasts) is kept inside and XML file.
  • Comments : associated with articles but completely SQL-handled.
Since I'm working with datas coming from different sources and ending up in different places I'm trying to figure out a system to accomplish the fallowing results:
  • Prevent injections
  • Prevent any content to be active at HTML rendering time : i.e. whatever you insert will be rendered exactly as you inserted it avoiding eventual tags to become part of the page's HTML structure
  • Safely store datas inside XML documents which may be displaied via brawser's default parser : I know for examlpes & and # symbols without an pattern cause parsing errors when displaying XML files
  • Perform a check on any user submitted data before saving them.
What I'm doing so far is this:[*]Retrive the data from the input form via Java Script[*]Execute a formatting sequence of string replaces to replace any symbol which is not a letter (lower or capitalized) or a number with its equivalent[*]Sending the formatted data over to the server[*]Escaping the received data from all the unexpected symbols (everything [^(a-Z)(0-9)\&\#\;])[*]Comparing the escaped string length with the original length if not matching flag the data as corrupted.This way I think I can accomplish what fallows:
  • Since every non a-Z 0-9 symbol is encoded I don't have any problems with HTML encoding characters such as '
 
Back
Top