[ale] OT: Illegal XML Characters

Mike Millson mgm at atsga.com
Thu Mar 21 13:00:47 EST 2002


I am in a situation where I am having to parse the xml that another company
is passing to a client of mine. The data often contains illegal characters.
The most usual culprits are hex C0, 80, and b7. Having these bytes in the
xml stream causes my parser to die. I have run the xml through an
independent xml validator on the web, and the validator  says the xml is
bad. I have forgotten the error with C0 and 80. The message for b7 is
"Error: Input error: Illegal UTF-8 start byte <0xb7>."

I need to be able to clearly explain to the client that it is the other
company passing invalid xml, not my parsing that is at issue, and they
should validate their xml before sending it out the door. I'm going through
the W3C documentation and haven't anything to clearly explain (at least to
me) why the above characters are not legal.

Anyone have any ideas how to explain why C0, 80, and b7 are not valid xml
characters?

Thank you,
Mike


---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.






More information about the Ale mailing list