-
-
Notifications
You must be signed in to change notification settings - Fork 106
Update XMLStringifier.attEscape method #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
TY. Manually merged. |
Ah, here's the issue. I've just added a comment on the commit: cf22542#commitcomment-15572748 It's true that these characters are valid in attribute values, but that doesn't mean it isn't a good idea to escape them. Only when they are escaped will they survive without getting converted to spaces. Can we revert this change please? |
Here's an example with JavaScript (libxmljs):
Ruby (Nokogiri):
Python (ElementTree):
As you can see, the tab character doesn't survive. With some parsers (e.g. xmldoc, xml2js) it does survive, but only because those parsers don't conform to the spec. |
I like the current behavior since whitespace chars are valid according to the spec. However, I am not against escaping whitespace if it is done explicitly, i.e. controlled with a flag which defaults to false. Would you like to send another pull request if that is OK with you? |
Please reconsider:
|
Can you take a look at 272b53f and let me know what you think? This should be close to your previous PR #54 with one difference: Both the two char newline |
Hi, on a mobile phone so being brief. I don't think replacing CR like this is right. Firstly it can still leave newlines (as opposed to escaped newlines) in the result. Secondly the canonical representation spec (linked to above) says:
To me this seems pretty straightforward, just direct replacement. The spec you linked to applies only at the entity level as far as I can tell. I can't check right now how all the other implementations behave but I'd be surprised if they mangled CRLF like that. I can check early next week if it helps. |
According to http://stackoverflow.com/questions/449627/are-line-breaks-in-xml-attribute-values-valid and http://www.w3.org/TR/REC-xml/#NT-AttValue \r \n and \t chars are valid chars for attribute values.