Ticket #3083 (reopened defect)

Opened 17 months ago

Last modified 2 weeks ago

xmpppy bug

Reported by: bb Owned by: asterix
Priority: low Milestone:
Component: xmpppy Version: svn
Severity: critical Keywords: xml prefix
Cc: OS: All

Description

<message from='gajim@conference.gajim.org/zinid' type='groupchat' id='1' xml:lang='ru-RU'>\n <x:y/>\n</message>

This is why we need more mods in gajim@.

Attachments

xmpppy.diff (1.3 kB) - added by asterix 12 months ago.
stream-dropping-unbound-xmlns-prefix-fix.patch (2.5 kB) - added by bct 2 months ago.

Change History

  Changed 16 months ago by asterix

  • milestone 0.11.1 deleted

I don't see anything we can do. it's a parser error that can't be catched.

follow-up: ↓ 6   Changed 14 months ago by misc

  • os set to All

For the record, can someone explain the problem ?

The xml snippet just arrived from the muc componant and it did something wrong to gajim ( crash ? traceback ? error ? )

Has this problem is rated as "critical", it would be nice to have more information, or to close it according to comment 1.

  Changed 14 months ago by evgs

this problem is not directly related to muc - problem is mostly noticeable in muc:

if such message sent to conference gajim can't enter this conference until this message will be moved out from room history delivered at entering conference.

if muc is not used gajim only reconnects silently. but silent reconnect is critical because messages arrived into TCP buffer may be lost.

this bug is xmpppy bug, there are some another apps using xmpppy and affected the same problem.

stanza displayed above is valid, but not well-formed. most parsing library will return warnings if xml is not well-formed. but non-well-formed xml is fatal for xmpppy.

background: non-well-formed stanzas may be sent by j2j transport Jabber-To-Jabber Transport (GTalk, LiveJournal? inside), http://wiki.JRuDevels.org/index.php/J2J bug is in the deep of used library (Twisted), and can't be resolved without hard tracing.

follow-up: ↓ 26   Changed 12 months ago by tb0hdan@…

guys, check out this: http://svn.hypothetic.org/neutron/branches/gh0st-dev/modules/xmpp/dispatcher.py --- WBR, Gh0st, one of the Neutron developers

  Changed 12 months ago by asterix

it works nicely for the described problem, but it break things when we receive big stanza (vcard with avatar for exemple) those stanza are received in 2 (or more) part. and each part is tested and of course rejected as it's not a complete stanza, so they are discarded ...

in reply to: ↑ 2   Changed 12 months ago by bb

  • keywords xml prefix added

Replying to misc:

For the record, can someone explain the problem ?

Try it out yourself with the XML console, you'll notice the problem soon enough.

Anyway, as Asterix said, it's not a problem of code written by the Gajim team indeed. The faulty code is, however, shipped with Gajim and as such users are going to see this as a Gajim bug, no matter how you put it :/

If you want to fix this, you'll have to work on the parser code, I guesss. If you do not, all gajim clients will be an easy target to shoot down for everybody who wants to be annoying.

Changed 12 months ago by asterix

  Changed 12 months ago by asterix

here is a patch that doesn't check big packet (or packet for which previous is big ans it's the end of the big stanza, so invalid too)

That's definitly not very good, it's why I don't commit it

  Changed 9 months ago by anonymous

<presence><caps:c/></presence>

  Changed 8 months ago by steve-e

  Changed 2 months ago by js

  • status changed from new to closed
  • resolution set to fixed
  • milestone set to 0.12

(In [9859]) Don't disconnect on unbound prefixes. Fixes #3083. I can't believe this was unfixed for so long, as it's really a rather easy fix.

  Changed 2 months ago by bct

  • status changed from closed to reopened
  • resolution fixed deleted

http://tools.ietf.org/html/draft-saintandre-rfc3920bis-05#section-12.3

A XMPP entity MUST NOT accept XML data from another entity if that data is not well-formed in accordance with both the definition of "well-formed" in Section 2.1 of [XML] and the definition of "namespace-well-formed" in Section 7 of [XML-NAMES]. In an XMPP entity receives XML data that is not so well-formed, it MUST return an <xml-not-well-formed/> stream error and close the stream over which the data was sent.

This message is not "namespace-well-formed", so we're doing the right thing according to the spec.

It may make sense for us to handle this specific case if the server isn't following the spec (but IMO handling arbitrary XML-like strings is a path we don't want to go down).

  Changed 2 months ago by bct

(reopened because js reverted his fix)

  Changed 2 months ago by steve-e

Furtheremore the problem hasn't only to do with namespaces. Any malformed xml can cause the stream to drop.

  Changed 2 months ago by js

Sorry bct and steve-e, but your both wrong: The problem is not malformed XML. The server never routes that, it is required to drop that according to XMPP Core and all servers do so. But unbound namespaces MAY be routed, and ejabberd decides to do so. On malformed XML, we SHOULD disconnect and we do so, but on unbound namespaces we SHOULD just ignore that.

Quoting XMPP core: Except as noted with regard to 'to' and 'from' addresses for stanzas within the 'jabber:server' namespace, a server is not responsible for validating the XML elements forwarded to a client or another server; an implementation MAY choose to provide only validated data elements but this is OPTIONAL (although an implementation MUST NOT accept XML that is not well-formed). Clients SHOULD NOT rely on the ability to send data which does not conform to the schemas, and SHOULD ignore any non-conformant elements or attributes on the incoming XML stream. Validation of XML streams and stanzas is OPTIONAL, and schemas are included herein for descriptive purposes only.

This means we can receive invalid XML (for example, unbound namespace, which is the problem here), but NEVER malformed XML. Thus we handle malformed XML correctly, but don't ignore non-conformant like we should.

  Changed 2 months ago by bct

js, the stanza is *not* namespace well-formed.

http://www.w3.org/TR/REC-xml-names/

The Prefix provides the namespace prefix part of the qualified name, and MUST be associated with a namespace URI reference in a namespace declaration.

...

Definition: A document is namespace-well-formed if it conforms to this specification.

  Changed 2 months ago by js

By well-formed, one usually refers to the XML being not malformed, whereas invalid means something makes it non-conformant, for example namespaces like in this case.

  Changed 2 months ago by bct

"well-formed" = follows the XML spec "namespace-well-formed" = follows the XML namespaces spec

As I quoted in http://trac.gajim.org/ticket/3083#comment:11 , the revised RFC specifically says to close the stream if it isn't namespace-well-formed.

  Changed 2 months ago by js

The revised does? Well, that'd be a bad change, or will jabber servers filter it then? Can you please quote that part or give a link? If the server should filter it in the revised RFC, I'll talk to the ejabberd devs again.

  Changed 2 months ago by bct

http://tools.ietf.org/html/draft-saintandre-rfc3920bis-05#section-12.3

A XMPP entity MUST NOT accept XML data from another entity if that data is not well-formed in accordance with both the definition of "well-formed" in Section 2.1 of [XML] and the definition of "namespace-well-formed" in Section 7 of [XML-NAMES]. In an XMPP entity receives XML data that is not so well-formed, it MUST return an <xml-not-well-formed/> stream error and close the stream over which the data was sent.

So the server should be dropping it. I logged it as a bug against ejabberd: https://support.process-one.net/browse/EJAB-680

  Changed 2 months ago by bct

I wrote a test for this issue: source:trunk/test/test_dispatcher_nb.py

It should help shorten the debugging cycle. Run it with python test/test_dispatcher_nb.py

  Changed 2 months ago by js

bct, the problem is that this is only for the revised RFC, not for the current RFC :(. For the current RFC, we do it wrong whereas with the revised RFC, ejabberd does it wrong. So, this still is a issue on our end.

  Changed 2 months ago by bct

The current RFC isn't specific about what it means by "well-formedness". It's not wrong to drop the message, it's just using a strict interpretation. It makes sense for the server to reject it anyways IMO, there's no reason not to.

Anyhow, we have to deal with it so I've attached my patch attempt. It buffers the stream until it thinks it's reached the end of the bad stanza, drops the whole thing and starts parsing again.

I don't think dropped messages will even end up in the XML console, so it needs to be tested carefully.

Changed 2 months ago by bct

  Changed 2 months ago by bct

Did some digging in pyexpat's internals. It looks like we can turn off expat's namespace processing by removing the namespace_separator=' ' from line 297 of simplexml.py. I don't know what side effects this would have.

  Changed 2 months ago by js

That does not work as all our code relies on a namespace aware expat, I already tried it. Most of simplexml.py would need to be rewritten, I already had a few tries with it.

  Changed 2 months ago by asterix

  • priority changed from highest to low
  • milestone 0.12 deleted

we choosed to live with this big for the moment. That's no SO critical as that closes stream and we don't lloose anything.

in reply to: ↑ 4   Changed 2 weeks ago by anonymous

Add/Change #3083 (xmpppy bug)

Author



Change Properties
<Author field>
Action
as reopened
as The resolution will be set. Next status will be 'closed'
to The owner will change. Next status will be 'new'
 
Note: See TracTickets for help on using tickets.