[jdom-interest] Parsing HTML elements
jdom at tuis.net
Tue Nov 20 09:14:02 PST 2012
Hmmm not using the default API.
JDOM expects the getURI() method to have a value if there is a prefix for the attribute. This is reasonable... ;)
This indicates the sax stream is broken. JDOM should be throwing "Namespace URIs must be non-null and non-empty Strings".
If you cannot fic the SAX stream code, you can maybe write a proxy class that fixes the URIs as the events pass through.
Paul Libbrecht <paul at hoplahup.net> wrote:
Hello JDOm experts,
I'm hitting a wall here and I am not sure who is responsible.
Just like the previous series of post, I am trying to parse an HTML document.
In this case I use the CyberNeko HTML parser http://nekohtml.sourceforge.net/ which creates a SAX stream hence is easily convertible to a JDOM document.
Now, my big issue is that the document I have (which I cannot easily change right now) contains undeclared namespace-prefixed attribute-names!
Do I have a way to predefine the namespace somewhere?
thanks in advance
To control your jdom-interest membership:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the jdom-interest