Opened 11 years ago

Closed 9 years ago

#11593 closed enhancement (fixed)

Add support for wxXML_PI_NODE to wxXmlDocument

Reported by: rapses Owned by:
Priority: low Milestone: 2.9.2
Component: base Version: stable-latest
Keywords: wxXML XML wxXML_PI_NODE processing node Cc: vslavik@…, paul@…
Blocked By: Blocking:
Patch: yes

Description

As far as I know from looking at the source of xml.cpp there is no handling of wxXML_PI_NODE

You can do things like that:

------------------------------

wxXmlDocument doc;
Create the Processing node, enabeling the xsl transformation
<?xml-stylesheet type="text/xsl" href="File.xsl" ?>
wxXmlNode* style = new wxXmlNode(NULL,wxXML_PI_NODE,wxT("xml-stylesheet"));
style->AddProperty(wxT("type"),wxT("text/xsl"));
style->AddProperty(wxT("href"),wxT("file.xsl"));               
doc.SetRoot(style);
doc.SetFileEncoding(wxT("UTF-8"));
doc.SetVersion(wxT("1.0"));
doc.Save(wxT("file.xml"));

-------------------------------

Instead of adding the node, this will just add an empty line to the file.

I think this is important for handling XML and it would be really nice to have that working in wx3.0

kind regards,

Chris

Attachments (1)

xmlip.patch download (23.0 KB) - added by nickmat 10 years ago.
Add prolog and IP nodes to wxXmlDocument

Download all attachments as: .zip

Change History (19)

comment:1 Changed 11 years ago by vadz

  • Milestone 3.0 deleted
  • Priority changed from high to low
  • Status changed from new to confirmed
  • Summary changed from wxXML_PI_NODE Handling to Add support for wxXML_PI_NODE to wxXmlDocument

There indeed doesn't seem to be any support for this in wxXmlDocument. Presumably you could use Expat API directly though.

Anyhow, I agree that some sort of support for this could be nice. However AFAIK there are no plans to implement it currently, so the only hope for having this in 3.0 would be for someone else to submit a patch implementing it.

comment:2 Changed 11 years ago by nickmat

  • Patch set
  • Version changed from 2.8.x to 2.9-svn

Hi all,

Since I've already implemented this in my own project it seems a good opportunity to learn a few new skills and contribute a patch.

The patch creates wxXML_PI_NODE type nodes but since these are often included in the documents prolog before the root element, I've also added that.

To include the prolog I've added a wxXMLDOC_INCLUDE_PROLOG_NODES flag to the wxXmlDocument::Load member. When added the m_root node will point to the 1st item in the prolog (or the root element as normal if there is no prolog). Since in these circumstances GetRoot() will not always return a wxXML_ELEMENT_NODE type node, I've added the member GetRootElement() which will. Note that if the wxXMLDOC_INCLUDE_PROLOG_NODES flag is not used, GetRoot() and GetRootElement() are identical.

The patch has the bonus that comments before (and after) the root element can now be included in the document - so removing a FIXME from the code.

I hope this works and is useful, please let me know if it can be improved.
Regards,
Nick Matthews

comment:3 Changed 11 years ago by vadz

  • Cc vslavik@… added
  • Milestone set to 2.9.1

Patch looks very good to me, thanks!

If there are no objections (especially from Vaclav) I'm going to commit it soon, thanks!

comment:4 Changed 10 years ago by nickmat

I've just updated the patch to remove a memory leak.

comment:5 Changed 10 years ago by vaclavslavik

I have some serious issues with the patch:

  1. I don't think it's good idea to make this conditional on a flag; IMNSHO the prolog should always be read and written, it's regular part of the document.
  1. I understand (well, assume) that you did that so that you can assign very different meaning to GetRoot(). I think that's a bad idea too, the meaning of a method should not depend on the flags used to create the object. If nothing else, it makes following the code harder. To make the ensuing confusion even worse, the meaning of SetRoot() didn't change.
  1. Finally, the method used to read the prolog shouldn't be called GetRoot() (which takes care of point 2. above). The meaning of word "root" in XML specification is clear, it's the document element, i.e. what wxXmlDocument::GetRoot() returns (without your patch). Virtually all XML APIs I've seen deal with this by having node-like API in the document class too, and I think we should follow suit. Or, we may have GetProlog() method that may return NULL and is documented to return root siblings; something sensible would have to be done with SetRoot().

And a nitpick:

  1. Documentation is missing @since tags.

comment:6 Changed 10 years ago by vadz

  • Milestone 2.9.1 deleted

comment:7 follow-up: Changed 10 years ago by pjs

  • Cc paul@… added

Please consider adding this patch, or some support for wxXML_PI_NODE.

I'm currently working on an application that integrates with QuickBooks using Intuit's XML-based interface. QuickBooks absolutely will not accept XML unless it has a "qbxml" processing instruction specifying the version used, such as <?qbxml version="2.0"?>

I ended up using this ugly hack.

bool save_xml(wxXmlDocument &doc, const wxString &filename, const wxString &ver)
{
        wxMemoryOutputStream out;
        if (!doc.Save(out)) return false;
        FILE *fp;
        fp = fopen(filename.c_str(), "wb");
        if (!fp) return false;
        wxMemoryInputStream in(out);
        wxTextInputStream text(in);
        wxString line;
        int count = 0;
        while (!in.Eof()) {
                line = text.ReadLine();
                if (!line.IsEmpty()) {
                        fprintf(fp, "%s\n", line.c_str());
                        count++;
                        if (count != 1) continue;
                        fprintf(fp, "<?qbxml version=\"%s\"?>\n", ver.c_str());
                }
        }
        fclose(fp);
        return true;   // doc.Save(filename) would have been much simpler!
}

comment:8 in reply to: ↑ 7 Changed 10 years ago by vaclavslavik

Replying to pjs:

Please consider adding this patch

Did you read the comments above? We did consider it, but the patch has some quality issues that prevent it from being applied. If you want this included in wx, the best you could do would be to fix the problems described above yourself and submit updated patch.

comment:9 Changed 10 years ago by vadz

This is a nice patch and would be a pity for it to bit rot but unfortunately I just don't have time/motivation to improve it myself. Any volunteers for this would be really welcome.

comment:10 Changed 10 years ago by nickmat

I will take another look at this. It seems to be a problem of nomenclature since I was using "root" to mean the root entity, whereas vaclavslavik clearly intends it to mean the root element.

The problem is the member function DetachRoot and its partner SetRoot. I suspect that, dispite their name, anyone using these functions are actually expecting them to be handling the entire document contents and not just the root element. So when the prolog gets left behind it could have some unexpected consequences. Trailing comments also become a problem as these will need special handling - since detaching the root element effectively attaches them to the end of the prolog.

Nick

comment:11 Changed 10 years ago by vadz

I think that any existing code using these functions simply can't rely on prologue being returned by them because this isn't supported by the current code anyhow. So from compatibility point of view it seems clearly better to keep it this way. I don't care that much about compatibility in this concrete case though, I don't think there is that much code using these functions, but if the correct (at least according to Vaclav) behaviour is also backwards compatible, I really don't see any reason to not do it like this.

BTW, if you think the correct meaning of "root" is different I'd like to see any pointers to the relevant documentation/standards/whatever (this is not necessary for the patch, it's just out of personal curiosity).

comment:12 Changed 10 years ago by nickmat

I don't think there is a correct meaning of the word "root", when the XML spec uses the word it is always fully qualified. The root element is a key part of a XML document but the spec also talks of the root entity, as in Section 2 Documents: ... A document begins in a "root" or document entity. ... and again in section 4.8 [Definition: The document entity serves as the root of the entity tree and a starting-point for an XML processor.] ... So I don't my use of root as root entity is wrong, however, on reflection, it was probably not wise. I think Vaclav is correct insofar as most people will associate "root" with "xml" with the root element.

My understanding of the XML model is that the document (which is itself a node) has a list of children consisting of the various nodes making up the prolog followed by the root element node and possibly followed by more comment nodes. So based on that understanding I propose making the following changes to the patch.

  1. Remove the wxXMLDOC_INCLUDE_PROLOG_NODES flag.
  1. Ignore trailing comments [Although legal, taking account of them is probably more trouble than it's worth.]
  1. Scrap the new member GetRootElement() and instead add two new members to wxXmlDocument, wxXmlNode *DetachChildren() and SetChildren(wxXmlNode*) [I've used the term Children based on the document also being a node.]
  1. Rename the wxXmlDocument private member m_root to m_children.
  1. Modify the DetachRoot and SetRoot member functions to detach/attach the root element from/to the end of prolog chain. [I personally would also depreciate them - but I'm happy to take instruction on this.]

comment:13 follow-up: Changed 10 years ago by nickmat

Actually I've just realized we need some more access functions for the Documents children. Ideally, the wxXmlDocument would be derived from a wxXmlNode when these functions would come for free, but that is probably a change too far.

I will try to get something working before coming back.

Nick

comment:14 in reply to: ↑ 13 Changed 10 years ago by vaclavslavik

Replying to nickmat:

wxXmlDocument would be derived from a wxXmlNode

No. A document is not a node (and doesn't have all its things), it doesn't make sense to do it like this.

As for modifying the patch, I think I already described what an acceptable patch should look like and at first glance, your suggestion appears to fulfill it (if it's backward compatible).

Changed 10 years ago by nickmat

Add prolog and IP nodes to wxXmlDocument

comment:15 Changed 10 years ago by nickmat

I've updated the patch which now does the following to the wxXmlDocument class.

  1. The private member m_root is replaced by private member m_docNode. This points to node of type wxXML_DOCUMENT_NODE which is created (and removed) as required. It acts as the parent of the prolog and root element entities as well as any trailing comments.
  1. Three new member functions have been added to allow handling of the document tree in its entirety: SetDocumentNode, GetDocumentNode and DetachDocumentNode.
  1. A new member function AppendToProlog has been added to simplify creating the prolog
  1. The SetRoot, GetRoot and DetachRoot member functions have been modified to work on the root element which is now a child of the document node. These functions are backwards compatable but the do not have any effect on the document node or prolog (other to update them as necessary).
  1. The Load and Save member functions now load and save the document prolog and Processing Instructions (PIs).

I've tested this on Windows 7, compiled with VC9. Apart from the CppUnit unit test (which I have added to) I've also tested the xrc and richtext samples. If there any other samples using XML I can test them as well.

Let me know of any changes you require.

comment:16 Changed 10 years ago by vadz

  • Milestone set to 2.9.2

The patch looks good to me so I'm going to apply it soon if I don't hear any objections.

Thanks!

comment:17 Changed 9 years ago by VZ

(In [67345]) Fix memory leaks in wxXml unit test.

Ensure that the root wxXmlNode is deleted by storing it in a wxScopedPtr
instead of a raw pointer.

See #11593.

comment:18 Changed 9 years ago by VZ

  • Resolution set to fixed
  • Status changed from confirmed to closed

(In [67346]) Add support for elements preceding the document node in wxXML.

This is mainly useful for parsing and generating processing instructions but
can be used for any kind of elements, e.g. also comments, occurring before the
document node in XML documents.

Closes #11593.

Note: See TracTickets for help on using tickets.