OmniWeb Bookmark Format

Andreas Åkre Solberg mac at solweb.no
Fri Aug 26 15:53:55 PDT 2005


I am sharing these discoveries for others parsing the bookmark files.
- - - -

I guess everybody hates the omniweb bookmark format. Not HTML and not  
XML!

I am working with parsing these files, and trust me it's alot of  
work. Could be done in 10 minutes if OmniGroup was using the XML  
standard.

The current status is, I needed to patch htmltidy to not throw  
exeptions when encountering the bookmarkinfo tag. This patched  
version got me a XML version of the bookmarks files. These files  
follows (close to) this DTD:

<!ELEMENT a (#PCDATA)>
<!ATTLIST a validator CDATA #IMPLIED>
<!ATTLIST a status CDATA #IMPLIED>
<!ATTLIST a checkfrequency CDATA #IMPLIED>
<!ATTLIST a wordcount CDATA #IMPLIED>
<!ATTLIST a lastcheckedtime CDATA #IMPLIED>
<!ATTLIST a bookmarkid CDATA #IMPLIED>
<!ATTLIST a href CDATA #IMPLIED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT html (#PCDATA | head | body)*>
<!ELEMENT head (#PCDATA | meta | title)*>

<!ELEMENT bookmarkinfo (#PCDATA | dl)*>
<!ATTLIST bookmarkinfo rootid CDATA #IMPLIED>
<!ATTLIST bookmarkinfo nextid CDATA #IMPLIED>

<!ELEMENT dl (#PCDATA | dd | dt)*>

<!ELEMENT dd (#PCDATA | h3 | dl)*>
<!ELEMENT dt (#PCDATA | a)*>

<!ATTLIST dl archivedictionary CDATA #IMPLIED>
<!ATTLIST dl delegateclass CDATA #IMPLIED>

<!ELEMENT h3 (#PCDATA | a)*>

-- 
Andreas Åkre Solberg
Andreas.Solberg at uninett.no
UNINETT - http://uninett.no

Contact Info and PGP Public Key:
http://andreas.solweb.no/?Account=Work





More information about the OmniWeb-l mailing list