On 12/09/28 18:42, Michael[tm] Smith wrote:
Dunno if you also looked at pdftohtml or some other PDF-to-HTML converter,
but that might be worth trying. Just an idea.

Another idea: A while back Jason Orendorff set up a fully automated way (I
think) to generate an HTML version of the EcmaScript 5.1 spec -- generated
from either the PDF or .doc sources. Result is here:


It might be possible repurpose whatever toolchain he set up to do that.


Thanks for the suggestions. However I think it would be more productive to make contact with someone who can fix the bugs in either OpenOffice or LibreOffice. They don't look hard to fix for someone who knows the code. The worst problem is that the hrefs in the table of contents all use different names with a different format than the anchor names. Writing a script to fix this up would be non-trivial while fixing the bug should be trivial.



