So to continue on ways to convert existing documents to wiki code, next is formatted text documents, which is typically word DOC files, but may also be something like RTF files.
Most sites I found actually just instructed people to use a 2 step conversion. From Word to HTML and then to wiki code. While this may work, it’s much less efficient and I can imagine more things are lost in the process. Admittedly, the converters that I have found are all geared towards MediaWiki, so if you’re using a different wiki then these converters may not work so well. Nevertheless, MediaWiki provides a list of Word to Wiki converters the most basic of which does not seem to be specifically geared to MediaWiki.
OpenOffice Sun Wiki Publisher Plugin (MAC and Windows compatible, not sure about other platforms)
(the wiki converter is built-in, the publishing part of it is optional)
The downside of OpenOffice is that it does not always interpret word documents very well. Embedded images tend to turn into hex code (ex. ffd8ffe000104a46494600010201 etc.) and tables aren’t always interpreted correctly either. The one I tried turned into overlapping text. So, in part, the usefulness of the outputted wiki code will depend on how well OpenOffice has read the word DOC itself, but it should handle ODT and RTF just fine.
Word2MediaWikiPlus Macro (Windows Only)
Word is the better choice for documents that OpenOffice can’t seem to handle very well. There is also a Word2MediaWiki Macro which is easier to use, but does not convert tables or deal with images very well.
For the OpenOffice plugin, ‘special characters’ (used loosely here) sometimes turn into weird symbols or random special characters. As with the HTML converters from the last post, something like ’ (not straight apostrophe) gets changed into ‚Äô, or a bullet point (which isn’t recognized to be in a bulleted list) turns into ‚Ä¢.
The Word2MediaWikiPlus (W2MWP) converter is better at dealing with special characters. The macro will simply insert the character as is and at times put a nowiki tag around it, but regardless, it displays just fine.
For some reason, the W2MWP plugin turns text boxes into a single cell table and then repeats the same text again as regular text (not inside a table). The OpenOffice plugin strips the text of formatting and leaves it as regular text in the wiki output.
When tables are interpreted correctly, I think the OpenOffice plugin does a better job overall. The W2MWP macro is better at keeping formatting, such as colours and border style (below right), but OpenOffice one seems to interpret things inside a table better, such as type of lists (below left). (It’s supposed to be a bulleted list, not a numbered list.)
Needs Good Original Document Formatting
In both cases, the usefulness of the wiki code will depend on how well the original document was formatted. For example, in one of the documents I tested, a number of the number and bullet lists were not formatted as such, but instead, numbers and bullets were just manually added. In both plugins, they were considered to be regular text with a ‘special’ character or number at the beginning of it.
Whether the Word2Wiki or the OpenOffice plugin is better depends on your priorities. OpenOffice seems to interpret lists and text boxes better, and doing a replace all for characters that weren’t interpreted properly is a pretty quick step. W2MWP is better at keeping formatting and interpreting all characters. So, if you like the way your document looks and you want to keep it that way, use the W2MWP macro. The big downside of course is that it doesn’t work on MACs (which I’m using right now, yay for VMware). Nevertheless, my conclusion is that the DOC2Wiki Converters are useful, but may not be the optimal solution depending on how much you’re willing to install and play around with. And if the document isn’t formatted like it should be, then manual wiki formatting might be the way to go.