LaTeX forum ⇒ LyXTutorial: export .docx or .odt with bibliography Topic is solved

Information and discussion about LyX, a WYSIWYM editor, available for Linux, Windows and Mac OS X systems.
juliendutant
Posts: 4
Joined: Sat Aug 27, 2016 1:46 pm

Tutorial: export .docx or .odt with bibliography  Topic is solved

Postby juliendutant » Sat Aug 27, 2016 6:13 pm

How to export to MS Word Office XML (.docx) and Libreoffice OpenDocument (.odt) with bibliography, almost out of the box with LyX!

I've been using and enjoying LyX for years, for its clean output and bibliography management (from LaTeX) and its convenient, distraction-free interface. However in my line of work (Humanities) I'm constantly required to produce MS Word documents. It's been a pain that there weren't simple and reasonable ways to convert LaTeX to Word output. In my case I didn't need some fancy conversion of LaTeX math or graphs. Only the basic layout (headings, italics, ...), citations (bibliography and citations), and cross-references within the document. Still it was far from straightforward to get it. I used latex2rtf, which did a good job but is not developed anymore and is becoming increasingly outdated - as well as hard to get up and running on Windows and Mac OS X.

Fortunately the trouble is over now. Thanks to improvements in LyX and a wonderful modern converter (pandoc), you can now export to MS Word and OpenDocument practically out of the box in LyX. With a few tweaks, the export will include your bibliography too. I think that's a dramatic improvement that makes it much easier for people to use LyX even if they're stuck in MS Word-based environments.

This tutorial details what you need to do to get the conversion up and running. I haven't found them elsewhere online, so hopefully they'll be useful to some. I'm pitching this to the non-tech savvy user, so bear with me if it's more detailed than you need.

Export to MS Word Office XML and to OpenDocument out of the box

Export to MS Word Office XML and OpenDocument now works practically out of the box in LyX, though without bibliography. Here are the steps:

1. Get the latest version of LyX. I think you need versions 2.1 or 2.2 or more recent for the out-of-the-box configuration of the pandoc converter to work. If you use older versions you need to enter the configuration manually, see below.

2. Install the document converter pandoc. Check out the installation's page of pandoc's website for instructions specific to your system.

For Mac OS X users: it's a good idea to install Homebrew first. That's a "package manager", a utility that installs and keeps updated various software, notably pandoc. Follow the installation instructions on Homebrew. Then follow the instruction on pandoc's website to install pandoc using homebrew. Note: currently pandoc's website only tells you to install pandoc by entering the command:
  1. brew install pandoc

in the Terminal. (The Terminal is in Applications > Utilities; you can use it to enter "commands". That is, you copy the text above, and click "Enter".) However that will not install pandoc-citeproc, which is needed to process bibliography. To get that as well you should enter
  1. brew install pandoc-citeproc
as well.

3. Launch's LyX autoconfiguration. To do this, open Lyx and go to Tools > Reconfigure.

That takes a few moments during which LyX scrutinizes the system and finds what tools and formats it can use. If all goes well it will detect pandoc and configure itself for using it. When the process is over, LyX displays a message that says that you must restart it to use the new configuration.

4. Restart LyX. In the menus: File > Quit LyX, or on Mac OS X: menu LyX > Quit LyX. Then start LyX again.

5. You should now be able to export a file in MS Word Office XML format and OpenDocument format. Open a document in LyX, or create a new one and save it in a location you can find. Then Choose File > Export > MS Word Office XML, or alternatively File > Export > OpenDocument (Pandoc). You will see processing messages at the bottom of the Window. When "Export successful" appears, go to the location of your LyX document. You should find next to it the converted .docx or .odt document.

If you do not see the options OpenDocument (Pandoc) and MS Word Office XML in the Export submenu, that means that pandoc has not been automatically configured. Follow the steps in "Older versions of LyX" below.

If export works correctly, but you want to export bibliographies as well, go on to step 11.

TROUBLESHOOTING

During step 5, you may get an error warning like "Error while running pandoc -s -f latex -o $$o -t docx $$i". It means that the pandoc converter encountered an error while processing the file that LyX sent it. The most likely cause for that is that LyX didn't export your file with the character encoding that pandoc requires, namely utf8. I detail the steps to fix that below in the post. The short version: in LyX go to Document > Settings, select "Language", and under "Encoding" select "Other: Unicode (utf8)". Click on "Save" to save the settings. Try to export again (step 5 above).

If you still get an error message, there's another problem with pandoc. To inquire into it, activate the Messages pane (View > Message Pane) and look at the error messages that appear when you try to do the pandoc export. Then search online or ask in forums to figure out what causes them.

OLDER VERSIONS OF LYX

Steps 6-10 below are only for older versions of LyX (2.1 and below?) that do not configure pandoc automatically. They should work, though I haven't tested them.

6. If not done already, install pandoc. See step 2 above.

7. In LyX open the Preferences box, Converters section. From the menu Edit > Preferences (Linux and, I think, Windows) or LyX > Preferences (Mac OS X). Then in the box select the section File Handling > Converter.

8. Create a converter from LaTex (plain) to MS Word Office XML. To do that:
a. Under "From format" select "LaTeX (plain)".
b. Under "To format" select "MS Word Office XML" or, if not available, "MS Word".
c. In the "Converter" field enter the following:
  1. pandoc -s -f latex -o $$o -t docx $$i

d. Click "Add" or "Modify" (one, and only one, of the two buttons will remain active. This will add ("Add") a converter or replace ("Modify") an existing converter for the same formats.

9. Create a converter from LaTex (plain) to OpenDocument. To do that:
a. Under "From format" select "LaTeX (plain)".
b. Under "To format" select "OpenDocument".
c. In the "Converter" field enter the following:
  1. pandoc -s -f latex -o $$o -t odt $$i

d. Click "Add" or "Modify" (one, and only one, of the two buttons will remain active. This will add ("Add") a converter or replace ("Modify") an existing converter for the same formats.

10. Click "Save" to save preferences and try to export a document. See step 5 above.

Export to MS Word or Libreoffice WITH bibliography

If your LyX documents typically include a bibliography you are or should be using a BibTeX bibliography file (.bib). Note that a BibTeX bibliography database can be used to insert references in Word or Libreoffice documents too, if you use a bibliography manager like https://www.zotero.org/. For more information on using BibTeX for bibliographies in LyX see LyX's User Guide, the section More Tools > Bilbliography > Bibliography databases (bibtex). To read the guide open LyX and go to Help > User Guide; you can find the relevant section by using the Navigate menu.

In its default setup for the pandoc converter, LyX exports to MS Word Office XML and OpenDocument do not include bibliographies. Perhaps that's a good thing, as bibliography files with the wrong encoding can cause an error that the average user will have a hard time to uncover (see below, utf8 encoding issue). Below are the steps to allow exports with bibliography.

Note for advanced users: This works even your .bib file is in a different folder, and whether its path is given in absolute or relative terms.

I assume that you have a LyX document with a bibliography in BibTeX. This requires a separate bibliography file with the extension .bib . BibTex files can be created using an ordinary text editor or a bibliography manager like JabRef or Zotero (for the latter, you need to Export as bibtex).

11. If not done already, install pandoc and pandoc-citeproc. See step 2 above. Depending on your system pandoc-citeproc is included in pandoc or must be installed separately.

12. If not done already, reconfigure LyX after pandoc installation. See step 3 above. If your version of LyX is old, you'll need to go through steps 7-10 instead.

13. Open the LyX Preferences box, Converters section. In LyX, open the Preferences box, via the menu Edit > Preferences (Linux and, I think, Windows) or LyX > Preferences (Mac OS X). In the Preferences box, select the File Handling section, and within that, the Converters section.

14. Change the LaTeX to MS Word converter.
a. Scroll down the list of converters and select “Latex (plain) -> MS Word Office Open XML”.
b. In the "Converter" field, replace the default command line:
  1. pandoc -s -f latex -o $$o -t docx $$i

with the following:
  1. pandoc -s --filter pandoc-citeproc -f latex -o $$o -t docx $$i

The "--filter pandoc-citeproc" addition tells pandoc to use the tool "pandoc-citeproc" to process bibliographic references.
c. Click the button "Modify".

15. Change the LaTeX to OpenDocument converter. As in the previous step, except that you scroll down instead to "Latex (plain) > OpenDocument (Pandoc)" and that you have "odt" instead of "docx" in the command lines. That is, you replace:
  1. pandoc -s -f latex -o $$o -t odt $$i

with the following:
  1. pandoc -s --filter pandoc-citeproc -f latex -o $$o -t odt $$i


16. Click "Save" to save the new preferences.

17. You should now be able to export your document with bibliography. See step 5 above. Look at the resulting OpenDocument or MS Word file, and see your references.

TROUBLESHOOTING

When you try to export you may get an error message: "Error while running pandoc -s --filter pandoc-citeproc -f latex -o $$o -t docx $$i." If so that's most likely because the files LyX sends to the converter (a LaTeX .tex file with your text and a BibTeX .bib file with your bibliography) are either or both in a wrong "encoding". Encoding are ways in which text files are encoded. Until recently .tex and .bib files were typically encoded in "ISO-8859 Latin text" but more the encoding "UTF-8" has become standard, as it handles pretty much any characters, including e.g. Chinese. Pandoc requires UTF-8 encoding, but LyX may need some set-up to send files in the proper encoding. If LyX sends a file with the wrong encoding, and the file contains complex characters such as é î ø etc., pandoc will not convert the file.

In short, if you encounter that error, try the steps below and try to export your file again.

Making sure your files are in the proper encoding

18. Make sure your bibliography file is encoded in UTF-8.

For Zotero users. If you created a bibliography file with Zotero from scratch, it's already in UTF-8 and you can skip to step 19. If, however, you've imported a non-UTF .bib file in Zotero, then Zotero has not converted it and your bibliography may contain non-UTF-8 characters. (You can spot them in Zotero: the little lozenges with a question mark inside). If so, you should export a .bib file from Zotero and then apply one of the conversion below. But in the longer term, you'll need to remove clean up your bibliography of non-UTF-8 characters.

For all users. Warning. Before attempting a conversion by one of the methods below, it's a good idea to make a back-up of your bibliography file.

Method 1. Use the http://www/jabref.org bibliography manager. Open your .bib file in JabRef, go to File > Database properties. At the top, entry "Encoding", select "UTF-8". Click Ok to exit the database properties, and then File > Save database. Your .bib is converted.

Note for running JabRef on Mac OS X. There may be a couple of hurdles when installing JabRef in OS X 10.10 or 10.11. Check http://www.jabref.org/faq/ (Mac OS X tab) to solve them.

Method 2. The .bib file is an ordinary text file. If you're advanced enough to search online how to check the encoding of a text file and convert it, that's an option. You'll typically find methods using the command line or some advanced text editor like EditPadLite on Windows.

19. Make sure that LyX sends a UTF-8 LaTeX to the converter.

In LyX, open the document you want to convert to MS Word or Libreoffice format. Go to Document > Settings, select the section "Language". Under “Encoding”, select “Other: Unicode (utf8)”. Click Ok to save the settings.

Exporting should now work (step 5 above). If you still get an error message, then there is some further problem with pandoc. Try viewing the message pane (see Troubleshooting section above) and look for help online and in forum.

20. (Optional) Make sure all your new LyX documents have the same setting.

In LyX, create a new document. Go to Document > Settings, and in the section "Language", under "Encoding" choose "Other: Unicode (utf8)". Click "Save as Document Defaults". Click Ok to save these settings. Close and discard the new document.

This ensures that all your new documents have the same UTF-8 setting that allows straightforward conversion to Libreoffice or MS Word with pandoc.

On using bibliography styles

After the set-up above you can straightforwardly export MS Word and Libreoffice files from LyX, with most formatting and bibliography references preserved. However, --- as of now --- the conversion will ignore your choice of bibliography style. That is because pandoc does not use bibtex bibliography styles. By default it will produce a bibliography following the Chicago manual of style author-year model.

There are fairly easy ways to get pandoc to generate the style you want, however. Here are two, one using LyX only, the other using the command line. The first step is common to both.

21. Get the CSL file for the bibliography format you want. Go to https://www.zotero.org/styles and look for the bibliography format you want (e.g. the journal name, Chicago manual of style, IEEE, APA, etc.). Click on the name to download a .csl file. Save the file in the same folder as your LyX file.

We'll refer to it below as
  1. mycitationfile.csl
, replace that by the actual name of your file, for instance or
  1. chicago-author-date.csl
.

22. Method 1, LyX only: adjust the converter command. Do as in step 14 or 15 above adding "--csl mycitationfile.csl" in the converter command. You only need to change the converter command for the format you're interested in, MS Word or Libreoffice. For Libreoffice for instance (step 15), you would replace the command:
  1. pandoc -s --filter pandoc-citeproc -f latex -o $$o -t odt $$i

with:
  1. pandoc -s --filter pandoc-citeproc --csl mycitationfile.csl -f latex -o $$o -t odt $$i


You can then export the document from LyX and the bibliographic style will be that of mycitationfile.csl. This method is good if you want to generate the file from LyX several times. The downside is that you have to change the LyX converter settings every time you need to export with another style format.

23. Method 2, command line: export in LaTeX and use pandoc from the command line. This method doesn't require any setting in LyX, that is, none of the steps 3-20 above. You only need to get the CSL file (step 21) and have pandoc installed (step 2).

In LyX, choose Export > LaTeX (plain). This will generate a file with a .tex extension next to your .lyx file. Say it is name
  1. mydocument.tex
(replace that by the actual name of your file below). Now open a terminal and navigate towards the folder where your file is. (If you don't know how to do that, search online "command line tutorial".) For OpenDocument conversion, enter the command:

  1. pandoc -s --filter pandoc-citeproc --csl mycitationfile.csl -f latex -o mydocument.odt -t odt mydocument.tex


For MS Word Office XML conversion, enter the command:

  1. pandoc -s --filter pandoc-citeproc --csl mycitationfile.csl -f latex -o mydocument.docx -t docx mydocument.tex


In case you're curious: "-s" means "make a stand-alone document", "--filter pandoc-citeproc" means "use the processor pandoc-citeproc (that generates a bibliograhy)", "--csl mycitationfile.csl" means "use the bibliographic format file mycitationfile.csl", "-f latex" means "convert from a document in latex, "-o mydocument.docx" means "save with filename mydocument.docx" (note, it will erase and replace any previous file will that name), "-t docx" means "convert to a document in MS Word Office XML" and "mydocument.tex" is the name of the file to be converted.

This generates .docx or .odt document in the same folder, with the requisite bibliography format style.

The second method is good if you want to try several bibliographic formats quickly and if you're comfortable with the command line.

Tip: store your CSL files in a dedicated folder

The methods above assume that you have the requisite CSL bibliography format file in the same folder as your LyX document. If you don't want to clutter your folders with CLS file documents, you can locate them elsewhere and modify the pandoc command accordingly. For instance, suppose you have a folder named "Papers", with one sub-folder for each paper you're working on, say a sub-folder "Wombats" with file wombat.lyx and a subfolder "Otter" with file otter.lyx. Say you want to export both of them with the apa bibliographic style, for which you have the file. Here's a solution:

- create a subfolder within Papers, call it "CSL".
- put in the CSL folder.
- in the Lyx Converter command (method 1, step 21) or the command line (method 2, step 22), replace:
  1. --cls apa.csl

with:
  1. --cls ../CSL/apa.csl

(Note, if you're using the command line, you must be located in one of the subfolders of paper, say wombat.)

This tells pandoc to go and look for apa.cls at the location "../CSL/". "../" tells it to go one folder up from the current folder (in LyX, the current folder is the one where you document is) and "CSL/" tells it to go to the folder CSL/ below that. So from Papers/Wombat/, for instance, that leads you first to Papers/ then to Papers/CSL/, where pandoc will find the file.

It's also possible to give the absolute location of the .csl file, for instance:

  1. -csl /Users/arthur/Papers/CLS/apa.csl


Note that if the path contains spaces, you should put it in double quotes:

  1. --csl "C:\My Documents\styles\apa.csl"


Absolute paths are useful if you use the LyX converter command (method 1, step 22) to convert several LyX files scattered in a variety of folders and subfolders. Relative paths are useful if you use the same folders on different computers (e.g. with Dropbox) and if your LyX files are at the same depth.

That set-up allows you to convert various files to MS Word Office XML or OpenDocument formats with the APA bibliographic style without having to scatter the apa.csl in all your working folders.
Last edited by juliendutant on Mon Aug 29, 2016 10:03 am, edited 1 time in total.

Tags:

User avatar
Stefan Kottwitz
Site Admin
Posts: 8075
Joined: Mon Mar 10, 2008 9:44 pm
Location: Hamburg, Germany
Contact:

Postby Stefan Kottwitz » Sat Aug 27, 2016 9:25 pm

Hi Julien,

welcome to the forum!

That is a very thorough and detailed tutorial. Thank you for this!

Stefan
Site admin

scottkosty
Site Moderator
Posts: 470
Joined: Sat Sep 01, 2012 6:38 am

Postby scottkosty » Sun Aug 28, 2016 3:45 pm

This looks great! Can you make a wiki page on http://wiki.lyx.org/ ?


scottkosty
Site Moderator
Posts: 470
Joined: Sat Sep 01, 2012 6:38 am

Postby scottkosty » Thu Sep 01, 2016 11:51 pm



Thanks so much for taking the time to do that! I'm sure a lot of users will find that helpful.


Return to “LyX”

Who is online

Users browsing this forum: No registered users and 1 guest