Fonts & Character Setshyperref | Problem with "ß" in PDF Meta Data

Information and discussion about fonts and character sets (e.g. how to use language specific characters)
Post Reply
Bob_the_Topper
Posts: 15
Joined: Thu May 03, 2012 4:54 pm

hyperref | Problem with "ß" in PDF Meta Data

Post by Bob_the_Topper »

Hi!

I'm using LaTeX to generate a PDF and need to use some special characters from the german language. That works with the document itself but I have problems using "ß" in the PDF meta data, they get stripped. Other special characters like "ä" or "Ü" work though.

Here's a pretty minimal example:

Code: Select all

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage[ngerman]{babel}

%\usepackage{hyperxmp}	% provide PDF with XMP metadata
\usepackage[%
 pdfauthor={äÄöÖüÜß},
 pdftitle={äÄöÖüÜß},
 pdflang={de},
 pdfpagelabels=true,
% unicode=true,
]{hyperref}
 
\begin{document}
äÄöÖüÜß --- blah blah blah
\end{document}
In the generated PDF title and author will read "äÄöÖüÜ". During the pdflatex run I also get warning concerning this:

Code: Select all

Package hyperref Warning: Token not allowed in a PDF string (PDFDocEncoding):
(hyperref)                removing `\T1\ss' on input line 2864.


Package hyperref Warning: Token not allowed in a PDF string (PDFDocEncoding):
(hyperref)                removing `\T1\ss' on input line 2864.
Where do the extremely large line number come from? Why are some special chars ok in PDF meta data and others are not? Anything I can do about this?

Recommended reading 2024:

LaTeXguide.org • LaTeX-Cookbook.net • TikZ.org

NEW: TikZ book now 40% off at Amazon.com for a short time.

User avatar
localghost
Site Moderator
Posts: 9202
Joined: Fri Feb 02, 2007 12:06 pm

hyperref | Problem with "ß" in PDF Meta Data

Post by localghost »

The solution to fix this problem is divided into two parts.
  1. For accented characters from other languages than English it needs not matter what the »unicode« option.
  2. The actual meta data like title and author should be given by the \hypersetup after loading the package to make sure that Unicode is active.
Both points are incorporated in the below code.

Code: Select all

\documentclass[ngerman]{scrartcl}
\usepackage[T1]{fontenc}
\usepackage{selinput}     % Replacement for »inputenc«
\SelectInputMappings{     % Semi-automatic input selection
  adieresis={ä},          % by a list of selected glyphs
  germandbls={ß},         % see: http://partners.adobe.com/public/developer/en/opentype/glyphlist.txt
  Euro={€}
}
\usepackage{babel}
\usepackage{lmodern}

\usepackage[%
  pdfpagelabels=true,
  unicode=true
]{hyperref}

\hypersetup{
  pdfauthor={äÄöÖüÜß},
  pdftitle={äÄöÖüÜß},
  pdflang={de}
}

\begin{document}
  äÄöÖüÜß
\end{document}
For a deeper understanding please read the manuals of the involved packages, especially the hyperref manual.


Best regards and welcome to the board
Thorsten
Bob_the_Topper
Posts: 15
Joined: Thu May 03, 2012 4:54 pm

hyperref | Problem with "ß" in PDF Meta Data

Post by Bob_the_Topper »

localghost wrote:The solution to fix this problem is divided into two parts.
  1. For accented characters from other languages than English it needs not matter what the »unicode« option.
  2. The actual meta data like title and author should be given by the \hypersetup after loading the package to make sure that Unicode is active.
Thanks a lot!

I had already experimented with the »unicode« option, but it didn't work. Most likely because I used all the options when loading the package. With this "split" everything works fine now :-)

[Addition:
More or less by accident I found that »unicode« is not what made the "ß" work. It seems to be enough to use \hypersetup for the meta data declaration:

Code: Select all

\documentclass{article}

\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\usepackage[ngerman]{babel}

\usepackage [
 pdftitle={äÄöÖüÜß}
]{hyperref}

\hypersetup{
 pdfauthor={äÄöÖüÜß}
}
 
\begin{document}
äÄöÖüÜß --- blah blah blah
\end{document}
Here the PDF title is still truncated but the author is complete, including the "ß". I kept the »unicode« in my main document nevertheless, just thought this was interesting...]
Post Reply