Fonts & Character SetsIssue with \input and French accents

Information and discussion about fonts and character sets (e.g. how to use language specific characters)
Post Reply
mèlèth
Posts: 6
Joined: Wed Oct 01, 2014 10:24 am

Issue with \input and French accents

Post by mèlèth »

Hello everyone, I'm new to LaTeX and I hope one of you could help me out.

I'm using inputenc so that I can type accents like é, è, à directly. I have no problem with all the characters that I type inside my main file. I separate all my chapters in different files and use \input to put everything together. However, the accented characters from my separate files don't work when I output my LaTeX into dvi or pdf (é will turn into é, etc).

As of now I gave up on using \input and am forced to type everything inside the same file but the report is rather large and I'd prefer if I could split it.

Thank you in advance

Recommended reading 2024:

LaTeXguide.org • LaTeX-Cookbook.net • TikZ.org

NEW: TikZ book now 40% off at Amazon.com for a short time.

And: Currently, Packt sells ebooks for $4.99 each if you buy 5 of their over 1000 ebooks. If you choose only a single one, $9.99. How about combining 3 LaTeX books with Python, gnuplot, mathplotlib, Matlab, ChatGPT or other AI books? Epub and PDF. Bundle (3 books, add more for higher discount): https://packt.link/MDH5p

Johannes_B
Site Moderator
Posts: 4182
Joined: Thu Nov 01, 2012 4:08 pm

Re: Issue with \input and French accents

Post by Johannes_B »

Hi and welcome,

that is a very common beginners problem.
Your main file is (i guess) latin1 encoded, the other files you input are utf8 encoded.
- Make a backup copy of your files.
- Open the main file with your favourite text editor (notepad, mousepad, gedit, ...)
- Open your LaTeX-editor, go to the settings and switch to utf8 encoding.
- create a new file, copy/paste the contents of the main file, save.
- change the option of package inputenc to utf8.
- Try it out. Works? Perfect. Something went wrong? Good thing you made the backup.
The smart way: Calm down and take a deep breath, read posts and provided links attentively, try to understand and ask if necessary.
mèlèth
Posts: 6
Joined: Wed Oct 01, 2014 10:24 am

Re: Issue with \input and French accents

Post by mèlèth »

I should have specified that I'm working on Windows. So far it seems that inputenc[utf8] doesn't work on Windows. Am I wrong?
User avatar
Johannes_B
Site Moderator
Posts: 4182
Joined: Thu Nov 01, 2012 4:08 pm

Issue with \input and French accents

Post by Johannes_B »

Yes you are ;-)

Windows is more than capable of using utf8 (it uses utf16 internally at a very low level).

But be aware, that the syntax is

Code: Select all

\usepackage[utf8]{inputenc}
Copy the following example, save and compile it. Remember, the editor must save it as utf8.

Code: Select all

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\begin{document}
á è œ â ü
\end{document}
The smart way: Calm down and take a deep breath, read posts and provided links attentively, try to understand and ask if necessary.
mèlèth
Posts: 6
Joined: Wed Oct 01, 2014 10:24 am

Re: Issue with \input and French accents

Post by mèlèth »

Saving my main file in utf-8 format has solved the issue, thank you very much!

Is there any reason as to why tutorials everywhere tell me to use latin1 with inputenc instead of utf8?
User avatar
Johannes_B
Site Moderator
Posts: 4182
Joined: Thu Nov 01, 2012 4:08 pm

Issue with \input and French accents

Post by Johannes_B »

Latin1 was the quasi standard for a pretty long time. It has 8bit (meaning 256 slots) that can map a letter/symbol to a codepoint. Latin1 catches only the western european languages (have a look at the wikipedia entry). So if you had a french, english, german tutorial, it most likely recommended latin1. Let's count, the first 32 codepoints are special, the latin base alphabet has 26 letters, makes 84 in total, 10 numbers and the stuff that's there using shift. Makes 104 in total. Some other stuff, numbers keep adding up quickly. And know comes the big point of change.
For example, the greeks have the latin base alphabet (the first 128 are the same), but they don't need accented letters, they need greek letters. The russians need cyrillic letters.
This made transferring information a bit compllicated, so a new standard was born: Unicode (and a subset utf8). utf8 can not only map latin letters, it also includes greek, cyrillic, cjk, mathematical symbols ... even klingon.

Modern engines like XeLaTeX and LuaLaTeX are designed to use unicode and expect utf8 input. No need to use package inputenc anymore. You can do this.

Code: Select all

\documentclass{article}
\usepackage[english,russian,greek,ngerman,french]{babel}
\usepackage{fontspec}
\setmainfont{Linux Libertine O}
\begin{document}

UTF-8  est un codage de caractères informatiques conçu pour coder
l’ensemble des caractères du «répertoire universel de caractères
codés», initialement développé par l’ISO dans la norme
internationale ISO/CEI 10646, aujourd’hui totalement compatible
avec le standard Unicode, en restant compatible avec la norme
ASCII limitée à l’anglais de base (et quelques autres langues
beaucoup moins fréquentes), mais très largement répandue depuis
des décennies.

\selectlanguage{ngerman}
UTF-8  ist die am weitesten verbreitete Kodierung für
Unicode-Zeichen (Unicode und UCS sind praktisch identisch). Die
Kodierung wurde im September 1992 von Ken Thompson und Rob Pike
bei Arbeiten am Plan-9-Betriebssystem festgelegt. Die Kodierung
wurde zunächst im Rahmen von X/Open als FSS-UTF (filesystem safe
UTF in Abgrenzung zu UTF-1, das diese Eigenschaft nicht hat)
bezeichnet, in den Folgejahren erfolgte im Rahmen der
Standardisierung die Umbenennung auf die heute übliche
Bezeichnung UTF-8.

\selectlanguage{greek}
Το UTF-8  είναι ένα μη-απωλεστικό σχήμα κωδικοποίησης χαρακτήρων
μεταβλητού μήκους για το πρότυπο Unicode που δημιουργήθηκε από
τους Ken Thompson και Rob Pike. Χρησιμοποιεί ομάδες από byte για
να αναπαραστήσει τα κωδικά σημεία του Unicode. Είναι ιδιαίτερα
χρήσιμο για μετάδοση δεδομένων σε 8bit συστήματα ηλεκτρονικού
ταχυδρομείου.

\selectlanguage{russian}
UTF-8  — одна из общепринятых и стандартизированных кодировок
текста, которая позволяет хранить символы Unicode.

\selectlanguage{english}
\[  ⑴  a√∂b ∓ ∑m → ℏ/∞² \]

Equation ⑴  is complete and utter nonsense. 
\end{document}
which gives the attached pdf
Attachments
melethAccentsExt.pdf
(35.18 KiB) Downloaded 432 times
The smart way: Calm down and take a deep breath, read posts and provided links attentively, try to understand and ask if necessary.
User avatar
Stefan Kottwitz
Site Admin
Posts: 10335
Joined: Mon Mar 10, 2008 9:44 pm

Re: Issue with \input and French accents

Post by Stefan Kottwitz »

Nice explanation! Btw. I like the upside down letters in Unicode. No idea, if inputenc might support that. :-)

Best regards, and welcome to the forum, mèlèth!

Stefan
LaTeX.org admin
mèlèth
Posts: 6
Joined: Wed Oct 01, 2014 10:24 am

Re: Issue with \input and French accents

Post by mèlèth »

That was very helpful, thank you once more. :)
Post Reply