Text FormattingDon't split words on certain letters

Information and discussion about LaTeX's general text formatting features (e.g. bold, italic, enumerations, ...)
Post Reply
eleanor
Posts: 19
Joined: Sat Oct 03, 2009 6:03 pm

Don't split words on certain letters

Post by eleanor »

Hi,

I'm writing a document in latex. It's in my native language, Slovenian. And the rules for splitting words across multiple lines are a little bit different: the splitted word in new line should not start with letters: a,e,i,o,u and there are some other rules.

Is there any way to let Latex know about these rules? Does latex have a command, which can declare appropriate rules in any language, including Slovenian?

If not, how can I prevent Latex from splitting words on such letters. And I want a solution that affects whole document correctly, I don't want to mark each word separately.

Thanks

Recommended reading 2024:

LaTeXguide.org • LaTeX-Cookbook.net • TikZ.org

NEW: TikZ book now 40% off at Amazon.com for a short time.

User avatar
Stefan Kottwitz
Site Admin
Posts: 10319
Joined: Mon Mar 10, 2008 9:44 pm

Don't split words on certain letters

Post by Stefan Kottwitz »

Hi Eleanor,

I recommend to use babel:

Code: Select all

\usepackage[slovene]{babel}
or polyglossia, the latter with XeTeX.

You could implement your own special hyphenation rules using the pre_linebreak_filter callback (or a token_filter callback) with LuaTeX. Or, with XeTeX, your could use \XeTeXinterchartoks. In any way, you could insert \nobreak commands.

Stefan
LaTeX.org admin
eleanor
Posts: 19
Joined: Sat Oct 03, 2009 6:03 pm

Don't split words on certain letters

Post by eleanor »

Hi, thanks for quick reply.

I'm already using:

Code: Select all

\usepackage[slovene]{babel}
And from your answer I guess, this should be enough. But it isn't. The words are still splitted wrongly. A small example is the following code:

Code: Select all

\documentclass[12pt,a4paper,openany]{book}
\usepackage{fancyhdr}
\usepackage{graphicx,epsfig}
\usepackage[slovene]{babel}
\usepackage{longtable}
\usepackage[raggedrightboxes]{ragged2e}


\begin{document}
\chapter*{Testing}
  \begin{longtable}[l]{p{3.5cm}p{3.5cm}p{8cm}}
    \hline
    \textbf{Napad z vrinjenjem zlonamerne kode} & Cross-site scripting\newline XSS & Napad na spletno stran z vrinjenjem zlonamerne kode, napisane v skriptnem jeziku, npr. z namenom kraje piškotkov. \\\hline
  \end{longtable}

\end{document}
When running pdflatex test.tex, the third column contains the following text:
Napad na spletno stran z vrinjenjem zlon-
amerne kode, napisane v skriptnem jeziku,
npr. z namenom kraje pikotkov.
Notice that the word "zlonamerne" is splitted on the 'a' boundary. The letter 'a' should not be put into the next line, but should be on the previous line or the word should be put into the new line alltogether.

Any ideas?
User avatar
Stefan Kottwitz
Site Admin
Posts: 10319
Joined: Mon Mar 10, 2008 9:44 pm

Re: Don't split words on certain letters

Post by Stefan Kottwitz »

Sure, the other ideas I listed above. Perhaps have a look at the links to XeTeX and LuaTeX, to evaluate if it's worth working it out.

Stefan
LaTeX.org admin
User avatar
localghost
Site Moderator
Posts: 9202
Joined: Fri Feb 02, 2007 12:06 pm

Don't split words on certain letters

Post by localghost »

eleanor wrote:[…] When running pdflatex test.tex, the third column contains the following text:
Napad na spletno stran z vrinjenjem zlon-
amerne kode, napisane v skriptnem jeziku,
npr. z namenom kraje pikotkov.
Notice that the word "zlonamerne" is splitted on the 'a' boundary. The letter 'a' should not be put into the next line, but should be on the previous line or the word should be put into the new line alltogether. […]
When I compile your example as is, I get the result pictured in the attachment.
Attachments
The obtained output of the provided example.
The obtained output of the provided example.
hyphenation-slovene.png (12.97 KiB) Viewed 8256 times
Post Reply