Hi Stefan, thank you for uploading the file and inspiring suggestions!
As to numbers, I have 50+ pdf pages included using
\includepdf
with 40+ items added using
addtotoc
option and 70+ figures/tables added to LoF/LoT using
addtolist
option. I expect a significant amount of index terms to be added. Therefore I would like to create a solution that will be reasonably compact, where page numbers are not repeated where they don't need to be etc.
This probably requires writing some command that would offer similar interface as
\includepdf
, e.g.:
Code: Select all
\MyIncludePdf[
linkname=pdf1,
pages={14-21},
% list with n*5 elements:
addtotoc={
14, section, 1, {Paper Title}, sec:pdf1_title,
14, subsection, 2, {Section 1}, sec:pdf1_sec1,
% ...
21, subsection, 2, {Section 8}, sec:pdf1_sec8,
21, subsection, 2, {Section 9}, sec:pdf1_sec9
},
% list with n*4 elements:
addtolist={
14, figure, {Fig.~1. Caption}, fig:pdf1_fig1,
14, figure, {Fig.~2. Caption}, fig:pdf1_fig2,
% ...
21, table, {Tab.~1. Caption}, fig:pdf1_tab1,
21, figure, {Fig.~8. Caption}, fig:pdf1_fig8
},
% list with n*2 elements:
mypicturecommands={ % this option is not present in standard \includecommand
14, {\put(10,50){sometext}\index{someterm}},
% ...
21, {\put(10,50){othertext}\index{otherterm}}
}
]{filename.pdf}
I started implementing it using expl3 syntax. Below is the example, I tried to cut out the non-essential stuff, input checking, etc. to make it reasonably small.
Code: Select all
\documentclass{book}
\usepackage{pdfpages}
\ExplSyntaxOn
\keys_define:nn { mykeys } {
linkname .tl_set:N = \l_linkname_tl,
pages .tl_set:N = \l_pages_tl,
addtotoc .clist_set:N = \l_addtotoc_clist,
}
\NewDocumentCommand{\MyIncludePdf}{ O{} m }{
\group_begin:
\keys_set:nn { mykeys } { #1 } % read key-value pairs from the first optional argument
% ==> `pages' option <== % now only supports format: 1-5
\seq_new:N \l_page_range_seq % 2-element page range seq
\int_new:N \l_first_page_int % first page number
\int_new:N \l_last_page_int % last page number
\seq_new:N \l_pages_seq % sequence of all pages between the first and the last
\seq_set_split:NnV \l_page_range_seq {-} \l_pages_tl
\int_set:Nn \l_first_page_int {\seq_item:Nn \l_page_range_seq 1}
\int_set:Nn \l_last_page_int {\seq_item:Nn \l_page_range_seq 2}
\int_step_inline:nnn \l_first_page_int \l_last_page_int {
\seq_put_right:Nn \l_pages_seq {##1}
}
Page~numbers~sequence:~\seq_use:Nn \l_pages_seq {,~} \\ % working fine with proper input
% ==> `addtotoc' option <== texdoc pdfpages, see \includepdf, addtotoc option
% addtotoc={<page number>,<section>,<level>,<heading>,<label>,<page number>,...}
% number of elements in a `addtotoc' list should be a multiple of 5
\prop_new:N \l_per_page_addtotoc_prop % key: page number, value: clist (subset of addtotoc for a given page)
\tl_new:N \l_page_number_tl
\tl_new:N \l_section_tl
\tl_new:N \l_level_tl
\tl_new:N \l_heading_tl
\tl_new:N \l_label_tl
\bool_until_do:nn {\clist_if_empty_p:N \l_addtotoc_clist} {
% get 5-element spec from addtotoc list
\clist_pop:NN \l_addtotoc_clist \l_page_number_tl
\clist_pop:NN \l_addtotoc_clist \l_section_tl
\clist_pop:NN \l_addtotoc_clist \l_level_tl
\clist_pop:NN \l_addtotoc_clist \l_heading_tl
\clist_pop:NN \l_addtotoc_clist \l_label_tl
% append 5-element spec to a list for a specific page
\prop_get:NVNTF \l_per_page_addtotoc_prop \l_page_number_tl \l_tmpa_clist
{} {\clist_clear:N \l_tmpa_clist}
\clist_put_right:NV \l_tmpa_clist \l_page_number_tl
\clist_put_right:NV \l_tmpa_clist \l_section_tl
\clist_put_right:NV \l_tmpa_clist \l_level_tl
\clist_put_right:NV \l_tmpa_clist { { \l_heading_tl } } % PROBLEM 1. What if \l_heading_tl contains commas and is surrounded with braces? It seems not to be working as documented
\clist_put_right:NV \l_tmpa_clist \l_label_tl
\clist_show:N \l_tmpa_clist % DEBUG (see terminal)
% put updated spec back in the property list (key is the page number)
\prop_put:NVV \l_per_page_addtotoc_prop {\l_page_number_tl} {\l_tmpa_clist}
}
\prop_show:N \l_per_page_addtotoc_prop % DEBUG (see terminal)
% ==> call \includepdf for each page <==
\prop_new:N \l_includepdf_prop % property list of key-value pairs to be passed as a first arg to \includepdf (after expansion?)
\seq_map_inline:Nn \l_pages_seq { % for each page
PAGE~NUMBER:~##1 \\
\prop_clear:N \l_includepdf_prop
\prop_put:NnV \l_includepdf_prop {linkname} {\l_linkname_tl}
\prop_put:Nnn \l_includepdf_prop {pages} ##1
% not all pages will have addtotoc items defined, that's why I build key-val in advance!
% I cannot pass addtotoc={} to \includepdf, this results in warnings
\prop_get:NnNTF \l_per_page_addtotoc_prop ##1 \l_tmpa_clist {
\prop_put:NnV \l_includepdf_prop {addtotoc} \l_tmpa_clist
}{}
\prop_show:N \l_includepdf_prop % DEBUG (see terminal)
% PROBLEM 2. This fails. How to pass a list of key-value pairs as a single variable to \includepdf?
% \exp_args:Nn \includepdf [\l_includepdf_prop] {#2}
% I suppose that \l_includepdf_prop must be either converted to clist or another datatype and expanded
% but all of my attempts to make it work failed
% can this help?
% use \prop_to_keyval and treat as clist or tl?
% \clist_set:NV \l_tmpb_clist {\prop_to_keyval:N \l_includepdf_prop}
% \clist_show:N \l_tmpb_clist % DEBUG (see terminal) strange output
% \tl_set:NV \l_tmpb_tl {\prop_to_keyval:N \l_includepdf_prop}
% \tl_show:N \l_tmpb_tl % DEBUG (see terminal) strange output
}
\group_end:
}
\ExplSyntaxOff
% ================================================================== %
\begin{document}
\mainmatter
\MyIncludePdf[
linkname=pdf:carbon,
pages={2-4},
addtotoc={
2, section, 1, {Paper \textbf{The, title}. Some text, WITH, COMMAS! }, sec:title1, % PROBLEM 1, see line 50
2, subsection, 2, {Section~1. Title}, sec:sec1,
4, subsection, 2, {Section~3. Another text}, sec:sec3
},
]{1602.03837.pdf}
\end{document}
I got stuck on two (possibly newbie) problems:
PROBLEM 1. Each
addtotoc
list element on 5*n+4 position is the header that lands in the table of contents. As such, it may contain commas, so i put it in braces, e.g.
{Section~1. Some text, WITH, COMMAS!}
in the example below. In the line 42 in the code, I pop this heading (
\l_heading_tl
) from one comma-separated list (
clist
type) and in the line 50 I put it on the right of another
clist
. When I list the contents of that another list (line 52), I see the following:
Code: Select all
The comma list \l_tmpa_clist contains the items (without outer braces):
> {2}
> {section}
> {1}
> {Paper \textbf {The, title}. Some text}
> {WITH}
> {COMMAS!}
> {sec:title1}.
But I would expect:
Code: Select all
The comma list \l_tmpa_clist contains the items (without outer braces):
> {2}
> {section}
> {1}
> {Paper \textbf {The, title}. Some text, WITH, COMMAS!}
> {sec:title1}.
I suppose something is wrong with the way I put the heading token list back to the clist in the line 50:
Code: Select all
\clist_put_right:NV \l_tmpa_clist { { \l_heading_tl } }
I tried to adhere to
texdoc interface3
, page 185 in the PDF (TL2023):
To append some ⟨tokens⟩ as a single ⟨item⟩ even if the ⟨tokens⟩ contain commas or spaces, add a set of braces: \clist_put_right:Nn ⟨clist var⟩ { {⟨tokens⟩} }
.
I do not know what am I doing wrong, but the extra set of braces does not change anything.
PROBLEM 2. Assuming that I would call
\includepdf
page by page, not all pages will use
addtotoc
etc. I cannot pass
addtotoc={}
to those that do not, this results in warnings. Therefore I decided to build a list of key-value pairs using
prop
datatype and then somehow pass it as a first optional argument to the
\includepdf
for each PDF page separately. This does not work and after trial-and-error I still do not know how to select data type and expansion type to make it work (some rumors about incompatibility between LaTeX2, pgfkeys and expl3 key-value interfaces also make me confused). The code related to this problem is in lines 72-82 and is now commented.
I would appreciate any tips how to solve this.