Editing TeX without TeX

From LuaTeXWiki

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 2: Line 2:
  
 
With the Lua-part in LuaTeX you have access to most of TeX's capabilities for creating PDF documents. Most of the [http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf LuaTeX reference manual] is about this topic.  
 
With the Lua-part in LuaTeX you have access to most of TeX's capabilities for creating PDF documents. Most of the [http://www.luatex.org/svn/trunk/manual/luatexref-t.pdf LuaTeX reference manual] is about this topic.  
 
[[Writing_Lua_in_TeX|Remember]] that you can jump into Lua mode from TeX with <tt>\directlua{...Lua code...}</tt>. For example this plain TeX document prints the first digits of the value of pi: <tt>$\pi = \directlua{ tex.print(math.pi)}$ \bye</tt>. You can get the result PDF by running <tt>luatex testfile.tex</tt>.
 
  
 
The underlying idea is to create TeX's internal data structure as TeX would do by transforming macros and primitives into something called nodes. These nodes are then transformed into instructions for the PDF file (PDF objects). So the following pages deal with node creation and the structures.  
 
The underlying idea is to create TeX's internal data structure as TeX would do by transforming macros and primitives into something called nodes. These nodes are then transformed into instructions for the PDF file (PDF objects). So the following pages deal with node creation and the structures.  
Line 86: Line 84:
 
In the example code we use black squares as the contents of the pages and not normal text, because character handling takes more code and will be covered later on.
 
In the example code we use black squares as the contents of the pages and not normal text, because character handling takes more code and will be covered later on.
  
Save the following file into <tt>myprogram.lua</tt> and use the TeX stub from above (or get the source [https://gist.github.com/pgundlach/1062041 here]).
+
Save the following file into <tt>myprogram.lua</tt> and use the TeX stub from above (or get the source [https://raw.github.com/gist/1062041/d3de082e955f6535f3820742cef76fe9e50d5a6f/myprogram.lua here]).
 
<pre>
 
<pre>
 
do
 
do
Line 193: Line 191:
 
start_link.link_attr = "/C [0.9 1 0] /Border [0 0 2]"
 
start_link.link_attr = "/C [0.9 1 0] /Border [0 0 2]"
  
--start_link.action = node.new("action")
+
start_link.action = node.new("action")
--there has been an update of the luatex
 
start_link.action = node.new("whatsit","pdf_action")
 
 
start_link.action.action_type = 1
 
start_link.action.action_type = 1
 
start_link.action.action_id  = dest
 
start_link.action.action_id  = dest
Line 307: Line 303:
 
   for s in string.utfvalues( text ) do
 
   for s in string.utfvalues( text ) do
 
     local char = unicode.utf8.char(s)
 
     local char = unicode.utf8.char(s)
     if unicode.utf8.match(char,"^%s$") then
+
     if unicode.utf8.match(char,"%s") then
 
       -- its a space
 
       -- its a space
 
       n = node.new("glue")
 
       n = node.new("glue")
Line 414: Line 410:
  
 
You might be able to guess what happens here. We load the image [[Media:Oilpainting.jpg|oilpainting.jpg]] with <tt>img.scan</tt> but do not write it out to the PDF file yet. We create a copy of the image data so we can manipulate the size for example (another option would be to obtain a specific page of a PDF document or do some rotation). After that we write a reference to that image to the PDF file. In the case above, two references to the image (XObject) get written but the image is only written once. That is easy, right?
 
You might be able to guess what happens here. We load the image [[Media:Oilpainting.jpg|oilpainting.jpg]] with <tt>img.scan</tt> but do not write it out to the PDF file yet. We create a copy of the image data so we can manipulate the size for example (another option would be to obtain a specific page of a PDF document or do some rotation). After that we write a reference to that image to the PDF file. In the case above, two references to the image (XObject) get written but the image is only written once. That is easy, right?
 
You can get an impression how the resulting document looks like:
 
 
[[File:Oilpaintingdocument.jpg]]
 
 
== Hyphenation ==
 
 
Every now and then you have to hyphenate your words in other languages. When you create the paragraph to be hyphenated (with <tt>lang.hyphenate()</tt>), you assemble it by creating glyph nodes which have a field lang (see above). This field accepts a number, the language id. So: how do we "create" a language? The first steps are simple, we start without TeX stub again:
 
 
<pre>
 
\directlua{dofile("hyphenation.lua")}
 
\end
 
</pre>
 
 
and the file hyphenation.lua is:
 
<pre>
 
local path = kpse.find_file("hyph-de-1996.pat.txt")
 
local l = lang.new()
 
 
local hyph_file = io.open(path)
 
lang.patterns(l,hyph_file:read("*all"))
 
hyph_file:close()
 
</pre>
 
 
Now the function <tt>lang.id(l)</tt> returns the language id of the language <tt>l</tt>, which we can use in the glyph nodes. A peculiarity of TeX prevents that we can start with that right away. All characters that are in the paragraph get lowercased before hyphenation rules apply. So how will a word like "Œuvre" be converted to lowercase? TeX has an internal table called ''lccode'' where it looks up a character and the output is its lowercase variant. So the lccode of Œ would be œ, but as TeX stores numbers, the lccode of 338 is 339. But only lccodes of the letters A-Z and a-z are set, so we need to set the codes for all other characters that appear in a text. In TeX you would write:
 
 
<pre>
 
\lccode`\Œ=`\œ
 
\lccode`\œ=`\œ
 
</pre>
 
or
 
 
<pre>
 
\lccode338=339
 
\lccode339=339
 
</pre>
 
 
You see that you need to set the lccode of the lowercase characters as well. In Lua you do something similar:
 
 
<pre>
 
tex.lccode[unicode.utf8.byte("Œ")] = [unicode.utf8.byte("œ")]
 
tex.lccode[unicode.utf8.byte("œ")] = [unicode.utf8.byte("œ")]
 
</pre>
 
 
or shorter:
 
 
<pre>
 
tex.lccode[338] = [339]
 
tex.lccode[339] = [339]
 
</pre>
 
 
Once you have set all the necessary lccodes (remember you don't need to do this with a to z and A to Z), you can expect the hyphenation to work.
 

Please note that all contributions to LuaTeXWiki are considered to be released under the GNU Free Documentation License 1.3 (see LuaTeXWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel Editing help (opens in new window)