Editing Sort a token list

From LuaTeXWiki

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 1: Line 1:
= Sorting a list of tokens =
+
http://luatex.bluwiki.com/go/Japanese_and_more_generally_CJK_typesetting
 
 
Sorting in TeX can be done, as demonstrated in the paper "Sorting in TeX's mouth" by Bernd Raichle, but it is slow. Using Lua, sorting can be done much quicker.
 
 
 
The following code demonstrates sorting in PlainTeX-LuaTeX:
 
 
 
% We `preload' the necessary functions into the Lua state
 
%
 
% Note that comments inside \directlua must be TeX comments (`%');
 
% Lua comments (`--') do not work correctly
 
%
 
% The code for transforming a string into a table and back
 
% has been copied from the Lua users wiki (page `MakingLuaLikePhp')
 
function explode(str,divider)
 
local result = {}
 
% Sanity check:
 
if divider == "" then
 
table.insert(result,str)
 
return result
 
end
 
% We search for the start of <divider> or an opening/closing
 
% brace. That way, we can jump through <str> instead of having to
 
% check every character
 
local search = "([{}" .. string.sub(divider,1,1) .. "])"
 
% The current search position
 
local curpos = 0
 
% The position of the last divider
 
local divpos = 0
 
% The number of open left braces
 
local braces = 0
 
% The length of <divider>
 
local divlen = string.len(divider)
 
 
% Search for whatever comes first
 
for st,sp,first in function() return string.find(str,search,curpos) end
 
do
 
% Braces? Then increase/decrease the brace count
 
if first == "{" then
 
braces = braces + 1
 
curpos = st + 1
 
elseif first == "}" then
 
braces = braces - 1
 
curpos = st + 1
 
else
 
% Otherwise, if we are not inside braces, check if the
 
% position found is the start of <divider>. Then split
 
% the substring off <str> and store the first part in <result>
 
if braces == 0 and string.sub(str,st,st + divlen - 1) == divider then
 
table.insert(result,string.sub(str,divpos,st - 1))
 
divpos = st + divlen
 
curpos = st + divlen
 
else
 
curpos = st + 1
 
end
 
end
 
end
 
 
% Do not forget to add the last substring to the result table
 
if divpos < string.len(str) then
 
table.insert(result,string.sub(str,divpos))
 
end
 
 
% Done
 
return result
 
end
 
 
function implode(t,div)
 
return table.concat(t,div)
 
end
 
}
 
 
% #1 = continuation (called with the sorted list as its only
 
% parameter), #2 = delimiter, #3 = <delimiter>-separated list
 
\def\sort#1#2#3{%
 
\directlua0{%
 
% We prevent expansion and then escape all characters special
 
% to lua in the parameters. This is necessary to prevent them
 
% from (a) executed right now and (b) being misunderstood by
 
% Lua.
 
%
 
% We convert the string into a table, delimited by a
 
% (unexpanded, escaped) delimiter:
 
local t = explode("\luaescapestring{\unexpanded{#3}}",
 
  "\luaescapestring{\unexpanded{#2}}")
 
% We sort the table in-place:
 
table.sort(t)
 
% We output the table as a string.
 
%
 
% We use a single call to tex.print to prevent spaces in
 
% the output, since all tex.print calls except the last add
 
% an \endlinechar (LuaTeX manual 4.1.10.1):
 
tex.print("\luaescapestring{\unexpanded{#1}}{" .. implode(t," ") .. "}")
 
}%
 
}
 
 
 
% Output tokens without expansion
 
\def\typeout#1{\message{\unexpanded{[#1]}}}
 
 
 
% Example (text taken from Wikipedia, ``Hyperreality'')
 
\sort{\typeout}{ }{%
 
In semiotics and postmodern philosophy, the term hyperreality
 
characterizes the inability of consciousness to distinguish reality
 
from fantasy, especially \in technologically advanced postmodern
 
cultures. Hyperreality is a means of characterising the way
 
consciousness defines what is actually `real' in a world where a
 
multitude of media can radically shape and filter the original event
 
or experience being depicted. Some famous theorists of hyperreality
 
include Jean Baudrillard, \Albert Borgmann, Daniel Boorstin, and
 
Umberto Eco.}
 
 
\bye
 
 
 
Note: The above code correctly skips the delimiter if enclosed in braces.
 

Please note that all contributions to LuaTeXWiki are considered to be released under the GNU Free Documentation License 1.3 (see LuaTeXWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel Editing help (opens in new window)