symbol2unicode: Generate unicode symbols from similar ascii character combinations
Huub de Beer
June, 2016
1 Introduction
While reading Nederpelt and Kamareddine (2011) Logical reasoning: A first course for a project to explore constructionist learning approaches, I found myself entering Unicode symbols a lot. There is nothing wrong with entering one or two Unicode symbols now and then—a fitting symbol enhances the readability of a text enormously—, but when a text is symbol-heavy it soon becomes a chore.
For example, in Vim, the text editor
I use for all my writing, you can enter the logical not operator as
follows: to get “¬” you have to press Control
+ “v”, then
“u”, and then “00ac”. This is a lot more typing than, say, !
to denote “not”. Would it not be great if there was a program where I
could enter ASCII
representations of the symbols I want to use, which would then be
converted to their Unicode equivalents?
symbol2unicode
is such a program!
symbol2unicode
is free software;
symbol2unicode
is licensed under the GNU General Public Licence
Version 3. You will find its source code at
Github.
There are two ways to use symbol2unicode
: via a web interface and via a command-line interface,
which has an interactive mode. Both interfaces work mostly the same: you
enter in an ASCII representation of a symbol, such as =>
,
and by pressing ENTER
it is converted to Unicode. You can
also supply the ASCII representation as a parameter to the
symbol2unicode
program.
2 Command-line interface
2.1 Install
You can install symbol2unicode
via npm as follows:
npm install -g symbol2unicode
If you do not want to install the program globally, remove the
-g
parameter from the line above.
2.2 Usage
Run the program symbol2unicode
with the ASCII
representations of the symbols you want to convert as parameters. For
example, to convert =>
to ⇒
, run the program
as follows:
symbol2unicode "=>"
You can specify as many parameters as you like. These will be joined together with a space (” “) and run through the converter as one long string. For example,
symbol2unicode "P /\ Q" "=>" "!Q \/ P === !P"
results in the output P ∧ Q ⇒ ¬Q ∨ P ≡ ¬P
. The input
string (<forall i: i in ZZ:i <= i^2>)
will be
converted to 〈∀ i: i ∈ ℤ:i ≤ i²〉
.
If the symbol2unicode
program is executed without any
parameters, it will run in interactive mode. The interactive
mode starts by printing the following short welcome message:
Welcome by symbol2unicode.
Usage:
Enter a string of ascii symbols after the prompt (? ) and press
ENTER to convert it to unicode. Press CONTROL+C to quit.
Hereafter you can enter ASCII representations of the symbols you want
to convert after the ?
prompt. Press ENTER
to
convert your input to Unicode. To quit interactive mode and the program,
press CONTROL+C
.
Finally, it is possible to use the symbol2unicode
program
with pipes. For example:
echo "P /\ Q === true" | symbol2unicode
will result in P ∧ Q ≡ true
.
2.2.1 Use in Vim
As I am a heavy Vim user, I like to use symbol2unicode
from inside vim. Of course, I can call it as any other external program
in Vim:
:r !symbol2unicode "(forall i:i in NN:i <= i^2)"
Which will insert (∀ i: i ∈ ℕ: i ≤ i²) on the line below the one where
the cursor is. This works fine, but the command is quite a “lineful”,
particularly if you only want to insert a single symbol now and then. A
simple way to decrease the invocation length, is to create an alias in
Bash (or any other shell
that supports them) for symbol2unicode
to something shorter,
such as s2u
or uu
.
A better way, however, is to create a custom Vim command—I like the
sound of S2u
for that (custom commands
should start with a capital letter)—that feeds its argument to
symbol2unicode
and inserts the output in the current file.
The above example then becomes:
:S2u (forall i:i in NN: i <= i^2)
To create the S2u
command, run
:command -nargs=+ S2u r! symbol2unicode "<args>"
or add it to your .vimrc
. As a next step, you could map
S2u
to a key, such as F7
, with
:map <F7> <Esc>:S2u <Space>
All you now have to do is to press that key, type your ASCII string of
symbols and press ENTER
.
3 Overview ASCII-Unicode mappings
For a full overview of the ASCII to Unicode mappings, see the source
code file
src/DEFAULT_REPLACEMENTS.js
.
The rules for replacement rules are simple:
- An ASCII symbol representation can occur only once, but Unicode symbols can occur as often as needed.
- An Unicode symbol is exactly one character, but the ASCII symbol representations can have as many characters as needed.
Where there are clear conventions for ASCII symbol representations,
such as in programming languages, these conventions have priority over
more “logical” representations. Therefore, <=
is
converter to ≤
rather than ⇐
(which you get
with <==
).
You can add a symbol to the default list of replacements by either doing a pull request or by shooting me an email. Before you do, however, check if your new replacement rule does not interfere with pre-existing rules. You can check that by
-
On the command line: add your rules to the
src/DEFAULT_REPLACEMENTS.js
file, build the program withnpm run build
and start the program
bin/cli
If your new rule is in conflict with a pre-existing rule, it will complain and exit.
-
In the web browser: go to the web interface and open the javascript console. The
converter
is in the global scope. You can try to add your rules by calling therule
method on the converter like so:.rule("=>", "⋔"); converter
The first argument to the rule method is the ASCII representation and the second one is the Unicode symbol. Again, if your rule interferes with a pre-existing rule or is otherwise not okay, it will complain.
Of course, as symbol2unicode
is free software, you are
free to create your own set of (default) translation rules.