Here I'll try to provide step-by-step explanation of how to use this library and tool.
Surely, if you want to use it, you need have Java installed (Only JRE for
tool and whole JDK for library). Ask google about these abbreviations if you
need more info.
Running command-line tool
Download frej.jar and try to run it:
$ java -jar frej.jar
Since you have not give the pattern which should be used, you shall receive
short example of usage:
Pattern should be specified. Example:
java -jar frej.jar "(give,(#)~A,(^doll*,buck*,usd))|got_$A_dollars"
give 5 dollars
giv 70 bucks
gave 1000 usd
Ok. Now you at least know that you can run the utility.
Parsing input lines
Let us try to run some simple example and watch how utility process input:
$ java -jar frej.jar "(^(barack,(?h*),obama)|44-th,(=george,washington)|1-st,((^abraham,abe),lincoln)|16-th)~A|$A_president_of_USA"
After utility starts, it awaits for input lines. Let us try some variants
(yellow text describes answers which gave frej.jar):
Barack Obama
44-th president of USA
Barak Boama
44-th president of USA
Barrac Obamah
44-th president of USA
Gorge Woshingtone
1-st president of USA
Washignton Goerg
washignton Georg
1-st president of USA
Barack Hussein Obama
44-th president of USA
Vladimir Putin
Abe Linkonl
16-th president of USA
Ab Linkoln
Abbe Linkoln
16-th president of USA
Avraam Lncoln
16-th president of USA
Here we see, that we can use FREJ to describe simple "recognize-and-substitute"
functionality. You may note the following points:
- elements of regular expression are fragments in brackets, with optional
element-type character at beginning and, possibly, deep nesting;
- for example (^expr1,...,exprN) means that string should be matched
against any one of provided inner expressions (expr1 .. exprN);
- element without special type-mark (expr1,...,exprN) means that
string must match all the inner expressions in the same order, for example
(barack,obama) would match "Barack Obama" but not "Obama Barack";
- on contrary, (=exprA,exprB) would match string against pair of
expressions in any order (look an example with "George Washington" and
"Washington George".
- if there are too much typos, program decides that string does not match
provided regular expression, and outputs nothing (as it happens with
"Vladimir Putin" or "Washignton Goerg") - threshold for typos percentage
could be changed by programmer.
- we use symbol "|" (channel) to show the intended substitution for
the element of regexp (if it matched some part of string).
- we use symbol "~" (tilda) with a letter 'A'..'Z' to remember the
result of element's match (or substitution) as a named group and later
use it in some outer replacement string (with the help of dollar sign "$"
followed by the same letter);
Some more abilities
- (#M:N) matches integer in range from M to N, (#N) matches number
in range from 1 to N, and (#) would match any number. It is useful for
matching numbered geographical names (1-st to 25-th avenue, for example);
- token specified with a star at the end would match the literal token
from the beginning, so (barack,h*,obama) would match "Barack H. Obama" and
"Barack Hussein Obama";
- special kind of element is (?expr) - it could match string against
specified expression, but also could match nothing - for this reason we
could easily match both "Barack Obama" and "Barack Hussein Obama".
- Spaces, tabs and line-feeds are skipped, but you can use underscore to
specify space or "\_" to specify underscore and "\r", "\n", "\\" behaves
as in C-derived languages.
- For better readability you can also use not only round brackets, but
also rectangular and curly. See following example.
Formatting patterns using extended syntax
That is how example with presidents could look if pattern is placed in
separate file:
[^
(barack, {?h*}, obama)
| 44-th,
(= george, washington)
| 1-st,
( {^ abraham, abe}, lincoln)
| 16-th
]~A
| $A_president_\rof_U\_S\_A
Using library
If one need to use the project as library, not stand-alone tool - one only need
to add frej.jar to classpath and create instances of class frej.Regex, specifying
necessary patterns. Then one can test and replace strings with methods
match, getReplacement and others. Download javadocs and find out
which helpful methods this class could offer you.