201 lines
6.5 KiB
Org Mode
201 lines
6.5 KiB
Org Mode
#+HTML_HEAD: <link rel="stylesheet" href="../../static/style.css">
|
|
#+HTML_HEAD: <link rel="icon" href="../../static/grus/favicon.png" type="image/png">
|
|
#+EXPORT_FILE_NAME: index
|
|
#+OPTIONS: toc:nil
|
|
#+TOC: headlines 3
|
|
#+TITLE: Grus
|
|
|
|
Grus is a simple word unjumbler written in Go.
|
|
|
|
| Project Home | [[https://andinus.nand.sh/grus/][Grus]] |
|
|
| Source Code | [[https://tildegit.org/andinus/grus][Andinus / Grus]] |
|
|
| GitHub (Mirror) | [[https://github.com/andinus/grus][Grus - GitHub]] |
|
|
*Tested on*:
|
|
- OpenBSD 6.6 (with /pledge/ & /unveil/)
|
|
|
|
| Demo Video | System Information |
|
|
|-------------+------------------------------------|
|
|
| [[https://diode.zone/videos/watch/515e2528-a731-4c73-a0da-4f8da21a90c0][Grus v0.2.0]] | OpenBSD 6.6 (with /pledge/ & /unveil/) |
|
|
|
|
* Installation
|
|
** Pre-built binaries
|
|
Pre-built binaries are available for OpenBSD, FreeBSD, NetBSD, DragonFly BSD,
|
|
Linux & macOS.
|
|
|
|
This will just print the steps to install grus & you have to run those commands
|
|
manually. Piping directly to =sh= is not a good idea, don't run this unless you
|
|
understand what you're doing.
|
|
*** v0.3.0
|
|
#+BEGIN_SRC sh
|
|
curl -s https://tildegit.org/andinus/grus/raw/tag/v0.3.0/scripts/install.sh | sh
|
|
#+END_SRC
|
|
*** v0.2.0
|
|
#+BEGIN_SRC sh
|
|
curl -s https://tildegit.org/andinus/grus/raw/tag/v0.2.0/scripts/install.sh | sh
|
|
#+END_SRC
|
|
** Post install
|
|
You need to have a dictionary for grus to work, if you don't have one then you
|
|
can download the Webster's Second International Dictionary, all 234,936 words
|
|
worth. The 1934 copyright has lapsed.
|
|
#+BEGIN_SRC sh
|
|
curl -L -o /usr/local/share/dict/web2 \
|
|
https://archive.org/download/grus-v0.2.0/web2
|
|
#+END_SRC
|
|
|
|
There is also another big dictionary with around half a million english words.
|
|
I'm not allowed to distribute it, you can get it directly from GitHub.
|
|
#+BEGIN_SRC sh
|
|
curl -o /usr/local/share/dict/words \
|
|
https://raw.githubusercontent.com/dwyl/english-words/master/words.txt
|
|
#+END_SRC
|
|
* Documentation
|
|
*Note*: This is documentation for latest release, releases are tagged & so
|
|
previous documentation can be checked by browsing source at tags.
|
|
|
|
** Environment Variables
|
|
*** =GRUS_SEARCH_ALL=
|
|
Search in all dictionaries, by default Grus will exit after searching in first
|
|
dictionary.
|
|
*** =GRUS_ANAGRAMS=
|
|
Prints all anagrams if set to true, by default Grus will print all anagrams.
|
|
*** =GRUS_PRINT_PATH=
|
|
Prints path to dictionary if set to true, this is set to false by default.
|
|
*** =GRUS_STRICT_UNJUMBLE=
|
|
Overrides everything & will try to print at least one match, if it doesn't find
|
|
any then it will exit the program with a non-zero exit code. This will ignore
|
|
=GRUS_SEARCH_ALL= till it finds at least one match.
|
|
** Default Dictionaries
|
|
These files will be checked by default (in order).
|
|
- =/usr/local/share/dict/words=
|
|
- =/usr/local/share/dict/web2=
|
|
- =/usr/share/dict/words=
|
|
- =/usr/share/dict/web2=
|
|
- =/usr/share/dict/special/4bsd=
|
|
- =/usr/share/dict/special/math=
|
|
** Examples
|
|
#+BEGIN_SRC sh
|
|
# print grus version
|
|
grus version
|
|
|
|
# print grus env
|
|
grus env
|
|
|
|
# unjumble word
|
|
grus word
|
|
|
|
# don't print all anagrams
|
|
GRUS_ANAGRAMS=false grus word
|
|
|
|
# search for word in custom dictionaries too
|
|
grus word /path/to/dict1 /path/to/dict2
|
|
|
|
# search for word in all dictionaries
|
|
GRUS_SEARCH_ALL=true grus word /path/to/dict1 /path/to/dict2
|
|
|
|
# print path to dictionary
|
|
GRUS_PRINT_PATH=1 grus word
|
|
|
|
# find at least one match
|
|
GRUS_STRICT_UNJUMBLE=1 grus word
|
|
#+END_SRC
|
|
* History
|
|
Initial version of Grus was just a simple shell script that used the slowest
|
|
method of unjumbling words, it checked every permutation of the word with all
|
|
words in the file with same length.
|
|
|
|
Later I rewrote the above logic in python, I wanted to use a better method. Next
|
|
version used logic similar to the current one. It still had to iterate through
|
|
all the words in the file but it eliminated lots of cases very quickly so it was
|
|
faster. It first used the length check then it used this little thing to match
|
|
the words.
|
|
|
|
#+BEGIN_SRC python
|
|
import collections
|
|
|
|
match = lambda s1, s2: collections.Counter(s1) == collections.Counter(s2)
|
|
#+END_SRC
|
|
|
|
I don't understand how it works but it's fast, faster than convert the string to
|
|
list & sorting the list. Actually I did that initially & you'll still find it in
|
|
grus-add script.
|
|
|
|
#+BEGIN_SRC python
|
|
lexical = ''.join(sorted(word))
|
|
if word == lexical:
|
|
print(word)
|
|
#+END_SRC
|
|
|
|
This is equivalent to lexical.SlowSort in current version.
|
|
|
|
#+BEGIN_SRC go
|
|
package lexical
|
|
|
|
import (
|
|
"sort"
|
|
"strings"
|
|
)
|
|
|
|
// SlowSort returns string in lexical order. This function is slower
|
|
// than Lexical.
|
|
func SlowSort(word string) (sorted string) {
|
|
// Convert word to a slice, sort the slice.
|
|
t := strings.Split(word, "")
|
|
sort.Strings(t)
|
|
|
|
sorted = strings.Join(t, "")
|
|
return
|
|
}
|
|
#+END_SRC
|
|
|
|
Next version was also in python & it was stupid, for some reason using a
|
|
database didn't cross my mind then. It sorted the word & then created a file
|
|
with name as lexical order of that word (if word is "test" then filename would
|
|
be "estt"), and it appended the word to that file.
|
|
|
|
It took user input & sorted the word, then it just had to print the file (if
|
|
word is "test" then it had to print "estt"). This was a lot faster than
|
|
iterating through all the words but we had to prepare the files before we could
|
|
do this.
|
|
|
|
This was very stupid because the dictionary I was using had around 1/2 million
|
|
words so this meant we got around half a million files, actually less than that
|
|
because anagrams got appended into a single file but it was still a lot of small
|
|
files. Handling that many small files is stupid.
|
|
|
|
I don't have previous versions of this program. I decided to rewrite this in Go,
|
|
this version does things differently & is faster than all previous versions.
|
|
Currently we first sort the word in lexical order, we do that by converting the
|
|
string to =[]rune= & sorting it, this is faster than lexical.SlowSort.
|
|
lexical.SlowSort converts the string to =[]string= & sorts it.
|
|
|
|
#+BEGIN_SRC go
|
|
package lexical
|
|
|
|
import "sort"
|
|
|
|
// Sort takes a string as input and returns the lexical order.
|
|
func Sort(word string) (sorted string) {
|
|
// Convert the string to []rune.
|
|
var r []rune
|
|
for _, char := range word {
|
|
r = append(r, char)
|
|
}
|
|
|
|
sort.Slice(r, func(i, j int) bool {
|
|
return r[i] < r[j]
|
|
})
|
|
|
|
sorted = string(r)
|
|
return
|
|
}
|
|
#+END_SRC
|
|
|
|
Instead of creating lots of small files, entries are stored in a sqlite3
|
|
database.
|
|
|
|
This was true till v0.1.0, v0.2.0 was rewritten & it dropped the use of database
|
|
or any form of pre-parsing the dictionary. Instead it would look through each
|
|
line of dictionary & unjumble the word, while this may be slower than previous
|
|
version but this is simpler.
|