sr-71/REFERENCE.gmi

469 lines
27 KiB
Plaintext

sr-71 Reference Documentation
# Configuration Options In Detail
The configuration file is built out of: servers, directives, and modifiers. Servers specify a protocol to be hosted and where to host it, and directives inside the server definition fill out the details. Directives can be provided inside of servers and also globally, and modifiers are applied to certain directives to further customize their behavior.
## Servers
A server is structured as a block of configuration, starting with the protocol name and optionally the IP:port on which to host.
```
<protocol> [<IP>][:<port>] {
# directives...
}
# for example:
gopher 1.2.3.4:7070 {
# directives...
}
# or:
gemini {
# directives...
}
# or:
finger :7979 {
# directives...
}
```
The protocol must be supplied, but the IP and colon-prefixed port are both optional. By default "0.0.0.0" will be used as the IP (meaning it will serve on all available interfaces) and the port will be the default for the protocol.
Just for simplicity of the configuration parser, the opening brace ({) must be at the end of the same line as the protocol name, and the closing brace (}) must be on a line of its own.
### gopher server
Gopher servers are pretty straightforward. The default port is 70, and the only special thing you have to get right inside a "gopher" block is that the server must know *where* it is, so it needs a single "host" directive with exactly one hostname. It supports all other directives.
### finger server
Finger is a little more constrained as a protocol than the other two here, because requests pretty much just point to a user name. So a "finger {...}" block must contain a single "static" or "cgi" directive, which must have a tilde (~) placeholder in the file system path and is lacking an "at <url path>" clause. See the sections below on the "static" and "cgi" directives for more details.
Finger's default port is 79.
### gemini server
Gemini's default port is 1965.
Because of the TLS requirement, a gemini server must contain a "servertls" directive.
sr-71 also supports virtualhosts in gemini, where multiple domains can be hosted on the same IP/port. This is done by having multiple "gemini {...}" blocks with the same IP and port (potentially defaults), but with "host" directives differentiating them. More on this in the section below on virtualhosting.
### spartan server
Spartan's default port is 300.
It also supports virtualhosting.
### nex server
Nex's default port is 1900.
Nex doesn't support virtualhosting, and in fact "host" directives have no effect as the protocol doesn't allow the request to specify the host it's targeting.
## Directives
Directives really come in two flavors: global directives and server directives. In either case, a directive is always contained on a single line, which begins (after any leading whitespace) with the type of the directive. What follows the directive type depends on that type.
### [global] systemuser
A config file may have one "systemuser" directive. In this case the type is followed by just a single word which is the user (or numeric user ID) the server should run as.
When there is a systemuser directive, sr-71 will drop privileges to the named user after doing a few things:
* parsing the config file
* reading any TLS-related files into memory
* starting listening sockets on all the server ports
If sr-71 is not started as root but has a "systemuser" directive, it will fail at startup with an error message.
### [global] loglevel
There may be a single "loglevel" directive, in which the word "loglevel" is followed by one of the levels: "debug", "info", "warn", or "error". Logs sent to stdout will then be filtered to only include the specified level and above.
Having no "loglevel" directive is the same as "loglevel debug", which allows everything through.
### [global] auth
The global "auth" directive sets up an authentication mechanism which can then be used in server directives.
The first word following "auth" is a name which can later be used to reference the auth mechanism.
Next is a token specifying how the auth will be accomplished, which can be any of:
* "hasclienttls", which stands alone and means that requests with no client TLS certficate will be rejected (but any client TLS cert allowed)
* "clienttls" is then followed by a comma-separated list of client certificate SHA256 hashes, indicating which client certs will be allowed to pass this auth
* "clienttlsfile" is followed by the path of a file which contains a line-delimited list of client certificate SHA256 hashes which will be allowed to pass. This file path may also include ~ characters, in which case its location will be expanded according to "User-Custom Paths" rules.
```some example auth directives
auth is_named hasclienttls
auth is_tony clienttls 0284bcb38d7c98548df4a67587163276373ea8f9a8cc931a89f475557bd9f3a3
auth dev_team clienttls 95927842cde8d9bfe121602db7178f7d5d9b0d73ad6a815a6631f0af2d6c4ebf, f8f5c6e9e7f8e19852247648114d99e7ae682454d5f093afc695da58d9b2dabf
auth user_private_gemini clienttlsfile ~/.private_gemini
```
### [server] host
The "host" directive tells a server what hostname(s) it is serving. It is followed by one or more comma-separated hostnames.
A gopher server *must* contain a single host directive with just one hostname - this is to enable it to generate local links on generated gopher menus.
In gemini and spartan servers, host directives control virtual hosting behavior.
### [server] servertls
"servertls" provides servers with the paths to their TLS server credentials. It is followed by two clauses: "key <path to key file>", and "cert <path to cert file>". Both clauses are required (if a single file contains both then use that path for both clauses).
Gemini servers must always host with TLS and so require a "servertls" directive. In gopher, finger, and nex, the presence of a "servertls" directive will cause them to host their content tls-encrypted. It is not allowed in spartan servers.
### [server] static
The "static" directive causes the contents of a directory to be served directly. "static" directives only allow world-readable files to be read.
"static" is immediately followed by the file system path to host, then an "at" clause is usually required which provides the URL prefix to host (use "/" to host at the domain root), an optional "with" clause introduces modifiers, and finally an optional "auth" clause with an auth name defined elsewhere in a global "auth" directive.
The file system path and url prefix support user-custom path expansion as well.
It ends up looking like:
```static directive structure
gemini {
#static <filesystem path to documents> [at <url path prefix>] [with <modifiers>] [auth <name>]
# for example:
static /var/gemini at / with dirdefault index.gmi
static /var/gemini/private at /private auth private_gemini
static /my/project/documentation at /docs with dirlist
}
```
Finger servers are a little special, in that the file system path *must* use a tilde (~) for user-custom path expansion, and it's the only context which does not support the "at" clause.
### [server] cgi
"cgi" directives are similar to "static" in that they define a filesystem directory to be hosted at a URL path. However "cgi" directives cause any world-readable and world-executable files to be *run* instead, then stdout to be served as the default type for the protocol. There is more detail on this in the section "Running CGIs".
CGI directives don't support any modifiers except "extendedgophermap", and even that only in gopher servers. Find more on the "extendedgophermap" modifier below in the section "Extended Gophermap Parsing".
Otherwise the structure is the same as "static":
```example cgi directives
gemini {
cgi /var/gemini/cgis at /cgi-bin
}
gopher {
cgi ~/public_gopher/cgi-bin at /~/cgi-bin with extendedgophermap
}
```
### [server] git
"git" directives point to a filesystem directory which contains git repositories and serves up git repo viewers at a given URL prefix. As another routing directive, it is structured similarly to the "static" and "cgi" directives.
```example git directive
gopher {
git /var/repos at /code
}
gemini {
git ~/code at /~/code
}
```
"git" is not supported in finger, but otherwise it builds appropriate views according to the protocol of the server it is under (gemtext on gemini and spartan, gopher menu on gopher, plain text on nex).
The only supported modifier is "templates" - find more details on that in the section on "Git Viewing Templates".
## Modifiers
The routing directives "static", "cgi", and "git" support modifiers in the "with" clause. These are comma-separated, and each modifier begins with the modifier type and then may be followed by type-specific additional information.
### dirdefault <filename>
"dirdefault" is followed by a file name, and it customizes the behavior of requests for directory paths. If the path exists and "dirdefault" is given, it will look for the provided file name within the requested directory and serve that *as the directory itself*. Think "index.html" in web servers.
Allowed contexts: static directive (neither cgi nor git), gemini, spartan, gopher, or nex servers (no finger).
### dirlist
"dirlist" takes no additional parameter and causes a protocol-appropriate listing to be built for directory requests.
If both "dirdefault" and "dirlist" are in use then "dirdefault" will take precedence and the listing will only be built if the dirdefault filename doesn't exist.
Allowed contexts: static directive (neither cgi nor git), gemini, spartan, gopher, or nex servers (no finger).
### exec
"exec" allows static directives to also run CGIs. The major difference between "static...with exec" and "cgi" is that the former will also serve file contents for non-executable files, and can support more additional modifiers such as dirdefault and dirlist.
There is more detail in the section "Running CGIs" below.
Allowed contexts: static directive (neither cgi nor git), gemini/gopher/spartan/finger/nex servers.
### cmd <file path>
"cmd" overrides the CGI behavior of either "static...with exec" or "cgi" directives. It causes the named file to be run as the executable in place of the located executable file.
Importantly, in all other ways it will still run as the located file:
* It will not be invoked at all unless the request maps to a world-readable and world-executable file
* The working directory will be set to the location of the target file
* All the CGI environment variables are the same, for instance SCRIPT_NAME and PATH_INFO.
So it can, for instance, be a good opportunity for a system administrator to impose boundaries on user CGIs in a shared hosting environment. The "cmd" script can set a nice level, increment a semaphore potentially waiting for a slot, set system resource limitations, chroot, and finally "exec ./$(basename $SCRIPT_NAME)".
Allowed contexts: "static...with exec" or cgi (no git), gemini/gopher/spartan/finger/nex servers.
### extendedgophermap
"extendedgophermap" enables lots of additional flexibility in writing the gopher menu format. The ideas are mostly borrowed from gophernicus, and sr-71's implementation is documented in more detail below in "Extended Gophermap Parsing".
Allowed contexts: static, cgi directives (no git), gopher servers (no gemini, spartan, nex, or finger).
### autoatom
The "autoatom" modifier customizes routing to recognize "<any other valid path>.atom", and parses any gemtext response, transforming it into atom XML according to the gemini companion specification.
=> gemini://geminiprotocol.net/docs/companion/subscription.gmi "Subscribing to Gemini pages" gemini companion specification
Allowed contexts: static, cgi directives (no git), gemini and spartan servers (no gopher, nex, or finger).
### auth <auth name>
The "auth" modifier takes the name of an auth (defined in a global "auth" directive) and sets it as a requirement to access the modified route. All the supported auth mechanisms are based on client TLS certificates, so they only work in servers with a "servertls" directive.
Allowed contexts: static, cgi, git directives, and gemini, gopher, finger, and nex servers (no spartan, and gopher/finger/nex only with "servertls").
### titan <auth name>
The "titan" modifier takes an auth name (defined in a global "auth" directive) and enables the titan file upload protocol in a static route. Titan requests specifically will have to pass the named auth mechanism.
Allowed contexts: static (neither cgi nor git), gemini servers (no gopher, spartan, nex, or finger).
### templates <dir path>
"templates" is a "git" directive modifier which allows custom templates (golang's text/template style) to control the rendering of the various pages in the git repo viewer.
The supported template names and their execution contexts are documented below in the section on "Git Viewing Templates".
Allowed contexts: git (neither static nor cgi), gemini or gopher servers (no spartan, nex, or finger).
# Git Viewing Templates
The request handlers for "git" directives are built around rendering templates. There are default templates built-in to sr-71, but they can also be overridden with the "templates" modifier.
All templates are parsed and rendered with the golang standard library "text/template" package.
=> https://pkg.go.dev/text/template Documentation on the "text/template" format
The directory path given to "templates" should contain template files which define any of the supported template names. There are differences in the templates depending on the protocol in use though.
## Types
These are the types you will encounter as context objects in rendered templates.
The Repository object at ".Repo" has methods:
* .Name() is a more readable name (for instance, trimming .git off the end).
* .NameBytes() is a byte slice of the Name.
* .Type(ctx, hash) produces the git type of an object (blob/tree/commit/etc) as a string.
* .Refs(ctx) returns a list of the branch and tag Ref objects.
* .Commits(ctx, head, num) returns a list of Commits, counting <num> back from <head>.
* .Commit(ctx, ref) returns a single Commit by it's ref string.
* .Diffstat(ctx, from_ref, to_ref) returns the plain text diffstat between two refs.
* .Diff(ctx, from_ref, to_ref) returns the diff string between two refs.
* .Readme(ctx, ref) returns any README file found in the given ref as a Readme object.
* .Description() reads the "description" file from the git repository and returns its contents.
* .Blob(ctx, ref, path) reads the file at <path> from commit <ref> and returns its contents.
* .Tree(ctx, ref, path) retrieves the directory contents at <path> from commit <ref> and returns a list of ObjectDescriptions.
### Ref objects
* .Repo a pointer back to the Repository.
* .Name is the full ref name, starting with "refs/heads/" or "refs/tags/".
* .Hash is the commit hash the ref points to.
* .IsBranch() returns whether the ref is a branch.
* .IsTag() returns whether the ref is a tag.
* .ShortName() returns just the branch or tag name, with "/refs/heads/" or "refs/tags/" stripped off.
### Commit objects
* .Repo is a pointer back to the Repository.
* .Hash is the full commit hash.
* .Parents is a list of strings, of the parent commit hashes.
* .CommitterName is the committer's name as a string.
* .CommitterEmail is the committer's email as a string.
* .CommitDate is the commit date as a Time object (not documented here, it's time.Time from the golang standard library).
* .AuthorName is the author's name string.
* .AuthorEmail is the author's email as a string.
* .AuthorDate is the authorship date as a Time.
* .Message is the full commit message as a string.
* .ParentHash() just returns the Hash with a "^" appended. This is usable for fetching the parent commit.
* .ShortMessage() returns just the first line of the commit message.
* .RestOfMessage() returns all but the first line of the commit message after trimming off an empty line at the start.
### Readme objects
* .Filename is the filename at which the README was found.
* .RawContents is a string of the file contents.
* .GeminiEscapedContents() produces the file contents but with any triple-backtick-leading lines prefixed with a space.
* .GopherEscapedContents(selector, host, port) produces the file contents formatted as gopher menu with each line an info-message line.
### ObjectDescription objects
ObjectDescription is a representation of a blob or tree:
* .Mode is an integer of the file permission bits.
* .Type is a string of the object type ("blob", "tree", etc).
* .Hash is the string git hash object object.
* .Size is the integer length of blob objects in bytes, or 0 for trees.
* .Path is the file or directory name of the object.
## Gopher Templates
The routes defined by the gopher git router are:
* / - a gophermap listing of the repositories in the directory, rendered by repo_root.gph
* /:repository - a gophermap overview of the repository, rendered by repo_home.gph
* /:repository/branches - a gophermap list of the branches and HEAD, rendered by branch_list.gph
* /:repository/tags - a gophermap listing of the tags, rendered by tag_list.gph
* /:repository/refs/:ref - a gophermap overview of a ref (commit), rendered by ref.gph
* /:repository/refs/:ref/tree - gophermap listing of a ref's root directory, rendered by tree.gph
* /:repository/refs/:ref/tree/*path - for paths to files the raw file content is sent, directories are rendered by tree.gph
* /:repository/diffstat/:fromref/:toref - the plain text diffstat between two refs, rendered by diffstat.gph.txt
* /:repository/diff/:fromref/:toref - the text/x-diff between two refs, rendered by diff.gph.txt
The standard object for rendering gopher templates has 6 fields:
* .Ctx: the context.Context from the request
* .Repo: the Repository object for the selected repo
* .Params: the string hashmap of the route parameters
* .Host: the hostname of the server
* .Port: the port number of the server
* .Selector: the selector of the current request
All templates use the above object as their rendering context except repo_root.gph (which doesn't have a repository selected). repo_root.gph is rendered with a list of the repository names found in the directory.
Gopher templates also have three additional functions defined:
* combine(string, ...string) -> string - successively combines paths by resolving relative references.
* join(string, ...string) -> string - successively joins path segments.
* rawtext(selector, host, port, text) -> string - renders text as gopher info-message lines.
## Gemini Templates
The gemini and spartan git routers define these routes:
* / - gemtext listing of the repos in the directory, rendered by repo_root.gmi
* /:repository/ - gemtext overview of the repository, rendered by repo_home.gmi
* /:repository/branches - gemtext list of branches/heads, rendered by branch_list.gmi
* /:repository/tags - gemtext listing of tags, rendered by tag_list.gmi
* /:repository/refs/:ref/ - gemtext overview of a ref, rendered by ref.gmi
* /:repository/refs/:ref/tree/*path - raw file contents for blobs, gemtext listing of directories for trees, the latter rendered by tree.gmi
* /:repository/diffstat/:fromref/:toref - text/plain diffstat rendered by diffstat.gmi.txt
* /:repository/diff/:fromref/:toref - text/plain diff rendered by diff.gmi.txt
The standard object for gemini templates has:
* .Ctx: the context.Context from the request
* .Repo: the Repository object for the selected repo
* .Params: the string hashmap of the route parameters
This object is used for all templates except repo_root.gmi, which instead gets a list of the repository name strings.
# Running CGIs
sr-71 can run child processes in a CGI environment to respond to requests. It can do this as a result of a "cgi" directive, or "static...with exec". While parsing extended gophermap, it can also execute other files included with an "=" line (but even then, it will only execute the file if it is under "cgi" or "static...with exec"). Also, it only ever executes files which are both world-readable and world-executable.
When it runs a CGI process, sr-71 sets up a standard CGI environment for it, trying to follow RFC 3875 as closely as possible.
* the current working directory is the directory in which the executed file resides
* SCRIPT_NAME is the portion of the request path which resolved to the executable file
* PATH_INFO is the remainder of the request path after SCRIPT_NAME, or "/" if the entire path was used in SCRIPT_NAME
* CONTENT_LENGTH, CONTENT_TYPE, PATH_TRANSLATED, REMOTE_HOST, REMOTE_IDENT are all set empty
* GATEWAY_INTERFACE is "CGI/1.1"
* QUERY_STRING is the raw URL query string (following the ? in the URL)
* SERVER_NAME is the hostname attached to the server, if any
* SERVER_PORT is the port on which the server is listening
* SERVER_PROTOCOL is the protocol: "GOPHER", "FINGER", or "GEMINI"
* SERVER_SOFTWARE is set to "SLIDERULE" (this is a library used behind the scenes)
* AUTH_TYPE is "Certificate" if there is a client TLS certificate, empty otherwise
* TLS_CLIENT_HASH is the sha256 hash of the client TLS certificate, if there is one
* TLS_CLIENT_CERT is the hex-encoded raw client TLS certificate, if there is one
* TLS_CLIENT_ISSUER is the issuer on the client TLS certificate, if there is one
* TLS_CLIENT_ISSUER_CN is the common-name of the client TLS cert issuer, if there is one
* TLS_CLIENT_SUBJECT is the subject field of the client TLS cert, if there is one
* TLS_CLIENT_SUBJECT_CN is the subject's common name of the client TLS cert, if there is one
Standard in of the CGI process is set to the request body, if there is one (this is only the case for titan requests). Standard out of the CGI process is used as the response body, and the default format for the protocol is assumed (gemtext for gemini and spartan, gopher menu for gopher, text/no format for finger). "extendedgophermap" is usable in conjunection with "cgi" or "static...with exec" in gopher servers, in which case the standard output will be processed with the gophermap extensions.
The "cmd" modifier can be used on cgi and "static...with exec" directives, in which case the given world-executable file will be used in place of the resolved CGI program, however the environment will still be set up entirely as if the executable pointed at by the request is being run. This means the working directory is that of the located program (not the cmd override), and SCRIPT_NAME and PATH_INFO are set as if that program was being run. This means that a no-op cmd override could just "exec ./$(basename $SCRIPT_NAME)".
The cmd override can be a way to limit system resources, chroot, or otherwise impose a security sandbox around the CGI program.
# User-Custom Paths
Shared-hosting environments can benefit from utilizing /~username paths for the various users on a host. sr-71 has a compact solution for this situation. In routing directives (static, cgi, git) both the file system path and URL path may include the tilde (~) character. In the URL path it will match a "~username" request path segment and capture the user name. If the file system path begins with a ~ character it will be replaced with the requested user's home directory, otherwise it will simply be replaced by the username.
This captured username and file system path handling is usable in a few other contexts as well, such as "clienttlsfile" auth.
```user-custom path configuration examples
# The "user_private" auth lets users on the system define their own allowed users
auth user_private clienttlsfile /var/gemini/users/~/private/allowed_users
gemini {
# The "at /~" will route any paths beginning with "/~<username>/..." to /var/gemini/users/<username>/...
static /var/gemini/users/~ at /~ with dirdefault index.gmi, dirlist, autoatom
# Here we're overriding the "private" subdirectory to let users set their own allowed list of client TLS certs.
# Paths beginning with "/~<username>/private/..." will route to "/var/gemini/users/<username>/private/... but
# will ONLY be allowed through if their client TLS cert hash is in /var/gemini/users/<username>/private/allowed_users
static /var/gemini/users/~/private at /~/private with dirdefault index.gmi, dirlist auth user_private
}
gopher {
# Here in gopher we're using the users' $HOME directories instead (because the fs path *begins* with ~).
# Paths beginning with "/~<username>/..." will be routed to "<$HOME for username>/public_gopher/..."
static ~/public_gopher at /~ with dirdefault gophermap, dirlist, extendedgophermap
}
```
# Virtualhosting
Gopher and Finger have no guarantee that a domain name will appear in a request. Therefore virtualhosting by domain doesn't make sense in these protocols. Multiple gopher or finger servers may still be defined in a configuration file, but they will have to appear on separate IPs and/or ports.
With Gemini and Spartan, however, there can also be multiple servers defined on the same IP and port (perhaps defaults), which can be differentiated at request-time based on the requested domain and "host" directives in each server.
```example of gemini virtualhosting
gemini {
# default 0.0.0.0:1965
servertls key /etc/ssl/mydomain.pem cert /etc/ssl/mydomain.pem
# only requests for "mydomain.com" will be served by the logic in this server
host mydomain.com
static /var/gemini at / with dirdefault index.gmi, dirlist, autoatom
}
gemini {
# ALSO default 0.0.0.0:1965
# "code.mydomain.com" does the git hosting defined here
host code.mydomain.com
servertls key /etc/ssl/code.mydomain.pem cert /etc/ssl/code.mydomain.pem
git /var/repos at /
}
```
TLS negotiation is done before the request is sent, but sr-71 can use SNI to select the correct certificate to use. So separate gemini servers, even when listening on the same IP and port, can have separate "servertls" directives.
# Extended Gophermap Parsing
sr-71's gopher server supports extensions to the gophermap format to make it easier to write, and to provide flexibility to CGIs. The ideas here are mostly borrowed from the Gophernicus server. These aren't on by default but can be enabled with the "extendedgophermap" modifier for static and cgi directives.
=> gopher://gopher.gophernicus.org:70/0/docs/README.Gophermap documentation on Gophernicus's gophermap extensions
## Automatic info-message lines
Any lines in extended gophermap which contain NO tab characters (normally exactly 3 tabs are required on all lines) will be converted to an info-message line (type 'i'), with the current document's selector, and the current server's hostname and port. The end result is that such lines are rendered as-is in gopher clients.
## Additional line types
* '#' lines are treated as comments and filtered out.
* A '!' line is treated as the gopher menu title.
* lines with just '-<filename>' are used to hide a file from a subsequent '*' listing.
* lines with just ':<ext>=<type>' are used to override gopher file types for this directory and will affect a '*' file listing. The '<ext>' may be a file extension or a full file name, and '<type>' must be the single character code of a gopher file type.
* lines with just '=<filepath>' will include or execute another (possibly extended) gophermap. If the current directive is a "cgi" or "static...with exec", the file path may be executed as another CGI, otherwise its contents will be parsed and dropped in place.
* '*' behaves like '.' (stop processing the gophermap) but it adds a listing of the current directory at the end.