363 lines
13 KiB
Python
363 lines
13 KiB
Python
# mill.py, Markdown interface for llama.cpp
|
|
# Copyright (C) 2024 unworriedsafari <unworriedsafari@tilde.club>
|
|
#
|
|
# This program is free software: you can redistribute it and/or modify
|
|
# it under the terms of the GNU Affero General Public License as published by
|
|
# the Free Software Foundation, either version 3 of the License, or
|
|
# (at your option) any later version.
|
|
#
|
|
# This program is distributed in the hope that it will be useful,
|
|
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
# GNU Affero General Public License for more details.
|
|
#
|
|
# You should have received a copy of the GNU Affero General Public License
|
|
# along with this program. If not, see <https://www.gnu.org/licenses/>
|
|
|
|
"""
|
|
## Markdown tutorial
|
|
|
|
This section describes the Markdown language module of `mill.py`.
|
|
|
|
`mill.py` is controlled with variables embedded in the Markdown document.
|
|
|
|
In general, variables take the form
|
|
|
|
```variable-type [reset]
|
|
name
|
|
[value]
|
|
```
|
|
|
|
Variables are assigned to in fenced code blocks. The syntax follows the
|
|
CommonMark spec as much as possible. The first line inside the block is the
|
|
name of the variable. The name can contain spaces in principle. The text of
|
|
the block from the second line onward is the value. Nothing prevents you from
|
|
having a multi-line value. It depends on the variable whether or not this makes
|
|
sense. The value of a block with only a variable name is the empty string.
|
|
|
|
Variables are either syntax variables or LLM variables. The distinction is made
|
|
based on the variable type contained in the info string. Syntax variables have
|
|
type `mill` and are handled directly by `mill.py` while LLM variables have
|
|
other types and are passed on to the LLM-engine module.
|
|
|
|
Syntax variables and LLM variables exist in two different namespaces. The
|
|
namespace is implied by the variable type. If the `reset` flag is given, then
|
|
the variable value must be absent. The variable is reset to its default value.
|
|
If the variable has no default value, then it ceases to exist in the namespace
|
|
until it is assigned to again.
|
|
|
|
`mill.py` parses the text in a single pass from top-to-bottom and then calls
|
|
the LLM at the end. Some syntax variables affect input parsing. Assignments to
|
|
a variable overwrite any existing value. For LLM variables, the final value of
|
|
a variable is the value passed on to the LLM.
|
|
|
|
The following two subsections explain variables in more detail. For each
|
|
variable, the default value is given as the value.
|
|
|
|
|
|
### Syntax variables
|
|
|
|
The following variables are syntax variables.
|
|
|
|
|
|
```mill
|
|
prompt start
|
|
```
|
|
|
|
The `prompt start` variable marks the start of the prompt. `mill.py` excludes
|
|
_everything_ before the last occurrence of this variable from the prompt.
|
|
_However_, if this variable does not exist, then `mill.py` considers that the
|
|
potential prompt starts at the beginning of the document. In other words,
|
|
omitting it is the same as putting it at the very start of the document.
|
|
|
|
When the prompt size exceeds the LLM's context limit, you can either move the
|
|
`prompt start` variable down or create another one. The value of this variable
|
|
doesn't matter. It's only its position in the document that counts.
|
|
|
|
|
|
```mill
|
|
prompt indent
|
|
>
|
|
```
|
|
|
|
The value of the `prompt indent` variable must be (at most) one line. It's a
|
|
line prefix. Only blocks for which the lines start with this prefix are
|
|
considered to be part of the prompt. These blocks are called _prompt indent
|
|
blocks_ throughout the tutorial. The `prompt indent` variable affects input
|
|
parsing. For each line of input, the most recent value of this variable is used
|
|
to identify prompt indent blocks.
|
|
|
|
Technically, you can set `prompt indent` to the empty string. _No variables are
|
|
parsed in a prompt indent block._ So, in this situation, if the prompt starts
|
|
before the setting, then all the text below the assignment is considered to be
|
|
part of the prompt, and any variable assignments below the setting are ignored.
|
|
|
|
|
|
```mill
|
|
message template
|
|
```
|
|
|
|
The `message template` variable contains the template for each message. When
|
|
`mill.py` responds to a prompt, the value of this variable is added at the end
|
|
of the output of the LLM.
|
|
|
|
Note that `mill.py` does not add extra newlines to the output of the LLM in
|
|
general. You can add blank lines at the start of the message template instead.
|
|
This is by design. Some models are sensitive to newlines, so the user should be
|
|
able to control newlines.
|
|
|
|
|
|
### LLM variables
|
|
|
|
There are three different variable types for LLM variables:
|
|
|
|
1. `mill-llm`
|
|
2. `mill-llm-file`
|
|
3. `mill-llm-b64-gz-file`
|
|
|
|
The first type simply assigns the value to the name.
|
|
|
|
For some LLM engines (like `llama.cpp`), it's useful to pass arguments via a
|
|
file. This can be done using the second and third variable types. For example,
|
|
you can pass a grammar via either `--grammar` or `--grammar-file`. However,
|
|
grammars can contain tokens that `mill.py` does not know how to shell-escape.
|
|
In that case, you have to use `--grammar-file`. The next paragraph explains how
|
|
to use it.
|
|
|
|
To pass an argument via a file, use `mill-llm-file` or `mill-llm-b64-gz-file`.
|
|
The former is for text data, the latter for binary data. The value is stored in
|
|
a temporary file. The name of the temporary file subsequently becomes the new
|
|
value of the variable. Binary data must be a base64 representation of a
|
|
gzipped file. The file is uncompressed by `mill.py` before passing it to the
|
|
LLM. The base64 data can be split across multiple lines. The newlines are
|
|
removed in that case.
|
|
|
|
|
|
### Prompt construction
|
|
|
|
The algorithm to construct the entire prompt is simple and can be stated in one
|
|
line: _concatenate the text of all the prompt indent blocks below the last
|
|
prompt start._
|
|
|
|
The text of a prompt indent block does not include the prompt indent for each
|
|
line. Everything else is included, even newlines, with one exception: the
|
|
newline that ends the block is excluded.
|
|
"""
|
|
|
|
import base64, contextlib, gzip, os, re, sys, tempfile
|
|
|
|
|
|
def parse(input_lines):
|
|
return Language(input_lines)
|
|
|
|
|
|
class Language(contextlib.AbstractContextManager):
|
|
_default_prompt_indent = ' >'
|
|
|
|
|
|
def __init__(self, input_lines):
|
|
self._input_lines = input_lines
|
|
|
|
|
|
def __enter__(self):
|
|
self.prompt = ''
|
|
self.llm_vars = {}
|
|
self.returncode = 0
|
|
self._syntax_vars = {}
|
|
self._temp_files = []
|
|
|
|
complete_prompt = ''
|
|
|
|
# The stripping and adding of newlines is a bit complicated. This is
|
|
# the result of some trial and error with/without prompt indents.
|
|
last_line_in_prompt = False
|
|
var_update_lines = 0
|
|
for idx, line in enumerate(self._input_lines):
|
|
prompt_indent = self._syntax_vars.get(
|
|
'prompt indent',
|
|
self._default_prompt_indent)
|
|
|
|
if last_line_in_prompt and prompt_indent:
|
|
print()
|
|
|
|
# Still inside last variable update
|
|
if var_update_lines:
|
|
print(line, end='')
|
|
var_update_lines -= 1
|
|
continue
|
|
|
|
current_line_in_prompt = line.startswith(prompt_indent)
|
|
|
|
if not current_line_in_prompt:
|
|
namespace, updated_variable, var_update_lines = \
|
|
self._var_parse(idx)
|
|
|
|
if namespace is self._syntax_vars:
|
|
if updated_variable == 'prompt start':
|
|
if 'prompt start' in self._syntax_vars:
|
|
self.prompt = ''
|
|
print(f'[DEBUG] Prompt start: {idx}', file=sys.stderr)
|
|
else:
|
|
self.prompt = complete_prompt
|
|
|
|
elif updated_variable == 'prompt indent':
|
|
if len(namespace.get(updated_variable, '').split(os.linesep)) > 1:
|
|
raise SyntaxError(f'line {idx+4}: value for prompt indent must be at most one line')
|
|
|
|
if var_update_lines:
|
|
var_update_lines -= 1
|
|
|
|
print(line, end='')
|
|
|
|
last_line_in_prompt = False
|
|
|
|
else:
|
|
new_part = ''
|
|
|
|
if last_line_in_prompt and prompt_indent:
|
|
new_part += os.linesep
|
|
|
|
if prompt_indent and line.endswith(os.linesep):
|
|
print(line[:-len(os.linesep)], end='')
|
|
new_part += line[len(prompt_indent):-len(os.linesep)]
|
|
else:
|
|
print(line, end='')
|
|
new_part += line[len(prompt_indent):]
|
|
|
|
self.prompt += new_part
|
|
complete_prompt += new_part
|
|
|
|
last_line_in_prompt = True
|
|
|
|
sys.stdout.flush()
|
|
|
|
return self
|
|
|
|
|
|
def __exit__(self, exc_type, exc_value, traceback):
|
|
for f in self._temp_files:
|
|
os.remove(f)
|
|
self._temp_files = []
|
|
return None
|
|
|
|
|
|
def _var_parse(self, start_idx):
|
|
input_lines = self._input_lines[start_idx:]
|
|
if not input_lines:
|
|
return {}, '', 0
|
|
|
|
# Do we have an opening code fence?
|
|
opening_fence = input_lines[0].lstrip(' ')
|
|
indent_len = len(input_lines[0]) - len(opening_fence)
|
|
if indent_len > 3:
|
|
return {}, '', 0
|
|
|
|
# Determine fence string
|
|
fence_string = opening_fence[:3]
|
|
if fence_string not in ['```', '~~~']:
|
|
return {}, '', 0
|
|
|
|
while len(fence_string) < len(opening_fence) and \
|
|
opening_fence[len(fence_string)] == fence_string[0]:
|
|
fence_string += fence_string[0]
|
|
|
|
# Determine variable type
|
|
info_string = opening_fence[len(fence_string):].strip()
|
|
variable_types = ['mill-llm-file',
|
|
'mill-llm-b64-gz-file',
|
|
'mill-llm',
|
|
'mill']
|
|
variable_type = [t for t in variable_types if \
|
|
info_string.split(' ')[0] == t]
|
|
|
|
if not variable_type:
|
|
return {}, '', 0
|
|
else:
|
|
variable_type = variable_type[0]
|
|
|
|
namespace = self._syntax_vars if variable_type == 'mill' \
|
|
else self.llm_vars
|
|
|
|
# Determine variable name
|
|
variable_name = input_lines[1].strip() if len(input_lines) >= 2 else ''
|
|
if not variable_name:
|
|
raise SyntaxError(f'line {start_idx+2}: expected variable name')
|
|
|
|
# Gather variable value
|
|
variable_value = ''
|
|
num_block_lines = 2
|
|
|
|
for idx, line in enumerate(input_lines[2:], start=3):
|
|
# Strip indentation from line
|
|
for i in range(0,indent_len):
|
|
if line.startswith(' '):
|
|
line = line[1:]
|
|
|
|
if line.startswith(fence_string):
|
|
num_block_lines = idx
|
|
break
|
|
|
|
variable_value += line.rstrip() \
|
|
if variable_type == 'mill-llm-base64-gz-file' \
|
|
else line
|
|
|
|
if variable_value.endswith(os.linesep):
|
|
variable_value = variable_value[:-len(os.linesep)]
|
|
|
|
if 'reset' in info_string.split(' '):
|
|
if variable_value:
|
|
raise SyntaxError(f'line {start_idx+3}: value specified in reset (not allowed)')
|
|
if variable_name in namespace:
|
|
del namespace[variable_name]
|
|
else:
|
|
# Handle file variables
|
|
if variable_type == 'mill-llm-b64-gz-file':
|
|
variable_value = gzip.decompress(
|
|
base64.standard_b64decode(variable_value))
|
|
elif variable_type == 'mill-llm-file':
|
|
variable_value = variable_value.encode('utf-8')
|
|
|
|
if variable_type in ['mill-llm-file',
|
|
'mill-llm-b64-gz-file']:
|
|
with tempfile.NamedTemporaryFile(delete=False) as fp:
|
|
fp.write(variable_value)
|
|
fp.flush()
|
|
variable_value = fp.name
|
|
self._temp_files += [variable_value]
|
|
|
|
namespace[variable_name] = variable_value
|
|
|
|
print(f'[DEBUG] {variable_name}: {variable_value}', file=sys.stderr)
|
|
print(f'[DEBUG] temp files: {self._temp_files}', file=sys.stderr)
|
|
|
|
return namespace, variable_name, num_block_lines
|
|
|
|
|
|
def print_message_template(self):
|
|
message_template = self._syntax_vars.get('message template', ' >')
|
|
lines = message_template.split(os.linesep)
|
|
for idx, line in enumerate(lines):
|
|
if idx != len(lines)-1:
|
|
line += os.linesep
|
|
print(line, end='')
|
|
|
|
sys.stdout.flush()
|
|
|
|
|
|
def print_generated_text(self, generated_text):
|
|
if not generated_text:
|
|
return
|
|
|
|
# True but irrelevant
|
|
# self.prompt += generated_text
|
|
|
|
prompt_indent = self._syntax_vars.get('prompt indent',
|
|
self._default_prompt_indent)
|
|
lines = generated_text.split(os.linesep)
|
|
for idx, line in enumerate(lines):
|
|
if idx != 0:
|
|
print(os.linesep + prompt_indent, end='')
|
|
print(line, end='')
|
|
|
|
sys.stdout.flush()
|