Initial code commit
This commit is contained in:
commit
469335d73a
|
@ -0,0 +1,148 @@
|
|||
|
||||
# Created by https://www.toptal.com/developers/gitignore/api/python
|
||||
# Edit at https://www.toptal.com/developers/gitignore?templates=python
|
||||
|
||||
### Python ###
|
||||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# C extensions
|
||||
*.so
|
||||
|
||||
# Distribution / packaging
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
pip-wheel-metadata/
|
||||
share/python-wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
MANIFEST
|
||||
|
||||
# PyInstaller
|
||||
# Usually these files are written by a python script from a template
|
||||
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
||||
*.manifest
|
||||
*.spec
|
||||
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
|
||||
# Unit test / coverage reports
|
||||
htmlcov/
|
||||
.tox/
|
||||
.nox/
|
||||
.coverage
|
||||
.coverage.*
|
||||
.cache
|
||||
nosetests.xml
|
||||
coverage.xml
|
||||
*.cover
|
||||
*.py,cover
|
||||
.hypothesis/
|
||||
.pytest_cache/
|
||||
pytestdebug.log
|
||||
|
||||
# Translations
|
||||
*.mo
|
||||
*.pot
|
||||
|
||||
# Django stuff:
|
||||
*.log
|
||||
local_settings.py
|
||||
db.sqlite3
|
||||
db.sqlite3-journal
|
||||
|
||||
# Flask stuff:
|
||||
instance/
|
||||
.webassets-cache
|
||||
|
||||
# Scrapy stuff:
|
||||
.scrapy
|
||||
|
||||
# Sphinx documentation
|
||||
docs/_build/
|
||||
doc/_build/
|
||||
|
||||
# PyBuilder
|
||||
target/
|
||||
|
||||
# Jupyter Notebook
|
||||
.ipynb_checkpoints
|
||||
|
||||
# IPython
|
||||
profile_default/
|
||||
ipython_config.py
|
||||
|
||||
# pyenv
|
||||
.python-version
|
||||
|
||||
# pipenv
|
||||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
|
||||
# However, in case of collaboration, if having platform-specific dependencies or dependencies
|
||||
# having no cross-platform support, pipenv may install dependencies that don't work, or not
|
||||
# install all needed dependencies.
|
||||
#Pipfile.lock
|
||||
|
||||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
|
||||
__pypackages__/
|
||||
|
||||
# Celery stuff
|
||||
celerybeat-schedule
|
||||
celerybeat.pid
|
||||
|
||||
# SageMath parsed files
|
||||
*.sage.py
|
||||
|
||||
# Environments
|
||||
.env
|
||||
.venv
|
||||
env/
|
||||
venv/
|
||||
ENV/
|
||||
env.bak/
|
||||
venv.bak/
|
||||
pythonenv*
|
||||
|
||||
# Spyder project settings
|
||||
.spyderproject
|
||||
.spyproject
|
||||
|
||||
# Rope project settings
|
||||
.ropeproject
|
||||
|
||||
# mkdocs documentation
|
||||
/site
|
||||
|
||||
# mypy
|
||||
.mypy_cache/
|
||||
.dmypy.json
|
||||
dmypy.json
|
||||
|
||||
# Pyre type checker
|
||||
.pyre/
|
||||
|
||||
# pytype static type analyzer
|
||||
.pytype/
|
||||
|
||||
# profiling data
|
||||
.prof
|
||||
|
||||
# End of https://www.toptal.com/developers/gitignore/api/python
|
||||
|
||||
# orbit.json file (contains the URLs in the orbit)
|
||||
orbit.json
|
|
@ -0,0 +1,44 @@
|
|||
# molniya - Gemini orbit software
|
||||
|
||||
An orbit is like a webring but for gemini. This code was originally written to service LEO[1], the first (to my knowledge) orbit ever made.
|
||||
|
||||
A Molniya orbit is a type of satellite orbit designed to provide communications and remote sensing coverage over high latitudes (thanks Wikipedia[2]). It was invented by the Russians for their spy satellites. I feel like it's a fitting name.
|
||||
|
||||
[1]: gemini://tilde.team/~khuxkm/leo/
|
||||
[2]: https://en.wikipedia.org/wiki/Molniya_orbit
|
||||
|
||||
## How to use Molniya
|
||||
|
||||
In order to use Molniya, do the following:
|
||||
|
||||
1. Clone this repository into a place on your Gemini root directory. (If you're using a userdir for this, that's fine too.)
|
||||
|
||||
2. Modify `config.py`. Namely, update `MAIN_PAGE` to be the link to the index.gmi file, and update `REQUIRED_LINKS` to be the links to `next.cgi`, `prev.cgi`, and `rand.cgi`, wherever they are accessible via Gemini. `BACKLINKS` and `determine_capsule` shouldn't need to be changed (unless GUS is down or you want to change how Zenit decides what makes up a capsule.
|
||||
|
||||
3. Modify `index.gmi` to link to your files instead of mine.
|
||||
|
||||
4. Get people to link to your main page and one of `next.cgi`, `prev.cgi`, and `rand.cgi`.
|
||||
|
||||
5. Set a cronjob to run Zenit every now and again. (If I knew how often GUS indexed, I'd give a specific frequency, but I don't, so I won't.)
|
||||
|
||||
## What is the orbit.json file?
|
||||
|
||||
The `orbit.json` file contains all of the URLs in the orbit. It's how Molniya and Zenit know what URLs are there, and in what order.
|
||||
|
||||
## You keep mentioning Zenit. What is it?
|
||||
|
||||
Zenit is the Molniya indexer. It uses GUS's backlinks feature to get a list of pages that link to the orbit, and then checks them for having a link to allow people to continue to traverse the webring.
|
||||
|
||||
## Can you add links to orbit.json manually?
|
||||
|
||||
Of course you can! Just make sure to keep `orbit.json` valid JSON, or it'll break the Molniya library and the entire ring will break.
|
||||
|
||||
## Why does Molniya want the next and previous links to contain the URL of the page as a query?
|
||||
|
||||
Because Gemini lacks referrers (a good choice), there's no way to tell where a client came from just by studying the request. As such, it needs some way of indicating where in the orbit the client is.
|
||||
|
||||
`rand.cgi` lacks this requirement because it just gets a random page anyways. That being said, it's best if the client also puts their URL in the `rand.cgi` link, just so that users selecting the random link option aren't sent back to the very site they just came from.
|
||||
|
||||
## Why does Molniya abuse redirects to send the user on to the next page? Why not have a landing page?
|
||||
|
||||
Because I don't feel like implementing a landing page to just have the user click off of it. The redirect *should* replace the orbit link in the client's history, so the user shouldn't be adversely affected by this.
|
|
@ -0,0 +1,31 @@
|
|||
# ------------------------------------------------------------------------------
|
||||
# The configuration for the Molniya orbit software.
|
||||
# Implemented as a Python library to allow for determine_capsule logic.
|
||||
# ------------------------------------------------------------------------------
|
||||
|
||||
# The main page of the orbit.
|
||||
# Is used to seed the orbit when it's first created, as well as to find new pages to include in the orbit.
|
||||
MAIN_PAGE = "gemini://tilde.team/~khuxkm/leo/"
|
||||
|
||||
# The backlinks base page.
|
||||
# Takes a URL as a query and returns a list of pages that backlink to it.
|
||||
# In case GUS ever goes down, is replaced, or changes URL, this will allow for easy fixing of Zenit.
|
||||
BACKLINKS = "gemini://gus.guru/backlinks"
|
||||
|
||||
# Determine a "capsule".
|
||||
# Used to prevent one person from flooding the orbit with pages.
|
||||
# One page is allowed in the list per "capsule".
|
||||
# This function is passed a urllib.parse.ParseResult object, and returns a string that identifies which capsule the URL belongs to.
|
||||
def determine_capsule(parsed):
|
||||
capsule = parsed.netloc
|
||||
if parsed.path.startswith("/~"): # allow for each userdir to be its own capsule
|
||||
capsule+="/"+parsed.path.split("/")[1]
|
||||
return capsule
|
||||
|
||||
# Required links.
|
||||
# A site must have one of these links to be included in the orbit.
|
||||
REQUIRED_LINKS = [
|
||||
"gemini://tilde.team/~khuxkm/leo/next.cgi",
|
||||
"gemini://tilde.team/~khuxkm/leo/prev.cgi",
|
||||
"gemini://tilde.team/~khuxkm/leo/rand.cgi"
|
||||
]
|
|
@ -0,0 +1,51 @@
|
|||
import os,subprocess
|
||||
from urllib.parse import urljoin
|
||||
|
||||
def start_response(meta,code="20"):
|
||||
print(f"{code} {meta}",end="\r\n")
|
||||
|
||||
def get_input():
|
||||
return os.environ.get("QUERY_STRING","")
|
||||
|
||||
# utility func
|
||||
def has_input():
|
||||
return bool(get_input())
|
||||
|
||||
def link(text,url):
|
||||
print(f"=> {url} {text}")
|
||||
|
||||
def header(text,level=1):
|
||||
print("#"*level + " " + text)
|
||||
|
||||
# named for convenience; in practice just use header func with different levels
|
||||
def h1(text):
|
||||
header(text,1)
|
||||
|
||||
def h2(text):
|
||||
header(text,2)
|
||||
|
||||
def h3(text):
|
||||
header(text,3)
|
||||
|
||||
def text(text=""):
|
||||
print(text)
|
||||
|
||||
def figlet(tex,**opts):
|
||||
if len(opts.keys())==0:
|
||||
opts = {"f":"standard"}
|
||||
ct = ["/usr/bin/figlet"]
|
||||
for k in opts:
|
||||
ct.append("-"+k)
|
||||
ct.append(opts[k])
|
||||
ct.append(tex)
|
||||
text("```")
|
||||
command_out(ct)
|
||||
text("```")
|
||||
|
||||
def command_lines(ct):
|
||||
for l in subprocess.check_output(ct).decode("ascii").split("\n"):
|
||||
yield l
|
||||
|
||||
def command_out(ct,f=text):
|
||||
for l in command_lines(ct):
|
||||
f(l)
|
|
@ -0,0 +1,33 @@
|
|||
#LEO
|
||||
|
||||
LEO is an orbit, a term used here to mean "a webring but for Gemini instead of the web". The name is both a play on Low Earth Orbit, as well as the constellation Leo, the Lion.
|
||||
|
||||
##How to join LEO
|
||||
|
||||
To join LEO, here's what you need to do:
|
||||
|
||||
1. Link to `gemini://tilde.team/~khuxkm/leo/` on your page. This is what will get the indexer to notice your page.
|
||||
2. Make sure your page is indexed by GUS (gemini://gus.guru/). GUS's backlink page is what drives the indexer.
|
||||
3. Also include a link to one of the following:
|
||||
|
||||
```
|
||||
Next Page - gemini://tilde.team/~khuxkm/leo/next.cgi?<your url>
|
||||
Last Page - gemini://tilde.team/~khuxkm/leo/prev.cgi?<your url>
|
||||
Random Page - gemini://tilde.team/~khuxkm/leo/rand.cgi
|
||||
```
|
||||
|
||||
If your page links to this page and at least one (1) of the links above, your page will be added to the orbit the next time the indexer runs.
|
||||
|
||||
Also, only one page can be added to the orbit per capsule (capsule being defined as the host:port, as well as optional userdir). This is to prevent any one person from flooding LEO with their own pages.
|
||||
|
||||
##How do I start my own orbit?
|
||||
|
||||
To start your own orbit, download Molniya at the link below. Place it in a gemini root, and modify config.py as directed in Molniya's README.
|
||||
|
||||
=> https://tildegit.org/khuxkm/molniya Molniya (will take you to HTTP land, tildegit.org)
|
||||
|
||||
##Can I go to the next page now?
|
||||
|
||||
Yep!
|
||||
|
||||
=> next.cgi Beam me up, Scotty!
|
|
@ -0,0 +1,41 @@
|
|||
"""The utility library for the Molniya CGI pages.
|
||||
|
||||
Handles loading the orbit, as well as next/previous/random links."""
|
||||
import json, random, os, urllib.parse
|
||||
# stolen from AV-98
|
||||
urllib.parse.uses_relative.append("gemini")
|
||||
urllib.parse.uses_netloc.append("gemini")
|
||||
|
||||
# The URL of the main page of the orbit
|
||||
MAIN_PAGE = "gemini://tilde.team/~khuxkm/leo/"
|
||||
|
||||
URLS = [MAIN_PAGE]
|
||||
try:
|
||||
with open("orbit.json") as f:
|
||||
URLS = json.load(f)["urls"]
|
||||
except:
|
||||
pass
|
||||
|
||||
CURRENT_URL = urllib.parse.unquote(os.environ.get("QUERY_STRING",MAIN_PAGE))
|
||||
CURRENT_URL_PARSED = urllib.parse.urlparse(CURRENT_URL)
|
||||
try:
|
||||
CURRENT_URL_INDEX = URLS.index(CURRENT_URL)
|
||||
except:
|
||||
CURRENT_URL = MAIN_PAGE
|
||||
try:
|
||||
CURRENT_URL_INDEX = URLS.index(CURRENT_URL)
|
||||
except:
|
||||
# give up and just start at 0
|
||||
CURRENT_URL_INDEX = 0
|
||||
|
||||
def next_url():
|
||||
return URLS[(CURRENT_URL_INDEX+1)%len(URLS)]
|
||||
|
||||
def prev_url():
|
||||
return URLS[(CURRENT_URL_INDEX-1)%len(URLS)]
|
||||
|
||||
def rand_url():
|
||||
ret = random.choice(URLS)
|
||||
while ret==CURRENT_URL or ret==MAIN_PAGE:
|
||||
ret = random.choice(URLS)
|
||||
return ret
|
|
@ -0,0 +1,4 @@
|
|||
#!/usr/bin/env python3
|
||||
import molniya, gemini
|
||||
|
||||
gemini.start_response(molniya.next_url(),"30")
|
|
@ -0,0 +1,4 @@
|
|||
#!/usr/bin/env python3
|
||||
import molniya, gemini
|
||||
|
||||
gemini.start_response(molniya.next_url(),"30")
|
|
@ -0,0 +1,4 @@
|
|||
#!/usr/bin/env python3
|
||||
import molniya, gemini
|
||||
|
||||
gemini.start_response(molniya.next_url(),"30")
|
|
@ -0,0 +1,195 @@
|
|||
"""Zenit - the Molniya indexer.
|
||||
|
||||
Zenit was a series of military photoreconnaissance satellites launched by the Soviet Union between 1961 and 1994. In keeping with the Soviet spy satellite theme, I chose this name for the indexer."""
|
||||
import json, urllib.parse, traceback, sys, ssl, socket, string
|
||||
from config import *
|
||||
# stolen from AV-98
|
||||
urllib.parse.uses_relative.append("gemini")
|
||||
urllib.parse.uses_netloc.append("gemini")
|
||||
|
||||
# Load URL list
|
||||
URLS = [MAIN_PAGE]
|
||||
try:
|
||||
with open("orbit.json") as f:
|
||||
URLS = json.load(f)["urls"]
|
||||
except IOError as e: # we can be a bit more outgoing about our errors here
|
||||
print(f"Error loading orbit.json: {e!r}")
|
||||
print("Continuing on anyways with a list containing only the URL of the main page.")
|
||||
except KeyError as e:
|
||||
print("Malformed orbit.json: no urls list")
|
||||
print("Continuing on anyways with a list containing only the URL of the main page.")
|
||||
except:
|
||||
print("Error loading orbit.json (not IOError or KeyError):")
|
||||
traceback.print_exc()
|
||||
print("Exiting.")
|
||||
sys.exit(1)
|
||||
|
||||
# Utility function to parse a MIME type
|
||||
def parse_mime(mimetype):
|
||||
mimetype = mimetype.strip()
|
||||
index = 0
|
||||
type = ""
|
||||
# type is everything before the /
|
||||
while index<len(mimetype) and mimetype[index]!="/":
|
||||
type+=mimetype[index]
|
||||
index+=1
|
||||
index+=1
|
||||
subtype = ""
|
||||
# subtype is everything after the slash and before the semicolon (if the latter exists)
|
||||
while index<len(mimetype) and mimetype[index]!=";":
|
||||
subtype+=mimetype[index]
|
||||
index+=1
|
||||
index+=1
|
||||
# if there's no semicolon, there are no params
|
||||
if index>=len(mimetype): return [type,subtype], dict()
|
||||
params = dict()
|
||||
while index<len(mimetype):
|
||||
# skip whitespace
|
||||
while index<len(mimetype) and mimetype[index] in string.whitespace:
|
||||
index+=1
|
||||
paramName = ""
|
||||
# the parameter name is everything before the = or ;
|
||||
while index<len(mimetype) and mimetype[index] not in "=;":
|
||||
paramName+=mimetype[index]
|
||||
index+=1
|
||||
# if the string is over or there isn't an equals sign, there's no param value
|
||||
if index>=len(mimetype) or mimetype[index]==";":
|
||||
index+=1
|
||||
params[paramName]=None
|
||||
continue
|
||||
# otherwise, grab the param value
|
||||
index+=1
|
||||
paramValue = ""
|
||||
if mimetype[index]=='"':
|
||||
index+=1
|
||||
while True:
|
||||
while index<len(mimetype) and mimetype[index] not in '\\"':
|
||||
paramValue+=mimetype[index]
|
||||
index+=1
|
||||
if index>=len(mimetype): break
|
||||
c = mimetype[index]
|
||||
index+=1
|
||||
if c=="\\":
|
||||
if index>=len(mimetype):
|
||||
paramValue+=c
|
||||
break
|
||||
paramValue+=mimetype[index]
|
||||
index+=1
|
||||
else:
|
||||
break
|
||||
# skip until next ;
|
||||
while index<len(mimetype) and mimetype[index]!=";": index+=1
|
||||
else:
|
||||
while index<len(mimetype) and mimetype[index]!=";":
|
||||
paramValue+=mimetype[index]
|
||||
index+=1
|
||||
if paramName: params[paramName]=paramValue
|
||||
return [type, subtype], params
|
||||
|
||||
# Utility function to grab content from a URL
|
||||
# Context setup courtesy of my own half-baked spartan client
|
||||
def grab_content(url,redirect_num=0):
|
||||
if redirect_num>=5:
|
||||
return "Too many redirects!","text/plain"
|
||||
parsed = urllib.parse.urlparse(url)
|
||||
if "ctx" not in globals():
|
||||
ctx = ssl.create_default_context()
|
||||
ctx.check_hostname = False
|
||||
ctx.verify_mode = ssl.CERT_NONE
|
||||
globals()["ctx"]=ctx
|
||||
else:
|
||||
ctx = globals()["ctx"]
|
||||
with socket.socket(socket.AF_INET,socket.SOCK_STREAM) as s:
|
||||
ss = ctx.wrap_socket(s,server_hostname=parsed.hostname)
|
||||
ss.connect((parsed.hostname,parsed.port or 1965))
|
||||
ss.send((url.strip()+"\r\n").encode("UTF-8"))
|
||||
out = b""
|
||||
while (data:=ss.recv(2048)):
|
||||
out+=data
|
||||
header, content = out.split(b"\r\n",1)
|
||||
status, meta = header.decode("utf-8").split(None,1)
|
||||
assert len(meta)<1024
|
||||
if status[0]=="2":
|
||||
types, params = parse_mime(meta)
|
||||
if types[0]=="text":
|
||||
# assume UTF-8
|
||||
charset = "utf-8"
|
||||
# ...but if another charset is given accept it
|
||||
if "charset" in params:
|
||||
charset = params["charset"]
|
||||
# decode and return
|
||||
return content.decode(charset), meta
|
||||
else:
|
||||
# if it's not a text result, just return the content
|
||||
return content, meta
|
||||
elif status[0]=="3":
|
||||
# if it's a redirect, then let's follow it
|
||||
return grab_content(meta,redirect_num+1)
|
||||
else:
|
||||
# Either:
|
||||
# 1x - it wants an input, which we have no agency to give
|
||||
# 6x - it wants a client cert, which we have no agency to give
|
||||
# 4x or 5x - there's an error
|
||||
# Return the header with a mimetype of text/plain. If this were a real library I might throw an error here, but this is just to make Zenit work.
|
||||
return header.decode("utf-8"), "text/plain"
|
||||
|
||||
CAPSULES_IN_ORBIT = set(determine_capsule(urllib.parse.urlparse(url)) for url in URLS)
|
||||
modified_orbit = False
|
||||
|
||||
backlinks_url = "?".join([BACKLINKS, urllib.parse.quote(MAIN_PAGE)])
|
||||
response, mime = grab_content(backlinks_url)
|
||||
# Backlinks page should return a text/gemini doc
|
||||
assert mime.startswith("text/gemini"),f"Backlinks URL returned a response that wasn't text/gemini! ({response},{mime})"
|
||||
|
||||
links = []
|
||||
stage = 1
|
||||
for line in response.splitlines():
|
||||
if stage==1:
|
||||
if line.startswith("=>"):
|
||||
stage=2
|
||||
if stage==2:
|
||||
if line.startswith("=>"):
|
||||
parts = line.split(None,2)
|
||||
links.append(parts[1])
|
||||
else:
|
||||
stage=3
|
||||
break
|
||||
# stage 3 is to ignore the rest of the lines.
|
||||
# if we're in stage 3 and somehow miss breaking out of the loop, nothing will happen
|
||||
|
||||
# filter out just the new links
|
||||
links = [link for link in links if link not in URLS]
|
||||
|
||||
print("Found {} new link{} to index{}".format(len(links),"" if len(links)==1 else "s","..." if len(links)>0 else "."))
|
||||
|
||||
for link in links:
|
||||
# Things to consider for a new link:
|
||||
# Does its capsule already have representation in the orbit?
|
||||
capsule = determine_capsule(urllib.parse.urlparse(link))
|
||||
if capsule in CAPSULES_IN_ORBIT:
|
||||
# skip
|
||||
print(f"Skipping {link} (capsule already in orbit)...")
|
||||
continue
|
||||
# Does it link to any of the required links?
|
||||
response, mime = grab_content(link)
|
||||
try:
|
||||
assert mime.startswith("text/gemini"), f"{mime} response isn't text/gemini and therefore can't link back"
|
||||
links_to_orbit = False
|
||||
for line in response.splitlines():
|
||||
if line.startswith("=>"):
|
||||
parts = line.split(None,2)
|
||||
for reqlink in REQUIRED_LINKS:
|
||||
links_to_orbit=links_to_orbit or parts[1].startswith(reqlink)
|
||||
assert links_to_orbit, "doesn't link back to orbit"
|
||||
except AssertionError as e:
|
||||
print(f"Skipping {link} ({e.args[0]})...")
|
||||
continue
|
||||
# If we haven't continue'd by now, the link meets all of the criteria
|
||||
print(f"Adding {link} to the orbit...")
|
||||
URLS.append(link)
|
||||
modified_orbit = True
|
||||
|
||||
if modified_orbit:
|
||||
print("Saving modified orbit...")
|
||||
with open("orbit.json","w") as f:
|
||||
json.dump(dict(urls=URLS),f)
|
Loading…
Reference in New Issue