Compare commits
18 Commits
1d458c88f1
...
257ee2f66d
Author | SHA1 | Date | |
---|---|---|---|
|
257ee2f66d | ||
|
305a736074 | ||
|
b3821d7719 | ||
|
1d5f3b94b3 | ||
|
a92130db00 | ||
|
a75cd96f65 | ||
|
6a77582606 | ||
|
35b884e9ff | ||
|
8f5240f982 | ||
|
1d780fb1a3 | ||
|
1110bf5e0c | ||
|
d92a6db6b6 | ||
|
1edeab027c | ||
|
fd88a5181f | ||
|
c9a478d0eb | ||
|
2f7fa5daae | ||
|
a79f5da2cc | ||
|
78c2728e96 |
32
README.md
32
README.md
|
@ -1,8 +1,8 @@
|
|||
# OFFPUNK
|
||||
|
||||
A command-line, text-based and offline-first Gemini browser by [Ploum](https://ploum.net).
|
||||
A command-line, text-based and offline-first Gemini and Web browser by [Ploum](https://ploum.net).
|
||||
|
||||
Focused on Gemini first but with some Gopher/web support available or projected, the goal of Offpunk is to be able to synchronise your content once (a day, a week, a month) and then browse it while staying disconnected.
|
||||
Focused on Gemini first but with text-mode support for HTTP/HTML (gopher is planned), the goal of Offpunk is to be able to synchronise your content once (a day, a week, a month) and then browse it while staying disconnected.
|
||||
|
||||
Offpunk is a fork of the original [AV-98](https://tildegit.org/solderpunk/AV-98) by Solderpunk and was originally called AV-98-offline as an experimental branch.
|
||||
|
||||
|
@ -10,12 +10,12 @@ Offpunk is a fork of the original [AV-98](https://tildegit.org/solderpunk/AV-98)
|
|||
|
||||
Offmini is a single python file. Installation is optional, you can simply download and run "./offmini.py" or "python3 offmini.py" in a terminal.
|
||||
|
||||
You use the `go` command to visit a URL, e.g. `go gemini.circumlunar.space`.
|
||||
You use the `go` command to visit a URL, e.g. `go gemini.circumlunar.space`. (gemini:// is assumed is no protocol is specified).
|
||||
|
||||
Links in Gemini documents are assigned numerical indices. Just type an index to
|
||||
follow that link. If a Gemini document is too long to fit on your screen, the content is displayed in the less pager (by default). Type `q` to quit and go back to Offpunk prompt.
|
||||
Links in pages are assigned numerical indices. Just type an index to
|
||||
follow that link. If page is too long to fit on your screen, the content is displayed in the less pager (by default). Type `q` to quit and go back to Offpunk prompt. Type `less` or `l` to display it again in less.
|
||||
|
||||
Use `add` to add a capsule to your bookmarks and `bookmarks` or `bm` to show your bookmarks (which are stored in a file in you .config).
|
||||
Use `add` to add a capsule to your bookmarks and `bookmarks` or `bm` to show your bookmarks (which are stored in an editable file in you .config).
|
||||
|
||||
Use `offline` to only browse cached content and `online` to go back online. While offline, the `reload` command will force a re-fetch during the next synchronisation.
|
||||
|
||||
|
@ -38,15 +38,14 @@ At the moment, caching only work for gemini:// ressources. gopher:// is not impl
|
|||
Known issues in the code:
|
||||
* WONTFIX: Sync is slow if you have bookmarks with lot of links that change very often.
|
||||
* FIXME0: Certificates error are not handled in --sync
|
||||
* FIXME1: consider root file is always index.gmi
|
||||
* FIXME2: offline web browser use os.system because it’s the only one that understands the ">> file.txt"
|
||||
* FIXME1: consider root file is always index.gmi or index.html
|
||||
|
||||
* TODO: Update blackbox to reflect cache hits.
|
||||
* TODO: allow to search cache while offline
|
||||
|
||||
## More
|
||||
|
||||
See how I browse Gemini offline => gemini://rawtext.club/~ploum/2021-12-17-offline-gemini.gmi
|
||||
See how I browse Web/Gemini offline => gemini://rawtext.club/~ploum/2021-12-17-offline-gemini.gmi
|
||||
|
||||
Announces about Offpunk will be made on Ploum’s Gemlog => gemini://rawtext.club/~ploum/
|
||||
|
||||
|
@ -56,20 +55,25 @@ Announces about Offpunk will be made on Ploum’s Gemlog => gemini://rawtext.cl
|
|||
Offpunk has no "strict dependencies", i.e. it will run and work without anything
|
||||
else beyond the Python standard library. However, it will "opportunistically
|
||||
import" a few other libraries if they are available to offer an improved
|
||||
experience.
|
||||
experience or some other features. Python libraries requests, bs4 and readabliity are required for http/html support.
|
||||
|
||||
To avoid using unstable or too recent libraries, the rule of thumb is that a library should be packaged in Debian/Ubuntu.
|
||||
|
||||
* [Python-requests](http://python-requests.org) is needed to handle http/https requests natively (apt-get install python3-requests). Without it, http links will be opened in an external browser
|
||||
* [BeautifulSoup4](https://www.crummy.com/software/BeautifulSoup) and [Readability](https://github.com/buriy/python-readability) are both needed to render HTML. Without them, HTML will not be rendered or be sent to an external parser like Lynx. (apt-get install python3-bs4 python3-readability)
|
||||
* The [ansiwrap library](https://pypi.org/project/ansiwrap/) may result in
|
||||
neater display of text which makes use of ANSI escape codes to control colour.
|
||||
neater display of text which makes use of ANSI escape codes to control colour (not in Debian?).
|
||||
* The [cryptography library](https://pypi.org/project/cryptography/) will
|
||||
provide a better and slightly more secure experience when using the default
|
||||
TOFU certificate validation mode and is highly recommended.
|
||||
TOFU certificate validation mode and is highly recommended (apt-get install python3-cryptography).
|
||||
* [Python magic](https://github.com/ahupp/python-magic/) is useful to determine the MIME type of cached object. If not present, the file extension will be used but some capsules provide wrong extension or no extension at all. (apt-get install python3-magic)
|
||||
* [Xsel](http://www.vergenet.net/~conrad/software/xsel/) allows to `go` to the URL copied in the clipboard without having to paste it (both X and traditional clipboards are supported). Also needed to use the `copy` command. (apt-get install xsel)
|
||||
|
||||
## Features
|
||||
|
||||
* Offline mode to browse cached content without a connection. Requested elements are automatically fetched during the next synchronization and are added to your tour.
|
||||
* Support "subscriptions" to gemlogs. New content seen in bookmarked gemlogs are automatically added to your next tour.
|
||||
* HTML pages are prettified to focus on content. Read without being disturbed.
|
||||
* Support "subscriptions" to a page. New content seen in bookmarked pages are automatically added to your next tour.
|
||||
* TOFU or CA server certificate validation
|
||||
* Extensive client certificate support if an `openssl` binary is available
|
||||
* Ability to specify external handler programs for different MIME types
|
||||
|
@ -99,6 +103,6 @@ it will create `~/.offpunk/`.
|
|||
|
||||
## Cache design
|
||||
|
||||
The offline content is stored in ~/.cache/offmini/gemini/ as plain .gmi files. The structure of the Gemini-space is tentatively recreated. One key element of the design is to not have any database. The cache can thus be modified by hand, content can be removed, used or added by software other than offpunk.
|
||||
The offline content is stored in ~/.cache/offmini/ as plain .gmi/.html files. The structure of the Gemini-space is tentatively recreated. One key element of the design is to not have any database. The cache can thus be modified by hand, content can be removed, used or added by software other than offpunk.
|
||||
|
||||
There’s no feature to automatically trim the cache. It is believed that gemini content being lightweight, one would have to seriously browse a lot before cache size is an issue. If cache becomes too big, simply rm -rf the folders of the capsules taking too much space.
|
||||
|
|
298
offpunk.py
298
offpunk.py
|
@ -61,6 +61,18 @@ try:
|
|||
except ModuleNotFoundError:
|
||||
_HAS_MAGIC = False
|
||||
|
||||
try:
|
||||
import requests
|
||||
_DO_HTTP = True
|
||||
except ModuleNotFoundError:
|
||||
_DO_HTTP = False
|
||||
|
||||
try:
|
||||
from readability import Document
|
||||
from bs4 import BeautifulSoup
|
||||
_DO_HTML = True
|
||||
except ModuleNotFoundError:
|
||||
_DO_HTML = False
|
||||
_VERSION = "0.1"
|
||||
|
||||
_MAX_REDIRECTS = 5
|
||||
|
@ -104,7 +116,7 @@ _MIME_HANDLERS = {
|
|||
"audio/mpeg": "mpg123 %s",
|
||||
"audio/ogg": "ogg123 %s",
|
||||
"image/*": "feh -. %s",
|
||||
"text/html": "lynx -dump -force_html %s",
|
||||
#"text/html": "lynx -dump -force_html %s",
|
||||
}
|
||||
|
||||
|
||||
|
@ -159,17 +171,12 @@ class GeminiItem():
|
|||
self.host = None
|
||||
h = self.url.split('/')
|
||||
self.host = h[0:len(h)-1]
|
||||
self.scheme = 'local'
|
||||
self._cache_path = None
|
||||
# localhost:/ is 11 char
|
||||
if self.url.startswith("localhost://"):
|
||||
self.path = self.url[11:]
|
||||
else:
|
||||
self.path = self.url
|
||||
if self.name != "":
|
||||
self.title = self.name
|
||||
else:
|
||||
self.title = self.path
|
||||
else:
|
||||
self.path = parsed.path
|
||||
self.local = False
|
||||
|
@ -181,31 +188,44 @@ class GeminiItem():
|
|||
# index.gmi. I don’t know how to know the real name
|
||||
# of the file. But first, we need to ensure that the domain name
|
||||
# finish by "/". Else, the cache will create a file, not a folder.
|
||||
# SPECIFIC GEMINI
|
||||
if self.scheme.startswith("http"):
|
||||
index = "index.html"
|
||||
else:
|
||||
index = "index.gmi"
|
||||
if self.path == "" or os.path.isdir(self._cache_path):
|
||||
if not self._cache_path.endswith("/"):
|
||||
self._cache_path += "/"
|
||||
if not self.url.endswith("/"):
|
||||
self.url += "/"
|
||||
if self._cache_path.endswith("/"):
|
||||
self._cache_path += "index.gmi"
|
||||
self._cache_path += index
|
||||
|
||||
self.port = parsed.port or standard_ports.get(self.scheme, 0)
|
||||
|
||||
def get_title(self):
|
||||
#small intelligence to try to find a good name for a capsule
|
||||
#we try to find eithe ~username or /users/username
|
||||
#else we fallback to hostname
|
||||
self.title = self.host
|
||||
if "user" in parsed.path:
|
||||
i = 0
|
||||
splitted = parsed.path.split("/")
|
||||
while i < (len(splitted)-1):
|
||||
if splitted[i].startswith("user"):
|
||||
self.title = splitted[i+1]
|
||||
i += 1
|
||||
if "~" in parsed.path:
|
||||
for pp in parsed.path.split("/"):
|
||||
if pp.startswith("~"):
|
||||
self.title = pp[1:]
|
||||
if self.scheme == "localhost":
|
||||
if self.name != "":
|
||||
self.title = self.name
|
||||
else:
|
||||
self.title = self.path
|
||||
|
||||
else:
|
||||
self.title = self.host
|
||||
if "user" in self.path:
|
||||
i = 0
|
||||
splitted = self.path.split("/")
|
||||
while i < (len(splitted)-1):
|
||||
if splitted[i].startswith("user"):
|
||||
self.title = splitted[i+1]
|
||||
i += 1
|
||||
if "~" in self.path:
|
||||
for pp in self.path.split("/"):
|
||||
if pp.startswith("~"):
|
||||
self.title = pp[1:]
|
||||
return self.title
|
||||
|
||||
def is_cache_valid(self,validity=0):
|
||||
# Validity is the acceptable time for
|
||||
|
@ -253,7 +273,7 @@ class GeminiItem():
|
|||
with open(path) as f:
|
||||
body = f.read()
|
||||
f.close()
|
||||
return body
|
||||
return body
|
||||
else:
|
||||
print("ERROR: NO CACHE for %s" %self._cache_path)
|
||||
return FIXME
|
||||
|
@ -262,18 +282,18 @@ class GeminiItem():
|
|||
filename = os.path.basename(self._cache_path)
|
||||
return filename
|
||||
|
||||
def write_body(self,body,mode,encoding):
|
||||
def write_body(self,body,mime):
|
||||
## body is a copy of the raw gemtext
|
||||
## Tmpf is the temporary cache (historically, the only cache)
|
||||
tmpf = tempfile.NamedTemporaryFile(mode, encoding=encoding, delete=False)
|
||||
size = tmpf.write(body)
|
||||
tmpf.close()
|
||||
tmp_filename = tmpf.name
|
||||
#self._debug("Wrote %d byte response to %s." % (size, self.tmp_filename))
|
||||
|
||||
# Maintain cache and log : FIXME
|
||||
#self._log_visit(gi, address, size)
|
||||
## We create the permanent cache
|
||||
## Write_body() also create the cache !
|
||||
# DEFAULT GEMINI MIME
|
||||
if mime == "":
|
||||
mime = "text/gemini; charset=utf-8"
|
||||
mime, mime_options = cgi.parse_header(mime)
|
||||
self.mime = mime
|
||||
if self.mime and self.mime.startswith("text/"):
|
||||
mode = "w"
|
||||
else:
|
||||
mode = "wb"
|
||||
cache_dir = os.path.dirname(self._cache_path)
|
||||
# If the subdirectory already exists as a file (not a folder)
|
||||
# We remove it (happens when accessing URL/subfolder before
|
||||
|
@ -283,7 +303,9 @@ class GeminiItem():
|
|||
if os.path.isfile(cache_dir):
|
||||
os.remove(cache_dir)
|
||||
os.makedirs(cache_dir,exist_ok=True)
|
||||
shutil.copyfile(tmp_filename,self._cache_path)
|
||||
with open(self._cache_path, mode=mode) as f:
|
||||
f.write(body)
|
||||
f.close()
|
||||
|
||||
|
||||
def get_mime(self):
|
||||
|
@ -295,9 +317,9 @@ class GeminiItem():
|
|||
elif not _HAS_MAGIC :
|
||||
print("Cannot guess the mime type of the file. Install Python-magic")
|
||||
if mime.startswith("text"):
|
||||
#if mime == "text/gemini":
|
||||
#SPECIFIC GEMINI
|
||||
mime = "text/gemini"
|
||||
#by default, we consider it’s gemini except for html
|
||||
if "html" not in mime:
|
||||
mime = "text/gemini"
|
||||
self.mime = mime
|
||||
return self.mime
|
||||
|
||||
|
@ -446,7 +468,6 @@ class GeminiClient(cmd.Cmd):
|
|||
self.synconly = synconly
|
||||
self.tmp_filename = ""
|
||||
self.visited_hosts = set()
|
||||
#self.waypoints = []
|
||||
self.offline_only = False
|
||||
self.sync_only = False
|
||||
self.tourfile = os.path.join(self.config_dir, "tour")
|
||||
|
@ -507,24 +528,7 @@ class GeminiClient(cmd.Cmd):
|
|||
and calling a handler program, and updating the history.
|
||||
Nothing is returned."""
|
||||
# Don't try to speak to servers running other protocols
|
||||
if gi.scheme in ("http", "https") and not self.sync_only:
|
||||
if not self.options.get("http_proxy",None) and not self.offline_only:
|
||||
webbrowser.open_new_tab(gi.url)
|
||||
return
|
||||
elif self.offline_only and self.options.get("offline_web"):
|
||||
offline_browser = self.options.get("offline_web")
|
||||
cmd = offline_browser % gi.url
|
||||
print("Save for offline web : %s" %gi.url)
|
||||
#FIXME : subprocess doesn’t understand shell redirection
|
||||
os.system(cmd)
|
||||
return
|
||||
else:
|
||||
print("Do you want to try to open this link with a http proxy?")
|
||||
resp = input("(Y)/N ")
|
||||
if resp.strip().lower() in ("n","no"):
|
||||
webbrowser.open_new_tab(gi.url)
|
||||
return
|
||||
elif gi.scheme == "gopher" and not self.options.get("gopher_proxy", None)\
|
||||
if gi.scheme == "gopher" and not self.options.get("gopher_proxy", None)\
|
||||
and not self.sync_only:
|
||||
print("""Offpunk does not speak Gopher natively.
|
||||
However, you can use `set gopher_proxy hostname:port` to tell it about a
|
||||
|
@ -541,7 +545,7 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
else:
|
||||
print("Sorry, file %s does not exist."%gi.path)
|
||||
return
|
||||
elif gi.scheme not in ("gemini", "gopher") and not self.sync_only:
|
||||
elif gi.scheme not in ("gemini", "gopher", "http", "https") and not self.sync_only:
|
||||
print("Sorry, no support for {} links.".format(gi.scheme))
|
||||
return
|
||||
|
||||
|
@ -556,7 +560,7 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
if self.offline_only:
|
||||
|
||||
if not gi.is_cache_valid():
|
||||
print("Content not available, marked for syncing")
|
||||
print("%s not available, marked for syncing"%gi.url)
|
||||
with open(self.syncfile,mode='a') as sf:
|
||||
line = gi.url.strip() + '\n'
|
||||
sf.write(line)
|
||||
|
@ -565,7 +569,15 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
|
||||
elif not self.offline_only:
|
||||
try:
|
||||
gi = self._fetch_over_network(gi)
|
||||
if gi.scheme in ("http", "https"):
|
||||
if _DO_HTTP:
|
||||
gi = self._fetch_http(gi)
|
||||
else:
|
||||
print("Install python3-requests to handle http requests natively")
|
||||
webbrowser.open_new_tab(gi.url)
|
||||
return
|
||||
else:
|
||||
gi = self._fetch_over_network(gi)
|
||||
except UserAbortException:
|
||||
return
|
||||
except Exception as err:
|
||||
|
@ -601,10 +613,11 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
|
||||
# Pass file to handler, unless we were asked not to
|
||||
#SPECIFIC GEMINI : default handler should be provided by the GI.
|
||||
if handle :
|
||||
if gi and handle :
|
||||
if gi.get_mime() == "text/gemini":
|
||||
self._handle_gemtext(gi, display=not self.sync_only)
|
||||
|
||||
elif gi.get_mime() == "text/html":
|
||||
self._handle_html(gi,display=not self.sync_only)
|
||||
elif not self.sync_only :
|
||||
cmd_str = self._get_handler_cmd(gi.get_mime())
|
||||
try:
|
||||
|
@ -620,7 +633,20 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
if update_hist:
|
||||
self._update_history(gi)
|
||||
|
||||
#SPECIFIC GEMINI : fetch_over_network should be part of gi os each could have its own.
|
||||
|
||||
def _fetch_http(self,gi):
|
||||
response = requests.get(gi.url)
|
||||
mime = response.headers['content-type']
|
||||
body = response.content
|
||||
if "text/" in mime:
|
||||
body = response.text
|
||||
else:
|
||||
body = response.content
|
||||
gi.write_body(body,mime)
|
||||
return gi
|
||||
|
||||
# fetch_over_network will modify with gi.write_body(body,mime)
|
||||
# before returning the gi
|
||||
def _fetch_over_network(self, gi):
|
||||
|
||||
# Be careful with client certificates!
|
||||
|
@ -739,33 +765,25 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
assert status.startswith("2")
|
||||
|
||||
mime = meta
|
||||
# Read the response body over the network
|
||||
fbody = f.read()
|
||||
# DEFAULT GEMINI MIME
|
||||
if mime == "":
|
||||
mime = "text/gemini; charset=utf-8"
|
||||
gi.mime, mime_options = cgi.parse_header(mime)
|
||||
shortmime, mime_options = cgi.parse_header(mime)
|
||||
if "charset" in mime_options:
|
||||
try:
|
||||
codecs.lookup(mime_options["charset"])
|
||||
except LookupError:
|
||||
raise RuntimeError("Header declared unknown encoding %s" % value)
|
||||
|
||||
# Read the response body over the network
|
||||
body = f.read()
|
||||
|
||||
# Save the result in a temporary file
|
||||
## Set file mode
|
||||
if gi.get_mime().startswith("text/"):
|
||||
mode = "w"
|
||||
if shortmime.startswith("text/"):
|
||||
encoding = mime_options.get("charset", "UTF-8")
|
||||
try:
|
||||
body = body.decode(encoding)
|
||||
body = fbody.decode(encoding)
|
||||
except UnicodeError:
|
||||
raise RuntimeError("Could not decode response body using %s encoding declared in header!" % encoding)
|
||||
else:
|
||||
mode = "wb"
|
||||
encoding = None
|
||||
## body is a copy of the raw gemtext
|
||||
gi.write_body(body,mode,encoding)
|
||||
raise RuntimeError("Could not decode response body using %s\
|
||||
encoding declared in header!" % encoding)
|
||||
gi.write_body(body,mime)
|
||||
return gi
|
||||
|
||||
def _send_request(self, gi):
|
||||
|
@ -1040,6 +1058,105 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
self._debug("Using handler: %s" % cmd_str)
|
||||
return cmd_str
|
||||
|
||||
|
||||
# Red title above rendered content
|
||||
def _make_terminal_title(self,gi):
|
||||
title = gi.get_title()
|
||||
if gi.is_cache_valid() and self.offline_only and not gi.local:
|
||||
last_modification = gi.cache_last_modified()
|
||||
str_last = time.ctime(last_modification)
|
||||
title += " \x1b[0;31m(last accessed on %s)"%str_last
|
||||
rendered_title = "\x1b[31m\x1b[1m"+ title + "\x1b[0m"+"\n"
|
||||
return rendered_title
|
||||
# Our own HTML engine (crazy, isn’t it?)
|
||||
def _handle_html(self,gi,display=True):
|
||||
if not _DO_HTML:
|
||||
print("HTML document detected. Please install python-bs4 and python readability.")
|
||||
return
|
||||
# This method recursively parse the HTML
|
||||
def recursive_render(element):
|
||||
rendered_body = ""
|
||||
if element.name == "div":
|
||||
rendered_body += "\n"
|
||||
for child in element.children:
|
||||
rendered_body += recursive_render(child)
|
||||
elif element.name in ["h1","h2","h3","h4","h5","h6"]:
|
||||
line = element.get_text()
|
||||
if element.name in ["h1","h2"]:
|
||||
rendered_body += "\n"+"\x1b[1;34m\x1b[4m" + line + "\x1b[0m"+"\n"
|
||||
elif element.name in ["h3","h4"]:
|
||||
rendered_body += "\n" + "\x1b[34m" + line + "\x1b[0m" + "\n"
|
||||
else:
|
||||
rendered_body += "\n" + "\x1b[34m\x1b[2m" + line + "\x1b[0m" + "\n"
|
||||
elif element.name == "pre":
|
||||
if element.string:
|
||||
rendered_body += "\n"
|
||||
rendered_body += element.string
|
||||
rendered_body += "\n"
|
||||
elif element.name == "li":
|
||||
for child in element.children:
|
||||
line = recursive_render(child)
|
||||
wrapped = textwrap.fill(line,self.options["width"])
|
||||
rendered_body += " * " + wrapped + "\n"
|
||||
elif element.name == "p":
|
||||
temp_str = ""
|
||||
if element.string:
|
||||
temp_str += element.string.strip()
|
||||
#print("p string : ",element.string)
|
||||
else:
|
||||
#print("p no string : ",element.contents)
|
||||
for child in element.children:
|
||||
temp_str += recursive_render(child)
|
||||
#temp_str += " "
|
||||
wrapped = textwrap.fill(temp_str,self.options["width"])
|
||||
if wrapped.strip() != "":
|
||||
rendered_body += wrapped + "\n\n"
|
||||
elif element.name == "a":
|
||||
text = element.get_text().strip()
|
||||
link = element.get('href')
|
||||
if link:
|
||||
line = "=> " + link + " " +text
|
||||
link_id = " [%s] "%(len(self.index)+1)
|
||||
temp_gi = GeminiItem.from_map_line(line, gi)
|
||||
self.index.append(temp_gi)
|
||||
rendered_body = "\x1b[34m\x1b[2m " + text + link_id + "\x1b[0m"
|
||||
elif element.string:
|
||||
#print("tag without children:",element.name)
|
||||
#print("string : **%s** "%element.string.strip())
|
||||
#print("########")
|
||||
rendered_body = element.string.strip()
|
||||
else:
|
||||
#print("tag children:",element.name)
|
||||
for child in element.children:
|
||||
rendered_body += recursive_render(child)
|
||||
#print("body for element %s: %s"%(element.name,rendered_body))
|
||||
return rendered_body
|
||||
|
||||
# the real _handle_html method
|
||||
self.index = []
|
||||
if self.idx_filename:
|
||||
os.unlink(self.idx_filename)
|
||||
tmpf = tempfile.NamedTemporaryFile("w", encoding="UTF-8", delete=False)
|
||||
self.idx_filename = tmpf.name
|
||||
tmpf.write(self._make_terminal_title(gi))
|
||||
body = gi.get_body()
|
||||
title = Document(body).title()
|
||||
tmpf.write("\x1b[1;34m\x1b[4m" + title + "\x1b[0m""\n")
|
||||
summary = Document(body).summary()
|
||||
soup = BeautifulSoup(summary, 'html.parser')
|
||||
rendered_body = ""
|
||||
for el in soup.body.contents:
|
||||
rendered_body += recursive_render(el)
|
||||
rendered_body = rendered_body.rstrip()
|
||||
tmpf.write(rendered_body)
|
||||
tmpf.close()
|
||||
self.lookup = self.index
|
||||
self.page_index = 0
|
||||
self.index_index = -1
|
||||
if display:
|
||||
cmd_str = self._get_handler_cmd("text/gemini")
|
||||
subprocess.call(shlex.split(cmd_str % self.idx_filename))
|
||||
|
||||
# Gemtext Rendering Engine
|
||||
# this method renders the original Gemtext then call the handler to display it.
|
||||
def _handle_gemtext(self, menu_gi, display=True):
|
||||
|
@ -1051,12 +1168,7 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
# to display it. This is the output, not native gemtext.
|
||||
tmpf = tempfile.NamedTemporaryFile("w", encoding="UTF-8", delete=False)
|
||||
self.idx_filename = tmpf.name
|
||||
title = menu_gi.title
|
||||
if menu_gi.is_cache_valid() and self.offline_only and not menu_gi.local:
|
||||
last_modification = menu_gi.cache_last_modified()
|
||||
str_last = time.ctime(last_modification)
|
||||
title += " \x1b[0;31m(last accessed on %s)"%str_last
|
||||
tmpf.write("\x1b[31m\x1b[1m"+ title + "\x1b[0m""\n")
|
||||
tmpf.write(self._make_terminal_title(menu_gi))
|
||||
for line in menu_gi.get_body().splitlines():
|
||||
if line.startswith("```"):
|
||||
preformatted = not preformatted
|
||||
|
@ -1133,7 +1245,7 @@ you'll be able to transparently follow links to Gopherspace!""")
|
|||
self.log["ipv6_bytes_recvd"] += size
|
||||
|
||||
def _get_active_tmpfile(self):
|
||||
if self.gi.get_mime() == "text/gemini":
|
||||
if self.gi.get_mime() in ["text/gemini","text/html"]:
|
||||
return self.idx_filename
|
||||
else:
|
||||
return self.tmp_filename
|
||||
|
@ -1655,7 +1767,7 @@ Use 'ls -l' to see URLs."""
|
|||
"""Run most recently visited item through "less" command."""
|
||||
cmd_str = self._get_handler_cmd(self.gi.get_mime())
|
||||
cmd_str = cmd_str % self._get_active_tmpfile()
|
||||
subprocess.call("%s | less -R" % cmd_str, shell=True)
|
||||
subprocess.call("%s | less -RM" % cmd_str, shell=True)
|
||||
|
||||
@needs_gi
|
||||
def do_fold(self, *args):
|
||||
|
@ -1927,12 +2039,11 @@ def main():
|
|||
if args.sync:
|
||||
# fetch_cache is the core of the sync algorithm.
|
||||
# It takes as input :
|
||||
# - a list of GeminiItems to be fetched (TODO: convert to list)
|
||||
# - a list of GeminiItems to be fetched
|
||||
# - depth : the degree of recursion to build the cache (0 means no recursion)
|
||||
# - validity : the age, in seconds, existing caches need to have before
|
||||
# being refreshed (0 = never refreshed if it already exists)
|
||||
# - savetotour : if True, newly cached items are added to tour
|
||||
# (this option does not apply recursively)
|
||||
def add_to_tour(gitem):
|
||||
if gitem.is_cache_valid():
|
||||
print(" -> adding to tour: %s" %gitem.url)
|
||||
|
@ -1964,10 +2075,6 @@ def main():
|
|||
subcount = [0,len(temp_lookup)]
|
||||
for k in temp_lookup:
|
||||
#recursive call
|
||||
#To not refresh already cached ressource too often
|
||||
#we impose a random validity
|
||||
#randomval = int(refresh_time*random.uniform(10,100))
|
||||
#never saving recursion to tour
|
||||
substri = strin + " -->"
|
||||
subcount[0] += 1
|
||||
fetch_cache(k,depth=d,validity=0,savetotour=savetotour,\
|
||||
|
@ -1976,7 +2083,7 @@ def main():
|
|||
if args.cache_validity:
|
||||
refresh_time = int(args.cache_validity)
|
||||
else:
|
||||
# if no refresh time, a default of 1h is used
|
||||
# if no refresh time, a default of 0 is used (which means "infinite")
|
||||
refresh_time = 0
|
||||
gc.sync_only = True
|
||||
# We start by syncing the bookmarks
|
||||
|
@ -2003,7 +2110,7 @@ def main():
|
|||
#always get to_fetch and tour, regarless of refreshtime
|
||||
#we don’t save to tour (it’s already there)
|
||||
counter += 1
|
||||
if l.startswith("gemini://"):
|
||||
if l.startswith("gemini://") or l.startswith("http"):
|
||||
fetch_cache(GeminiItem(l.strip()),depth=1,validity=refresh_time,\
|
||||
savetotour=False,count=[counter,tot])
|
||||
# Then we get ressources from syncfile
|
||||
|
@ -2019,11 +2126,12 @@ def main():
|
|||
if tot > 0:
|
||||
print(" * * * %s to fetch from your offline browsing * * *" %tot)
|
||||
for l in set(lines_lookup):
|
||||
#always fetch the cache (we allows only a 3 minutes time)
|
||||
#always fetch the cache (we allows only a 3 minutes time
|
||||
# to avoid multiple fetch in the same sync run)
|
||||
#then add to tour
|
||||
counter += 1
|
||||
gitem = GeminiItem(l.strip())
|
||||
if l.startswith("gemini://"):
|
||||
if l.startswith("gemini://") or l.startswith("http"):
|
||||
fetch_cache(gitem,depth=1,validity=180,\
|
||||
savetotour=False,count=[counter,tot])
|
||||
add_to_tour(gitem)
|
||||
|
|
Loading…
Reference in New Issue
Block a user