percent encoded spaces break links #216

Closed
opened 2022-02-27 23:17:05 +00:00 by bencollver · 3 comments

bombadillo is not able to follow gopher links that have percent encoded spaces.

Example:

$ bombadillo gemini://tilde.pink/~bencollver/

What did i do?

Tried to use the link specified below.

=> gopher://gopherpedia.com:70/0/Accident%20of%20birth "Accident of birth" explanation

What did i expect to see?

The gopherpedia page about "Accident of birth"

What did i see instead?

i ____ _ null (FALSE) 0
i / __ \ | | null (FALSE) 0
i | | | | ___ ___ _ __ | | null (FALSE) 0
i | | | |/ _ \ / _ | '
/ | | null (FALSE) 0
i | |
| | (
) | (
) | |) _ _| null (FALSE) 0
i _/ _/ _/| ./|() null (FALSE) 0
i | | null (FALSE) 0
i |
| null (FALSE) 0
i null (FALSE) 0
iLooks like something went wrong with that request null (FALSE) 0
i null (FALSE) 0
i null (FALSE) 0
3Error 400 null (FALSE) 0
i null (FALSE) 0
i null (FALSE) 0
1back to gopherpedia / gopherpedia.com 70

What does the gemini specification say?

gemini://gemini.circumlunar.space/docs/specification.gmi

5.4.2 Link lines

URLs in link lines must have reserved characters and spaces percent-encoded as per RFC 3986.

bombadillo is not able to follow gopher links that have percent encoded spaces. Example: $ bombadillo gemini://tilde.pink/~bencollver/ What did i do? Tried to use the link specified below. => gopher://gopherpedia.com:70/0/Accident%20of%20birth "Accident of birth" explanation What did i expect to see? The gopherpedia page about "Accident of birth" What did i see instead? i ____ _ null (FALSE) 0 i / __ \ | | null (FALSE) 0 i | | | | ___ ___ _ __ ___| | null (FALSE) 0 i | | | |/ _ \ / _ \| '_ \/ __| | null (FALSE) 0 i | |__| | (_) | (_) | |_) \__ \_| null (FALSE) 0 i \____/ \___/ \___/| .__/|___(_) null (FALSE) 0 i | | null (FALSE) 0 i |_| null (FALSE) 0 i null (FALSE) 0 iLooks like something went wrong with that request null (FALSE) 0 i null (FALSE) 0 i null (FALSE) 0 3Error 400 null (FALSE) 0 i null (FALSE) 0 i null (FALSE) 0 1back to gopherpedia / gopherpedia.com 70 What does the gemini specification say? gemini://gemini.circumlunar.space/docs/specification.gmi 5.4.2 Link lines URLs in link lines must have reserved characters and spaces percent-encoded as per RFC 3986.
Owner

I am not sure that this is an issue/bug. I agree it is broken... but it is more of an incompatibility between gopher and gemini. Gopher does not percent encode links. They do notuse the URL spec and have their own URL spec that does not include percent encoding/decoding. If you go to the root of that URL (gopher://gopherpedia.com) and choose the full text search and search for accident of birth (either in bombaillo or a different gopher client) you will likely see that the URL loads and does not get percent encoded. It just has spaces in it. Because in gopher spaces are perfectly valid URL characters.

Gemini, however, is unable to cope with this because of how its link lines work. A space is a delimiter for the link line itself and thus breaks links.

So, the question becomes: should URLs to gopher from gemini be unencoded before request to the gopher server. I would argue not. URLs in gopher can use a percent sign for a meaning other than encoding characters and treating those as encoded characters would break existing gopher content.

In general I consider Bombadillo to be a "gopher first" client. It existed before gemini (both bombadillo and gopher), and while I love gemini and was involved in a lot of the early discussions... bombadillo is, at its heart, a gopher client. It is an unfortunate circumstance that there is this incompatibility. Other client may handle it other ways and I am interested to see how they do so... but do think it is a broken situation where one side has to lose to some degree.

If you think I am incorrect in any of the above info or want to make a case contrary to where this comment seems to have landed definitely do so. I'm open to hearing arguments one way or the other. I'm also interested to see if the community has standardized on anything relating to this in multi-protocol clients.

Alternatively, content authors can be encouraged to link to a gemini based resource, such as: gemini://vault.transjovian.org:1965/search/en/accident%20of%20birth, which does load as expected in bombadillo.

I am not sure that this is an issue/bug. I agree it is broken... but it is more of an incompatibility between gopher and gemini. Gopher does not percent encode links. They do notuse the URL spec and have their own URL spec that does not include percent encoding/decoding. If you go to the root of that URL (`gopher://gopherpedia.com`) and choose the full text search and search for accident of birth (either in bombaillo or a different gopher client) you will likely see that the URL loads and does _not_ get percent encoded. It just has spaces in it. Because in gopher spaces are perfectly valid URL characters. Gemini, however, is unable to cope with this because of how its link lines work. A space is a delimiter for the link line itself and thus breaks links. So, the question becomes: should URLs to gopher from gemini be unencoded before request to the gopher server. I would argue not. URLs in gopher can use a percent sign for a meaning _other than encoding characters_ and treating those as encoded characters would break existing gopher content. In general I consider Bombadillo to be a "gopher first" client. It existed before gemini (both bombadillo and gopher), and while I love gemini and was involved in a lot of the early discussions... bombadillo is, at its heart, a gopher client. It is an unfortunate circumstance that there is this incompatibility. Other client may handle it other ways and I am interested to see how they do so... but do think it is a broken situation where one side has to lose to some degree. If you think I am incorrect in any of the above info or want to make a case contrary to where this comment seems to have landed definitely do so. I'm open to hearing arguments one way or the other. I'm also interested to see if the community has standardized on anything relating to this in multi-protocol clients. Alternatively, content authors can be encouraged to link to a gemini based resource, such as: `gemini://vault.transjovian.org:1965/search/en/accident%20of%20birth`, which does load as expected in bombadillo.
Author

Thanks for your quick and thoughtful response!

Lagrange dereferences the link and its gopher request succeeds.

I am not attached to whether or how this gets fixed in bombadillo.

Just to play "devil's advocate": Since the gemtext spec says to percent encode URLs, then it would still be possible to have percent signs in a gopher URL. They would just need to be encoded as %25. Bombadillo could handle gemini content differently from gopher content and decode URLs from gemini content prior to making gopher requests.

Thanks for your quick and thoughtful response! Lagrange dereferences the link and its gopher request succeeds. I am not attached to whether or how this gets fixed in bombadillo. Just to play "devil's advocate": Since the gemtext spec says to percent encode URLs, then it would still be possible to have percent signs in a gopher URL. They would just need to be encoded as %25. Bombadillo could handle gemini content differently from gopher content and decode URLs from gemini content prior to making gopher requests.
sloum closed this issue 2022-03-06 21:44:38 +00:00
Owner

I think it accidentally submitted a comment that was just a draft that I did not intend to send, so you may have gotten an e-mail with it. Sorry, if so. It was not meant to be sent.

I am closing this as I think that the two things are just incompatible. I definitely get what you are saying with using a translation layer, but do not think it is a route I want to go for bombadillo.

Good to know about lagrange doing translation for this though. It is interesting, and this is certainly a muddy area where opinions will diverge (specifically between those wanting a practical solution and those wanting to stick to individual but conflicting specs/rfcs).

Definitely thanks for bringing it to our/my attention. Having a history of the issue here can be referenced later if anything changes in either specs or in the developer opinions for bombadillo, so definitely still useful to have on record :-D

I think it accidentally submitted a comment that was just a draft that I did not intend to send, so you may have gotten an e-mail with it. Sorry, if so. It was not meant to be sent. I am closing this as I think that the two things are just incompatible. I definitely get what you are saying with using a translation layer, but do not think it is a route I want to go for bombadillo. Good to know about lagrange doing translation for this though. It is interesting, and this is certainly a muddy area where opinions will diverge (specifically between those wanting a practical solution and those wanting to stick to individual but conflicting specs/rfcs). Definitely thanks for bringing it to our/my attention. Having a history of the issue here can be referenced later if anything changes in either specs or in the developer opinions for bombadillo, so definitely still useful to have on record :-D
sloum added the
wontfix
label 2022-03-06 21:48:13 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: sloum/bombadillo#216
No description provided.