Browse Source

New article about decentralized forge

master
southerntofu 6 months ago
parent
commit
21e581b9e5
  1. 2
      .gitignore
  2. 3
      .gitmodules
  3. BIN
      content/blog/decentralized-forge/2010_and_2020_2x.jpg
  4. BIN
      content/blog/decentralized-forge/2010_and_2020_2x.png
  5. BIN
      content/blog/decentralized-forge/Shamirs-secret-sharing-scheme.ppm.png
  6. BIN
      content/blog/decentralized-forge/force-push.jpg
  7. BIN
      content/blog/decentralized-forge/force-push.png
  8. BIN
      content/blog/decentralized-forge/fuck-cloudflare.jpg
  9. BIN
      content/blog/decentralized-forge/fuck-cloudflare.jpg.jpg
  10. BIN
      content/blog/decentralized-forge/git-ssb.png
  11. BIN
      content/blog/decentralized-forge/github-politics.jpg
  12. BIN
      content/blog/decentralized-forge/github-politics.png
  13. 335
      content/blog/decentralized-forge/index.md
  14. 39
      content/blog/decentralized-forge/index2.md
  15. 279
      content/blog/decentralized-forge/index3.md
  16. 253
      content/blog/decentralized-forge/index_old.md
  17. 365
      content/blog/decentralized-forge/index_old2.md
  18. BIN
      content/blog/decentralized-forge/launchpad.jpg
  19. BIN
      content/blog/decentralized-forge/launchpad.jpg.jpg
  20. BIN
      content/blog/decentralized-forge/radicle.png
  21. BIN
      content/blog/decentralized-forge/sat-forge-cut.png
  22. BIN
      content/blog/decentralized-forge/sat-forge.png
  23. 1
      themes/water

2
.gitignore

@ -19,3 +19,5 @@ tags
# Persistent undo
[._]*.un~
content/drafts
public

3
.gitmodules

@ -0,0 +1,3 @@
[submodule "themes/water"]
path = themes/water
url = https://tildegit.org/southerntofu/zola-water

BIN
content/blog/decentralized-forge/2010_and_2020_2x.jpg

After

Width: 800  |  Height: 409  |  Size: 57 KiB

BIN
content/blog/decentralized-forge/2010_and_2020_2x.png

After

Width: 1058  |  Height: 541  |  Size: 143 KiB

BIN
content/blog/decentralized-forge/Shamirs-secret-sharing-scheme.ppm.png

After

Width: 478  |  Height: 432  |  Size: 26 KiB

BIN
content/blog/decentralized-forge/force-push.jpg

After

Width: 500  |  Height: 439  |  Size: 52 KiB

BIN
content/blog/decentralized-forge/force-push.png

After

Width: 500  |  Height: 439  |  Size: 114 KiB

BIN
content/blog/decentralized-forge/fuck-cloudflare.jpg

After

Width: 825  |  Height: 510  |  Size: 64 KiB

BIN
content/blog/decentralized-forge/fuck-cloudflare.jpg.jpg

After

Width: 500  |  Height: 309  |  Size: 40 KiB

BIN
content/blog/decentralized-forge/git-ssb.png

After

Width: 931  |  Height: 1001  |  Size: 70 KiB

BIN
content/blog/decentralized-forge/github-politics.jpg

After

Width: 800  |  Height: 335  |  Size: 35 KiB

BIN
content/blog/decentralized-forge/github-politics.png

After

Width: 1613  |  Height: 676  |  Size: 305 KiB

335
content/blog/decentralized-forge/index.md

@ -0,0 +1,335 @@
+++
title = "Decentralized forge: distributing the means of digital production"
date = 2021-01-07
+++
This article began as a draft in february 2020, which i revised because the Free Software Foundation is looking for feedback on their high-priority projects list.
Our world is increasingly controled by software. From medical equipment to political repression to interpersonal relationships, software is everywhere, shaping our lives. As the luddites did centuries ago, we're evaluating whether new technologies empower us or on the contrary reinforce existing systems of social control.
This question is often phrased through a [software freedom](https://www.gnu.org/philosophy/free-sw.html) perspective: am i free to study the program, modify it according to my needs and distribute original or modified copies of it? However, individual programs cannot be studied out of their broader context. Just like the human and ecological impact of a product cannot be imagined by looking at the product itself, so does a binary program tell us little about its conditions of production and envisioned goals.
A forge is another name for a software development platform. The past two decades have been rich in progress on forging ecosystems, empowering many projects to integrate tests, builds, fuzzing, or deployments as part of their development pipeline. However, there is a growing privatisation of the means of digital production: Github, in particular, is centralizing a lot of forging activities on its closed platform.
In this article, I argue decentralized forging is a key issue for the future of free software and non-profit software development that empowers users. It will cover:
- Github considered harmful (what Github did well, and why they are now a problem)
- git is decentralized, but the workflows aren't standardized (why email forging is not yet practical for everyone)
- Selfhosted walled gardens are not a solution (how selfhosted forges can limit cooperation)
- Centralized and decentralized trust (how to discover and authenticate content in a decentralized setting)
- Popular self-defense against malicious activities (what are the threats, and how to empower our communities against them)
- Existing decentralized forging solutions (both federated and peer-to-peer)
- With interoperability, please
- Code signing and nomadic identity
- Social recovery and backup mechanisms
- and a short conclusion
This article assumes you are somewhat familiar with software development, [decentralized version control system](https://en.wikipedia.org/wiki/Distributed_version_control) (DVCS) such as git, cooperative forging (collaboration between mutliple users), and some notions of asymetric cryptography (public/private keys, signatures).
# Glossary
- forge: a program facilitating cooperation on a project
- DVCS: a version control system such as git
- repository: a folder, containing versioning metadata, to represent several versions of the same project
- source: the contents of the repository (excluding versioning metadata)
- commit: a specific version of a repository, in which the commit name is the [checksum](https://en.wikipedia.org/wiki/Checksum>) of all file contents for this version of the repository
- branch: a user-friendly name pointing to a specific commit, which can be updated to point to newer commits (this is what happens when an update is pushed)
- patch: a proposal to change content from the repository, that can be understood by humans and by machines (typically, a [diff](https://en.wikipedia.org/wiki/Diff)), also known as a pull request or merge request (though slightly different)
- issue: a comment/discussion about the project, which is usually not contained within the repository itself (also called a ticket or a bug)
- contributors: people cooperating on a project by submitting issues and patches
- maintainers: contributors with permission to push updates to the repository (validate patches)
# Github considered harmful
More and more, software developers and other digital producers are turning to Github to host their project and handle cooperation across users, groups and projects. Github is a web interface for git. From this web interface, you can browse the source of the project, submit/triage issues, and propose/review patches. While Github internals rely on git, Github does more than git itself. Notably, Github enables to define fine-grained permissions for different users/groups.
Github is what we call a web forge, that is a software suite that makes cooperation easier. It was inspired by many others that came before, like Sourceforge and Launchpad. What's different about Github, compared to previous forges, is the emphasis on user experience. While Sourceforge is great for those of us familiar with [DAGs](https://en.wikipedia.org/wiki/Directed_acyclic_graph) and git vocabulary (*refs*, *HEAD*, *branches*), Github was conceived from the beginning to be more friendly to newcomers. Using Github to produce software still requires git knowledge, but submitting a ticket (to report a bug) is accessible to anyone familiar with a web interface.
To illustrate their difference, we can look at how different forges treat the [README](https://en.wikipedia.org/wiki/README) file. From [Inkscape's Launchpad project](https://launchpad.net/inkscape), it takes me one click to reach the list of files (sometimes called *tree*), and yet another click to open the README. In contrast, [zola's project page](https://github.com/getzola/zola) on Github instantly displays the list of files, as well as the complete README.
![The inkscape project page on Launchpad](launchpad.jpg.jpg)
Every project is architectured differently. That's why displaying a list of branches or the latest commits on the project page brings limited value: the people who need this information likely know how to retrieve it from the git command-line (`git branch` or `git log`). README files, however, were created precisely to introduce newcomers to your project. Whether your README file is intended for novice users or people from a specific background is up to you, but Github does not stand in the way.
But not all is so bright with Github. First and foremost, Github is a private company trying to make money on other people's work. They are more than happy to use free-software projects to build their platform, but always refused to publish their own sourcecode. Such hypocrisy has long been denounced by the Free Software Foundation, which in its [ethical forging evaluation](https://www.gnu.org/software/repo-criteria-evaluation.html) gave Github the worst note possible ([criteria](https://www.gnu.org/software/repo-criteria.html)).
Like Facebook and Google, and other centralized platforms, Github also has a reputation for banning users from the platform with little/slow recourse. Some take to forums and social media platform to get the attention of the giant in the hope of getting their account restored. That's what happened for instance when Github decided to [ban all users from Iran, Crimea and Syria](https://www.theverge.com/2019/7/29/8934694/github-us-trade-sanctions-developers-restricted-crimea-cuba-iran-north-korea-syria): users organized [massive petitions](https://github.com/1995parham/github-do-not-ban-us), but were left powerless in the end.
Github has also been involved in very political questionable decisions. If your program is in any way interacting with copyrighted materials (such as Popcorn Time or youtube-dl), there are chances it will be taken down at some point. This is even more true if your project displeases a colonial empire (or pretend "republic") like Spain or the United States. An application promoting Catalan independance ⁽²⁾ [was removed](https://www.bbc.com/news/technology-50232902). And so was a list of police officers (ICE) compiled from entirely public data, which was finally [hosted by Wikileaks](https://www.rt.com/usa/430505-ice-database-hosted-wikileaks/).
![Github logo parody](github-politics.jpg)
Github has chosen their side: they will protect the privileged against the oppressed, in an effort to safeguard their own profits. They will continue to support [Palantir](https://www.fastcompany.com/90348304/exclusive-tech-workers-organize-protest-against-palantir-on-the-github-coding-platform) (surveillance company) and [ICE](https://techcrunch.com/2019/11/13/github-faces-more-resignations-in-light-of-ice-contract/) (racist police detaining/deporting people) despite protests from their users and employees, and the only option left for us is to boycott their platform entirely.
To be fair, there are examples for which I personally agree with Github's repressive actions, like when they banned a dystopian [DeepNude application](https://www.vice.com/en/article/8xzjpk/github-removed-open-source-versions-of-deepnude-app-deepfakes) encouraging hurtful [stalking](https://en.wikipedia.org/wiki/Stalking)/harrassing behavior. My point is not that Github's positions are wrong, but rather that nobody should ever hold such power.
It is entirely understandable for a person or entity to refuse to be affiliated with certain kinds of activities. However, given Github's quasi-monopoly on software forging, banning people from Github means exiling them from many projects. Likewise, banning projects from Github disconnects them from a vast pool of potential contributors. There should never be a global button to dictate who can contribute to a project, and what kind of project they may contribute to.
# git is decentralized, but the workflows aren't standardized
To oppose the ever-growing influence of Github, much of the free-software community pointed out that we simply don't need it, because [git is already a decentralized system](https://drewdevault.com/2018/07/23/Git-is-already-distributed.html), contrary to centralized versioning systems like subversion (svn). Every contributor to a git repository has a complete copy of the project's history, can work offline, then push their changes to any remote repository (called a *remote* in git terminology).
This property ensures that anyone can at any time take a repository, take it elsewhere, and start their own fork (modified version). However, what is outside the repository itself cannot always so easily be migrated: tickets, submitted patches, and other contributions and discussions may each have their own procedure to export, when that is possible at all.
Historically, git was developed for the Linux kernel community, where email is the core of the cooperation workflow. Tickets and patches are submitted and commented on mailing lists. git even provides subcommands for such workflow (eg `git-send-mail`). This workflow is [less prone to censorship](https://sourcehut.org/blog/2020-10-29-how-mailing-lists-prevent-censorship/) than centralized forges like Github, but has other challenges.
For example, when forking a project, how to advertise your fork to the community without spamming existing contributors? Surely if the original project is dead and the mailing list doesn't exist anymore, a fork's notification email would be welcome for most people. In other cases, it may be perceived as an invading action because these contributors did not explicitely consent (opt-in) to your communication, as required by privacy regulations in many countries.
Moreover, there is no standardized protocol for managing tickets and patch submissions over email. [sourcehut's todo](https://man.sr.ht/todo.sr.ht/) and Debian's [debbugs](https://www.debian.org/Bugs/) have very good documentation for new contributors to submit issues, but in other communities it can be hard to understand the [netiquette](https://en.wikipedia.org/wiki/Etiquette_in_technology) or expected ticket format for a project, even more so when this project has been going on for a long time with an implicit internal culture.
With the email workflow, every project is left to implement their own way. That can be quite powerful for complex usecases, but unexperienced persons who just want to start a cooperative project will be left powerless. Creating a project on Github takes a few clicks. In comparison, setting up a git server, a couple mailing lists (one for bugs, one for patches), and configuring correct permissions for all that is more exhausting and error-prone, unless you use an awesome shared hosting service like [sr.ht](https://sr.ht/).
From a contributor's perspective, it can take some time getting used to an email workflow. In particular, receiving and applying patches from other people may require using a developer-friendly email client such as [emacs](https://www.gnu.org/software/emacs/), [aerc](https://aerc-mail.org/) or [neomutt](https://neomutt.org/). Sending patches, on the other hand, is not complicated at all. If you want to learn, there's an amazing interactive tutorial at [git-send-email.io](https://git-send-email.io/).
We have to keep in mind forging platforms are not only used by developers, but also designers, translators, editors, and many other kinds of contributors. Many of them already struggle learning git, with its [infamous inconsistencies](https://stevelosh.com/blog/2013/04/git-koans/). Achieving a full-featured newcomer-friendly email forging workflow is currently still a challenge, and this is currently a hard limit on contributions from less-technical users.
Developing interoperable standards for forging over email (such as ticket management) would help a great deal. It would enable mail clients, desktop environments and integrated development environments to (IDEs) provide familiar interfaces for all interactions we need with the forge. For example, Github's [issue templates](https://docs.github.com/en/free-pro-team@latest/github/building-a-strong-community/manually-creating-a-single-issue-template-for-your-repository) could be supported in email workflows. An open email forging standard would also make it easier for web forges (and others) to enable contributions via email, reuniting two forging ecosystems who currently rarely interact because most people are familiar with one or the other, not both.
# Selfhosted walled gardens are not a solution
Over the past decade, some web forges have been developed with the goal of replicating the Github experience, but in a selfhosted environment. The most popular of those are [Gitlab](https://gitlab.com/), [Gogs](https://gogs.io/) and [Gitea](https://gitea.io/) (which is a fork of Gogs).
Such modern, selfhosted webforges are very popular with bigger communities and projects who already have their own infrastructure, and avoid relying on 3rd party service providers, for reliability or security concern. Forging can be integrated into their project ecosystem, for example to manage user accounts (eg. with [LDAP](https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol)).
However, for hobbyists and smaller communities, these solutions are far from ideal, because they were specifically developed to replicate a centralized development environment like Github. The forge is usually shut off from the outside world, and cooperation between users is only envisioned in a local context. Two users on [Codeberg](https://codeberg.org/) may cooperate on a project, but a user from [0xacab.org](https://0xacab.org/) may not contribute to the same project without creating an account on Codeberg.
Some people may argue this is a feature and not a bug, for three reasons:
1. *easier enforcement of server-wide rules, guidelines and access rules*: this may be an advantage in corporate settings (or for big community projects), but doesn't apply to popular hacking usecases, where all users are treated equally, and settings are defined on a per-project basis (not server-wide)
2. *an account on every server is more resilient to censorship or server shutdown*: while true, i would argue the issue should be tackled in more comprehensive and user-friendly ways through easier project migration and nomadic identity systems (explained later in this article)
3. *an account on every server isn't a problem, because there's only so many projects you contribute to*: though a person may only contribute seriously and frequently to a limited number of projects, there's so many more projects we use on a daily basis and don't report bugs to, because unless the project has privacy-invading telemetrics and click-to-go bug reporting, figuring out a specific project's bug reporting guidelines can be tedious
In contrast, the bug reporting workflow as achieved by Github and mailing lists is more accessible: you use your usual account, and the interface you're used to, to submit bugs to many projects. If a project uses Mailing Lists for cooperation, you can contribute to from your usual mail client. Your bug reports and patches may be moderated before they appear publicly, but you don't have to create a new account and learn a new workflow/interface just to submit a bug report.
Creating a new account for every community and project you'd like to join is not a user-friendly approach. This [antipattern](https://en.wikipedia.org/wiki/Anti-pattern) was already observed in a different area, with selfhosted social networks: [Elgg](https://elgg.org/) could never replace Facebook entirely, nor could [Postmill](https://postmill.xyz/)/[Lobsters](https://lobste.rs/) replace Reddit, because participation was restricted to a local community. In some cases it's feature: a family's private social network should not connect to the outside world, and a focused and friendly community like [raddle.me](https://raddle.me/) or [lobsters](https://lobste.rs/) may wish to preserve itself from nazi trolls.
But in many cases, not being able to federate across instances (and communities) is a bug. Selfhosted centralized services tailor to the niche usecases, not because they're too different from Facebook/Reddit, but because they're technically so similar to them. Instead of dealing with a gigantic walled garden (Github), or a wild jungle (mailing lists), we now end up with a collection of tiny closed gardens. The barrier to entry to those gardens is low: you just have to introduce yourself to the frontdoor and define a password. But this barrier to entry, however low it is, is already too high.
I suspect that for smaller volunteer-run projects, the ratio of bug reporters to code committers is much higher on Github and development mailing lists than it is on smaller, selfhosted forges. If you think that's a bad thing, try shifting your reasoning: if only people familiar with programming are reporting bugs, and your project is not only aimed at developers, it means most of your users are either taking bugs for granted, or abandoning your project entirely.
# Centralized and decentralized trust
When we're fetching information about a project, how to ensure it is correct? In traditional centralized and federated systems, we rely on location-addressed sources of trust. We define *where* to find reliable information about a project (such as a git remote). To ensure authenticity of the information, we rely on additional security layers:
- Transport Layer Security ([TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security)) or Tor's [onion services](https://community.torproject.org/onion-services/) to ensure the remote server's authenticity, that is to make it harder for someone to impersonate a forge to serve you malicious updates
- Pretty Good Privacy ([PGP](https://en.wikipedia.org/wiki/Pretty_Good_Privacy)) to ensure the document's authenticity, that is to make it harder for someone who took control of your account/forge to serve malicious updates
How we bootstrap trust (from the ground up) for those additional layers, however, is not a simple problem. Traditional TLS setup rely on absolute trust in a pre-defined list of 3rd-party [Certificate Authorities](https://en.wikipedia.org/wiki/Certificate_authority), and CAs abusing their immense power is far from unheard of. Onion services and PGP, on the other hand, require prior knowledge of authentic keys ([key exchange](https://en.wikipedia.org/wiki/Key_exchange)). With the [DANE](https://en.wikipedia.org/wiki/DNS-based_Authentication_of_Named_Entities) protocol, we can bootstrap TLS keys from the domain name system ([DNS](https://en.wikipedia.org/wiki/Domain_Name_System)) instead of the CA cartel. However, this is still not supported by many clients, and in any case is only as secure as DNS itself. That is, very insecure despite recent progress with [DNSSEC](https://en.wikipedia.org/wiki/Domain_Name_System_Security_Extensions). For a location-based system to be secure, we need a secure naming system like the [GNU Name System](https://gnunet.org/en/gns.html) to enable secure key exchange.
These difficulties are inherent properties of location-addressed storage, in which we describe *where* is a valid source of the information we're looking for. Centralized and federated systems are by definition location-addressed systems. Peer-to-peer systems, on the other hand, don't place trust in specific entities. In decentralized systems, trust is established either via [cryptographic identities and signatures](https://en.wikipedia.org/wiki/Digital_signature) and/or content-addressed storage ([CAS](https://en.wikipedia.org/wiki/Content-addressable_storage)).
Signatures verify the authenticity of a document compared to a known public key. For example, when we trust the Tor project's PGP key (`EF6E286DDA85EA2A4BA7DE684E2C6E8793298290`), we can obtain the Tor browser (and corresponding signature) from any source, and verify the file was indeed *signed* by Tor developers. Content-addressed storage, in comparison, merely matches a document with a [checksum](https://en.wikipedia.org/wiki/Checksum), and does not provide authorship information.
So, with these building blocks in place, how do we discover new content in a decentralized system? There are typically two approaches to this problem: consensus and gossip. There may be more, but i'm not aware of them.
## Consensus
Consensus is an approach in which all participating peers should agree on a single truth (a global state). They take votes following a protocol like [Raft](https://en.wikipedia.org/wiki/Raft_(algorithm)), and usually the majority wins. In a closed-off system controled by a limited number of people, a restricted set of trusted peers is allowed to vote. These peers can either be manually approved (static configuration), or be bootstrapped from a third party such as a local certificate authority controlled by the same operators.
But these traditional consensus algorithms do not work for public systems. If anyone can join the network and participate to the consensus establishment (not just a limited set of peers), then anyone may create many peers to try and take control of the consensus. This attack is often known as [Sybil attack](https://en.wikipedia.org/wiki/Sybil_attack), pseudospoofing, or 51% attack.
Public systems of consensus like [Bitcoin](https://en.wikipedia.org/wiki/Bitcoin) use a Proof-of-Work algorithm ([PoW](https://en.wikipedia.org/wiki/Proof_of_work)) to reduce the risk of a Sybil attack. Such blockchains are not determined by a simple majority vote, but rather by a vote by the majority of global computing power. While this represents a mathematical achievement, it means whoever controls the majority of computing power controls the network. This situation already happened in Bitcoin's past, and may happen again.
As we speak, the Bitcoin network already relies on a handful of giant computing pools, mostly running in China where coal-produced electricity is cheap ⁽³⁾. As time goes by, two things happen:
- difficulty goes up so mining becomes less rewarding, and may even cost you money depending on your hardware and your source of electricity; there was a time when mining on a GPU was profitable, and even before that, mining on a CPU was profitable for quite a while
- the blockchain grows more and more (7.5MB in 2010, 28GB in 2015, 320GB nowadays), so joining the network requires ever-growing resources
![XKCD comic about buying hand-sanitizer for 1BTC, CC BY-SA](2010_and_2020_2x.jpg)
To recap, hardware requirements go up, while economic incentives go down. What could possibly go wrong with this approach? Even worse, Bitcoin's crazy cult of raw computational power has serious ecological consequences. Bitcoin, a single application, uses more electricity than many countries. Also, all the dedicated hardware ([ASIC](https://en.wikipedia.org/wiki/Application-specific_integrated_circuit)) built for Bitcoin, will likely never be usable for anything else. As such, it appears global consensus is a deadend for decentralized systems.
## Gossip
[Gossip](https://en.wikipedia.org/wiki/Gossip_protocol) is a conceptual shift, in which we explicitely avoid global consensus. Instead, each peer has their own view of the network (truth), but can ask other peers for more information. Gossip is closer to how actual human interactions work: my local library may not have all books every printed, but whatever i find in there i can freely share with my friends and neighbors. ⁽⁴⁾ Identity in gossip protocols is usually powered by asymmetric cryptography (like PGP), so that all messages across the network can be signed and authenticated.
Gossip can be achieved through any channel. Usually, it involves USB keys and local area networks (LAN). But nothing prevents us from using well-known locations on the Internet to exchange gossiped information, much like a local newspaper or community center would achieve in the physical world. That's essentially what the Secure ScuttleButt ([SSB](https://scuttlebutt.nz/)) protocol is doing with its *[pubs](https://ssbc.github.io/scuttlebutt-protocol-guide/#pubs)*, or PGP with *[keyservers](https://en.wikipedia.org/wiki/Key_server_%28cryptographic%29)*.
In my view, gossip protocols include [IPFS](https://en.wikipedia.org/wiki/InterPlanetary_File_System) and [Bittorrent](https://en.wikipedia.org/wiki/BitTorrent). That's because they rely on Distributed Hash Tables ([DHTs](https://en.wikipedia.org/wiki/Distributed_hash_table)). Compared to a Bitcoin-style blockchain (where every peer needs to know about everything for consistency), In a DHT, no peer knows about everything (reducing hardware requirements to join the DHT), and consistency is ensured by content addressing (checksumming the information stored).
The database (DHT) is partitioned (divided) across many peers who each have their view of the network, but existing peers will gladly help you discover content they don't have, and ensuring authenticity of the data is not hard thanks to checksums. In this sense, i consider DHTs to be some kind of globally-consistent gossip.
It's not (yet) a widely-researched topic, but it seems IPv6 multicast could be used to make gossiping a lower-level concern (on the network layer). If you're interested in this, be sure to check out a talk called [Privacy and decentralization with Multicast](https://archive.fosdem.org/2020/schedule/event/dip_librecast/).
# Popular self-defense against malicious activity
One may object that favorizing interaction across many networks will introduce big avenues for malicious activity. However, i would argue this does not have to be true. The same could be said of the Internet as a whole, or email in particular. But in practice, we have decades of experience (including many failures) about how to protect users from spam and malicious activities in a decentralized context.
Even protocols who initially discarded such concerns as secondary are eventually rediscovering well-known countermeasures. Some talks from the last [ActivityPub conference](https://conf.activitypub.rocks/#talks) ([watch on Peertube](https://conf.tube/video-channels/apconf_channel/videos)) touch on these topics. I personally recommend a talk entitled Architectures of Robust Openness.
What about information which you published by mistake, or information which you willingly published but may cause you harm? And what about spam?
## Revocation and deniability
In the first case, how do you take down a very secret piece of information you did not intend to publish? I am unaware of any way of achieving this in a decentralized manner. Key revocation processes (like with PGP) rely on good faith from other peers, who have incentives to honor your revocation certificate, as the revocated material was a public key, not a secret document.
However, even in a centralized system, there's only so much you can do following an accidental publication. If you published a password or private key, you can simply rotate it. However, if you published a secret document, it may be mirrored forever on some other machines, even if you force-pushed it out. In that sense, a forging repository is similar to a newspaper: better think twice about what you write in your article, because it will be impossible to destroy every copy once it's been distributed.
![git push --force may burn down your house](force-push.jpg)
[Plausible deniability](https://en.wikipedia.org/wiki/Plausible_deniability) addresses the second concern. In modern encrypted messengers (OTR/Signal/OMEMO), encryption keys are constantly rotated, and previous keys are shared. Authenticity of a message is proven in the present, but you cannot be held responsible for a past message, because anyone could have forged the signature by recycling your now-public private key.
While it may seem like a secondary concern, plausible deniability can be a (litterally) life-saving property in case of political dissent. That's why Veracrypt has [hidden volumes](https://www.veracrypt.fr/en/Hidden%20Volume.html). That's also why some people are now [calling on mail providers to publish their previous DKIM private keys](https://blog.cryptographyengineering.com/2020/11/16/ok-google-please-publish-your-dkim-secret-keys/): plaintext emails would be plausibly deniable, while retaining strong authenticity for PGP-signed emails (as intended).
In decentralized public systems like software forges, i am unaware of any effort to implement plausible deniability. By instinct, i feel like plausible deniability is incompatible with authentication, which is a desired property for secure forging. However, i don't know that for sure. I'm also not sure about the need for plausibly-deniable forging, as developers are usually tracked and persecuted through other means, like [mailing lists](https://en.wikipedia.org/wiki/DeCSS) or [mobile phone networks](https://news.ycombinator.com/item?id=21747424).
## SPAM
Another vector of harm is spam and otherwise undesired content. It is often said that combatting spam in a decentralized environment is a harder problem. However, as previously explained, some decentralized systems have decades of experience in fighting malicious activities.
Simple techniques like [rate-limiting](https://en.wikipedia.org/wiki/Rate_limiting), webs of trust ([WoT](https://en.wikipedia.org/wiki/Web_of_trust)), or (sometimes user-overridable) [allow](https://en.wikipedia.org/wiki/Whitelisting)/[deny lists](https://en.wikipedia.org/wiki/Blacklist_(computing)) can go very far to protect us from spam. However, there are techniques invented for the web and for emails which should never be reused, because they are user-hostile antipatterns: IP range bans, and 3rd party denylisting.
Banning entire IP address ranges is a common practice on the web, and the reasoning is that if you received malicious activity from more than one address in an IP range, you'd better ban the entire range, or even all addresses registered from the same country. While this may protect you from an unskilled script kiddie spamming a forum, it will prevent a whole bunch of honest users from using your service.
For example, banning Tor users from your service will do little to protect you from undesired activities. Bad actors usually have a collection of machines/addresses they can start attacks from. When that is not the case, a few dollars will get them a bunch of virtual private servers, each with a dedicated address. For a few dollars more, they'll get access to a botnet running from residential IP addresses. This means blocking entire IP ranges will only block legitimate users, but will not stop bad actors.
Renting under-the-radar residential IP addresses was already common practice years ago with people offered money or free TV services in exchange for placing [a blackbox](https://www.reddit.com/r/raspberry_pi/comments/3nc5mv/what_are_they_trying_to_do_with_this_pi_seems/) in their network. Nowadays, "smart" TVs, lightbulbs, cameras and other abominations will do the trick with even less mitigation strategies. The Internet of Things is a capitalist nightmare that hinders our security (see [this talk](https://www.usenix.org/conference/usenixsecurity18/presentation/mickens)) and destroys the environment.
The other common antipattern in fighting malicious activities is delegating access control to a third party. In the context of a webbrowser's [adblocker](https://en.wikipedia.org/wiki/Adblocker), it makes sense to rely on denylists maintained by other persons: you may not have time to do it yourself, and if you feel like something is missing on the page, you can instantly disable the adblocker or remove specific rules. On the server side however, using a 3rd party blocklist may introduce new problems.
How can users report that they are (wrongfully) blocked from your services, if they cannot access those services in the first place to find your contact information? How can they reach you if their email server is blocked, because a long time ago someone spammed from their IP address? In the specific case of email, most shared blocklists have a procedure to get unlisted. But some other blocklists don't.
On the web, CloudFlare is well-known as a privacy-hostile actor: they will terminate TLS connections intended for your service, snooping on communications between your service and its users. Moreover, CloudFlare blocks by default any activity that comes from privacy-friendly networks like Tor, trapping users in endless CAPTCHA loops. They also block [user agents](https://en.wikipedia.org/wiki/User_agent) they suspect to be bots, preventing legitimate [scrapers](https://en.wikipedia.org/wiki/Web_scraping) indexing or archiving content.
![Fuck cloudflare stickers](fuck-cloudflare.jpg.jpg)
Would you allow a private multinational corporation to stop and [stripsearch](https://en.wikipedia.org/wiki/Strip_search) anyone trying to reach your home's doorbell? Would you allow them to prevent your friends and neighbors from reaching you, because they refuse to be stripped? That's exactly what CloudFlare is doing in the digital world, and that is utterly unacceptable. All in all, **Fuck CloudFlare**! Yes, there's even [a song](https://polarisfm.bandcamp.com/releases) about it.
So, there's nothing wrong with banning malicious actors from your network and services. But what defines a malicious actor? Who gets to decide? These are extremely political question, and delegating such power to privacy-hostile third party services is definitely not the way to go.
# Existing decentralized forging solutions
Now that we have covered some of the reasons why decentralized forging is important and how to deal with malicious activities, let's take a look at projects people are actually working on.
## Federated authentication
Some web forges like Gitea propose [OpenID](https://en.wikipedia.org/wiki/OpenID#Technical_overview) federated authentication: you can use any OpenID account to authenticate yourself against a such selfhosted forge. Compared to [OAuth](https://en.wikipedia.org/wiki/OAuth), OpenID does not require the service operator to list explicitely all accepted identity providers. Instead of having a predetermined list of "login with microfacegoogapple" buttons, you have a free form for your OpenID URL.
Whether you're signing up for the first time or signing in, you give your OpenID URL to the forge. You will then be redirected to the corresponding OpenID server, authenticated (if you are not yet logged in) and prompted whether you want to authenticate on the forge. If you accept, you will be redirected to the forge, who will know the OpenID server vetted for your identity.
Newer standards like [OpenID Connect](https://en.wikipedia.org/wiki/OpenID_Connect) also feature a well-known discovery mechanism, so you don't have to use a full URL to authenticate yourself, but a simple *user@server* address, as we are already used to. Federated authentication can also be achieved via other protocols, such as email confirmation, [IndieAuth](https://indieauth.net/) or XMPP/Jabber (XEP [0070](https://xmpp.org/extensions/xep-0070.html) or [0101](https://xmpp.org/extensions/xep-0101.html)).
The federated authentication approach is brilliant because it's simple and focuses on the practicality for end-users. However, it does not solve the problem of migrating projects between forges, nor does it enable you to forge from your usual tools/interfaces.
## Federated forging
Federated forging relies on a federation protocol and standard vocabulary to let users cooperate across servers. That means a whole ecosystem of interoperable clients and servers can be developed to suit everyone's needs. This approach is exemplified by the ForgeFed and Salut-à-Toi projects.
[ForgeFed](https://forgefed.peers.community/) is a forging extension for the [ActivityPub](https://activitypub.rocks/) federation protocol (the fediverse). It has a proof-of-concept implementation called [vervis](https://dev.angeley.es/s/fr33domlover/r/vervis) and aims to be implementable for any web forge. However, despite some interesting discussions on their forums, there seems to be limited activity implementation-wise.
[Salut-à-Toi](https://salut-a-toi.org/) on the other hand, is an existing suite of clients for the Jabber federation (XMPP protocol). They have CLI, web, desktop and TUI frontends to do social networking on Jabber. From this base, they released support for decentralized forging [in july 2018](https://www.goffi.org/b/Uj5MCqezCwQUuYvKhSFAwL/salut-alpha-contributors,-take-your-keyboards).
![A proposed patch on the salut-à-toi forge](sat-forge-cut.png)
It's still a proof-of-concept, but it's reliable enough for the project to be selfhosted. In this context, *selfhosted* means that salut-à-toi is the forging software used to develop salut-à-toi itself. This all happens [here](https://bugs.goffi.org/).
While such features are not implemented yet, the fact that these federated forge rely on standard vocabulary would help with migration between forges, without having to use custom APIs for every forge, as is common for Github/Sourceforge/Bugzilla migrations.
As the user interactions themselves are federated, and not just authentication, folks may use their client of choice to contribute to remote projects. This means lesser concerns for color themes or accessibility on the server side, because all of these questions would be addressed on the client side. This is very important for [accessibility](https://en.wikipedia.org/wiki/Accessibility), ensuring your needs are covered by the client software, and that a remote server cannot impact you in negative ways.
If your email client is hard for you to use, or otherwise unpleasant, you may use any email-compatible client that better suits your needs. With selfhosted, centralized forges, where the client interface is tightly-coupled to the server, every forge server needs to take extra steps to please everyone. Every forge you join to contribute to a project can make your user experience miserable. Imagine if you had to use a different user interface for every different server you're sending emails to?!
Federated forging, despite being in early stages, is an interesting approach. Let servers provide functions tailored to the project owners, and clients provide usability on your own terms.
## Gossip
An early attempt based on Bittorrent's DHT was [Gittorrent](https://blog.printf.net/articles/2015/05/29/announcing-gittorrent-a-decentralized-github/). Another one was [git-remote-ipfs](https://github.com/larsks/git-remote-ipfs), based on [IPFS](https://ipfs.io/) instead. The project is now unmaintained, but it can be replicated [with a simple IPFS HTTP gateway](https://docs.ipfs.io/how-to/host-git-style-repo/). While these proof-of-concept systems did not support tickets and patches, they were inspirational for more modern attempts like git-ssb and radicle.
[git-ssb](https://scuttlebot.io/apis/community/git-ssb.html) implements forging capabilities on top of the Secure ScuttleButt ([SSB](https://scuttlebutt.nz/)) gossip protocol. SSB was designed for an offline-first usage, with local wifi hotspots or sharing USB keys. This makes the system very resilient, though worldwide cooperation between strangers is an afterthought.
![Git SSB's web interface](git-ssb.png)
[radicle](https://radicle.xyz/) is another project, with a homegrown gossip protocol explicitely designed for forging. The project is still in early stages, but there's growing interest around it. They have a dedicated development team and a roadmap, and they claim Monadic (their employer) [will not own](https://monadic.xyz/what-is-monadic.html) any of the Radicle ecosystem, but rather contribute to it under the guidance of a non-profit foundation.
![radicle screenshot, taken from their homepage](radicle.png)
While radicle's funding isn't clear from my perspective, and therefore triggers my cryptostartup vaporware bullshit-o-meter, what i could read from their forums and hear from their presentations gave me confidence they really intend to build a community/standards-driven project for the improvement of the ecosystem. Also worth noting, their [jobs page](https://monadic.xyz/jobs.html) claims decision-making is a horizontal process, and all of their employees are paid the same.
Even more exciting, radicle uses strong cryptography to sign commits in a very integrated and user-friendly way. That's a strong advantage over most systems in which signatures are optional and delegated to third-party tooling which can be hard to setup.
Although these peer-to-peer forges are less polished for day-to-day use, they helped pave the way for research into decentralized forging by showing that git and other decentralized version control systems (DVCS) play well with content-addressed storage (given that a commit is an actual checksum) and peer-to-peer networks.
## Consensus
Apart from sketchy Silicon Valley startups, nobody is attempting to build consensus-based forging solutions, because of the many problems i explained before (with Bitcoin). When i started drafting this article, Radicle seemed keen on using blockchains and global consensus. But since then, they've reevaluated their decision, though not for any of the reasons i proposed, but rather because of legal problems due to [blockchain poisoning](https://jennyleung.io/2019/03/29/blockchain-poisoning-and-the-proliferation-of-state-privacy-rights/). Nowadays, Radicle intends to use [Ethereum](https://en.wikipedia.org/wiki/Ethereum), but only as a complement (not replacement) to their gossip protocol.
## Not covered here
Many more projects over the years have experimented with storing forging interactions (metadata like bugs and pull requests) as well-known files within the repository itself. Some of them are specific to git: [git-dit](https://github.com/neithernut/git-dit), [ticgit](https://github.com/jeffWelling/ticgit), [git-bug](https://github.com/MichaelMure/git-bug), [git-issue](https://github.com/dspinellis/git-issue). Others intend to be used with any versioning system (DVCS-agnostic): [artemis](https://www.mrzv.org/software/artemis/), [bugs-everywhere](https://github.com/kalkin/be), [dits](https://github.com/jashmenn/ditz), [sit](https://github.com/sit-fyi/issue-tracking).
I will not go into more details about them, because these systems only worry about the semantics of forging (vocabulary), but do not emphasize how to publicize changes. For example, these tools would be great for a single team having access to a common repository to update tickets in an offline-first setting, then merging them on the shared remote when they're back online. But they do not address cooperation with strangers, unless you give anyone permission to publish a new branch to your remote, which is probably a terrible idea. However, that's just my personal, uninformed opinion: if you have counter-arguments about how in-band storage of forging interactions could be used for real-world cooperation with strangers, i'd be glad to hear about it!
Lastly, i didn't mention [Fossil SCM](https://fossil-scm.org/) because i'm not familiar with it, and from reading the docs, i'm very confused about how it approaches cooperation with strangers. It appears forging interactions are stored within the repository itself, but then does that mean that Fossil pulls every interaction it hears about from strangers? Or is Fossil only intended for use in a closed team, as [this page](https://fossil-scm.org/home/doc/trunk/www/fossil-v-git.wiki) seems to indicate? Let me know if you have interesting articles to learn more about Fossil.
# With interoperability, please
After this brief review of the existing landscape of decentralized forging, i would like to argue for [interoperability](https://en.wikipedia.org/wiki/Interoperability). If you're not familiar with this concept, it's a key concern for accessibility/usability of both physical and digital systems: interoperability is the property when two systems addressing the same usecases can be used interchangeably. For example, a broken lightbulb can be replaced by any lightbulb following the same socket/voltage standards, no matter how it works internally to produce light.
In fact, interoperability is the default state of things throughout nature. To make fire, you can build any sort of wood that burns. If your window is broken and you don't have any glass at hand, you can replace it with any material that will prevent air flowing through. Interoperability is a very political topic, and a key concern to prevent the emergence of monopolies. If you'd like to know more about it, i strongly recommend a talk called [We used to have cake, now we've barely got icing](https://media.ccc.de/v/pw20-392-we-used-to-have-cake-now-we-ve-barely-got-icing).
So while these approaches of decentralized forging we've talked about are very different in some regards, there is no technical reason why they could not play well together and inteoperate consistently. As a proof of concept, git-issue we've mentioned in the previous section can actually synchronise issues contained within the repository with Github and Gitlab repositories. It could as well synchronise with any selfhosted forge (federated or not), or even gossip them to git-ssb or radicle.
The difference between federated and p2p systems is big, but hybrid p2p/federated systems have a lot of value. If we develop open standards, there is no technical barrier for a peer-to-peer forge to synchronise with a federated web/XMPP forge. It may be hard to wrap one's head around, and may require more for implementation, but it's entirely possible. Likewise, a federated forge could federate both via ForgeFed, and via XMPP. And it could itself be a peer in a peer-to-peer forge, so that pull requests submitted on Radicle may automatically appear on your web forge.
Not all forges have to understand each other. But it's important that we at least try, because the current fragmentation across tiny different ecosystems is hostile to new contributions from people who are used to different workflows and interfaces.
Beyond cooperation, interoperability would also ease backups, forks and migrations. Migrating your whole project from a forge to another would only take a single, unprivileged action. When forking a project, you would have a choice whether to inherit all of its issues and pull requests or not. So if you're working on a single patch, you would discard it. But in case you want to take over an abandoned project, you would inherit all of the project's history and discussions, not just the commits.
You may have noticed i did not mention the email workflow in this section about interoperability. That's because email bugtracking and patching is far from being standardized. Many cncerns expressed in this article would equally apply to email forging. But to be honest, i'm not knowledgeable enough about email-based forging to provide a good insight on this topic. I'm hoping people from the [sourcehut forge community](https://sourcehut.org/) and other git email wizards can find inspiration in this call to decentralized forging, come around the table, and figure out clever ways to integrate into a broader ecosystem.
# Code signing and nomadic identity
How to ensure authenticity of interactions across different networks? Code signing in forging usually uses PGP keys and signatures to authenticate commits and *refs*. In most cases, it is considered a DVCS-level concern and is left untouched by the forge, except maybe to display a symbol for valid signature alongside a commit. While we may choose to trust the forge regarding commit signatures, we may also verify these on our end. The tooling for verifying signatures is lacking, although there is recent progress with the [GNU Guix project](https://guix.gnu.org/) releasing the amazing [guix git authenticate](https://guix.gnu.org/manual/en/html_node/Invoking-guix-git-authenticate.html#Invoking-guix-git-authenticate) command for bootstrapping a secure software supply chain.
However, forging interactions such as issues are typically unsigned, and cannot be verified. In systems like ActivityPub and radicle, these interactions are signed, but with varying levels of reliability. While radicle has strong security guarantees because every client owns their keys, email/ActivityPub lets the server perform signatures for the users: a compromised server could compromise a lot of users and therefore such signatures are unreliable from a security perspective. We could take this into consideration when developing forging protocols, and ensure we can embed signatures (like PGP) into forging intereactions such as tickets.
For interoperability concerns, each forge could implement different security levels, and let maintainers choose the security properties they expect for external contributions, depending on their practical security needs. A funny IRC bot may choose to emphasize low-barrier contribution across many forges over security, while a distribution may enforce stricter security guidelines, allowing contributions only from a trusted webforge and PGP-signed emails. In any case, we need more user-friendly tools for making and verifying signatures.
Another concern is how to deal with migrations. If my personal account is migrated across servers, or i'm rotating/changing keys, how to let others know about it in a secure manner? In the federated world, this concern has been addressed by the [ZOT protocol](https://zotlabs.org/help/en/developer/zot_protocol), which as initially developed for [Hubzilla](https://zotlabs.org/page/hubzilla/hubzilla-project)'s nomadic identity system. ZOT lets you take your content and your friends to a new server at any given moment.
This is achieved by adding a crypto-identity layer around server-based identity (`user@server`). This crypto-identity corresponding to a keypair (think PGP) is bootstrapped in a [TOFU](https://en.wikipedia.org/wiki/Trust_on_first_use) manner (Trust On First Use) when federating with a remote user on a server that supports the ZOT protocol. The server will give back the information you requested, and let you know the nomadic public key for the corresponding user, and other identities signed with this keypair.
For example, let's imagine for a second that [tildegit.org](https://tildegit.org/) and [framagit.org](https://framagit.org/) both supported the ZOT protocol and some form of federated forging. My ZOT tooling would generate a keypair, that would advertise my accounts on both forges. Whenever i push changes to a repository, these changes would be pushed to the two servers simultaneously. When someone clones one of my projects, their ZOT-enabled client would save my nomadic identity somewhere. This way, if one of the two server ever closes, the client would immediately know to try and find my project on the other forge.
In practice, there would be a lot more subtlety to represent actual mapping between projects (mirrors), and to map additional keypairs on p2p networks (such as radicle) to a single identity. However, a nomadic identity system doesn't have to be much more complex than that.
The more interesting implementation concern is how to store, update and retrieve information about a nomadic identity. With the current ZOT implementations, identities are stored as signed JSON blobs, that you retrieve opportunistically from a remote server. However, that means if all of your declared servers are offline (for instance, if there's only one of those) one cannot automatically discover updates to your nomadic identity (new forges to find your projects).
I believe a crypto-secure, decentralized naming system such as [GNS](https://gnunet.org/en/gns.html) or [IPNS](https://docs.ipfs.io/concepts/ipns/) would greatly benefit the nomadic identity experience. DNS could also be used here, but as explained before, DNS is highly vulnerable to determined attackers. Introducing DNS as discovery mechanism for nomadic identities would weaken the whole system, and make it much harder to get rid of in the future (for backwards-compatibility).
With GNS/IPNS (or any other equivalent system), people would only need to advertise their public key on every forge, and the nomadic identity mapping would be fetched in a secure manner. Considering GNS is in fact a signed and encrypted peer-to-peer key-value store itself, we could use GNS itself to store nomadic identity information (using well-known keys). IPNS, on the other hand, only contains an updatable pointer to an IPFS content-addressed directory. In this case, we would use well-known files within the target directory.
So, migration and failover across forges should also be feasible, despite other challenges not presented here, such as how to ensure consistency across decentralized mirrors, and what to do in case of conflicts.
# Social recovery and backup mechanisms
Introducing strong cryptographic measures as proposed in the previous sections introduces new challenges, in particular about account recovery mechanisms in case of lost key/password. In the case of federated forges, for instance, the server does not have access to users' private keys and therefore cannot help users recover their account.
That's a good thing, because it means a compromised server cannot compromise its users, whether willingly or not. However, this is counterintuitive for most users who are used to password reset forms, and are (understandably) pissed to loose many things simply because they forgot a password.
To workaround this issue, many encrypted services providers provide recovery codes that you should write down and store safely, enabling you to regenerate your private key and/or reset your password in the future. However, where you store this highly-secret recovery code can either be compromised (eg. physical intrusion in your home), and in any case, loosing access to this recovery code will leave you helpless.
That's why the hosting cooperative Riseup also enables account recovery via email, but whatever the recovery mechanism you use, [your entire mailbox will be deleted](https://support.riseup.net/en/knowledgebase/1-accounts/docs/14-how-can-i-access-my-account-if-i-forgot-my-password). In this case, the recovery mechanism will let you recover your identity, but will not help compromise you and other people by revealing your secrets to a potential attacker.
Another approach to this problem is [Shamir's secret sharing scheme](https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing) (ssss), which divides a secret in several parts, a certain number of which have to be reunited to recover the secret. These smaller parts of the secrets can then be encrypted for people you trust, and kept safely with them.
![Shamir's secret sharing scheme illustration, CC BY Raylin Tso, Zi-Yuan Liu and Jen-Ho Hsiao](Shamirs-secret-sharing-scheme.ppm.png)
Shamir's approach is interesting and can be successful because it considers the problem of account recovery from a trust perspective: how can we empower users to delegate trust to people of their choice, and not to a single third-party service which could compromise them. Shamir's algorithm also successfuly accounts for a trusted party becoming unavailable or uncooperative, by not requiring that all parties agree to recover the secret (but a configurable fraction of them). In short, it [allows the employment of not fully trusted people](http://point-at-infinity.org/ssss/).
In this area also, we need research and work on better tooling focused on user experience. Conceiving availability/recovery issues as social problems opens the door to solving many problems, not just for end-users in a federated/peer-to-peer setup. For instance, tooling that enables sysadmins to perform/receive backups in a socially aware manner like [Yunohost does](https://github.com/YunoHost-Apps/borg_ynh/blob/master/README.md) does a great deal to improve reliability of services for resources-constrained sysadmins.
# Conclusion
Decentralized forging is in my view the top priority for free-software in the coming decade. The Internet and free-software have a symbiotic relationship where one cannot exist without the other. They are two facets of the same software supply chain, and any harm done on one side will have negative consequences on the other. Both are under relentless attack by tyrants (including pretend-democracies like France or the USA) and multinational corporations (like Microsoft and Google). Developing decentralized forging tooling is the only way to save free software and the Internet, and may even lower the barrier to contribution for smaller community projects.
Of course, decentralized forging will not save us from the rise of fascism in the physical and digital world. People will have to stand up to their oppressors. Companies and State infrastructure designed to destroy nature and make people's lives miserable will have to burn. And we have to start developing right now the world we want to see for the future.
« We are not in the least afraid of ruins. We are going to inherit the earth; there is not the slightest doubt about that. The bourgeoisie might blast and ruin its own world before it leaves the stage of history. We carry a new world here, in our hearts. That world is growing in this minute. » -- [Buenaventura Durruti](https://libcom.org/library/durruti-spanish-revolution)
⁽¹⁾ Microsoft has tried to buy themselves a free-software friendly public image in the past years. This obvious [openwashing](https://en.wiktionary.org/wiki/openwashing) process has consisted in bringing free software to their closed platform (eg. [Windows Subsystem for Linux](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux)), and open-sourcing a few projects that they could not monetize (eg. [VSCode](https://en.wikipedia.org/wiki/Visual_Studio_Code)), while packaging them with spyware (telemetry). Furthermore, Microsoft has been known for decades to cooperate with nation States and the [military industrial complex](https://en.wikipedia.org/wiki/Military%E2%80%93industrial_complex).
⁽²⁾ Catalonia has a long history of oppression and repression by the spanish State. Microsoft is just the latest technological aid for that aim, just like Hitler and Mussolini in their time provided weapons to support Franco's coup (in 1936), and crush the democratically-elected social front, then the social revolution.
⁽³⁾ Cheap electricity on the other side of the world is the only reason we have cheap hardware. It takes considerably more energy to produce a low-powered device like a Raspberry, than to keep an old computer running for decades, but the economical incentives are not aligned.
⁽⁴⁾ In many countries, sharing copyrighted material in a restricted circle is legal as long as you obtained the material in a legal manner. For example, copying a book or DVD obtained from a library is legal. Throughout the french colonial empire, this is defined by the right to private copy ([copie privée](https://fr.wikipedia.org/wiki/Copie_priv%C3%A9e)).

39
content/blog/decentralized-forge/index2.md

@ -0,0 +1,39 @@
+++
title = "Decentralized forge: distributing the means of digital production"
date = 2020-11-20
+++
This article began as a draft in february 2020, which i revised because the Free Software Foundation is looking for feedback on their high-priority projects list.
Our world is increasingly controled by software. From medical equipment to political repression to interpersonal relationships, software is eating our world (TODO: i don't remember this article was it good?). As the luddites and other worker collectives have with previous technologies in the past centuries, we've been wondering whether a particular piece of software empowers, or controls its users. In the software world, this question is often asked through the frame of free software: am i free to study the program, modify it according to my needs and distribute original or modified copies of it?
However, considering individual pieces of software without their broader context is a lack of perspective. Just like in the physical world, the immense human and ecological impact of a product simply cannot be imagined by looking at the final outcome, so does a binary program tell us very little about its goal and the social conditions in which it was conceived and produced.
# Github considered harmful
# The means of production
Imagine there's a single workshop on the entire planet where you can build things. Whether you want to build a shelf or repair your car, you'll find there all the tools you need and a broad community of people you may be able to help out if you need. Nothing actually prevents you from crafting something in your own neighborhood, but you probably don't have all the tools you're gonna need, and your neighbors are not much more skilled or equipped than you are.
So you decide you want to go to this famous Github workshop island you keep hearing about. After a long plane flight, you reach this remote island where products are born. Leaving the plane, you are interrogated by Microsoft's immigration services who, after hearing your name and contact information, decide to let you in on the island.
Finally, there you are! Every street is filled with stressed people rushing from one place to another. Some are just standing there shouting their `CHANGELOG`, others are sitting around a beer or a card game arguing endlessly over technical details. You notice some even wear star badges on their clothes as a sign of social distinction for their successful product, as these stars you learn, you offer to people who built something useful for you.
Folks are having a chat in a corner, but suddenly a black van appears out of nowhere. Faceless officers step out of the van, seize all equipment and arrest the chatters in a brutal manner. Lots of people seem concerned, even terrified. A few cry, some scream, others try to articulate words for what's happening under their eyes. Very few dare to hit back: a couple punks here and there throw rocks and glass bottles at those strange officers, then run into the crowd before the cops can get their hands on them. And before we know it, the van is back on its way, already far from our eyes.
As you're not exactly sure what just happened, you ask around. It appears this black van belongs to the political police, and this arrest was not isolated, as several thousand people were detained at the exact same time. Why? It appears they were crafting pens and paper, which are highly illegal due to obvious copyright regulations to prevent ideas from being stolen. At least this time, the issue seems to be dividing the community: some people find it reasonable that people are arrested for practicing their craft, because it's for the good of copyright.
That sounds completely surreal to you, as where you're from every body has pens and papers, and they're very useful for poetry, design, keeping notes, and many other activities. What the people here call "stealing" ideas doesn't sound like stealing at all, as nobody is ever disposessed from an idea.
A few years ago, the situation was not as clear. When the police came to arrest catalonian craftspeople and official diplomats, many people were concerned, but simply looked away and resumed their activities. There was very little outrage, despite Microsoft's obvious collaboration with the Spanish colonial republic in oppressing and terrorizing an entire population. Back then, it was as if nobody cared, and few people have even mentioned it since. Don't you think it would make a bad impression to expose your generous host as a supporter of many fascist dictatorships and colonialist pretend-republics?
solu volunteers and employees
This particular place is called Github in the digital world.

279
content/blog/decentralized-forge/index3.md

@ -0,0 +1,279 @@
+++
title = "Decentralized forge: distributing the means of digital production"
date = 2020-11-20
+++
[PAD](https://cryptpad.fr/pad/#/2/pad/edit/QXiMGyeOy3M5OcCc-QN8VtDV/)
This article began as a draft in february 2020, which i revised because the Free Software Foundation is looking for feedback on their high-priority projects list.
Our world is increasingly controled by software. From medical equipment to political repression to interpersonal relationships, software is eating our world (TODO: i don't remember this article was it good?). As the luddites did in other times, we've been wondering whether a particular piece of software empowers, or controls its users. This question is often phrased through a software freedom perspective, as defined by the GNU project (TODO): am i free to study the program, modify it according to my needs and distribute original or modified copies of it?
However, individual programs cannot be studied out of their broader context. In the physical world, the immense human and ecological impact of a product simply cannot be imagined by looking at the final outcome. In the digital world, a binary program tell us very little about its goals and the social conditions in which it was conceived and produced.
There is a growing monopoly (privatisation) on the means of digital production. This can be illustrated by Adobe and other publishers abandoning their licence schemes in favor of monthly subscriptions. Many services now refuse to setup or operate properly without Internet access, so that a remote party can decide at any time to revoke your access to your programs, whether you've paid for them or not. Other examples from many domains can be taken from Cory Doctorow's talk about the current war on general computation (TODO).
A forge is another name for a software development platform. The past two decades have been rich in progress on forging ecosystems, empowering many projects to integrate tests, builds, fuzzing, or deployments as part of their development pipeline. However, in this field too, there is a growing centralization in the hands of a few nefarious corporations.
In this article, i argue decentralized forging is a key issue for the future of free software and non-profit software development that empowers users. It will cover:
- what's wrong with Github and other centralized forging platforms
- why traditional git workflows (git-email) are not for everyone
- how selfhosted forges as they are (Gitlab/Gitea) limit cooperation
- different forms of truth: centralized, concensus and gossip
- how to deal with bad actors spamming your forge
- real-world projects for federated or peer-to-peer forging
- a call for interoperability between decentralized forges
- nomadic identity and censorship-resilience
- code-signing and secure project bootstrapping
# Github considered harmful
More and more, software developers and other digital producers are turning to Github to host their project and handle cooperation across users, groups and projects. Github is a web interface for the git decentralized version control system (DVCS). If you're unfamiliar with git or DVCS in general, i recommend reading [an introduction](TODO) before proceeding with reading this article.
From this web interface, Github enables you to browse issues (sometimes called tickets or bugs) and propose patches (changes) to other projects, or accept/deny those proposed to yours. Github also allows you to define permissions so that only certain groups of users may access the code or push changes. What lies beneath those features is actually handled by the git software itself, which is in no way affiliated to Github.
Github is a web forge, and as such was inspired by many others that came before like Sourceforge and Launchpad. But these previous forges were simple web interfaces exposing git internals naively: they were great, as long as you were familiar with Decentralized Acyclic Graphs and git-specific vocabulary (refs, HEAD, branches..). Github, on the other hand, managed to design their interface from the beginning as a collaboration tool that is usable by people unfamiliar with git, if only to submit tickets.
A minor but significant difference between their approach to user experience, is reflected in that a project's main page on Github displays the repository's folder as well as the README file. In contrast, other solutions displayed a lot more information, but not those. For example, on Inkscape's Launchpad project, it takes me two clicks from the homepage to reach the README file.
While displaying a lot of technical information on the homepage seems like a good idea at first, it's really confusing to whoever has no clue what a *tree* or a *ref* is. Technical details bring only moderate value for developers familiar with the tooling (who can use git CLI), but may considerably reduce the signal/noise ratio for newcomers. Moreover, every project is architectured differently, and a rigid, uniform page cannot possibly reflect that.
Placing the README file on the project's homepage is not a secondary concern. It enables direct communication from project developers to end-users, as README files were invented for. Whether those end-users need prior knowledge of other tools and jargon to understand the README is up to you, but Github does not place the bar higher than it needs to be. Also, rendering Markdown to HTML allows for simpler projects to do without a dedicated website, and use their project repository as documentation.
But not all is so bright with Github, and from its very first days some serious critiques have emerged. First and foremost, Github is a private company trying to make money on other people's work. They are more than happy to leverage the whole free-software ecosystem and to provide forging services for it, but always refused to publish their own sourcecode. Such hypocrisy has long been criticized by the Free Software Foundation (TODO).
Github has also been involved in very questionable political decisions. This is especially true since they were bought by Microsoft, one of the worst digital empires conspiring against our freedom. ⁽¹⁾ If your program is in any way interacting with copyrighted materials (such as Popcorn Time or youtube-dl), it may be taken down from Github at any time. The same goes if you or your project displeases a colonial empire (or pretend-republic) such as the United States or Spain: Iranian people have been entirely banned from Github, and an application promoting catalan independance has been removed. ⁽²⁾
As we can observe, giving an evil multinational life-or-death power over many projects is doomed to catastrophe. No single entity should ever have the power to dictate who can participate to a project, or what this project may be.
# git is already decentralized, but
To answer the enthusiasm and critique of Github, much of the free-software community argued that git is already decentralized, and therefore we don't need a platform like Github. git is a decentralized version control system: that means every contributor has a complete copy of the repository, and can work on their own without a central server, contrary to previous systems such as svn (TODO).
Due to this, git was conceived to synchronise from/to multiple remote repositories, called "remotes". So even if you use Github, you can at any time mirror or import your repository to another git host, or even setup your own.
Setting up your own git server doesn't have to be complicated, and offers a lot of flexibility in the workflow, thanks to the exposed [git hooks](TODO). What's more complicated is providing a decent and coherent user experience for other contributors. The git server does not manage patches and tickets: it's a simple versioned file store where a defined set of people can push changes, and little more.
Historically, git was developed for the Linux kernel community, which places email at the core of their cooperation workflow. Bugs are submitted and commented on a Mailing List, and so are patches. git even has a few embedded subcommands for such workflow: git-am, git-send-mail..
However, using git with email is not for everyone. Beyond simple CLI commands you could just memorize, the email workflow is intended to be integrated in your development environment. But most mail clients do not integrate such workflow and the best solution seems to use a dedicated mail client for this purpose, such as [aerc](TODO).
While CLI-savvy users might just install a new mail client just for collaborating with other folks on a project, this is beyond the reach of most people. To be clear, i don't think there's anything wrong that specific tools tailor to a specific audience. But DVCS like git are employed by people from all backgrounds including designers, translators, editors.. Many of these people will not bother to learn two tools and a bunch of keyboard shortcuts at the same time just to start contributing to a project.
Actually, they may do so if they understand the benefits of this approach, and find a tutorial in their own language that explains just how to achieve a consistent git email workflow that's well-integrated into their usual environment. Maybe if there was a well-known file within a repository describing ticketing/patching workflow in a machine-readable manner, then we could have many mail clients implementing this standard to facilitate cooperation.
But as of now, I don't know of any simple starter kit for newcomers to get started with git-email workflows without having to learn a whole bunch of other concepts, and this places the bar to contribution higher than it should be. Implicitly expecting new contributors to understand a developer-oriented workflow, or have a couple hours to learn how to send a simple patch, simply means fewer contributors.
# Smaller walled gardens is a burden for users
Over the past decade, selfhosted alternatives to Github have been developed. Gitlab, Gogs and Gitea (to name a few) have contributed a great deal to the free-software ecosystem, by empowering communities who were critical of Github to setup their own modern forging platform. For example, [Debian](https://debian.org/) has an ongoing love affair with Gitlab on their [*Salsa* instance](https://salsa.debian.org/).
Turnkey selfhosted forges are a great tool to avoid Github for big organizations, with a lot of subprojects and contributors, who need to selfhost their whole infrastructure for reliability or security concerns. However, as these solutions were modeled for this specific usecase, they have one major drawback compared to a git-email workflow.
As the forge is usually shut off from the outside world, cooperation between users is only envisioned in a local context. Two users on [Codeberg](https://codeberg.org/) may cooperate on a project, but a user from [tildegit](https://tildegit.org/) or [0xacab](https://0xacab.org/) may not. As it stands, a user from another forge would have to create a new account on your forge to participate to your project. Some may argue that this approach is a feature and not a bug, for three reasons.
First, because it makes it easier to enforce server-wide rules/guidelines and access roles. While this may be an advantage in corporate settings (or for big community projects), it does not apply to popular hacking usecases, where all users are treated equally, and project owners individually setup access roles for their project (not server-wide).
Second, having many accounts over many servers would make it harder to loose all your accounts at once, due to censorship, compromission or other discontinuation of server activities. This is a good point, but i would argue can be tackled in more comprehensive and user-friendly ways through easier project migration and nomadic identity systems, as will be explained further in this article.
Third, because having one account per server is usually not a problem, because you only contribute to so many projects. This last argument, in my view, is very misinformed. Although a person may only contribute seriously and frequently to a certain number of projects, we all use many more software projects in our day-to-day lives. Bug reporting outside of privacy-invading telemetrics is often a tedious process: create a new account on a bugtracker, read bug reporting guidelines, figure out the bugtracker's syntax for links/screenshots, and finally submit a bug.
The bug reporting workflow as achieved by Github and mailing lists is more accessible: use your usual account, and the interface you're used to, to submit bugs to many projects. Any project you'd like to contribute to that uses Mailing Lists for cooperation, you can contribute to from your usual mail client. Your bug reports and patches may be moderated before they appear publicly, but you don't have to create a new account and learn a new workflow just to submit a bug report.
Also, different projects may expect different formats of bug reports. For example, the [debbugs](TODO) bugtracker for Debian expects email formatted in a specific way in order to assign them to the corresponding projects/maintainers. In that, it is a bug-reporting protocol built on top of mail technology, but to my knowledge there is no standard document describing this protocol and no other implementation.
Github projects, on the other hand, worry more about the actual contents of the bug report. That is because semantic actions (bugreport metadata) are already handled by Gitub at the web request level. For bug report formatting, Github lets you write a `BUG.md` (TODO) template at the root of your project (alongside the README file). This template is presented to users submitting a bug. This lets users figure out what you're expecting from a bug report in certain circumstances (version/stacktrace/etc), but still allows them to write their own text (disregarding the template) when they feel it's more appropriate.
So, while selfhosted forges have introduced a lot of good stuff, they have broken the user expectation that you can contribute to any project from your personal account. Creating a new account for every piece of software you'd like to report bugs (or submit patches) to is not a user-friendly approach.
This phenomenon was already observed in a different area, with selfhosted social networks: Elgg could never replace Facebook entirely, nor could Postmill/Lobsters replace Reddit, because participation was restricted to a local community. In some cases it's feature: a family's private social network should not connect to the outside world, and a focused and friendly community like [raddle.me](https://raddle.me/) or [lobsters](https://lobste.rs/) may wish to preserve itself from nazi trolls from other forums.
But in many cases, not being able to federate across instances (across communities) is a bug. I would argue such selfhosted services tailor to the niche usecases, not because they're too different from Facebook/Reddit, but because they're technically so similar to them. In copying/reimplementing "upstream" features ⁽³⁾, the aspects of user management were also carried over to be that of a centralized system.
So, instead of dealing with a gigantic walled garden (Github), or a wild jungle (mailing lists), we now end up with a collection of tiny closed gardens. The barrier to entry to those gardens is low: you just have to introduce yourself to the frontdoor and define a password. But this barrier to entry, however low it is, is too high for most non-technical users to feel comfortable to submit bug reports to all projects they make use of.
I suspect that for smaller volunteer-run projects, the ratio of bug reporters to code committers is much higher on Github and development mailing lists than it is on smaller, selfhosted forges. If you think that's a bad thing, try shifting your reasoning: if only people familiar with programming are reporting bugs, and your project is not only aimed at developers, it means most of your users are either taking bugs for granted, or abandoning your project entirely.
# Centralized trust, consensus and gossip
One of the hard problems in computing is establishing trust. When we're fetching information from a project, how to ensure we have the correct information?
In traditional centralized and federated systems, we rely on location-addressed sources of trust. We define *where* to find reliable information about a project (such as a git remote). To ensure authenticity of the information, we rely on additional security layers:
- Transport Layer Security (TLS) or Tor's [onion services](TODO) to ensure the remote server's authenticity, that is to make it harder for someone to impersonate a forge to serve you malicious updates
- Pretty Good Privacy (PGP) to ensure the document's authenticity, that is to make it harder for someone who took control of your account/forge to serve malicious updates
How we bootstrap trust (from the ground up) for those additional layers, however, is not a simple problem. Traditional TLS setup can be abused by any member of the Certificate Authorities cartel, while onion services and PGP require prior knowledge of authentic keys (key exchange). With the [DANE](TODO) protocol, we can bootstrap TLS keys from the DNS instead of the CA cartel. However, this is still not supported by many clients, and in any case is only as secure as DNS itself. That is, very insecure even with DNS security extensions (DNSSEC). For a location-based system to be secure, we need a secure naming system like the [GNU Name System](TODO) to enable further key exchange.
These difficulties are inherent properties of location-addressed storage, in which we describe *where* is a valid source of the information we're looking for, which requires additional security measures. Centralized and federated systems are by definition location-addressed systems. Peer-to-peer systems, on the other hand, don't place trust in specific entities. In decentralized systems, trust is established either via consensus or gossip.
Consensus is an approach in which all participating peers should agree on a single source of truth. They take votes following a protocol like Raft, and the majority wins. In smaller, closed-off systems where a limited number of people control the network, consensus is achieved by acknowledging a limited set of peers. These approved peers can either be manually defined (static configuration), or be bootstrapped from a centralized third party such as a local certificate authority controlled by the same operators.
But these traditional consensus algorithms do not work for public systems. If anyone can join the network and participate to the consensus establishment (not just a limited set of peers), then anyone may create many peers to try and take control of the consensus. This is often known as Sybil attack (TODO) or 51% attack.
This problem has sprung two approaches: Proof of Work and gossip.
Proof-of-Work (PoW) is consensus achieved through raw computational power. PoW systems such as Bitcoin consider that, out of the global computing power in a given network, the peers representing the majority of the computing power must be right. While this approach is very interesting conceptually and was a mathematical achievement, it leads to terrible consequences like Bitcoin using on its own more electricity than many countries. I repeat, a single application is responsible for several percents of global electricity usage.
Gossip is a conceptual shift, in which we explicitely avoid global consensus, because as we've seen, establishing consensus is hard. Instead, each peer has their own truth/view of the network, but can ask other peers for more information. Gossip is closer to how actual human interactions work: my local library may not have all books every printed, but whatever i find in there i can share with my friends and neighbors.
Authenticity in gossip protocol is also enabled by asymmetric cryptography (like PGP), but gossip protocols usually employ cryptographic identifiers (tied to a public key) to designate users. In addition, all messages across the network are signed, so that any piece of content can be mapped to a unique identity and authenticated.
Gossip can be achieved through any channel. Usually, it involves USB keys and local area networks (LAN). But nothing prevents us from using well-known locations on the Internet to exchange gossiped information, much like a newspaper or community center would achieve in the physical world. That's essentially what the [Secure ScuttleButt](TODO) (SSB) protocol is doing with its *hubs*, or what the PGP [Web of Trust](TODO) is doing with *keyservers*.
In my view, gossip protocols include IPFS and Bittorrent. Don't be surprised: a Distributed Hash Table (a distributed content-discovery database) is a form of globally-consistent gossip. It's rather similar to a blockchain, in that you can query any peer about specific information. However, in a Bitcoin-style blockchain, every peer needs to know about everything in order to ensure consistency. In a DHT, no peer knows about everything (reducing requirements to join the DHT), and consistency is ensured by content addressing (checksumming the information stored).
That means although a DHT is partitioned across several peers who each have their view of the network, it is built so that peers will help you find information they don't know about, and checking information correctness (detecting bad actors) is not hard. When you load a [magnet:](TODO) URL in your Bittorrent client, it loads a list of DHT peers and asks them about what you're looking for (the checksum in the magnet link). If these peers have no idea what piece of content you're talking about, they may point you to other peers who may help you find it. In that, i consider DHTs to be some form of global crypto-gossip protocols.
Although it's not a widely-researched topic, it seems [IPv6 multicast](TODO: link to conf) could be used to make gossiping a network-level concern. This would avoid the overhead of keeping a local copy of all content you encounter simply to propagate it (a common criticism of SSB). In a such hypothetical setup, one could advertise all new content to a broader audience, while choosing to keep a local archive of select content they may want to access again later. If you're interested in this, be sure to check out a talk called [Privacy and decentralization with Multicast](TODO).
# Dealing with bad actors and malicious activity
One may object that favorizing interaction across many networks will introduce big avenues for malicious activity. However, i would argue this is far from the truth. In practice, we have decades of experience from the email community about how to protect users from spam and malicious activities.
Even protocols who initially discarded such concerns as secondary are eventually rediscovering [rate-limiting](TODO), [webs of trust](TODO), and user-overridable allow/deny lists on the server level. A talk entitled [Architectures of Robust Openness](TODO) from the latest ActivityPub conference touches on those topics.
Another concern with authenticated, decentralized forging is [repudiation](TODO) and [plausible deniability](TODO). For example, if you commit a very secret piece of information to your forge, how can you take it back? Or if you publish some information that displeases higher powers, how to pretend you are not responsible for it? This is a hard problem to tackle.
In secure instant messaging systems (such as OTR/Signal/OMEMO encryption), encryption keys are constantly rotated, and previous keys are published. By design, this allows anyone to impersonate anyone from the past. This way, there is no way you can be proven to be responsible for a past message (plausible deniability), because anyone could have forged it. Lately, people from the email ecosystem have called server operators to publish their previous DKIM keys. This would enable plaintext emails to be plausibly denied, while retaining authenticity for PGP-signed emails.
However, what works for private communications may not be suited to public cooperation. I do not know of any way to achieve plausible deniability, repudiation and authentication in a public context. If you have readings on this subject, please send me an email and i will update the article.
There is currently
past public keys are discarded
+ how to deal with repudiation in a p2P system? once you've signed something away, there's no turning back?
PLUS: forbidding tor or certain IP ranges is NOT protection from malicious activity because the most malicious actors have almost unlimited resources
# What's happening in the real world
So, i think we've talked enough about theoretical approaches to federated/decentralized systems and some of their properties. Now, it's time to take a look at projects people are actually working on.
## Federated authentication
Some forges like Gitea propose [OpenID Connect] federated authentication: you can use any OpenID Connect account to authenticate yourself against a selfhosted forge. Previous OpenID specifications required a specific list of servers you allowed authentication from: "login with microfacegoople". OpenID Connect is a newer standard which features a well-known endpoint discovery mechanism, so the software can detect the authentication server for a given domain, and start authenticating a user against it.
So, whether you're signing up for the first time or signing in, you need to give the forge your OpenID Connect server. You will then be redirected to this OpenID server, authenticated (if you are not yet logged in) and prompted whether you want to login on this forge. If you accept, you will be redirected to the forge, and it will know the OpenID server vetted for your identity.
This approach to decentralized forging is brilliant because it's simple and focuses on the practicality for end-users. However, it does not solve the problem of migrating projects between forges.
## Federated forging
Federated forging relies on forging vocabulary exchanged over established federation protocols. That means, a whole ecosystem of clients, servers and protocols is reused as a basis for forging systems. This is exemplified by the ForgeFed and Salut-à-Toi projects.
ForgeFed is a forging vocabulary for the ActivityPub federation protocol (the fediverse). It has a proof-of-concept implementation (TODO) and aims to be implementable for any web forge. However, despite some interesting discussions on their forums, there seems to be little implementation-focused work at the moment.
Salut-à-Toi (TODO) on the other hand, is an actual suite of clients (a library with several frontends) for the Jabber federation (XMPP protocol). It was a client project from the beginning, and only two years ago started to implement forging features. While it's still a proof-of-concept, it's reliable enough for the project to be selfhosted. In this context, *selfhosted* means that salut-à-toi is the forging software used to develop the project itself.
TODO: salut à toi forging screenshot
While such features are not implemented yet, the fact that these federated forge rely on standard vocabulary would theoretically enable migration between a lot of forges, without having to use custom APIs for every forge, as is common for Github/Sourceforge/etc migrations.
Also, as the user interactions themselves are federated, and not just authentication, folks may use their client of choice to contribute to remote projects. There would be lesser concerns for color themes or accessibility on the server side, because all of these questions would be addressed on the client side. This is a very important property for accessibility, ensuring your needs are covered by the client software, and that a remote server cannot impact you in negative ways.
If your email client is hard for you to use, or otherwise unpleasant, you may use any email-compatible client that better suits your needs. With selfhosted, centralized forges, where the client is tightly-coupled to the server, every forge needs to make sure their service is accessible. Every forge you join to contribute to a project can make your user experience miserable. Imagine if you had to use a different user interface for every different server you're sending emails to?!
The same would apply to federated forging, in which your favorite client would allow you to participate to many projects. The server provides function, and your client provides usability on your own terms.
## Blockchain consensus
Apart from sketchy Silicon Valley startups, the consensus approach is only experimented by [Radicle](https://radicle.xyz/) as far as i know. A blockchain is a strange approach for a community-oriented project. However, it appears they attempt to exploit crypto-speculation to benefit contributors to the free software ecosystem.
TODO: radicle screenshot
I'm tempted to just say *How could this possibly go wrong?!*. After all, remember Bitcoin was envisioned as a popular value-exchange system outside of the reach of bad actors (States, banks), which could be used to empower local communities in their day-to-day business. And look what we got: a global speculation ring consuming vasts amount of ressources, controlled by fewer actors as time goes, and unusable for the common people (because of transaction costs/delays).
But in the end, it all boils down to political and technical decisions Radicle will make as a community. As this specific community seems entirely comprised of good-faith enthusiasts, i wish them all the best, and can only encourage inspiration and cooperation from the broader decentralized forging ecosystem.
On the more exciting side of things, radicle uses strong cryptography to sign commits in a very integrated and user-friendly way. That's a strong advantage over most systems in which signatures are optional and delegated to third-party tooling which can be hard to setup for newcomers.
## Gossip
Gossip systems are a common approach for decentralized forging. The most recent and most polished attempt at gossiped forging is [git-ssb](TODO) over the Secure ScuttleButt protocol. Other examples are [git-ipfs](TODO) and [Gittorrent](TODO).
TODO: git-ssb screenshot.
Although they're less polished for day-to-day use, these projects are very interesting. They helped pave the way for research into decentralized forging by showing that git and other decentralized version control systems (DVCS) play well with content-addressed storage, given that the commits themselves are content-addressed (a commit name is a mathematical checksum of everything it contains).
## Not covered here
Many more projects over the years have experimented with storing forging interactions (metadata like bugs and pull requests) as well-known files within the repository itself. Some of them are specific to git: [git-dit](https://github.com/neithernut/git-dit), [ticgit](https://github.com/jeffWelling/ticgit), [git-bug](https://github.com/MichaelMure/git-bug), [git-issues](https://github.com/dspinellis/git-issue). Others intend to be used with other versioning systems (DVCS-agnostic): [artemis](https://www.mrzv.org/software/artemis/), [bugs-everywhere](https://github.com/kalkin/be), [dits](https://github.com/jashmenn/ditz), [sit](https://github.com/sit-fyi/issue-tracking).
I will not go into more details about them, because these systems only worry about the semantics of forging (vocabulary), but do not emphasize how to publicize changes. For example, these tools would be great for a single team having access to a common repository to update tickets in an offline-first setting, then merging them on the shared remote when they're back online. But they do not address cooperation with strangers, unless you give anyone permission to publish a new branch to your remote, which is probably a terrible idea. However, that's just my personal, uninformed opinion: if you have counter-arguments about how in-band storage of forging interactions could be used for real-world cooperation with strangers, i'd be glad to hear about it!
Lastly, i didn't mention [Fossil SCM](TODO) because i'm not familiar with it, and from reading the docs, i'm very confused about how it approaches cooperation with strangers. It appears forging interactions are stored within the repository itself, but then does that mean that Fossil merges every interaction it hears about? Or is Fossil only intended for use in a closed team? Let me know if you have interesting articles to learn more about Fossil.
# With interoperability, please
After this brief review of the existing landscape of decentralized forging, i would like to argue for [interoperability](TODO). If you're not familiar with this concept, it's a key concern for accessibility/usability of both physical and digital systems: interoperability is the property when two systems addressing the same usecases can be used interchangeably. For example, a broken lightbulb can be replaced by any lightbulb following the same socket/voltage standards, no matter how it works internally to produce light.
In fact, interoperability is the default state of things throughout nature. To make fire, you can build any sort of wood that burns. If your window is broken and you don't have any glass at hand, you can replace it with any material that will prevent air flowing through. Interoperability is a very political topic, and a key concern to prevent the emergence of monopolies. If you'd like to know more about it, i strongly recommend a talk called [We used to have cake, now we've barely got icing](TODO).
So while these approaches of decentralized forging we've talked about are very different in some regards, there is no technical reason why they could not play well together and inteoperate consistently. As a proof of concept, [git-issue](TODO) we've mentioned in the previous section can actually synchronise issues contained within the repository with Github and Gitlab issues. It could as well synchronise with any selfhosted forge (federated or not), or publish the issues on the radicle blockchain.
The difference between federated and p2p systems is big, but [hybrid p2p/federated systems have a lot of value](TODO: my own article to finish). If we develop open standards, there is no technical barrier for a peer-to-peer forge to synchronise with a federated web/XMPP forge. It may be hard to wrap one's head around, and may require a lot of work for implementation, but it's entirely possible. Likewise, a federated forge could federate both via ForgeFed, and via XMPP. And it could itself be a peer in a peer-to-peer forge, so that pull requests submitted on Radicle may automatically appear on your web forge.
Not all forges have to understand each other. But it's important that we at least try, because the current fragmentation across tiny different ecosystems is hostile to new contributions from people who are used to different workflows and interfaces.
Beyond cooperation, interoperability would also ease backups, forks and migrations. Migrating your whole project from a forge to another would only take a single, unprivileged action. When forking a project, you would have a choice whether to inherit all of its issues and pull requests or not. So if you're working on a single patch, you would discard it. But in case you want to take over an abandoned project, you would inherit all of the project's history and discussions, not just the commits.
You may have noticed i did not mention the email workflow in this section about interoperability. That's because email bugtracking and patching is far from being standardized. An issue tracker like debbugs could rather easily be interoperated with, because it has a somewhat-specified grammar for interacting with tickets. But what about less specified workflows? My personal feeling is that these different workflows should be standardized.
Many vocabulary and security concerns expressed in this article would equally apply to email forging. But to be honest, i'm not knowledgeable enough about email-based forging to provide a good insight on this topic. I'm hoping people from the [sourcehut forge community](https://sourcehut.org/) and other git-email wizards can find inspiration in this call to decentralized forging, come around the table, and figure out clever ways to integrate into the broader ecosystem.
# Code signing and nomadic identity
So far, i've talked about different approaches to decentralized forging and how they could interoperate. However, one question i've left in the cupboard is how to ensure authenticity of interactions across different networks?
Code signing in forging usually uses PGP keys and signatures to authenticate commits and *refs*. In most cases, it is considered a DVCS-level concern and is left untouched by the forge, except maybe to display a symbol for valid signature alongside a commit. While we may choose to trust the forge regarding commit signatures, we may also verify these on our end. The tooling for verifying signatures is lacking, although there is recent progress with the [GNU Guix project](https://guix.gnu.org/) releasing the amazing [guix git authenticate](TODO) command for bootstrapping a secure software supply chain.
However, forging interactions such as issues are typically unsigned, and cannot be verified. In systems like ActivityPub and radicle, these interactions are signed, but with varying levels of reliability. While radicle has strong security guarantees because every client owns their keys, email/ActivityPub lets the server perform signatures for the users: a compromised server could compromise a lot of users and therefore such signatures are unreliable from a security perspective. We could take this into consideration when developing forging protocols, and ensure we can embed signatures (like PGP) into interactions.
For interoperability concerns, each forge could implement different security levels, and let maintainers choose the security properties they expect for external contributions, depending on their practical security needs. A funny IRC bot may choose to emphasize low-barrier contribution across many forges over security, while a distribution may enforce stricter security guidelines, allowing contributions only from a trusted webforge and PGP-signed emails. In any case, we need more user-friendly tools for making and verifying signatures.
Another concern is how to deal with migrations. If my personal account is migrated across servers, or i'm rotating/changing keys, how to let others know about it in a secure manner? In the federated world, this concern has been addressed by the ZOT protocol, which as initially developed for Hubzilla's nomadic identity system. ZOT lets you take your content and your friends to a new server at any given moment.
This is achieved by adding a crypto-identity layer around server-based identity (`user@server`). This crypto-identity corresponding to a keypair (think PGP) is bootstrapped in a [TOFU](TODO) manner (Trust On First Use) when federating with a remote user on a server that supports the ZOT protocol. The server will give back the information you requested, and let you know the nomadic identity keypair for the corresponding user. Then, you can fetch the corresponding ZOT profile from the server to discover other identities signed with this keypair.
For example, let's imagine for a second that [tildegit.org](https://tildegit.org/) and [framagit.org](https://framagit.org/) both supported the ZOT protocol and some form of federated forging. My ZOT tooling would generate a keypair, that would advertise my accounts on both forges. When someone clones one of my projects, their ZOT-enabled client would save this identity mapping somewhere. This way, if one of the two server ever closes, the client would immediately know to try and find my project on the other forge.
In practice, there would be a lot more subtlety to represent actual mapping between projects (mirrors), and to map additional keypairs on p2p networks (such as radicle) to a single identity. However, a nomadic identity system doesn't have to be much more complex than that.
The more interesting implementation concern is how to store, update and retrieve information about a nomadic identity. With the current ZOT implementations (to my knowledge), identities are stored as signed JSON blobs, that you retrieve opportunistically from a remote server (TOFU). However, that means if all of your declared servers are offline (for instance, if there's only one of those) i cannot automatically discover your updated nomadic identity (with your new forge servers).
I believe a crypto-secure, decentralized naming system such as [GNS](TODO) or [IPNS](TODO) would greatly benefit the nomadic identity experience. DNS could also be used here, but as explained before, DNS is highly vulnerable to determined attackers. Introducing DNS as discovery mechanism for nomadic identities would weaken the whole system, and make it much harder to get rid of in the future (for backwards-compatibility).
With GNS/IPNS (or any other equivalent system), people would only need to advertise their public key on every forge, and the nomadic identity mapping would be fetched in a secure manner. Considering GNS is in fact a signed and encrypted peer-to-peer key-value store itself, we could use GNS itself to store nomadic identity information (using well-known keys). IPNS, on the other hand, only contains an updatable pointer to an IPFS content-addressed directory. In this case, we would use well-known files within the target directory.
So, migration and failover across forges should also be feasible, despite other challenges not presented here, such as how to ensure consistency across decentralized mirrors, and what to do in case of conflicts.
# Conclusion
Decentralized forging is in my view the top priority for free-software in the coming decade. The Internet and free-software have a symbiotic relationship where one cannot exist without the other. They are two facets of the same software supply chain, and any harm done on one side will have negative consequences on the other. Both are under relentless attack by tyrants (including pretend-democracies like France or the USA) and multinational corporations (like Microsoft and Google).
Developing decentralized forging tooling is the only way to save free software and the Internet as we know them, and may even lower the barrier to contribution for smaller community projects.
Of course, decentralized forging will not save us from the rise of fascism in the physical and digital world. People will have to stand up to their oppressors. Companies and State infrastructure designed to destroy nature and make people's lives miserable will have to burn, as explained in a talk about [Climate Change, Computing, and All our relationships](TODO). *But we are not afraid of ashes, because we carry a new world, right here in our hearts. And this world is growing, at this very minute.*
Finally, you may wonder how i envision my own contribution to the decentralized forging ecosystem. I may not be competent enough to contribute useful code to the projects i listed above, but i may articulate critical feedback from a user's perspective (as i did in this post). But to be honest with you, i have other plans.
In the past years, i've been struggling with shell scripts to articulate various repositories and trigger tasks from updates. From my painful experiences from automatically deploying this website (in which the theme is a submodule), i've come up with what i think is a simple, coherent, and user-friendly Continuous Integration and Delivery platform: the [forgesuite](TODO), which contains two tools: forgebuild and forgehook.
On a high-level, forgehook can receive update notifications from many forges (typically, webhooks) and expose a standard semantic (ForgeFedà representation of this notification, indicating whether it's a push event, a comment on a ticket, on a new/updated pull request. forgehook then manages local and federated subscriptions to those notifications, filters them according to your subscription settings, and transmits them to other parts of your infrastructure. For example, maybe your IRC bot would like to know about all events happening across your project in order to announce them on IRC, but a CI test suite may be only interested in push and pull-request events.
forgebuild, on the other side of the chain, fetches updates from many remote repositories, and applies local settings to decide whether to run specific tasks or not. For now, only git and mercurial are supported, but any other version control system can be implemented by following a simple interface. Automated submodule updates are a key feature of forgebuild, to let you update any submodule, and automatically trigger corresponding tasks if submodule updates are enabled. forgebuild follows a simple CLI interface, and as such your CI/CD tasks can be written in your favorite language.
While the forgesuite is still in early stages, i believe it's already capable of empowering people. Less-experienced users who are somewhat familiar with the command line should find it very convenient to automate simple tasks, while power users should be able to integrate it with their existing tooling and infrastructure without concern. I know there is room for improvement, so if the forgesuite fails you somehow, i consider it's a bug! Don't hesitate to report critical feedback.
I will write more about the forgesuite in another blogpost, so stay tuned. In the meantime, happy hacking!
⁽¹⁾ Microsoft has tried to buy themselves a free-software friendly public image in the past years. This obvious openwashing process has consisted in bringing free software to their closed platform (Windows Subsystem for Linux), and open-sourcing a few projects that they could not monetize (VSCode), while packaging them with spyware (telemetry) for free platforms. Furthermore, Microsoft has been known for decades to cooperate with intelligence services (PRISM, NSAKEY) and oppressive regimes.
⁽²⁾ Catalonia has a long hsitory of repression by the spanish state. Microsoft is just the latest technological aid for that aim, just like Hitler and Mussolini in their time provided weapons to support Franco's coup, and crush the social revolution in Barcelona.
⁽³⁾ "upstream" here is meant as the source for inspiration, not source for code.

253
content/blog/decentralized-forge/index_old.md

@ -0,0 +1,253 @@
+++
title = "Decentralized forge: distributing the digital means of production"
date = 2020-02-26
+++
In the world of today, more and more software gets built. But increasingly, the tools used to produce software are falling into the hands of a few giant corporations. Just last year, Github got acquired by Microsoft for more than 7 BILLION (milliards) dollars. In this article, we'll explore what's wrong with Github, discuss alternatives that popped up in the past decade, and see what kind of decentralized cooperation the future holds.
In this article, I will argue federated forge is cool. Or decentralized auth. or p2p stuff but we need to do something
# Github considered harmful
For those unaware, Github is an online forge. That is, a place where ordinary people and professionals alike sign up to cooperate on projects. While the more complicated aspects of code collaboration are handled by the `git` versioning control system, Github brings a user-friendly web interface where you can browse/file bugs (called `issues`) and submit or comment patches (`pull requests`).
Github looked really nice, because it was so much less frightening to newcomers than previous web forges such as Sourceforge or Launchpad. But of course, the startup developing Github did not intend to help humanity, and instead captured a whole generation of developers into a walled garden, by making it impossible to participate in software development when you don't have a Github account.
By building such a popular centralized forge, they created a strange situation in which most public software development takes place on Github, with a single company that can decide the fate of your project:
- in October 2019, Github removed a [militant application in Catalonia](https://techcrunch.com/2019/10/30/github-removes-tsunami-democratics-apk-after-a-takedown-order-from-spain/), on orders by the Spanish government which has been busy crushing opposition
- in July 2019, [Github banned](https://techcrunch.com/2019/07/29/github-ban-sanctioned-countries/) all accounts who had used their services from Iran-attributed IP addresses (TODO: see how IP subnets work and why banning per country is stupid)
- a few years before, [Popcorn Time](https://www.techdirt.com/articles/20140711/18044627859/mpaa-stretches-dmca-to-breaking-point-with-questionable-take-down-request-popcorn-time-repositories.shtml) repositories have been taken down
These painful developer stories with Github are unfortunately far from isolated cases, even though just a tiny portion of Github's wrongdoings attracts media coverage. But to understand how we've reached this situation, we need to understand what made the success of Github in the first place.
# The point of Github
Even before Github even existed, free software forges were already widespread. GNU Savane (forked from Sourceforge), Launchpad and Fossil SCM are notable names in the forge ecosystem.
However, it is often argued their User Interface is not very user-friendly. Many new developers would be surprised that, although `README` files have been hanging around for a long time, most forges of the time did not display the README content on the project homepage, but would rather display a list of latest commits.
For example, a project's homepage on Launchpad takes me one click to get to the project's files, and an additional click to see the README.
![Inkscape project on Launchpad](launchpad.jpg)
Arguably, their UI were centered around technical notions (imported from the mathematics lying underneath) such as branches and refs, catering to an already-familiar crowd, while newcomers were left confused.
![GRUB project on GNU Savannah](savannah.jpg)
Github's new features were not attractive to well-established projects who had the time to develop their own solutions and were rightfully concerned about Github's mischiefs. But to many smaller projects, Github's features were crucial to get up to speed and produce quality software: continuous integration/delivery (CI/CD), integrated tasks board (kanban), 3rd party tooling for code coverage and testing.
Nothing Github did was new. But surely, Github was breaking away from the [Keep It Simple Stupid](https://en.wikipedia.org/wiki/KISS_principle) approach and building an integrated solution where everything you need for software development is within your reach. This was an appeal to many people.
# Selfhostable centralized forges
With Github's growing powers and mischiefs, we've seen in the last decade a few projects rising up to provide the same kind of experience, but in a selfhosted environment. There's quite a bunch of those, but in this article I'll focus on Gitlab and Gitea.
## Gitlab
[Gitlab](https://gitlab.com) started about a decade ago as a free-software alternative to Github. According to their [history](https://en.wikipedia.org/wiki/Gitlab#History) on Wikipedia, they started adopting an « open-core » in 2014. This means they now release a free-software Community Edition which has fewer features than the Enterprise Edition.
While Gitlab is not the worst business in the field, it's important to remember they are a profit-driven organization with a track record of bad decisions:
- renting "cloud" servers from Google (used to be Microsoft), so giving money to Gitlab means you give money to Google
- forbiding political discussions between workers in regards to serving unethical clients ([decision repelled](https://www.theregister.co.uk/2019/10/17/gitlab_reverse_ferret/) due to uproar)
- including [invasive user tracking](https://gitlab.com/gitlab-org/growth/product/issues/164) on gitlab.com (decision repelled due to uproar)
- despite aggressive pink-washing, [asking women](https://www.theregister.co.uk/2020/02/06/gitlab_sales_women/) to wear "short but somewhat formal dress and heels" then trying to cover up the story by marking forum posts addressing the issue as confidential
On the more technical side of things:
- Gitlab works very poorly without Javascript in the browser: even README files cannot be displayed with Noscript or a Tor Browser
- Gitlab is very resource-hungry, recommending 4GB of RAM for hosting an instance
- Gitlab feels very slow to use, even self-hosted on a local network
- Gitlab is notorious in the selfhosting community for painful upgrades
## Gitea
[Gitea](https://gitea.io) is a community fork of [Gogs](https://gogs.io) started in 2016. It should be praised that a maintainer refuses to « Move fast and break things ». However, i'm glad the project was forked to promote community organizing. Over the past years, Gitea has implemented a lot of new features. Some of those have found their way upstream into the gogs codebase, albeit with sometimes different implementations. This way, everybody wins!
Gitea currently powers the [tildeverse.org](https://tildeverse.org) forge, and every time I've used it so far has been a pleasure. However, Gitea is not as featureful as Gitlab, as explained on their website [in a detailed comparison](https://docs.gitea.io/en-us/comparison/). For example, it does not support continuous integration/delivery pipelines out-of-the-box.
## The "many accounts" problem
Over the past decade, Gogs/Gitea and Gitlab have contributed a great deal to the free software ecosystem. Adopting a modern forge has allowed projects such as Debian and Gnome to build more reliable software more quickly. But to developers contributing to several projects, a new problem has emerged.
Part of the user-friendliness of the Github experience is the ability to use a single account to contribute to any project... or at least any project hosted on this giant walled garden. Both Gitlab and Gitea have been designed with this centralized mindset: you need an account on every server where you want to contribute something, whether it's reporting a bug or submiting a merge request.
This approach is not a problem for a huge corporation that seeks to entrap users in its ecosystem (it's actually a feature), but is a massive pain for end-users who want to report issues. Because many projects now selfhost their own forge, you may need to create separate accounts on [0xacab.org](https://0xacab.org) (Riseup forge), [framagit.org](https://framagit.org) (very popular in France) and [tildegit.org](https://tildegit.org) (the tildeverse forge).
As a result, we end up with a collection of very small walled gardens. Their cultures are thriving, but what's the point in all this if we can't cooperate across gardens? As an end-user, I would like to be able to contribute to any public project from a single identity, without having to deal with « many accounts ». And as we'll see, there's more than one way to tackle the problem.
# Git is already decentralized
As we've stated before, a forge is a collection of tools to ease cooperation on a project. A forge itself is built on top of one or more [Version Control](https://en.wikipedia.org/wiki/Revision_control) System (VCS). By far, the most popular VCS today is git. But [mercurial](https://www.mercurial-scm.org/) is still pretty popular, and [darcs](http://darcs.net/) is still maintained and is even having a modern reimplementation in Rust ([pijul](https://pijul.org/)).
More specifically, these VCS just mentioned are Distributed Version Control Systems (DVCS). This means they can be used without a central server, contrary to previous systems. If you're not familiar with the differences between client/server VCS and decentralized VCS, I'd recommend reading the Wikipedia page on [Decentralized version control](https://en.wikipedia.org/wiki/Distributed_version_control).
The key point is that in a DVCS every person has a complete copy of the project, so that they can apply proposed changes to their copy of the repo. Historically, git is meant to be used with email to send and review patches. That's how [Linux kernel development](https://www.kernel.org/doc/html/latest/process/2.Process.html#the-lifecycle-of-a-patch) works, and such [distributed workflows](https://git-scm.com/book/en/v2/Distributed-Git-Distributed-Workflows) for [email patches](https://git-scm.com/book/en/v2/Appendix-C%3A-Git-Commands-Email) are documented in the official git book.
[An article](https://drewdevault.com/2018/07/23/Git-is-already-distributed.html) by Drew Devault, a maintainer of the Sourcehut forge, got a lot of attention lately to advertise existing federated cooperation workflows built on email. In the article, he rightfully argues we should make a better job of teaching such workflows to newcomers, and integrate them with smoother tooling.
## git email is not enough
But the point missed by the "git+email" clique in my view, is that **different tools cater to different audiences**. An email-based workflow isn't tedious to terminal-friendly developers, but it considerably raises the bar for people of different backgrounds to contribute to projects. These contributions can be simple bug reports, but let's not forget that forges are used to manage more than code: documentation, websites, translations (among others).
Versioning tools have long been used by designers and writers and many other folks, who back in the day favored mercurial and darcs for their user-friendliness. In comparison, the latest newcomer git was invented by and for kernel people to streamline very complex workflows, but is notoriously complex to use for newcomers even for very simple tasks such as keeping up-to-date different branches with minor, non-conflicting changes (TODO: ensure that's true and investigate other major git painpoints).
After years of using git, I still sometimes have to read docs and tutorials to achieve what I consider basic functionality. Sometimes I end up force-pushing commits because it enables me to reach my goal-state in seconds instead of minutes, even though I'm convinced being able to rewrite history is the worst kind of anti-feature for a VCS. I'm far from an isolated case and it's become a meme in many workplaces and non-profit communities that you often have to spend more time making git happy than working on your actual patches.
TODO: insert mandatory tar/git MEME (xkcd or commitstrip)
Github and other forges have succeeded in making git workflows transparent and usable by people who do not have the time or knowledge to script their mailbox and connect many smaller tools together. For example, to a person reporting a bug, a forge's issue tracker makes it easier to see if this bug has already been reported, and maybe even patched. This is something that can and should be addressed in email-based workflows by piping more parts together, like the Debian project does with its bugtracker. (TODO: link)
But at some point, we can acknowledge that modern centralized forges have brought something on the table, and we don't currently replicate such a smooth experience easily with decentralized workflows. In the rest of this post, I'll outline what it would take to achieve a user-friendly experience with a decentralized forge, and take a look at current strategies being implemented by various projects.
TODO: git vs mercurial link (mercurial not dead) ?? maybe not the right place and time but many can be mentioned
TODO: mention git email niceties
https://git.causal.agency/imbox/about/git-fetch-email.1
# What it takes to build a decentralized forge
As we've seen in the previous sections, a forge consists of higher-level abstractions and tools to ease cooperation through a version control system. While this is not an exhaustive list, we'd like a decentralized forge to provide:
- tickets: users can open issues and comment on them to report bugs and have debates
- merge requests: users can propose patches and maintainers can review them and merge them
- continuous integration/delivery: running code in an automated fashion every time changes have been submitted
While each of these features is different in its implementation, they all require some form of signaling mechanism (to advertise new changes) and some semantics (to describe those changes). The signaling can either take place within the git repository itself (in-band) or use another protocol (out-of-band).
With centralized forges, we use some out-of-band signaling with custom semantics. Github and Gitlab for instance provide HTTP APIs to this goal. In a decentralized setup, we need a mechanism to propagate the state of a repository throughout the network. This state can be based on a consensus of peers, on a gossip of peers, or can be derived from authoritative sources.
## Source of trust
At some point in this article, we have to take a minute and talk about trust. Because everything we do on the Internet eventually boils down to trust and how we establish it. [Zero-knowledge proofs](https://en.wikipedia.org/wiki/Zero-knowledge_proof) are gaining traction and producing useful results, but large portions of our computing still rely on establishing a chain-of-trust with some 3rd parties, if only for routing to another machine (eg. BGP, TCP).
The same goes for a forge: starting with a repository ID we establish a chain of trust to retrieve the current state of the project. In both centralized and federated setups, we rely on location-based addressing, typically via the HTTP or SSH protocol. That means, we declare **where** we want to find the latest information about the project, and explicitely trust this provider (location) to serve us correct information.
This is precisely what we do when we `git clone https://github.com/foo/bar`:
- establish a chain of trust over the DNS protocol to determine what IP addresses we can reach `github.com` with
- open a connection to one of those addresses, and trust our ISP's chain of trust to deliver packets to Github
- ask Github to provide us with the repository `foo/bar`
However, distributed peer-to-peer protocols take a different approach and usually resort to content-based addressing. Instead of specifying where to find a specific piece of content, we describe **what** piece of content we want specifically. By performing calculations (hashing) on the content, we can determine a unique identifier (a hash) for the content.
This hash can then used as a secure identifier for the content. When we're trying to get this piece of content on a second machine, we can verify the content we get by calculating its own hash and comparing it to the hash of the content we're looking for. This is, on a high-level, how the [Bittorrent](https://en.wikipedia.org/wiki/BitTorrent) and [IPFS](https://en.wikipedia.org/wiki/InterPlanetary_File_System) protocols work.
## Semantics
TODO: explaining the semantics of a forge and difference between checkout based and patch based VCS
## Storage
With proper semantics in mind, we can now determine how and where to store the issues and merge requests. Some projects attempt to store information directly into the versioned repository, by using well-known folders and sometimes a dedicated branch.
TODO: give names and examples
While these projects are very promising for decentralized information tracking by providing useful semantics over a VCS, I did not find one that tried to implement a specific merging strategy. While this is a feature for projects with an established workflow, it does not crack the decentralized forge problem.
Some other approaches described in the next section either rely on communicating with a forge server, or propagating the changes through global consensus (blockchain) or gossip.
## Signaling
Back to: how to submit new issues and merge requests and let our peers know about them.
### Federated system
In a centralized or federated setup, we use a separate, out-of-band protocol to communicate with our forge (client-to-server) and tell it to open an issue. As our forge is considered the central source of truth for the repository, it can then spread this information to our peers and we can call it a day.
But what now if I want to open an issue on your repository, on your server, where I don't have an account. Either your forge allows me to login and use the previously-mentioned client-to-server protocol to open an issue, or we need a federation protocol (server-to-server) so that I can create an issue on my forge, so that my forge can let yours know about it.
In both scenarios, my forge vouches for my identity. But the second approach has two clear upsides:
- having a federation protocol from the beginning gives us building blocks for backup/migration mechanisms across servers
- letting strangers login on your forge can lead to mayhem if permissions are misconfigured (maybe someone from another server can open a project on your forge??)
### Peer-to-peer system
In a peer-to-peer setup, things are different because there is not a central source of truth. So in order to propagate changes to the repository, we either need to convince everyone on the network to advertise our changes (global **consensus**), or we can advertise them ourselves and wait for trustful peers to propagate them (**gossip**).
The first approach is usually achieved through blockchains. While pursuing a global consensus is a noble task, it's a famously hard problem to crack! Bitcoin, for instance, wastes countless computing resources in order to achieve consensus through a process called Proof of Work (TODO: link). While consuming more electricity than many small countries, the Bitcoin network is still famous for its slow throughput (processing transactions is either very costly or very slow) and can be abused by Sybil-type attacks (TODO: link).
Also, in order to reach global consensus within the blockchain, each active node usually needs to know the whole network's history of transactions. Verifying each transaction that ever occurred on the network requires a lot of computing power, and a lot of storage space. Therefore, as the network grows, more and more user devices become incapable of active participation in the network, and the power gets concentrated into the hands of the most powerful (TODO: maybe give example bitcoin? or maybe too much information?)
Of course, some mitigations can be put in place, and not all consensus-reaching systems are as defective-by-design as Bitcoin. However, until some such technology proves capable of high throughput while upholding decentralization, so-called blockchains appear to be a dead end filled with corporate snake-oil.
A notable exception is the [radicle](https://radicle.xyz) project wants to build a [reward mecanism for miners](https://radicle.community/t/block-rewards-in-the-registry/152) on a proof-of-work blockchain. This is surprising because the radicle project seems to be driven by enthusiasts and not a greedy corporation.
NOW EXPLAIN GOSSIP STUFF
There was a discussion on radicle forums about using a [gossip protocol](https://radicle.community/t/radicle-but-using-ssb-instead-of-ipfs/53) instead of a blockchain, but this discussion has not reached a conclusion (yet).
DHT is a very good balance. It does local consensus (not local geographically but local as in people of the same interest) to share pieces of global state by gossip
global consensus is usually an anti-pattern, as outlined in naming systems (see reasonings behind DNS delegation and GNS petnames)
also mention SSB?
## Access control / spam prevention
ok now we can send message now we want to make sure we don't receive too many
of course rate limiting applies
in a p2p fashion, i don't know apart from web of trust
in a federated fashion, we can do opt-in federation or something like that : TOFU like if one issue/merge request is accepted from a server, then that server is whitelisted for federation with ours.. or even more complex reputation/karma systems
# A look at some solutions
some solutions:
- Artemis: https://www.mrzv.org/software/artemis/
integrated into mercurial (hg subcommand) with partial git support
- bugseverywhere: https://github.com/kalkin/be
VCS-agnostic
- git-dit: https://github.com/neithernut/git-dit
git specific
- ditz: https://github.com/jashmenn/ditz
VCS-agnostic
- ticgit: https://github.com/jeffWelling/ticgit
+ web interface + git specific
- git-bug: https://github.com/MichaelMure/git-bug
git-specific
- git-ssb: https://git.scuttlebot.io/%25n92DiQh7ietE%2BR%2BX%2FI403LQoyf2DtR3WQfCkDKlheQU%3D.sha256
git-specific but very interesting anyway because it could work with others and does out of band stuff
- git-issue: https://github.com/dspinellis/git-issue
git-specific (bridges to Github/Gitlab)
- sit: https://github.com/sit-fyi/issue-tracking
SVN agnostic
in the same family:
- git-appraise: https://github.com/google/git-appraise
- git-notes: https://git-scm.com/docs/git-notes
to investigate?
TODO: why have i not mentioned fossil SCM? i need to dig into this again but the docs are so dense it's hard to have a global overview of how it operates in a decentralized manner
TODO: nice-to-haves from radicle: signed commits and maintainership identity bootstrapping through crypto
# Conclusion
many projects we just reviewed are far from ready for decentralized forging. they're still doing a very good job at pushing the ecosystem to selfhosted, free software solutions that empower us and not our corporate masterlords. this needs to be praised
the intent of this article is not to criticize or anger people because their favorite forge is the best in the universe, i just wanted to take a look at the ecosystem, see the reasons that brought us here and take some reflection on how we go on from here and bridge the gaps
i don't think there's a good and a bad way to do this (although there are many bad ways to do this), but we need both federated and p2p solutions to these problems and maybe we need to break the false federated/p2p binary: they serve different purposes but they mix very well together and i don't see why a federated forge could not cooperate with a p2p forge
there might be some information loss on the way (for example unsigned commits from Gitlab may not be imported with proper signature for radicle although some trusted radicle nodes (the maintainers) could sign those commits in good faith
-------
# Decentralized VCS and decentralized forge
git is already decentralized VCS
we just add some layer on top (federation semantics with ActivityPub (forgefed) or XMPP)
or we encourage out-of-band decentralized interactions (like sourcehut does with email-first workflow)
or we take the problem inband (git-issue) but that does not solve everything!
or we do like https://radicle.xyz/ a bit of all this on top of a p2p network
what it takes to build a decentralized forge:
- either i can open issues/merge and receive notifications from any instance (federated forge) ; this can of course be turned off for some internal projects that should be private
- or we consider instances and web gateways to be mere components that should not play any role more than displaying information

365
content/blog/decentralized-forge/index_old2.md

@ -0,0 +1,365 @@
+++
title = "Decentralized forge: distributing the means of digital production"
date = 2020-11-20
+++
[**PAD**](https://cryptpad.fr/pad/#/2/pad/edit/QXiMGyeOy3M5OcCc-QN8VtDV/)
This article began as a draft in february 2020, which i revised because the Free Software Foundation is looking for feedback on their high-priority projects list.
Our world is increasingly controled by software. From medical equipment to political repression to interpersonal relationships, software is everywhere, shaping our lives. As the luddites did centuries ago, we're evaluating whether new technologies empower us or on the contrary reinforce existing systems of social control.
This question is often phrased through a [software freedom](https://www.gnu.org/philosophy/free-sw.html) perspective: am i free to study the program, modify it according to my needs and distribute original or modified copies of it? However, individual programs cannot be studied out of their broader context. Just like the human and ecological impact of a product cannot be imagined by looking at the product itself, so does a binary program tell us little about its conditions of production and envisioned goals.
A forge is another name for a software development platform. The past two decades have been rich in progress on forging ecosystems, empowering many projects to integrate tests, builds, fuzzing, or deployments as part of their development pipeline. However, there is a growing privatisation of the means of digital production: Github, in particular, is centralizing a lot of forging activities on its closed platform.
In this article, i argue decentralized forging is a key issue for the future of free software and non-profit software development that empowers users. It will cover:
- Github considered harmful (what Github did well, and why they are now a problem)
- git is decentralized, but the workflows aren't standardized (why email forging is not yet practical for everyone)
- Selfhosted walled gardens are not a solution (how selfhosted forges can limit cooperation)
- Centralized and decentralized trust (how to discover and authenticate content in a decentralized setting)
- Popular self-defense against malicious activities (what are the threats, and how to empower our communities against them)
- Existing decentralized forging solutions
- a call for interoperability between decentralized forges
- nomadic identity and censorship-resilience
- code-signing and secure project bootstrapping
This article assumes you are somewhat familiar with software development, [decentralized version control system](https://en.wikipedia.org/wiki/Distributed_version_control) (DVCS) such as git, and cooperative forging (collaboration between mutliple users).
# Glossary
- forge: a program facilitating cooperation on a project
- DVCS: a version control system such as git
- repository: a folder, containing versioning metadata, to represent several versions of the same project
- source: the contents of the repository (excluding versioning metadata)
- commit: a specific version of a repository, in which the commit name is the [checksum](https://en.wikipedia.org/wiki/Checksum>) of all file contents for this version of the repository
- branch: a user-friendly name pointing to a specific commit, which can be updated to point to newer commits (this is what happens when an update is pushed)
- patch: a proposal to change content from the repository, that can be understood by humans and by machines (typically, a [diff](https://en.wikipedia.org/wiki/Diff))
- issue: a comment/discussion about the project, which is usually not contained within the repository itself (also called a ticket or a bug)
- contributors: people cooperating on a project by submitting issues and patches
- maintainers: contributors with permission to push updates to the repository (validate patches)
# Github considered harmful
More and more, software developers and other digital producers are turning to Github to host their project and handle cooperation across users, groups and projects. Github is a web interface for git. From this web interface, you can browse the source of the project, submit/triage issues, and propose/review patches. While Github internals rely on git, Github does more than git itself. Notably, Github enables to define fine-grained permissions for different users/groups.
Github is what we call a web forge, that is a software suite that makes cooperation easier. It was inspired by many others that came before, like Sourceforge and Launchpad. What's different about Github, compared to previous forges, is the emphasis on user experience. While Sourceforge is great for those of us familiar with [DAGs](https://en.wikipedia.org/wiki/Directed_acyclic_graph) and git vocabulary (*refs*, *HEAD*, *branches*), Github was conceived from the beginning to be more friendly to newcomers. Using Github to produce software still requires git knowledge, but submitting a ticket (to report a bug) is accessible to anyone familiar with a web interface.
To illustrate their difference, we can look at how different forges treat the [README](https://en.wikipedia.org/wiki/README) file. From [Inkscape's Launchpad project](https://launchpad.net/inkscape), it takes me one click to reach the list of files (sometimes called *tree*), and yet another click to open the README. In contrast, [zola's project page](https://github.com/getzola/zola) on Github instantly displays the list of files, as well as the complete README.
![The inkscape project page on Launchpad](launchpad.jpg)
Every project is architectured differently. That's why displaying a list of branches or the latest commits on the project page brings limited value: the people who need this information likely know how to retrieve it from the git command-line (`git branch` or `git log`). README files, however, were created precisely to introduce newcomers to your project. Whether your README file is intended for novice users or people from a specific background is up to you, but Github does not stand in the way.
But not all is so bright with Github. First and foremost, Github is a private company trying to make money on other people's work. They are more than happy to use free-software projects to build their platform, but always refused to publish their own sourcecode. Such hypocrisy has long been denounced by the Free Software Foundation, which in its [ethical forging evaluation](https://www.gnu.org/software/repo-criteria-evaluation.html) gave Github the worst note possible ([criteria](https://www.gnu.org/software/repo-criteria.html)).
Like Facebook and Google, and other centralized platforms, Github also has a reputation for banning users from the platform with little/slow recourse. Some take to forums and social media platform to get the attention of the giant in the hope of getting their account restored. That's what happened for instance when Github decided to [ban all users from Iran, Crimea and Syria](https://www.theverge.com/2019/7/29/8934694/github-us-trade-sanctions-developers-restricted-crimea-cuba-iran-north-korea-syria): users organized [massive petitions](https://github.com/1995parham/github-do-not-ban-us), but were left powerless in the end.
Github has also been involved in very political questionable decisions. If your program is in any way interacting with copyrighted materials (such as Popcorn Time or youtube-dl), there are chances it will be taken down at some point. This is even more true if your project displeases a colonial empire (or pretend "republic") like Spain or the United States. An application promoting Catalan independance ⁽²⁾ [was removed](TODO). And so was a list of police officers (ICE) compiled from entirely public data, which was finally [hosted by Wikileaks](https://www.rt.com/usa/430505-ice-database-hosted-wikileaks/).
![Github logo parody](github-politics.png)
Github has chosen their side: they will protect the privileged against the oppressed, in an effort to safeguard their own profits. They will continue to support [Palantir](https://www.fastcompany.com/90348304/exclusive-tech-workers-organize-protest-against-palantir-on-the-github-coding-platform) (surveillance company) and [ICE](https://techcrunch.com/2019/11/13/github-faces-more-resignations-in-light-of-ice-contract/) (racist police detaining/deporting people) despite protests from their users and employees, and the only option left for us is to boycott their platform entirely.
To be fair, there are examples for which i personally agree with Github's repressive actions, like when they banned a dystopian [DeepNude application](https://www.vice.com/en/article/8xzjpk/github-removed-open-source-versions-of-deepnude-app-deepfakes) encouraging hurtful [stalking](https://en.wikipedia.org/wiki/Stalking)/harrassing behavior. My point is not that Github's positions are wrong, but rather that nobody should ever hold such power.
It is entirely understandable for a person or entity to refuse to be affiliated with certain kinds of activities. However, given Github's quasi-monopoly on software forging, banning people from Github means exiling them from many projects. Likewise, banning projects from Github disconnects them from a vast pool of potential contributors. There should never be a global button to dictate who can contribute to a project, and what kind of project they may contribute to.
# git is decentralized, but the workflows aren't standardized
To oppose the ever-growing influence of Github, much of the free-software community pointed out that we simply don't need it, because [git is already a decentralized system](https://drewdevault.com/2018/07/23/Git-is-already-distributed.html), contrary to centralized versioning systems like subversion (svn). Every contributor to a git repository has a complete copy of the project's history, can work offline, then push their changes to any remote repository (called a *remote* in git terminology).
This property ensures that anyone can at any time take a repository, take it elsewhere, and start their own fork (modified version). However, what is outside the repository itself cannot always so easily be migrated: tickets, submitted patches, and other contributions and discussions may each have their own procedure to export, when that is possible at all.
Historically, git was developed for the Linux kernel community, where email is the core of the cooperation workflow. Tickets and patches are submitted and commented on mailing lists. git even provides subcommands for such workflow (eg `git-send-mail`). This workflow is [less prone to censorship](https://sourcehut.org/blog/2020-10-29-how-mailing-lists-prevent-censorship/) than centralized forges like Github, but has other challenges.
For example, when forking a project, how to advertise your fork to the community without spamming existing contributors? Surely if the original project is dead and the mailing list doesn't exist anymore, a fork's notification email would be welcome for most people. In other cases, it may be perceived as an invading action because these contributors did not explicitely consent (opt-in) to your communication, as required by privacy regulations in many countries.
Moreover, there is no standardized protocol for managing tickets and patch submissions over email. [sourcehut's todo](https://man.sr.ht/todo.sr.ht/) and Debian's [debbugs](https://www.debian.org/Bugs/) have very good documentation for new contributors to submit issues, but in other communities it can be hard to understand the [netiquette](https://en.wikipedia.org/wiki/Etiquette_in_technology) or expected ticket format for a project, even more so when this project has been going on for a long time with an implicit internal culture.
With the email workflow, every project is left to implement their own way. That can be quite powerful for complex usecases, but unexperienced persons who just want to start a cooperative project will be left powerless. Creating a project on Github takes a few clicks. In comparison, setting up a git server, a couple mailing lists (one for bugs, one for patches), and configuring correct permissions for all that is more exhausting and error-prone, unless you use an awesome shared hosting service like [sr.ht](https://sr.ht/).
From a contributor's perspective, it can take some time getting used to an email workflow. In particular, receiving and applying patches from other people may require using a developer-friendly email client such as [emacs](https://www.gnu.org/software/emacs/), [aerc](https://aerc-mail.org/) or [neomutt](https://neomutt.org/). Sending patches, on the other hand, is not complicated at all. If you want to learn, there's an amazing interactive tutorial at [git-send-email.io](https://git-send-email.io/).
We have to keep in mind forging platforms are not only used by developers, but also designers, translators, editors, and many other kinds of contributors. Many of them already struggle learning git, with its [infamous inconsistencies](https://stevelosh.com/blog/2013/04/git-koans/). Achieving a full-featured newcomer-friendly email forging workflow is currently still a challenge, and this is currently a hard limit on contributions from less-technical users.
Developing interoperable standards for forging over email (such as ticket management) would help a great deal. It would enable mail clients, desktop environments and integrated development environments to (IDEs) provide familiar interfaces for all interactions we need with the forge. For example, Github's [issue templates](https://docs.github.com/en/free-pro-team@latest/github/building-a-strong-community/manually-creating-a-single-issue-template-for-your-repository) could be supported in email workflows. An open email forging standard would also make it easier for web forges (and others) to enable contributions via email, reuniting two forging ecosystems who currently rarely interact because most people are familiar with one or the other, not both.
# Selfhosted walled gardens are not a solution
Over the past decade, some web forges have been developed with the goal of replicating the Github experience, but in a selfhosted environment. The most popular of those are [Gitlab](https://gitlab.com/), [Gogs](https://gogs.io/) and [Gitea](https://gitea.io/) (which is a fork of Gogs).
Such modern, selfhosted webforges are very popular with bigger communities and projects who already have their own infrastructure, and avoid relying on 3rd party service providers, for reliability or security concern. Forging can be integrated into their project ecosystem, for example to manage user accounts (eg. with [LDAP](https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol)).
However, for hobbyists and smaller communities, these solutions are far from ideal, because they were specifically developed to replicate a centralized development environment like Github. The forge is usually shut off from the outside world, and cooperation between users is only envisioned in a local context. Two users on [Codeberg](https://codeberg.org/) may cooperate on a project, but a user from [0xacab.org](https://0xacab.org/) may not contribute to the same project without creating an account on Codeberg.
Some people may argue this is a feature and not a bug, for three reasons:
- 1. *easier enforcement of server-wide rules, guidelines and access rules*: this may be an advantage in corporate settings (or for big community projects), but doesn't apply to popular hacking usecases, where all users are treated equally, and settings are defined on a per-project basis (not server-wide)
- 2. *an account on every server is more resilient to censorship or server shutdown*: while true, i would argue the issue should be tackled in more comprehensive and user-friendly ways through easier project migration and nomadic identity systems (explained later in this article)
- 3. *an account on every server isn't a problem, because there's only so many projects you contribute to*: though a person may only contribute seriously and frequently to a limited number of projects, there's so many more projects we use on a daily basis and don't report bugs to, because unless the project has privacy-invading telemetrics and click-to-go bug reporting, figuring out a specific project's bug reporting guidelines can be tedious
In contrast, the bug reporting workflow as achieved by Github and mailing lists is more accessible: you use your usual account, and the interface you're used to, to submit bugs to many projects. If a project uses Mailing Lists for cooperation, you can contribute to from your usual mail client. Your bug reports and patches may be moderated before they appear publicly, but you don't have to create a new account and learn a new workflow/interface just to submit a bug report.
Creating a new account for every community and project you'd like to join is not a user-friendly approach. This [antipattern](https://en.wikipedia.org/wiki/Anti-pattern) was already observed in a different area, with selfhosted social networks: [Elgg](https://elgg.org/) could never replace Facebook entirely, nor could [Postmill](https://postmill.xyz/)/[Lobsters](https://lobste.rs/) replace Reddit, because participation was restricted to a local community. In some cases it's feature: a family's private social network should not connect to the outside world, and a focused and friendly community like [raddle.me](https://raddle.me/) or [lobsters](https://lobste.rs/) may wish to preserve itself from nazi trolls.
But in many cases, not being able to federate across instances (and communities) is a bug. Selfhosted centralized services tailor to the niche usecases, not because they're too different from Facebook/Reddit, but because they're technically so similar to them. Instead of dealing with a gigantic walled garden (Github), or a wild jungle (mailing lists), we now end up with a collection of tiny closed gardens. The barrier to entry to those gardens is low: you just have to introduce yourself to the frontdoor and define a password. But this barrier to entry, however low it is, is already too high.
I suspect that for smaller volunteer-run projects, the ratio of bug reporters to code committers is much higher on Github and development mailing lists than it is on smaller, selfhosted forges. If you think that's a bad thing, try shifting your reasoning: if only people familiar with programming are reporting bugs, and your project is not only aimed at developers, it means most of your users are either taking bugs for granted, or abandoning your project entirely.
# Centralized and decentralized trust
When we're fetching information about a project, how to ensure it is correct? In traditional centralized and federated systems, we rely on location-addressed sources of trust. We define *where* to find reliable information about a project (such as a git remote). To ensure authenticity of the information, we rely on additional security layers:
- Transport Layer Security ([TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security)) or Tor's [onion services](https://community.torproject.org/onion-services/) to ensure the remote server's authenticity, that is to make it harder for someone to impersonate a forge to serve you malicious updates
- Pretty Good Privacy ([PGP](https://en.wikipedia.org/wiki/Pretty_Good_Privacy)) to ensure the document's authenticity, that is to make it harder for someone who took control of your account/forge to serve malicious updates
How we bootstrap trust (from the ground up) for those additional layers, however, is not a simple problem. Traditional TLS setup rely on absolute trust in a pre-defined list of 3rd-party [Certificate Authorities](https://en.wikipedia.org/wiki/Certificate_authority), and CAs abusing their immense power is far from unheard of. Onion services and PGP, on the other hand, require prior knowledge of authentic keys ([key exchange](https://en.wikipedia.org/wiki/Key_exchange)). With the [DANE](https://en.wikipedia.org/wiki/DNS-based_Authentication_of_Named_Entities) protocol, we can bootstrap TLS keys from the domain name system ([DNS](https://en.wikipedia.org/wiki/Domain_Name_System)) instead of the CA cartel. However, this is still not supported by many clients, and in any case is only as secure as DNS itself. That is, very insecure despite recent progress with [DNSSEC](https://en.wikipedia.org/wiki/Domain_Name_System_Security_Extensions). For a location-based system to be secure, we need a secure naming system like the [GNU Name System](https://gnunet.org/en/gns.html) to enable secure key exchange.
These difficulties are inherent properties of location-addressed storage, in which we describe *where* is a valid source of the information we're looking for. Centralized and federated systems are by definition location-addressed systems. Peer-to-peer systems, on the other hand, don't place trust in specific entities. In decentralized systems, trust is established either via [cryptographic identities and signatures](https://en.wikipedia.org/wiki/Digital_signature) and/or content-addressed storage ([CAS](https://en.wikipedia.org/wiki/Content-addressable_storage)).
Signatures verify the authenticity of a document compared to a known public key. For example, when we trust the Tor project's PGP key (`EF6E286DDA85EA2A4BA7DE684E2C6E8793298290`), we can obtain the Tor browser (and corresponding signature) from any source, and verify the file was indeed *signed* by Tor developers. Content-addressed storage, in comparison, merely matches a document with a [checksum](https://en.wikipedia.org/wiki/Checksum), and does not provide authorship information.
So, with these building blocks in place, how do we discover new content in a decentralized system? There are typically two approaches to this problem: consensus and gossip. There may be more, but i'm not aware of them.
## Consensus
Consensus is an approach in which all participating peers should agree on a single truth (a global state). They take votes following a protocol like [Raft](https://en.wikipedia.org/wiki/Raft_(algorithm)), and usually the majority wins. In a closed-off system controled by a limited number of people, a restricted set of trusted peers is allowed to vote. These peers can either be manually approved (static configuration), or be bootstrapped from a third party such as a local certificate authority controlled by the same operators.
But these traditional consensus algorithms do not work for public systems. If anyone can join the network and participate to the consensus establishment (not just a limited set of peers), then anyone may create many peers to try and take control of the consensus. This attack is often known as [Sybil attack](https://en.wikipedia.org/wiki/Sybil_attack), pseudospoofing, or 51% attack.
Public systems of consensus like [Bitcoin](https://en.wikipedia.org/wiki/Bitcoin) use a Proof-of-Work algorithm ([PoW](https://en.wikipedia.org/wiki/Proof_of_work)) to reduce the risk of a Sybil attack. Such blockchains are not determined by a simple majority vote, but rather by a vote by the majority of global computing power. While this represents a mathematical achievement, it means whoever controls the majority of computing power controls the network. This situation already happened in Bitcoin's past, and may happen again.
As we speak, the Bitcoin network already relies on a handful of giant computing pools, mostly running in China where coal-produced electricity is cheap ⁽³⁾. As time goes by, two things happen:
- difficulty goes up so mining becomes less rewarding, and may even cost you money depending on your hardware and your source of electricity; there was a time when mining on a GPU was profitable, and even before that, mining on a CPU was profitable for quite a while
- the blockchain grows more and more (7.5MB in 2010, 28GB in 2015, 320GB nowadays), so joining the network requires ever-growing resources
![XKCD comic about buying hand-sanitizer for 1BTC](2010_and_2020_2x.png)
To recap, hardware requirements go up, while economic incentives go down. What could possibly go wrong with this approach? Even worse, Bitcoin's crazy cult of raw computational power has serious ecological consequences. Bitcoin, a single application, uses more electricity than many countries. Also, all the dedicated hardware ([ASIC](https://en.wikipedia.org/wiki/Application-specific_integrated_circuit)) built for Bitcoin, will likely never be usable for anything else. As such, it appears global consensus is a deadend for decentralized systems.
## Gossip
[Gossip](https://en.wikipedia.org/wiki/Gossip_protocol) is a conceptual shift, in which we explicitely avoid global consensus. Instead, each peer has their own view of the network (truth), but can ask other peers for more information. Gossip is closer to how actual human interactions work: my local library may not have all books every printed, but whatever i find in there i can freely share with my friends and neighbors. ⁽⁴⁾ Identity in gossip protocols is usually powered by asymmetric cryptography (like PGP), so that all messages across the network can be signed and authenticated.
Gossip can be achieved through any channel. Usually, it involves USB keys and local area networks (LAN). But nothing prevents us from using well-known locations on the Internet to exchange gossiped information, much like a local newspaper or community center would achieve in the physical world. That's essentially what the Secure ScuttleButt ([SSB](https://scuttlebutt.nz/)) protocol is doing with its *[pubs](https://ssbc.github.io/scuttlebutt-protocol-guide/#pubs)*, or PGP with *[keyservers](https://en.wikipedia.org/wiki/Key_server_%28cryptographic%29)*.
In my view, gossip protocols include [IPFS](https://en.wikipedia.org/wiki/InterPlanetary_File_System) and [Bittorrent](https://en.wikipedia.org/wiki/BitTorrent). That's because they rely on Distributed Hash Tables ([DHTs](https://en.wikipedia.org/wiki/Distributed_hash_table)). Compared to a Bitcoin-style blockchain (where every peer needs to know about everything for consistency), In a DHT, no peer knows about everything (reducing hardware requirements to join the DHT), and consistency is ensured by content addressing (checksumming the information stored).
The database (DHT) is partitioned (divided) across many peers who each have their view of the network, but existing peers will gladly help you discover content they don't have, and ensuring authenticity of the data is not hard thanks to checksums. In this sense, i consider DHTs to be some kind of globally-consistent gossip.
It's not (yet) a widely-researched topic, but it seems IPv6 multicast could be used to make gossiping a lower-level concern (on the network layer). If you're interested in this, be sure to check out a talk called [Privacy and decentralization with Multicast](https://archive.fosdem.org/2020/schedule/event/dip_librecast/).
# Popular self-defense against malicious activity
One may object that favorizing interaction across many networks will introduce big avenues for malicious activity. However, i would argue this does not have to be true. The same could be said of the Internet as a whole, or email in particular. But in practice, we have decades of experience (including many failures) about how to protect users from spam and malicious activities in a decentralized context.
Even protocols who initially discarded such concerns as secondary are eventually rediscovering well-known countermeasures. Some talks from the last [ActivityPub conference](https://conf.activitypub.rocks/#talks) ([watch on Peertube](https://conf.tube/video-channels/apconf_channel/videos)) touch on these topics. I personally recommend a talk entitled Architectures of Robust Openness.
What about information which you published by mistake, or information which you willingly published but may cause you harm? And what about spam?
## Revocation and deniability
In the first case, how do you take down a very secret piece of information you did not intend to publish? I am unaware of any way of achieving this in a decentralized manner. Key revocation processes (like with PGP) rely on good faith from other peers, who have incentives to honor your revocation certificate, as the revocated material was a public key, not a secret document.
However, even in a centralized system, there's only so much you can do following an accidental publication. If you published a password or private key, you can simply rotate it. However, if you published a secret document, it may be mirrored forever on some other machines, even if you force-pushed it out. In that sense, a forging repository is similar to a newspaper: better think twice about what you write in your article, because it will be impossible to destroy every copy once it's been distributed.
![git push --force may burn down your house](force-push.png)
[Plausible deniability](https://en.wikipedia.org/wiki/Plausible_deniability) addresses the second concern. In modern encrypted messengers (OTR/Signal/OMEMO), encryption keys are constantly rotated, and previous keys are shared. Authenticity of a message is proven in the present, but you cannot be held responsible for a past message, because anyone could have forged the signature by recycling your now-public private key.
While it may seem like a secondary concern, plausible deniability can be a (litterally) life-saving property in case of political dissent. That's why Veracrypt has [hidden volumes](https://www.veracrypt.fr/en/Hidden%20Volume.html). That's also why some people are now [calling on mail providers to publish their previous DKIM private keys](https://blog.cryptographyengineering.com/2020/11/16/ok-google-please-publish-your-dkim-secret-keys/): plaintext emails would be plausibly deniable, while retaining strong authenticity for PGP-signed emails (as intended).
In decentralized public systems like software forges, i am unaware of any effort to implement plausible deniability. By instinct, i feel like plausible deniability is incompatible with authentication, which is a desired property for secure forging. However, i don't know that for sure. I'm also not sure about the need for plausibly-deniable forging, as developers are usually tracked and persecuted through other means, like [mailing lists](https://en.wikipedia.org/wiki/DeCSS) or [mobile phone networks](https://news.ycombinator.com/item?id=21747424).
## SPAM
Another vector of harm is spam and otherwise undesired content. It is often said that combatting spam in a decentralized environment is a harder problem. However, as previously explained, some decentralized systems have decades of experience in fighting malicious activities.
Simple techniques like [rate-limiting](https://en.wikipedia.org/wiki/Rate_limiting), webs of trust ([WoT](https://en.wikipedia.org/wiki/Web_of_trust)), or (sometimes user-overridable) [allow](https://en.wikipedia.org/wiki/Whitelisting)/[deny lists](https://en.wikipedia.org/wiki/Blacklist_(computing)) can go very far to protect us from spam. However, there are techniques invented for the web and for emails which should never be reused, because they are user-hostile antipatterns: IP range bans, and 3rd party denylisting.
Banning entire IP address ranges is a common practice on the web, and the reasoning is that if you received malicious activity from more than one address in an IP range, you'd better ban the entire range, or even all addresses registered from the same country. While this may protect you from an unskilled script kiddie spamming a forum, it will prevent a whole bunch of honest users from using your service.
For example, banning Tor users from your service will do little to protect you from undesired activities. Bad actors usually have a collection of machines/addresses they can start attacks from. When that is not the case, a few dollars will get them a bunch of virtual private servers, each with a dedicated address. For a few dollars more, they'll get access to a botnet running from residential IP addresses. This means blocking entire IP ranges will only block legitimate users, but will not stop bad actors.
Renting under-the-radar residential IP addresses was already common practice years ago with people offered money or free TV services in exchange for placing [a blackbox](https://www.reddit.com/r/raspberry_pi/comments/3nc5mv/what_are_they_trying_to_do_with_this_pi_seems/) in their network. Nowadays, "smart" TVs, lightbulbs, cameras and other abominations will do the trick with even less mitigation strategies. The Internet of Things is a capitalist nightmare that hinders our security (see [this talk](https://www.usenix.org/conference/usenixsecurity18/presentation/mickens)) and destroys the environment.
The other common antipattern in fighting malicious activities is delegating access control to a third party. In the context of a webbrowser's [adblocker](https://en.wikipedia.org/wiki/Adblocker), it makes sense to rely on denylists maintained by other persons: you may not have time to do it yourself, and if you feel like something is missing on the page, you can instantly disable the adblocker or remove specific rules. On the server side however, using a 3rd party blocklist may introduce new problems.
How can users report that they are (wrongfully) blocked from your services, if they cannot access those services in the first place to find your contact information? How can they reach you if their email server is blocked, because a long time ago someone spammed from their IP address? In the specific case of email, most shared blocklists have a procedure to get unlisted. But some other blocklists don't.
On the web, CloudFlare is well-known as a privacy-hostile actor: they will terminate TLS connections intended for your service, snooping on communications between your service and its users. Moreover, CloudFlare blocks by default any activity that comes from privacy-friendly networks like Tor, trapping users in endless CAPTCHA loops. They also block [user agents](https://en.wikipedia.org/wiki/User_agent) they suspect to be bots, preventing legitimate [scrapers](https://en.wikipedia.org/wiki/Web_scraping) indexing or archiving content.
![Fuck cloudflare stickers](fuck-cloudflare.jpg)
Would you allow a private multinational corporation to stop and [stripsearch](https://en.wikipedia.org/wiki/Strip_search) anyone trying to reach your home's doorbell? Would you allow them to prevent your friends and neighbors from reaching you, because they refuse to be stripped? That's exactly what CloudFlare is doing in the digital world, and that is utterly unacceptable. All in all, **Fuck CloudFlare**! Yes, there's even [a song](https://polarisfm.bandcamp.com/releases) about it.
So, there's nothing wrong with banning malicious actors from your network and services. But what defines a malicious actor? Who gets to decide? These are extremely political question, and delegating such power to privacy-hostile third party services is definitely not the way to go.
# Existing decentralized forging solutions
Now that we have covered some of the reasons why decentralized forging is important and how to deal with malicious activities, let's take a look at projects people are actually working on.
## Federated authentication
Some web forges like Gitea propose [OpenID](https://en.wikipedia.org/wiki/OpenID#Technical_overview) federated authentication: you can use any OpenID account to authenticate yourself against a such selfhosted forge. Compared to [OAuth](https://en.wikipedia.org/wiki/OAuth), OpenID does not require the service operator to list explicitely all accepted identity providers. Instead of having a predetermined list of "login with microfacegoogapple" buttons, you have a free form for your OpenID URL.
Whether you're signing up for the first time or signing in, you give your OpenID URL to the forge. You will then be redirected to the corresponding OpenID server, authenticated (if you are not yet logged in) and prompted whether you want to authenticate on the forge. If you accept, you will be redirected to the forge, who will know the OpenID server vetted for your identity.
Newer standards like [OpenID Connect](https://en.wikipedia.org/wiki/OpenID_Connect) also feature a well-known discovery mechanism, so you don't have to use a full URL to authenticate yourself, but a simple *user@server* address, as we are already used to. Federated authentication can also be achieved via other protocols, such as email confirmation, [IndieAuth](https://indieauth.net/) or XMPP/Jabber (XEP [0070](https://xmpp.org/extensions/xep-0070.html) or [0101](https://xmpp.org/extensions/xep-0101.html)).
The federated authentication approach is brilliant because it's simple and focuses on the practicality for end-users. However, it does not solve the problem of migrating projects between forges, nor does it enable you to forge from your usual tools/interfaces.
## Federated forging
Federated forging relies on a federation protocol and standard vocabulary to let users cooperate across servers. That means a whole ecosystem of interoperable clients and servers can be developed to suit everyone's needs. This approach is exemplified by the ForgeFed and Salut-à-Toi projects.
[ForgeFed](https://forgefed.peers.community/) is a forging extension for the [ActivityPub](https://activitypub.rocks/) federation protocol (the fediverse). It has a proof-of-concept implementation called [vervis](https://dev.angeley.es/s/fr33domlover/r/vervis) and aims to be implementable for any web forge. However, despite some interesting discussions on their forums, there seems to be limited activity implementation-wise.
[Salut-à-Toi](https://salut-a-toi.org/) on the other hand, is an existing suite of clients for the Jabber federation (XMPP protocol). They have CLI, web, desktop and TUI frontends to do social networking on Jabber. From this base, they released support for decentralized forging [in july 2018](https://www.goffi.org/b/Uj5MCqezCwQUuYvKhSFAwL/salut-alpha-contributors,-take-your-keyboards).
![A proposed patch on the salut-à-toi forge](sat-forge.png)
It's still a proof-of-concept, but it's reliable enough for the project to be selfhosted. In this context, *selfhosted* means that salut-à-toi is the forging software used to develop salut-à-toi itself. This all happens [here](https://bugs.goffi.org/).
While such features are not implemented yet, the fact that these federated forge rely on standard vocabulary would help with migration between forges, without having to use custom APIs for every forge, as is common for Github/Sourceforge/Bugzilla migrations.
As the user interactions themselves are federated, and not just authentication, folks may use their client of choice to contribute to remote projects. This means lesser concerns for color themes or accessibility on the server side, because all of these questions would be addressed on the client side. This is very important for [accessibility](https://en.wikipedia.org/wiki/Accessibility), ensuring your needs are covered by the client software, and that a remote server cannot impact you in negative ways.
If your email client is hard for you to use, or otherwise unpleasant, you may use any email-compatible client that better suits your needs. With selfhosted, centralized forges, where the client interface is tightly-coupled to the server, every forge server needs to take extra steps to please everyone. Every forge you join to contribute to a project can make your user experience miserable. Imagine if you had to use a different user interface for every different server you're sending emails to?!
Federated forging, despite being in early stages, is an interesting approach. Let servers provide functions tailored to the project owners, and clients provide usability on your own terms.
--- Below is still WIP
## Gossip
An early attempt based on Bittorrent's DHT was [Gittorrent](https://blog.printf.net/articles/2015/05/29/announcing-gittorrent-a-decentralized-github/). Another one was [git-remote-ipfs](https://github.com/larsks/git-remote-ipfs), based on [IPFS](https://ipfs.io/) instead of Bittorrent. The project is now unmaintained, but it can be replicated [with the simple IPFS HTTP gateway](https://docs.ipfs.io/how-to/host-git-style-repo/). While these systems did not support tickets and patches, they were inspirational for more modern attempts like git-ssb and radicle.
[git-ssb](https://scuttlebot.io/apis/community/git-ssb.html) implements forging capabilities on top of the Secure ScuttleButt ([SSB](https://scuttlebutt.nz/))
![Git SSB's web interface](git-ssb.png)
![radicle screenshot, taken from their homepage](radicle.png)
On the more exciting side of things, radicle uses strong cryptography to sign commits in a very integrated and user-friendly way. That's a strong advantage over most systems in which signatures are optional and delegated to third-party tooling which can be hard to setup.
TODO: git-ssb screenshot.
Although they're less polished for day-to-day use, these projects are very interesting. They helped pave the way for research into decentralized forging by showing that git and other decentralized version control systems (DVCS) play well with content-addressed storage, given that the commits themselves are content-addressed (a commit name is a mathematical checksum of everything it contains).
## Consensus
Apart from sketchy Silicon Valley startups, nobody is attempting to build consensus-based forging solutions, for all the reasons explained previously. When i started drafting this article, Radicle seemed keen using blockchains and global consensus. But since then, they've reevaluated their decision, though not for any of the reasons i proposed, but rather because of legal problems due to [blockchain poisoning](https://jennyleung.io/2019/03/29/blockchain-poisoning-and-the-proliferation-of-state-privacy-rights/). Nowadays, Radicle uses Ethereum, but only as a complement (not replacement) to their gossip protocol.
## Not covered here
Many more projects over the years have experimented with storing forging interactions (metadata like bugs and pull requests) as well-known files within the repository itself. Some of them are specific to git: [git-dit](https://github.com/neithernut/git-dit), [ticgit](https://github.com/jeffWelling/ticgit), [git-bug](https://github.com/MichaelMure/git-bug), [git-issues](https://github.com/dspinellis/git-issue). Others intend to be used with other versioning systems (DVCS-agnostic): [artemis](https://www.mrzv.org/software/artemis/), [bugs-everywhere](https://github.com/kalkin/be), [dits](https://github.com/jashmenn/ditz), [sit](https://github.com/sit-fyi/issue-tracking).
I will not go into more details about them, because these systems only worry about the semantics of forging (vocabulary), but do not emphasize how to publicize changes. For example, these tools would be great for a single team having access to a common repository to update tickets in an offline-first setting, then merging them on the shared remote when they're back online. But they do not address cooperation with strangers, unless you give anyone permission to publish a new branch to your remote, which is probably a terrible idea. However, that's just my personal, uninformed opinion: if you have counter-arguments about how in-band storage of forging interactions could be used for real-world cooperation with strangers, i'd be glad to hear about it!
Lastly, i didn't mention [Fossil SCM](TODO) because i'm not familiar with it, and from reading the docs, i'm very confused about how it approaches cooperation with strangers. It appears forging interactions are stored within the repository itself, but then does that mean that Fossil merges every interaction it hears about? Or is Fossil only intended for use in a closed team? Let me know if you have interesting articles to learn more about Fossil.
# With interoperability, please
After this brief review of the existing landscape of decentralized forging, i would like to argue for [interoperability](TODO). If you're not familiar with this concept, it's a key concern for accessibility/usability of both physical and digital systems: interoperability is the property when two systems addressing the same usecases can be used interchangeably. For example, a broken lightbulb can be replaced by any lightbulb following the same socket/voltage standards, no matter how it works internally to produce light.
In fact, interoperability is the default state of things throughout nature. To make fire, you can build any sort of wood that burns. If your window is broken and you don't have any glass at hand, you can replace it with any material that will prevent air flowing through. Interoperability is a very political topic, and a key concern to prevent the emergence of monopolies. If you'd like to know more about it, i strongly recommend a talk called [We used to have cake, now we've barely got icing](TODO).
So while these approaches of decentralized forging we've talked about are very different in some regards, there is no technical reason why they could not play well together and inteoperate consistently. As a proof of concept, [git-issue](TODO) we've mentioned in the previous section can actually synchronise issues contained within the repository with Github and Gitlab issues. It could as well synchronise with any selfhosted forge (federated or not), or publish the issues on the radicle blockchain.
The difference between federated and p2p systems is big, but [hybrid p2p/federated systems have a lot of value](TODO: my own article to finish). If we develop open standards, there is no technical barrier for a peer-to-peer forge to synchronise with a federated web/XMPP forge. It may be hard to wrap one's head around, and may require a lot of work for implementation, but it's entirely possible. Likewise, a federated forge could federate both via ForgeFed, and via XMPP. And it could itself be a peer in a peer-to-peer forge, so that pull requests submitted on Radicle may automatically appear on your web forge.
Not all forges have to understand each other. But it's important that we at least try, because the current fragmentation across tiny different ecosystems is hostile to new contributions from people who are used to different workflows and interfaces.
Beyond cooperation, interoperability would also ease backups, forks and migrations. Migrating your whole project from a forge to another would only take a single, unprivileged action. When forking a project, you would have a choice whether to inherit all of its issues and pull requests or not. So if you're working on a single patch, you would discard it. But in case you want to take over an abandoned project, you would inherit all of the project's history and discussions, not just the commits.
You may have noticed i did not mention the email workflow in this section about interoperability. That's because email bugtracking and patching is far from being standardized. An issue tracker like debbugs could rather easily be interoperated with, because it has a somewhat-specified grammar for interacting with tickets. But what about less specified workflows? My personal feeling is that these different workflows should be standardized.
Many vocabulary and security concerns expressed in this article would equally apply to email forging. But to be honest, i'm not knowledgeable enough about email-based forging to provide a good insight on this topic. I'm hoping people from the [sourcehut forge community](https://sourcehut.org/) and other git-email wizards can find inspiration in this call to decentralized forging, come around the table, and figure out clever ways to integrate into a broader ecosystem.
# Code signing and nomadic identity
So far, i've talked about different approaches to decentralized forging and how they could interoperate. However, one question i've left in the cupboard is how to ensure authenticity of interactions across different networks?
Code signing in forging usually uses PGP keys and signatures to authenticate commits and *refs*. In most cases, it is considered a DVCS-level concern and is left untouched by the forge, except maybe to display a symbol for valid signature alongside a commit. While we may choose to trust the forge regarding commit signatures, we may also verify these on our end. The tooling for verifying signatures is lacking, although there is recent progress with the [GNU Guix project](https://guix.gnu.org/) releasing the amazing [guix git authenticate](TODO) command for bootstrapping a secure software supply chain.
However, forging interactions such as issues are typically unsigned, and cannot be verified. In systems like ActivityPub and radicle, these interactions are signed, but with varying levels of reliability. While radicle has strong security guarantees because every client owns their keys, email/ActivityPub lets the server perform signatures for the users: a compromised server could compromise a lot of users and therefore such signatures are unreliable from a security perspective. We could take this into consideration when developing forging protocols, and ensure we can embed signatures (like PGP) into interactions.
For interoperability concerns, each forge could implement different security levels, and let maintainers choose the security properties they expect for external contributions, depending on their practical security needs. A funny IRC bot may choose to emphasize low-barrier contribution across many forges over security, while a distribution may enforce stricter security guidelines, allowing contributions only from a trusted webforge and PGP-signed emails. In any case, we need more user-friendly tools for making and verifying signatures.
Another concern is how to deal with migrations. If my personal account is migrated across servers, or i'm rotating/changing keys, how to let others know about it in a secure manner? In the federated world, this concern has been addressed by the ZOT protocol, which as initially developed for Hubzilla's nomadic identity system. ZOT lets you take your content and your friends to a new server at any given moment.
This is achieved by adding a crypto-identity layer around server-based identity (`user@server`). This crypto-identity corresponding to a keypair (think PGP) is bootstrapped in a [TOFU](TODO) manner (Trust On First Use) when federating with a remote user on a server that supports the ZOT protocol. The server will give back the information you requested, and let you know the nomadic identity keypair for the corresponding user. Then, you can fetch the corresponding ZOT profile from the server to discover other identities signed with this keypair.
For example, let's imagine for a second that [tildegit.org](https://tildegit.org/) and [framagit.org](https://framagit.org/) both supported the ZOT protocol and some form of federated forging. My ZOT tooling would generate a keypair, that would advertise my accounts on both forges. When someone clones one of my projects, their ZOT-enabled client would save this identity mapping somewhere. This way, if one of the two server ever closes, the client would immediately know to try and find my project on the other forge.
In practice, there would be a lot more subtlety to represent actual mapping between projects (mirrors), and to map additional keypairs on p2p networks (such as radicle) to a single identity. However, a nomadic identity system doesn't have to be much more complex than that.
The more interesting implementation concern is how to store, update and retrieve information about a nomadic identity. With the current ZOT implementations (to my knowledge), identities are stored as signed JSON blobs, that you retrieve opportunistically from a remote server (TOFU). However, that means if all of your declared servers are offline (for instance, if there's only one of those) i cannot automatically discover your updated nomadic identity (with your new forge servers).
I believe a crypto-secure, decentralized naming system such as [GNS](TODO) or [IPNS](TODO) would greatly benefit the nomadic identity experience. DNS could also be used here, but as explained before, DNS is highly vulnerable to determined attackers. Introducing DNS as discovery mechanism for nomadic identities would weaken the whole system, and make it much harder to get rid of in the future (for backwards-compatibility).
With GNS/IPNS (or any other equivalent system), people would only need to advertise their public key on every forge, and the nomadic identity mapping would be fetched in a secure manner. Considering GNS is in fact a signed and encrypted peer-to-peer key-value store itself, we could use GNS itself to store nomadic identity information (using well-known keys). IPNS, on the other hand, only contains an updatable pointer to an IPFS content-addressed directory. In this case, we would use well-known files within the target directory.
So, migration and failover across forges should also be feasible, despite other challenges not presented here, such as how to ensure consistency across decentralized mirrors, and what to do in case of conflicts.
# Conclusion
TODO: social recovery and backup mechanism
TODO: forging over onions
Decentralized forging is in my view the top priority for free-software in the coming decade. The Internet and free-software have a symbiotic relationship where one cannot exist without the other. They are two facets of the same software supply chain, and any harm done on one side will have negative consequences on the other. Both are under relentless attack by tyrants (including pretend-democracies like France or the USA) and multinational corporations (like Microsoft and Google).
Developing decentralized forging tooling is the only way to save free software and the Internet as we know them, and may even lower the barrier to contribution for smaller community projects.
Of course, decentralized forging will not save us from the rise of fascism in the physical and digital world. People will have to stand up to their oppressors. Companies and State infrastructure designed to destroy nature and make people's lives miserable will have to burn. *But we are not afraid of ashes, because we carry a new world, right here in our hearts. And this world is growing, at this very minute.*
Finally, you may wonder how i envision my own contribution to the decentralized forging ecosystem. I may not be competent enough to contribute useful code to the projects i listed above, but i may articulate critical feedback from a user's perspective (as i did in this post). But to be honest with you, i have other plans.
In the past years, i've been struggling with shell scripts to articulate various repositories and trigger tasks from updates. From my painful experiences from automatically deploying this website (in which the theme is a submodule), i've come up with what i think is a simple, coherent, and user-friendly Continuous Integration and Delivery platform: the [forgesuite](TODO), which contains two tools: forgebuild and forgehook.
On a high-level, forgehook can receive update notifications from many forges (typically, webhooks) and expose a standard semantic (ForgeFedà representation of this notification, indicating whether it's a push event, a comment on a ticket, on a new/updated pull request. forgehook then manages local and federated subscriptions to those notifications, filters them according to your subscription settings, and transmits them to other parts of your infrastructure. For example, maybe your IRC bot would like to know about all events happening across your project in order to announce them on IRC, but a CI test suite may be only interested in push and pull-request events.
forgebuild, on the other side of the chain, fetches updates from many remote repositories, and applies local settings to decide whether to run specific tasks or not. For now, only git and mercurial are supported, but any other version control system can be implemented by following a simple interface. Automated submodule updates are a key feature of forgebuild, to let you update any submodule, and automatically trigger corresponding tasks if submodule updates are enabled. forgebuild follows a simple CLI interface, and as such your CI/CD tasks can be written in your favorite language.
While the forgesuite is still in early stages, i believe it's already capable of empowering people. Less-experienced users who are somewhat familiar with the command line should find it very convenient to automate simple tasks, while power users should be able to integrate it with their existing tooling and infrastructure without concern. I know there is room for improvement, so if the forgesuite fails you somehow, i consider it's a bug! Don't hesitate to report critical feedback.
I will write more about the forgesuite in another blogpost, so stay tuned. In the meantime, happy hacking!
⁽¹⁾ Microsoft has tried to buy themselves a free-software friendly public image in the past years. This obvious openwashing process has consisted in bringing free software to their closed platform (Windows Subsystem for Linux), and open-sourcing a few projects that they could not monetize (VSCode), while packaging them with spyware (telemetry) for free platforms. Furthermore, Microsoft has been known for decades to cooperate with intelligence services (PRISM, NSAKEY) and oppressive regimes.
⁽²⁾ Catalonia has a long hsitory of repression by the spanish state. Microsoft is just the latest technological aid for that aim, just like Hitler and Mussolini in their time provided weapons to support Franco's coup, and crush the democratically-elected social front, then the social revolution.
⁽³⁾ Cheap electricity on the other side of the world is the only reason we have cheap hardware. It takes considerably more energy to produce a low-powered device like a Raspberry, than to keep an old computer running for decades, but the economical incentives are not aligned.
⁽⁴⁾ In many countries, sharing copyrighted material in a restricted circle is entirely legal as long as you obtained the material in a legal manner. For example, copying a book or DVD obtained from a library is legal. Throughout the french colonial empire, this is defined by the right to private copy ([copie privée](https://fr.wikipedia.org/wiki/Copie_priv%C3%A9e)).

BIN
content/blog/decentralized-forge/launchpad.jpg

After

Width: 1788  |  Height: 1001  |  Size: 153 KiB

BIN
content/blog/decentralized-forge/launchpad.jpg.jpg

After

Width: 800  |  Height: 448  |  Size: 33 KiB

BIN
content/blog/decentralized-forge/radicle.png

After

Width: 2880  |  Height: 1520  |  Size: 58 KiB

BIN
content/blog/decentralized-forge/sat-forge-cut.png

After

Width: 1049  |  Height: 1003  |  Size: 73 KiB

BIN
content/blog/decentralized-forge/sat-forge.png

After

Width: 1280  |  Height: 1024  |  Size: 78 KiB

1
themes/water

@ -0,0 +1 @@
Subproject commit a9ce0573982dc47466c335738c4c85f3b67e03d8
Loading…
Cancel
Save