site/content/_index.md

9.6 KiB

+++ +++

FOREWORD

NOTE 02/23/2022: Most of the project is being rearchitectured/rewritten. It should be ready by the end of the month! If you still see this notice in march 2022, either i've continuously failed to deliver, or i have forgotten to remove this note. This README is up-to-date but there may be broken links for a few days.

Introduction

The forge suite (forgesuite project) is a collection of high-level tools to automate actions on remote repositories. It aims to replace all-in-one continuous integration/delivery suites (like Drone or Gitlab CI) with simple components that can be used independently as part of your existing infrastructure, or part of a consistent whole:

  • forgebuild: check for remote/submodules updates (Git/Mercurial) and trigger tasks
  • forgehook: receive and validate webhooks sent by web forges (Gitea, Gitlab, Github), then triggers tasks
  • forgecheck: validate forge webhooks from command-line (CLI)
  • forgesub: (unimplemented) subscribe to updates from repositories you don't own locally (POSIX multi-user) or via federated webhooks
  • forgetest: (unimplemented) run structured/hierarchical tests and easily compare results

Each of these tools can be plugged into one another. But even if you integrate them in your existing environment, they should ease your life and not get in your way. If a program from the forge suite fails to meet your expectations of easiness, it's considered a bug and should be reported as soon as possible.

Architecture

In this section, we present an architectural overview of a complex forge suite setup ([diagram source]({{ url(path="schema.dot") }})):

![Diagram showing the forge sending a webhook to a forgehook endpoint, which validates it using a forgecheck CLI, then passes the update to forgebuild, which triggers tasks]({{ url(path="schema.dot.png") }})

The diagram was kept simple for educational purposes, but the different project READMEs will give you a lot more information about all the advanced features you can find. forgebuild in particular has many more configuration options and supported use-cases (one-off tasks, submodule tracking, per-host config).

Apart from the remote webforge (Gitea, Gitlab, or Github), we have three components running on our build server. forgehook will receive the webhooks produced by the webforge when code has been pushed, will extract the webhook payload as well as the claimed secret (either a signature or a token), then call forgecheck to validate against the actual secret.

The reason for calling a 3rd party program to perform the secret validation is detailed in the Security section. The short version is so that the web server cannot read the actual secrets. This is particularly useful in the case of shared pubnix/tilde servers where:

  • the server operator is trusted, but other users on the machine aren't
  • all web processes run as a system user (such as http or www-data)
  • the user has access to a shell account where they can setup programs and set the suid bit so that the web process can run forgecheck under their account

Once validation is performed, forgebuild will pick up the tasks to run in one of two ways:

  • by being called directly by forgehook (only if both run under the same user account)
  • by saving a notification notice in a common inbox folder where forgehook can write, and forgebuild --inbox can read (by being called from a cron job)

Principles

The forgesuite project is driven by technical and political motivations. Our overall principles can be summarized in three points:

  • Security through simplicity: everything can be understood, and audited
  • Composability: the tools should not get in your way, under any circumstance
  • Specification: every tool is a standardized interface for which multiple implementations may exist

Don't ask maintainers for permission, just forge on!

All major CI/CD platforms consider the repository itself should contain the tasks to be run, for example in a .gitlab-ci.yml file. This top-down deployment model is well suited to an organization controling the whole of its software supply chain, but is a severe restriction to 3rd party involvement, which mostly hinders volunteer-run projects.

The forge suite adopts an opposite approach, where anyone can receive updates from remote repositories, and run the tasks they wish. This allows anyone within or without your projects to setup new test suites, benchmarks, and integrations. The tasks and configuration can also be shared (across your machines, or with everyone else) in a repository, as the secrets can reside anywhere else on the machine. The applications are endless and should benefit your projects in many ways.

Do you really need virtual machines or containers to run tests?

It appears these days the hype is all about cloud services, and auto-provisioning containers from within virtual machines. Surely if you need to scale 1000 times in an hour the Kubernetes meta-hypervisor is your friend. But do you really need that? Can you even afford it? I'm certainly not rich, but I've got a laptop, a shared account on a tilde server, and a VM, so why would i need to waste years mastering all the footguns provided by cloud companies when i got all the computing power i need under my hand?

With the forgesuite you can run specific tasks under different user accounts, or on specific machines (filtered by hostname). If you really need to use Docker/LXC to run that task, sure just do it. Just write the magic virtualization incantations in your task files and be happy. You can even write a friendly abstraction in a helper script to do it semi-magically for all your tasks if that's what you really want. But if you don't need that, you'll be happy that your CI/CD does not require setting up an entire virtualized operating system everytime it needs to run 10 lines of script.

Screw platform-specific configuration formats

Why does every CI/CD tool have to reinvent their own configuration format? Github Actions, Drone and Gitlab are three different flavors of YAML doing 99% of the very same thing. Are they incapable of writing a specification to standardize? Or are they just happy to capture worryfree sysadmins in their ecosystem where you need every single browser to regenerate the same diagram over and over again or where you can clone the repository from tor only if you append ".git" to the HTTPS URL.

So wait, at this point you must realize i'm reinventing yet another standard so why would we even do that? First, because why not, since what exists does not suit our needs? But mostly because the forgesuite specifications are not tied to an implementation and will not add/remove features just for the sake of breaking interoperability: the test suites are the only source of truth and gladly accept patches for new use-cases and test cases. It's still early in the life of the forge suite so things may still change at this point, but i have confidence we can version the interfaces before that ever becomes a problem.

Code Of Conduct

This project abides by the ~fr operating principles.

Security

The forge suite has not received a security audit

The forge suite does not (yet) aim to be a highly-secure and fully-reproducible CI/CD pipeline, though it may be used to build one. In particular, software supply chain attacks can be mitigated with PGP signatures and channel introductions, and reproducibility issues can be tackled by a reproducible distribution such as NixOS or GNU/guix.

The modular architecture intends to make security easier by standardizing interfaces (CLI or HTTP) and expected behavior. However, implementation-specific bugs may bite you, especially with implementations written in interpreted languages (scripts). Please don't use these tools for any sensitive projects.

You should be aware that setting up a public forgehook endpoint can be a vector for Denial of Service attacks, as the server has to perform signatures to check against the client-provided claim.

License

Everything is licensed under aGPLv3 license, unless noted otherwise. The logo is an exception, as I have merely copied it from the Internet.

FAQ

Why are there no integration tests for the entire stack yet?

Let's write them!

My tests are long, how can they be started in the background?

Just start the long-running process in the background from your task script/program. The forgehook endpoint should return success/error instantly (based on the secret validation), although that is untested yet.

What type of webhooks are supported?

So far, only "push" notifications are supported (code was updated on the repository). In the future, i would like to pass a standardized representation of the webhook (maybe forgefed format?) to the task's STDIN, so that it may take different actions based on the type of webhook received.

It should be possible to use many more webhook sources/types, even outside of the forging ecosystem. I have no use for that, but patches are welcome!

How can i add a build machine without reconfiguring webhooks on all my repos?

This is not possible yet, but will be implemented soon (?!) as forgesub. If you'd like to see what i had in mind previously, there was a bash implementation for POSIX multi-user subscription (using static sudo rules).