[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Guidelines for pre-trained ML model weight binaries (Was re: Where s
From: |
zamfofex |
Subject: |
Re: Guidelines for pre-trained ML model weight binaries (Was re: Where should we put machine learning model parameters?) |
Date: |
Mon, 29 May 2023 00:57:38 -0300 (BRT) |
> To me, there is no doubt that neural networks are a threat to user
> autonomy: hard to train by yourself without very expensive hardware,
> next to impossible without proprietary software, plus you need that huge
> amount of data available to begin with.
>
> As a project, we don’t have guidelines about this though. I don’t know
> if we can come up with general guidelines or if we should, at least as a
> start, look at things on a case-by-case basis.
I feel like it’s important to have a guideline for this, at least if the issue
becomes recurrent too frequently.
To me, a sensible *base criterion* is whether the user is able to practically
produce their own networks (either from scratch, or by using the an existing
networks) using free software alone. I feel like this solves the issue of user
autonomy being in risk.
By “practically produce”, I mean within reasonable time (somewhere between a
few minutes and a few weeks depending on the scope) and using exclusively
hardware they physically own (assuming they own reasonbly recent hardware to
run Guix, at least).
The effect is that the user shouldn’t be bound to the provided networks, and
should be able to train their own for their own purposes if they so choose,
even if using the existing networks during that training. (And in the context
of Guix, the neural network needs to be packaged for the user to be able to use
it that way.)
Regarding Lc0 specifically, that is already possible! The Lc0 project has a
training client that can use existing networks and a set of configurations to
train your own special‐purpose network. (And although this client supports
proprietary software, it is able to run using exclusively free software too.)
In fact, there are already community‐provided networks for Lc0[1], which
sometimes can play even more accurately than the official ones (or otherwise
play differently in various specialised ways).
Of course, this might seem very dissatisfying in the same way as providing
binary seeds for software programs is. In the sense that if you require an
existing network to further train networks, rather than being able to start a
network from scratch (in this case). But I feel like (at least under my “base
criterion”), the effects of this to the user are not as significant, since the
effects of the networks are limited compared to those of actual programs.
In the sense that, even though you might want to argue that “the network
affects the behavior of the program using it” in the same way as “a Python
source file affects the behavior of its interpreter”, the effect of the network
file for the program is limited compared to that of a Python program. It’s much
more like how an image would affect the affect the behavior of the program
displaying it. More concretely, there isn’t a trust issue to be solved, because
the network doesn’t have as many capabilities (theoretical or practical) as a
program does.
I say “practical capabilities” in the sense of being access user resources and
data for purposes they don’t want. (E.g. By accessing/modifying their files,
sharing their data through the Internet without their acknowledgement, etc.)
I say “theoretical capabilities” in the sense of doing things the user doesn’t
want nor expects, i.e. thinking about using computations as a tool for some
purpose. (E.g. Even sandboxed/containerised programs can be harmful, because
the program could behave in a way the user doesn’t want without letting the
user do something about it.)
The only autonomy‐disrespecting (or perhaps rather freedom‐disrespecting) issue
is when the user is stuck with the provided network, and doesn’t have any tools
to (practically) change how the program behaves by creating a different network
that suits their needs. (Which is what my “base criterion” tries to defend
against.) This is not the case with Lc0, as I said.
Finally, I will also note that, in addition to the aforementioned[2] fact that
Stockfish (already packaged) does use pre‐trained neural networks too, the
lastest versions of Stockfish (from 14 onward) use neural networks that have
themselves been indirectly trained using the networks from the Lc0 project.[3]
[1]: See <https://lczero.org/play/networks/basics/#training-data>
[2]: It was mentioned in <https://issues.guix.gnu.org/63088>
[3]: See <https://stockfishchess.org/blog/2021/stockfish-14/>