[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: File search progress: database review and question on triggers
From: |
Pierre Neidhardt |
Subject: |
Re: File search progress: database review and question on triggers |
Date: |
Mon, 05 Oct 2020 20:53:01 +0200 |
Hi Ludo!
Ludovic Courtès <ludo@gnu.org> writes:
> Nice!
Thanks!
> Could you post a summary of what you have done, what’s left to do, and
> how you’d like to integrate it? (If you’ve already done it, my
> apologies, but you can resend a link. :-))
What I've done: mostly a database benchmark.
- Textual database: slow and not lighter than SQLite. Not worth it I believe.
- SQLite without full-text search: fast, supports classic patterns
(e.g. "foo*bar") but does not support word permutations.
- SQLite with full-text search: fast, supports word permutations but
does not support suffix-matching (e.g. "bar" won't match "foobar").
Size is about the same as without full-text search.
- Include synopsis and descriptions. Maybe we should include all fields
that are searched by `guix search`. This incurs a cost on the
database size but it would fix the `guix search` speed issue. Size
increases by some 10 MiB.
I say we go with SQLite full-text search for now with all package
details. Switching to without full-text search is just a matter of a
minor adjustment, which we can decide later when merging the final
patch. Same if we decide not to include the description, synopsis, etc.
What's left to do:
- Populate the database on demand, either after a `guix build` or from a
`guix filesearch...`. This is important so that `guix filesearch`
works on packages built locally. If `guix build`, I need help to know
where to plug it in.
- Adapt Cuirass so that it builds its file database.
I need pointers to get started here.
- Sync the databases from the substitute server to the client when
running `guix filesearch`. For this I suggest we send the compressed
database corresponding to a guix generation over the network (around
10 MiB). Not sure sending just the delta is worth it.
- Find a way to garbage-collect the database(s). My intuition is that
we should have 1 database per Guix checkout and when we `guix gc` a
Guix checkout we collect the corresponding database.
I would store the databases in /var/guix/...
Comments and help welcome! :)
--
Pierre Neidhardt
https://ambrevar.xyz/
signature.asc
Description: PGP signature
- Re: File search progress: database review and question on triggers, Ludovic Courtès, 2020/10/05
- Re: File search progress: database review and question on triggers,
Pierre Neidhardt <=
- Re: File search progress: database review and question on triggers, zimoun, 2020/10/09
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/10
- Re: File search progress: database review and question on triggers, zimoun, 2020/10/10
- Re: File search progress: database review and question on triggers, Ludovic Courtès, 2020/10/12
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/12
- Re: File search progress: database review and question on triggers, Ludovic Courtès, 2020/10/13
- Re: File search progress: database review and question on triggers, Pierre Neidhardt, 2020/10/13
Re: File search progress: database review and question on triggers, zimoun, 2020/10/10