[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Adding a few more finder keywords

From: Stephen J. Turnbull
Subject: Re: Adding a few more finder keywords
Date: Tue, 09 Jun 2015 13:39:51 +0900

Stefan Monnier writes:

 > We could decide that the specific keywords are unwanted, tho.

An "unwanted" keyword doesn't exist though.  Somebody wanted it or it
wasn't in Keywords: in the first place.  And although every human is
unique, very few humans are so unique that they'll choose a keyword
that nobody else would use to look up packages.

So I think what you mean by "unwanted" is mostly "redundant (because a
synonym)".  It seems to me that

1.  There *should* be a list of "recommended keywords" which package
    maintainers can easily access for reference when choosing keywords
    to specify for their packages and users can refer to get an idea
    of the keywords maintainers are likely to use.

2.  There *should* be a database of synonyms of recommended keywords
    for use by maintainers to discover recommended keywords, and for
    *finder* to use in user searches for keywords.  Finder should
    probably divide its report into exact matches for the user's
    keyword and matches discovered via synonyms.

    The schema for this database is unclear to me.  Should there be a
    "similarity" measure to indicate how synonymous two keywords are?
    (Probably a YAGNI.)  Should the primary key of the database be
    restricted to recommended keywords, or perhaps just be the most
    frequently used of a synonym group?  (See point 3 below.)

3.  There should be a tool to walk the libraries producing a Pareto
    distribution of keywords.  Those at the top of the distribution
    would be excellent candidates for the "recommended" list (but
    beware, it's quite possible that two popular keywords could be
    synonyms!).  Those at the bottom would be candidates for addition
    to the database of synonyms and replacement with a recommended

    Probably this tool only needs to be run at release time, and the
    distribution database could be included in etc.

    There's no need to be fascist about keyword maintenance and
    pruning low-frequency keywords that have synonyms, either.  There
    is quite some incentive for maintainers to use user-discoverable
    (ie, recommended) keywords, if you provide the tools so that they
    can find them easily.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]