[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[help-GIFT] Changing weightings of feature group in "separate normalizat
From: |
David Squire |
Subject: |
[help-GIFT] Changing weightings of feature group in "separate normalization" |
Date: |
Tue, 22 May 2001 15:40:20 +1000 |
Hi all,
I have now succeeded in hacking the feature extraction code so that it reads in
statistics (in .tff files) for .txt files associated with images and
incorporates them into the .fts files as a fifth feature group. I have also
modified gift-add-collection.pl to cope with this (it has to copy the .tff file
to /tmp/ as well, since gift-extract-features now looks for a .tff file with
the same root name as the image file, in the same directory [yes I know it's a
hack]).
I now want to change the query engine so that:
Wolfgang Müller (in Viper communication) wrote:
> David Squire wrote:
> > I want to be able to... vary the weighting assigned to each feature group
> > arbitrarily - this would allow us to test performance with various
> > combinations (e.g. visual features 50%, text 50%, or all five feature groups
> > equally weighted). It might also be possible to do relevance feedback or
> > learning at the level of feature groups.
> Most of this can be done with the current framework without full
> integration
> of the query engine. If I fix one bug you can do everything.
> The advantage of this is that you would get the results without digging
> in
> the deep.
I have been ploughing through the code trying to find the point at which I need
to make the necessary change(s), but without luck. My guess is that it is
somewhere in libGIFTQuInvertedFile/cc/CQInvertedFile.cc (or at least that
directory), but I can't find the appropriate point. Any clues on this would be
*greatly* appreciated.
On another aspect of this, does anyone know where the php interface gets its
knowledge of the available feature groups? I would love
to be able to hack that so that feature groups can be given a weight, not just
included or excluded.
Finally, where is the resolution from feature ID to feature group done? I would
like to get rid of the notion of a feature description file altogether and
assign ranges of IDs to feature groups. The current method is very inefficient
(yes, I'm responsible for it), and I can't predict a priori the total number of
possible text features. At present I have assigned them feature IDs from
100,000 upwards. Adding, say, 100,000 identical lines to the feature
description file doesn't seem like the best way to cope with this....
Cheers and thanks,
David
PS. Is there a file somewhere explaining the class/function/variable naming
conventions?
--
Dr. David McG. Squire
Computer Science and Software Engineering, Monash University, Australia
http://www.csse.monash.edu.au/~davids/ http://viper.unige.ch/
Do/Don't want HTML mail? Let me know.
- [help-GIFT] Changing weightings of feature group in "separate normalization",
David Squire <=