KinoSearch::Search::Similarity - calculate how closely two items match |
void STORABLE_thaw(blank_obj, cloning, serialized) SV *blank_obj; SV *cloning; SV *serialized; PPCODE: { Similarity *sim = Kino_Sim_new(); SV *deep_obj = SvRV(blank_obj); sv_setiv(deep_obj, PTR2IV(sim)); }
void
new(either_sv)
SV *either_sv;
PREINIT:
char *class;
Similarity *sim;
PPCODE:
/* determine the class */
class = sv_isobject(either_sv)
? sv_reftype(either_sv, 0)
: SvPV_nolen(either_sv);
/* build object */ sim = Kino_Sim_new(); ST(0) = sv_newmortal(); sv_setref_pv(ST(0), class, (void*)sim); XSRETURN(1);
Provide a normalization factor for a field based on the square-root of the number of terms in it.
Return a score factor based on the frequency of a term in a given document. The default implementation is sqrt(freq). Other implementations typically produce ascending scores with ascending freqs, since the more times a doc matches, the more relevant it is likely to be.
_float_to_byte and _byte_to_float encode and decode between 32-bit IEEE floating point numbers and a 5-bit exponent, 3-bit mantissa float. The range covered by the single-byte encoding is 7x10^9 to 2x10^-9. The accuracy is about one significant decimal digit.
The norm_decoder caches the 256 possible byte => float pairs, obviating the need to call decode_norm over and over for a scoring implementation that knows how to use it.
KinoSearch::Search::Similarity - calculate how closely two items match |