KinoSearch::Search::Similarity - calculate how closely two items match


void STORABLE_thaw(blank_obj, cloning, serialized) SV *blank_obj; SV *cloning; SV *serialized; PPCODE: { Similarity *sim = Kino_Sim_new(); SV *deep_obj = SvRV(blank_obj); sv_setiv(deep_obj, PTR2IV(sim)); }

void new(either_sv) SV *either_sv; PREINIT: char *class; Similarity *sim; PPCODE: /* determine the class */ class = sv_isobject(either_sv) ? sv_reftype(either_sv, 0) : SvPV_nolen(either_sv);

    /* build object */
    sim = Kino_Sim_new();
    ST(0)   = sv_newmortal();
    sv_setref_pv(ST(0), class, (void*)sim);
    XSRETURN(1);

Provide a normalization factor for a field based on the square-root of the number of terms in it.

Return a score factor based on the frequency of a term in a given document. The default implementation is sqrt(freq). Other implementations typically produce ascending scores with ascending freqs, since the more times a doc matches, the more relevant it is likely to be.

_float_to_byte and _byte_to_float encode and decode between 32-bit IEEE floating point numbers and a 5-bit exponent, 3-bit mantissa float. The range covered by the single-byte encoding is 7x10^9 to 2x10^-9. The accuracy is about one significant decimal digit.

The norm_decoder caches the 256 possible byte => float pairs, obviating the need to call decode_norm over and over for a scoring implementation that knows how to use it.

Back to Top

 KinoSearch::Search::Similarity - calculate how closely two items match