bpindexer

bpindexer — create indices for Poliqarp binary corpus

Synopsis

bpindexer { -h | --help | -v | --version }

bpindexer [option...] base-name [index-base-name]

Description

Create inverted indices for Poliqarp binary corpora.

Options

-h, --help

Display help and exit.

-v, --version

Output version information and exit.

-q, --quiet

Be quiet, suppress progress information.

-m, --memory-usage=mem

Set an upper limit on memory usage to mem, in bytes. This number can be followed by a letter K (which specifies kilobytes as the unit) or M (megabytes) or G (gigabytes).

The default is 30M.

-g, --granularity=gran

Set the granularity of the resulting index to gran, an integer between 100 and 1000000.

The default is 1024.

-i, --indices=list

Create only the given indices. list is a string of one to three different letters, with the following meanings:

o

create an index of orthographic forms;

d

create an index of disambiguated interpretations;

a

create an index of ambiguous interpretations.

The default is create all the indices.

Poliqarp format

As of bpindexer 1.3.1, the only supported binary format version is 2. Please use bpupgrade(1) to convert your corpora.

Binary files created by bpindexer

*.poliqarp.rindex.orth, *.poliqarp.rindex.orth.offset

index for orthographic forms

*.poliqarp.rindex.disamb, *.poliqarp.rindex.disamb.offset

index for disamiguated intepretations

*.poliqarp.rindex.amb, *.poliqarp.rindex.amb.offset

index for ambiguous intepretations

Notes

Up to Poliqarp 1.2, this tool was named simply indexer.