How

How does Virtualpaper work?

Virtualpaper handles documents as its basic unit. A single document contains not only the original file that’s uploaded, but it also includes basic information, such as name and date, as well as any user-configurable metadata.

document-preview

When user uploads document to the server, the server first extracts the contents of the document and saves it as text along with the file. Next the server matches the document’s title and content against known metadata patterns. After matching metadata, fully customizable rules are executed on the document. Finally, all of the data that’s collected from the document is saved on the search index so that user can search for the document.

Known limits

Virtualpaper has not been optimized for scaling. The number of users is limited to 200, since that is the limit of indices a single Meilisearch instance can process. Virtualpaper reserves single index for each user. This allows each user to customize the search experience to their needs, ranging from setting stop word to creating their personal dictionary of synonyms. This all improves the search experience.

Also, if lots of large documents are being added, Meilisearch takes some time to actually index them. For instance, uploading 1000s of research papers may require several hours of indexing, depending on the hardware. The searching is really fast, but the indexing process is not if lots of data is inserted in a short period of time.

Other than that, Meilisearch can handle millions of documents, so the few users should be fine.