Virtualpaper stores all files in a single directory that is
configured with setting processing.output_dir
.
Original documents are stored in a three-level hierarchical tree where the path to documents is identified by their ids.
Virtualpaper uses UUIDv4 as document identifiers and each id is split into two directories,
followed by the rest of the id.
For instance, a document id: d669c348-463d-4025-a290-49bfe65c5287
is
located in
# original document
<data-dir>/documents/d/6/69c348-463d-4025-a290-49bfe65c5287
# thumbnail
<data-dir>/previes/d/6/69c348-463d-4025-a290-49bfe65c5287.png
Aside from original documents and previews, the rest of the data is located in a PostgreSql database. Meilisearch acts a secondary storage containing the search indexi. While it may be beneficial to backup the Meilisearch data, especially because it might make it faster to restore backup, it is considered optional to backup the Meilisearch instance. Virtualpaper can always re-index the document data into a fresh Meilisearch instance from the Admin UI.
Sql script for getting all files and their paths
SELECT
id,
name,
substring(id,1,1)||'/'||substring(id,2,1)||'/'||substring(id,3,100) AS path,
mimetype
FROM documents;
To show only for single user, use:
SELECT
id,
name,
substring(id,1,1)||'/'||substring(id,2,1)||'/'||substring(id,3,100) AS path,
mimetype
FROM documents
WHERE user_id=<numerical-user-id>;