Processing rules

Processing rules are user specific automations intended to ease and assist on preprocessing documents when they are being uploaded. If for instance you find to repetitevely editing the documents in same or similar fashion, a suitable processing rule may be able to automate the task.

Some common tasks to do with processing rules are:

  • Extract date from the document and set it as document’s date. Without this, the document date will always be the date that the document was uploaded and not its actual date.
  • Add metadata
  • Flag documents with tags or metadata (e.g. flag new documents with certain metadata)
  • Making sure there’s always at least metadata keys A and B present
  • Classify documents
  • Modify document’s title or description

Extracting dates from document content

Virtualpaper is able to extract dates directly from document content and name. Often times multiple dates are actually found, in which case Virtualpaper uses heuristics to select the best date. It tries to select the most occuring date but also a date that’s upcoming in the near future.

Extracting dates is a special feature of Virtualpaper, which uses different mechanism than any other rule. If the rule condition finds and extracts a valid date the date is temporarily stored separately from the document. If there’s ‘Set Date’ action, the date time is saved as the document’s date regardless of the Action’s Value. The Value cannot be empty, but other thatn that its content doesn’t affect the processing in any way.

Configuration:

Sample

In order to get date extraction to work probely, two things are needed:

  1. A working regular expression
  2. A corresponding time layout that’s according to Golang time formats.

Hint you can add multiple layouts and regular expressions by adding a condition for each layout. Just be sure to set rule mode to ‘Match all’.

Cheat sheet for the format:

Name Marker
Day 2, 02
Month 1, 01, Jan
Year 06, 2006

Examples of valid time layouts:

  • 1/2/2006
  • Jan 2 2006
  • 1.2.2006

For more info on Go’s time layouts, see: Golang time package’s documentation.

Configuring processing rule

Conditions:

  1. Add condition
  2. Set Enabled, Case Insensitive and Regex to True
  3. Set condition type to ‘Date Is’
  4. Set filter to a correct regular expression, for example a regex (\d{1,2}.\d{1,2}.\d{4}) matches dates that are of format 1.2.2006.
  5. Set correct layout to Date Format, as described above

Action:

  1. Add action
  2. Set Action type to Set Date
  3. Set Enabled and On Condition to True
  4. Set now() to value.