- postgresql: create UNLOGGED tables for faster insertion. Since the data is presumably stored elsewhere (i.e. either the files you pass to `cache()` or the original systems you got your data from) this seems like a reasonable tradeoff.
- markroot: mark entire trees; i.e. when a process references a file and another process (i.e. it's parent), mark both the child process AND it's binary_ref file as "roots". This is useful when trying to reconstruct the original observations (via SQL LEFT OUTER JOIN - when joining the `observed-data` table to the SCO tables, add the condition `AND x_root IS NOT NULL`. This, combined with the DISTINCT clause, should be enough to prevent any "duplicate" rows resulting from the join.