The user's application play with Hyppocampus as if all contents would be in a SQL table with 2^64 columns, one for each mapped metadata: this is easiest to manage and to understand, when looking for a file with name 'foo' ask for META_NAME = 'foo', looking for files having size minor then 1kb ask for META_SIZE < 1024 and so on; same for the extraction of values: if I need the time of the creation of the item in the SELECT clause put META_CREATION.
Each row is an item, a set of rows (a result set) is a directory.
The format of the query is a bit different from the canonical one: having one only table, the FROM clause have been stripped out.
In the backend database, contents are organized in a SQL table (named "data") with three columns: one for the file owning the metadata, one for the unique identifier of the metadata rappresented and one for the value assigned to the file and to the metadata. The first two fields are in numeric format (NUMERIC(20), so to hold values until 2^64), the value is always expressed as string and is a binary field (so to be arbitrary sized).
WHERE clauseSELECT clause (and is extended if an ORDER BY is used)
A simple example:
SELECT META_NAME WHERE META_UID = 'foo' SELECT file, id, value FROM data WHERE id = META_NAME AND file IN ( SELECT file FROM data WHERE id = META_UID AND value = 'foo' )
(Notice that metadata ID in database are expressed in numeric format, here are reported with the mnemonic name for convenience)
This is a gently version of the real produced one: in real world table reference in both internal and external queries are aliased with AS, as described below.
When more metadata are asked to match, things begins to be complicated
SELECT META_NAME WHERE META_UID = 'foo' AND META_GID = 'bar' SELECT file, id, value FROM data WHERE id = META_NAME AND file IN ( SELECT file FROM data AS data_4, data AS data_5 WHERE data_4.file = data_5.file AND data_4.id = META_UID AND data_4.value = 'foo' AND data_5.id = META_GID AND data_5.value = 'bar' )
To retrive files matching both requests in the WHERE clause is needed to check parallelly values matching the first and values matching the second, with care about the file expressed in "file" field which have to be always the same. For convenience, aliases for the table are named with the numeric ID of the metadata for which the alias is created: 4 is the numeric for META_UID, 5 is for META_GID.
The same is did when an ORDER BY statement comes, but applied to the external query, the one to handle the effective result set
SELECT META_NAME WHERE META_UID = 'foo' ORDER BY META_ACCESS SELECT data_1.file, data_1.id, data_1.value FROM data AS data_1, data AS data_6 WHERE data_1.file = data_6.file AND data_1.id = META_NAME AND data_6 = META_ACCESS AND file IN ( SELECT file FROM data WHERE id = META_UID AND value = 'foo' ) ORDER BY data_6.value
No particular trasformations are applied in case of GROUP BY statement, only the metadata there expressed are added to the final result set; the organization of items by the grouping requests is managed later while formatting the result set before the presentation, as explained in Managment of virtual folders .
The result set, containing all three fundamental informations about each metadata (owner file, metadata identifier and value) is formatted into a SQLResultSet structure, which is just a list of SQLRow: each row collects metadata from a single file (take note: a result set from the Hyppocampus filesystem *always* is a list of files, more about this is showed below), and metadata are sorted in an HyppoVFSMeta array each element containing both the identifier and the value assigned. When the user application ask for contents in the result set (with the readdir syscall), items are returned as stat structures having, as name, the list of values assigned to the metadata required in SELECT clause separated each other with a MULTIPLE_VALUES_SEPARATOR sequence.
Hyppocampus is a filesystem, so its work is to provide lists of files as required by user applications: this means the result set always concern items, and aggregation functions cannot be used in the main SELECT statement (but yes in internal queries).
1.5.3