1# The Binary (Termless) Format for Views. #
2
3**(Damien Katz -- July 19 2012)**
4
5This documents the format of the Key/Value pairs in leaf nodes, and the reduction format in inner nodes.
6
7The underlying b-tree format is the same as already used in CouchStore.
8
9* All integers are network byte order.
10* All strings are encoded as UTF-8.
11* Adjacent fields are tightly packed with no padding.
12
13
14## Primary Index Key Values ##
15
16In leaf nodes, `KeyValues` have the following format:
17
18* `Key`:
19  *  `EmittedJsonKeyLength` -- 16bit integer
20  *  `EmittedJSONKey` -– JSON -– The key emitted by the map function
21  *  `UnquotedDocId` –- String -– The raw doc ID (occupies the remaining bytes)
22* Value:
23  *  `PartitionId` -- 16bit integer -- This is the partitionId (vbucket) from which this document id maps to.
24  *  1 to infinity `JSONStringValue`s -- These are all the values that were emitted for this `EmittedJSONKey`.  
25     Each `JSONStringValue` is of the form:
26		* `ValueLength` -- 24bit unsigned integer
27		* `JSONValue` - string that is `ValueLength` bytes long
28
29(Parsing the `JSONStringValue`s is simply reading the first 24 bits, getting the length of the following string and extracting the string. If there is still buffer left, the process is repeated until the is no value buffer left.)
30
31When an emit happens, and the Key is different from all other keys emitted for that document, then there is only one `JSONStringValue`.
32But when multiple identical keys are emitted, the values are coalesced into a list of Values, and there will be multiple values.
33
34
35### Primary Index Inner Node Reductions (KeyPointerNodes and Root) ###
36
37* `SubTreeCount` -- 40bit integer -- Count of all Values in subtree.  
38	NOTE: this is possibly greater than the `KeyId` count in the subtree, because a document can emit multiple identical keys, and they are coalesced into single `KeyId`, with all the values emitted in a list as the value.
39* `SubTreePartitionBitmap` -- 1024 bits -- a bitfield of all partition keys in the subtree. Currently this is hardcoded at 1024 bits in length, but in the future we may change this to a variable size. Until then, it works with any # of vbuckets ≤ 1024.
40* `JSONReductions` -- remaining bytes -- Zero or more `JSONReductions`, each consisting of:
41  *  `JSONLen` -- 16bit integer
42  *  `JSON` -- the actual JSON string
43
44
45## Back Index ##
46
47In leaf nodes, `KeyValues` have the following format:
48
49* `Key` -- blob -- The raw docId, not quoted or JSONified in any way.
50* `Value`:
51  * `PartitionId` -- 16bit integer -- This is the partitionId (vbucket) from which this document id maps to.
52  * 1-*n* `ViewKeysMappings`, where *n* ≤ the # of mapfunctions defined in the design document.  
53    A `ViewKeysMapping` is:
54	    * `ViewId` -- 8bit integer -- the ordinal id of the map view in the group the following keys were emitted from
55	    * `NumKeys` -- 16bit integer -- the number of `JSONKeys` that follow
56		* `JSONKeys` -- a sequence of:
57			  * `KeyLen` -- 16bit integer -- Length of the following `JSONKey`
58		  	  * `Key` -- JSON string -- Emitted JSON key
59
60
61### Back Index Inner Node Reductions (KeyPointerNodes and Root) ###
62
63* `SubTreeCount` -- 40bit integer -- count of all Keys in subtree.
64* `SubTreePartitionBitmap` -- 1024 bits -- a bitfield of all partition keys in the subtree. Currently this is hardcoded at 1024 in length, but in the future we may change this to a variable size. Until then, it works with any # of vbuckets ≤ 1024.
65