1d4c29426SMark Nunberg# subjson - quickly manipulate JSON subfields
205e44322SMark Nunberg
305e44322SMark NunbergThis is high performance string library which can manipulate JSON documents.
405e44322SMark NunbergIt does so by performing simple string substitutions on _regions_ of the
505e44322SMark Nunbergdocument.
605e44322SMark Nunberg
705e44322SMark NunbergThis library uses the fast [jsonsl](https://github.com/mnunberg/jsonsl) parser
805e44322SMark Nunbergto obtain regions of the document which should be replaced, and outputs a small,
905e44322SMark Nunbergfixed array of `iovec` like structures (buffer-length regions) which consist
1005e44322SMark Nunbergof the new document.
1105e44322SMark Nunberg
1270297af4SMark Nunberg## Performance Characteristics
1370297af4SMark Nunberg
1405e44322SMark NunbergBecause the library does not actually build a JSON tree, the memory usage and
1505e44322SMark NunbergCPU consumption is constant, regardless of the size of the actual JSON object
1605e44322SMark Nunbergbeing operated upon, and thus the only variable performance factor is the
1705e44322SMark Nunbergamount of actual time the library can seek to the location in the document to
1805e44322SMark Nunbergbe modified.
1905e44322SMark Nunberg
2070297af4SMark NunbergOn a single Xeon E5520 core, this library can process about 150MB/s-300MB/s
2170297af4SMark Nunbergof JSON. This processing includes the search logic as well as any replacement
2270297af4SMark Nunberglogic.
2370297af4SMark Nunberg
2470297af4SMark NunbergThe above speed is rather misleading, as this is often quicker, since the
2570297af4SMark Nunbergdocument is only parsed until the relevant match sections have been found.
2670297af4SMark NunbergThis means that even for large inputs, only _n_ bytes of the data is actually
2770297af4SMark Nunbergparsed, where _n_ is the position in the file where the match itself ends.
2870297af4SMark Nunberg
2970297af4SMark NunbergPerformance may also depend on how deep and/or long the path is (since string
3070297af4SMark Nunbergcomparison must be done occasionally on the relevant path components).
3170297af4SMark Nunberg
3205e44322SMark Nunberg## Building
3305e44322SMark Nunberg
3405e44322SMark Nunberg    $ mkdir build
3505e44322SMark Nunberg    $ cd build
36d4c29426SMark Nunberg    $ cmake .. -DCMAKE_BUILD_TYPE=RELEASE -DSUBJSON_GTEST=/path/to/gtest
3705e44322SMark Nunberg    $ make
3805e44322SMark Nunberg    $ make test
39e080e1bfSMark Nunberg    $ ./bin/bench --help
40e080e1bfSMark Nunberg
41d4c29426SMark NunbergNote that to run the tests you will need to have a copy of gtest. A minified
42d4c29426SMark Nunbergversion may be found [here](https://github.com/couchbasedeps/gtest).
43d4c29426SMark Nunberg
44e080e1bfSMark Nunberg## Testing commands
45e080e1bfSMark Nunberg
46e080e1bfSMark NunbergThe build will produce a `bench` program in the `$build/bin` directory,
47e080e1bfSMark Nunbergwhere `$build` is the directory from which CMake was run.
48e080e1bfSMark Nunberg
49e080e1bfSMark NunbergThe basic syntax of `bench` is:
50e080e1bfSMark Nunberg
51e080e1bfSMark Nunberg    ./bin/bench -c <COMMAND> -f <JSON FILE> -p <PATH> [ -v <VALUE> ]
52e080e1bfSMark Nunberg
53e080e1bfSMark NunbergYou can use `./bin/bench -c help` to show a list of commands.
54e080e1bfSMark Nunberg
55e080e1bfSMark NunbergFor commands which perform mutations, the `-v` argument is required, and
56e080e1bfSMark Nunbergmust contain a string which will evaluate as valid JSON within the context
57e080e1bfSMark Nunbergof the operation. In most cases this is just a simple JSON value; in the case
58e080e1bfSMark Nunbergof list operations this may also be a series of JSON values separated by
59e080e1bfSMark Nunbergcommas.
6005e44322SMark Nunberg
61e080e1bfSMark NunbergNote that if inserting a string, the string must be specified with surrounding
62e080e1bfSMark Nunbergquotes. For example
6305e44322SMark Nunberg
64a171334eSMark Nunberg
65a171334eSMark Nunberg    ./bin/bench -f ../jsondata/brewery_5k.json -v '"CENSORED DUE TO PROHIBITION"' -p description -c replace