History log of /6.6.0/kv_engine/ (Results 1 - 25 of 14892)
Revision (<<< Hide revision tags) (Show revision tags >>>)Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
Revision tags: v6.6.0
b08424f827-Jul-2020 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-40634: Update datatype when decompressing the payload at DelWithMeta

Validations may fail otherwise, as some code paths try to re-inflate the
payload if still marked as compressed.

MB-40634: Update datatype when decompressing the payload at DelWithMeta

Validations may fail otherwise, as some code paths try to re-inflate the
payload if still marked as compressed.

Change-Id: Iae20e7029cf031d32b63880a780a7052441f067d
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/133272
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@couchbase.com>

show more ...

8cec59e624-Jul-2020 Dave Rigby <daver@couchbase.com>

MB-40543: Merge branch '6.5.1' into mad-hatter

* 6.5.1:
MB-40543: Add dynamic blacklist FTS log config option

Change-Id: I4eba4a9650a255544607189a3873e2f2b2e2c8ca


166a75af22-Jul-2020 Ben Huddleston <ben.huddleston@couchbase.com>

MB-40543: Add dynamic blacklist FTS log config option

Add a new config option:

* dcp_blacklist_fts_connection_logs - Blacklists FTS DCP logging by
default by setting the log l

MB-40543: Add dynamic blacklist FTS log config option

Add a new config option:

* dcp_blacklist_fts_connection_logs - Blacklists FTS DCP logging by
default by setting the log level to critical and unregistering the
logger from log level verbosity changes.

If we un-blacklist the FTS connection logger then
we set the level to that of the global logger and re-register it so
that it can change log level along with the other connections.

Usage:

Set either via bucket config, or epctl on a per node / per bucket basis:

cbepctl <HOST> -u Administrator -p asdasd -b <BUCKET> set dcp_param dcp_blacklist_fts_connection_logs false

Change-Id: Ia77ca49d2b8470c0674f1d0e4fe9bde2e64f8f6a
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/133049
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>

show more ...

6842000321-Jul-2020 Ben Huddleston <ben.huddleston@couchbase.com>

MB-40480: Compare seqno at VBucket::deletedOnDiskCbk

Currently at VBucket::deletedOnDiskCbk we check the revision seqno
of the item to determine if it is the latest version before removi

MB-40480: Compare seqno at VBucket::deletedOnDiskCbk

Currently at VBucket::deletedOnDiskCbk we check the revision seqno
of the item to determine if it is the latest version before removing
it from the HashTable post-persistence if it is also deleted. This
causes a race condition though when we have deleted prepares and aborts.
This race condition is as follows:
1) Create and flush a prepare
2) Create an abort for that prepare
3) Run the flusher to the point that it has acquired the mutations
from the CheckpointManager but not yet invoked the
PersistenceCallback
4) Create a deleted prepare with the same key
5) Finish running the PersistenceCallback

At step 5 we would then compare the deleted flags and revision seqno of
the abort with those of the new prepare we created at step 4. These
values will be the same and at this point we will remove the new
prepare from the HashTable. This causes errors/exceptions later on
as we attempt to complete this prepare as we expect that prepares are
always resident in the HashTable.

Correct this by changing the revision seqno check to an actual seqno
check which should be different for any non meta items.

Change-Id: Icd498725fab94a0339e6677b4d14c98c5a196b8e
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/132951
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...

dd5b577317-Jul-2020 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-40467: Expiration removes everything from the value but SysXattrs

The VBucket::handlePreExpiry function may keep the body of an item
depending on if the payload contains also user/sys

MB-40467: Expiration removes everything from the value but SysXattrs

The VBucket::handlePreExpiry function may keep the body of an item
depending on if the payload contains also user/sys xattrs.

Depending on whether a DCP connection enables IncludeDeletedUserXattrs,
that may result in validation failures that may crash the Producer or
return EINVAL at Consumer.

To fix, VBucket::handlePreExpiry ensures that the item's value is always
replaced with the new value returned by the pre_expiry hook.

Change-Id: I300e3b8fb26883090ea3bfffdfb5165eb04571d7
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/132632
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>

show more ...

45590d3117-Jul-2020 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-40467: Don't use updateStoredValue in VBucket::handlePreExpiry

At "expiration by access" currently we replace the SV's value by calling
HT::unlocked_updateStoredValue and then we exec

MB-40467: Don't use updateStoredValue in VBucket::handlePreExpiry

At "expiration by access" currently we replace the SV's value by calling
HT::unlocked_updateStoredValue and then we execute the soft-delete
logic.

The problem with updateStoredValue is that the function updates the
deleted-state of the StoredValue. While that is fine for the usual
write-path, in the case of expiration we esentially loose the "pre
expiration deleted-state", which leads to errors in the accounting of
BasicLinkedList::numDeletedItems for Ephemeral. See the code in
BasicLinkedList::updateNumDeletedItems for details.

The problem is currently hidden by the main issue caught in MB-40467,
and an ep_testsuite test fails as soon as we fix MB-40467.
So, this patch is pre-requirememnt for the actual main fix for MB-40467.
The follow-up patch contains the fix for MB-40467 and the test coverage
for it. Plus, the existing ep_testsuite covers what we fix in this
patch.

Change-Id: Id5821f13f0c9d239ec368e03b7887611246f9b14
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/132727
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>

show more ...

5c64d40a13-Jul-2020 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-40370: Remove unused code in xattr/utils.cc

Change-Id: I410d79c62798cdb2e085b7bd5218aa227e263a79
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/132285
Tested-by: Build Bot

MB-40370: Remove unused code in xattr/utils.cc

Change-Id: I410d79c62798cdb2e085b7bd5218aa227e263a79
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/132285
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>

show more ...

e31f273413-Jul-2020 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-40370: Make cb::xattr::get_body_size resilient to compressed payloads

Any computation on body/xattr sizes must be done on uncompressed values,
the function may trigger or even silentl

MB-40370: Make cb::xattr::get_body_size resilient to compressed payloads

Any computation on body/xattr sizes must be done on uncompressed values,
the function may trigger or even silently return wrong results.

Change-Id: I6fde1db6ee4f229bf7625b5d43a2272278ab2be1
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/132361
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>

show more ...

3aa4532d01-Jul-2020 Dave Rigby <daver@couchbase.com>

MB-40243: Simplify subdoc testapp_xattr

Make use of the simpler test helper functions added as part of the
testing for MB-40126.

Change-Id: I20d75caa92cb6f6c00209a743eda4e1d92aa

MB-40243: Simplify subdoc testapp_xattr

Make use of the simpler test helper functions added as part of the
testing for MB-40126.

Change-Id: I20d75caa92cb6f6c00209a743eda4e1d92aabf1d
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131730
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>

show more ...

08d5dc4126-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-40162: Expand TRACE logging for bucket_get & bucket_store

Change-Id: I2ae5d105155f1770f59a44100320f6684136e125
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131461
Well-F

MB-40162: Expand TRACE logging for bucket_get & bucket_store

Change-Id: I2ae5d105155f1770f59a44100320f6684136e125
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131461
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Richard de Mellow <richard.demellow@couchbase.com>

show more ...

20504bbb02-Jul-2020 Dave Rigby <daver@couchbase.com>

MB-40262: Subdoc inserts of Alive empty docs should be '{}'

Prior to the fix for MB-40126 (c55541f9f), if subdoc created a new
document (via doc_flag::Add or MkDoc) and all of the paths

MB-40262: Subdoc inserts of Alive empty docs should be '{}'

Prior to the fix for MB-40126 (c55541f9f), if subdoc created a new
document (via doc_flag::Add or MkDoc) and all of the paths were XATTR
paths then the resulting document body would be '{}' - i.e. an empty
JSON object, but crucually *not* an empty value.

As part of fixing MB-40126 (subdoc mutations creating a docuement in
the *deleted* state had a non-empty '{}' value), the '{}' value for
alive document was lost - i.e. patch c55541f9f made both Alive and
Deleted inserts with only XATTR paths have a null '' value.

While in the abstract this is probably fine, it breaks the previous
5.0.0 -> 6.5.0 behaviour, and applications may be relying on this
(SDK Transactions does).

As such, revert to the previous behaviour for Alive documents - they
will be created as '{}' when all paths are XATTR paths.

(The previous fix for Deleted documents is kept - they will be created
as '').

Change-Id: I7b96b89a656c7b745fcd3c19174c6859e5c957f2
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131817
Reviewed-by: James Harrison <james.harrison@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...

c55541f929-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-40126: subdoc CreateAsDeleted: User value shouldn't be '{}'

When using the subdoc API to create docuements in the deleted state
with user XAttrs, the user value _should_ be entirely e

MB-40126: subdoc CreateAsDeleted: User value shouldn't be '{}'

When using the subdoc API to create docuements in the deleted state
with user XAttrs, the user value _should_ be entirely empty (apart
from the XATTRs).

However, the user value can incorrectly end up as '{}' (i.e. an empty
JSON object, but _not_ an empty document) in the following situation:

1. If the document doesn't already exist (but the user specifies
doc_flag::MkDoc or doc_flag::Add).
2. If the subdoc mutation doesn't specify a path to the user value.

This occurs because if a existing document doens't exist (but MkDoc or
Add is specified), then a "empty" JSON object '{}' is assigned to be
the input document. If no futher modification of the document occurs,
then the resulting user body is empty JSON oject '{}'.

The solution is to defer creating the empty template object until the
body phase of subdoc_update; only if there's actually one or more
value paths to be applied.

Change-Id: I466d38ef71bd0345e4c45905d85ebfbb8bda55b6
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131577
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>

show more ...

c4383c7126-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-40162: Allow Add+CreateAsDeleted if no tombstone found after bgfetch

When attempting to create a document directly in the Tombstoned state
(via CreateAsDeleted), this request would in

MB-40162: Allow Add+CreateAsDeleted if no tombstone found after bgfetch

When attempting to create a document directly in the Tombstoned state
(via CreateAsDeleted), this request would intermittently fail with
KEY_ENOENT.

This is caused by a bug in the subdoc_fetch code when allowing
execution to continue even if no existing doc is found (say when using
CreateAsDeleted), if the check to see if a tombstone is present needs
to go to disk (i.e. the Bloom filter failed to avoid the
bgFetch). This explains why this only occurs intermittently.

In that situation, the subdoc state machine sequence for a
multi-mutation with Add|AccessDeleted|CreateAsDeleted looks like:

1. subdoc_fetch -> bucket_get(AliveOrDeleted) ->
EventuallyPersistentEngine::get() -> not found in HT and bloom
filter cannot determine -> EWOULDBLOCK, schedule bgFetch.

2. subdoc_fetch returns back to runloop with EWOULDBLOCK, waiting for
notify_io_complete

3. bgFetch completes, returns KEY_ENOENT (no tombstone on disk) ->
notify_IO_complete(fd, KEY_ENOENT).

4. subdoc_fetch called again, checks AsyncIO status code
KEY_ENOENT. CreateAsDeleted specified - Ok, setup empty document
and return true (to continue execution).

*** However, the status code ret) is not modified, is still KEY_ENOENT. ***

5. subdoc_execute called, performs requested operations on
newly-formed empty doc.

6. subdoc_update called, checks ret. Upon finding non-successful
result returns early with that same status code (KEY_ENOENT).

Note the problem at step 4. Fix is to ensure that after subdoc_fetch()
succeeds (i.e. is happy with no tombstone being present), then ret is
set to SUCCESS.

Change-Id: I6edb5df9f4cdbf986971136c1a3aa4c493c83711
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131460
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Jim Walker <jim@couchbase.com>

show more ...

bcfde9c626-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-37374: XattrNoDocTest: Make explicit when tests are not supported

Previously we reported Success for all these tests, without any
indication they were actually skipped. Instead explic

MB-37374: XattrNoDocTest: Make explicit when tests are not supported

Previously we reported Success for all these tests, without any
indication they were actually skipped. Instead explicitly flag them as
skipped.

Note: Once this is merged to master (where we have GoogleTest v1.10)
the test status changes to SKIPPED instead of PASS.

Change-Id: Ic1745f1f5d87ee383e6c428fee987fb9315617c3
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131458
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: James Harrison <james.harrison@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>

show more ...

97752cd123-Jun-2020 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-40096: Update DcpOpen doc for IncludeDeletedUserXattrs

Actually we miss also the entry for NoValueWithUnderlyingDatatype, so we
add that too here.

Change-Id: Id69be43e805eed6

MB-40096: Update DcpOpen doc for IncludeDeletedUserXattrs

Actually we miss also the entry for NoValueWithUnderlyingDatatype, so we
add that too here.

Change-Id: Id69be43e805eed6fbc72b3c4d4227836b3251c52
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131152
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...

ee5212ee22-Jun-2020 Trond Norbye <trond.norbye@gmail.com>

MB-40058: [BP]: Synchronize access to trace vector

The trace vector may be operated from multiple threads
(and reallocated). Make sure that we serialize this
access (and don't reallo

MB-40058: [BP]: Synchronize access to trace vector

The trace vector may be operated from multiple threads
(and reallocated). Make sure that we serialize this
access (and don't reallocate without letting other
threads know that the location isn't valid anymore)

Change-Id: I0addc4e4e75c3ff7ef87024f0cc8a5ccfd64ef01
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/131059
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...

a4372adc19-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-40052 [3/3]: Account for Backfills in initializingQ on destruction

When Backfills are started (moved into the initializingQ) the
BackfillManager notifies the DcpConnMap so it can trac

MB-40052 [3/3]: Account for Backfills in initializingQ on destruction

When Backfills are started (moved into the initializingQ) the
BackfillManager notifies the DcpConnMap so it can track how many
Backfills across the whole bucket are in progress, and if necessary
cap the number which can concurrently run.

When these backfills complete (either successfully or are cancelled),
then the BackfillManger needs to also notify DcpConnMap to decrement
the number in-progress.

However, when the additional initializingQ was added to
BackfillManager for MB-37680, Backfills in that queue were *not*
decremented from the number in progress.

The effect of this was such Backfills effectively "leaked", meaning
that with sufficent number DCP backfilling for the entire bucket could
hang.

Fix by adding the missing accounting.

Add tests to check that active or snoozing backfills which are still
in existence when a BackfillManager is destructed, are correctly
accounted in the BackfillTracker.

Change-Id: I00215072e9558e7f5cdcd1672f800027d90124ac
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/130995
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: James Harrison <james.harrison@couchbase.com>

show more ...

da816fc019-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-40052 [2/3]: Introduce BackfillTrackingIface

Further decouple BackfillManager from DcpConnMap, by introducing a
BackfillTrackingIface. This interface allows the implementer to track

MB-40052 [2/3]: Introduce BackfillTrackingIface

Further decouple BackfillManager from DcpConnMap, by introducing a
BackfillTrackingIface. This interface allows the implementer to track
the active Backfills, and is used by BackfillManager to know when to
place Backfills onto the pending list.

By adding this interface it simplifies BackfillManager unit tests - no
need to mock the entire DcpConnMap if behaviour of it needs to be changed.

Change-Id: I9999a2a2b9ea8e9a720bcf8d6c4d3b2c82fa15aa
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/130975
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: James Harrison <james.harrison@couchbase.com>

show more ...

148e1ac319-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-40052 [1/3]: Decouple BackfillManager and EvpEngine

Modify BackfillManager so it doesn't rely directly on
EventuallyPersistentEngine, instead it is passed references to the two
su

MB-40052 [1/3]: Decouple BackfillManager and EvpEngine

Modify BackfillManager so it doesn't rely directly on
EventuallyPersistentEngine, instead it is passed references to the two
sub-objects it cares about (KVBucket, DcpConnMap) and the 3 config
parameters.

This makes it significantly simpler to unit-test BackfillManager
functionality relased to DcpConnMap (i.e. backfill tracking at the
bucket level) - for example a mock DcpConnMap can be injected directly
without having to somehow first create an entire
EventuallyPersistentEngine.

Change-Id: I332183984907492d3c7765fbae789f6124eabce1
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/130974
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: James Harrison <james.harrison@couchbase.com>

show more ...

3d701d2d17-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-39993: [Ephemeral] Handle delete_time < server startup time

There's a bug in how we handle replicating an item's deletion time
over DCP, if the item was deleted before destination nod

MB-39993: [Ephemeral] Handle delete_time < server startup time

There's a bug in how we handle replicating an item's deletion time
over DCP, if the item was deleted before destination node started. The
effect of this is that the deletion time of the item on the replica is
set to memcached uptime + 1, and hence won't be subject to metadata
purging until the entire purge interval has passed, even if the actual
item was deleted a long time ago.

The consequence of this is that on Ephemeral buckets, deletions
earlier than server startup received by a replica vBucket will not be
purged at the correct time. Instead their purge is delayed until
server_startup + metadata purge interval.

This occurs because the delete_time in StoredValue is represented as
rel_time_t - i.e. seconds since server started. As such, it is not
possible to represent delete times earlier than the server startup
time. If such a rel_time_t is attempted to be created, then
mc_time_convert_to_real_time() returns a value of "1" (since server
startup):

/* if item expiration is at/before the server started, give it an
expiration time of 1 second after the server started.
(because 0 means don't expire). without this, we'd
underflow and wrap around to some large value way in the
future, effectively making items expiring in the past
really expiring never */
if (t <= epoch) {
rv = (rel_time_t)1;
}

To address this, StoredValue::delete_time is changed from rel_time_t,
to a 32bit time_t (seconds since Unix epoch) - the same as is used for
expiry time. This allows delete times before the server startup to be
represented, and hence correctly purged.

Implementation Note:

Using rel_time_t and (uint32_t) time_t is error-prone, as they are
typically just typedefs to the same underlying type (uint32_t) and
hence the compiler won't help you if you directly assign one to the
other. When this is merged to master I plan to introduce
strongly-typed clocks for each which _will_ prevent any such
accidental conversion, however that would significantly incease the
scope of this fix so not doing that here.

Change-Id: I45b62286f422785462b52eabcf93fdcde73bfa9e
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/130788
Tested-by: Build Bot <build@couchbase.com>
Tested-by: Dave Rigby <daver@couchbase.com>
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>

show more ...

36134ec908-Jun-2020 Richard de Mellow <richard.demellow@couchbase.com>

MB-37009: DcpProducer::handleResponse, don't disconnect on KeyEexists

DcpProducer shouldn't disconnect on status code
cb::mcbp::Status::KeyEexists. As this is returned by
DcpConsumer

MB-37009: DcpProducer::handleResponse, don't disconnect on KeyEexists

DcpProducer shouldn't disconnect on status code
cb::mcbp::Status::KeyEexists. As this is returned by
DcpConsumer::lookupStreamAndDispatchMessage(), when the stream found
for the op's vbucket has an opaque that does not match the op's
opaque. This can occur during rebalance due to stream takeover and
thus we don't want to tear down all the streams because it's likely a
vbucket is being moved to another node or has changed state.

Change-Id: I6ed38cd9b50b40b84b0a354b2a26f952bb380d80
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/130046
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: James Harrison <james.harrison@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...

257ba40220-May-2020 James Harrison <00jamesh@gmail.com>

MB-39333: Ignore unpersisted aborts during rollback

Rolling back an unpersisted abort erroneously tried to load an earlier
version of the document from disk. This instead loaded an older

MB-39333: Ignore unpersisted aborts during rollback

Rolling back an unpersisted abort erroneously tried to load an earlier
version of the document from disk. This instead loaded an older prepare
into memory - even though the _commit_ for that prepare was not rolled
back.

E.g.,

Seqno Operation

1 Prepare
2 Commit
3 Prepare
4 Abort

Rollback to seqno 2.

This should leave only the committed value at seqno 2 in the hashtable.
Instead, when processing the abort at seqno 4
EPBucket::rollbackUnpersistedItems loaded the prepare at seqno 1 back
into memory.

Aborts remove the prepared document from the hashtable, so
EPBucket::rollbackUnpersistedItems does not need to do anything to roll
an abort back.

Had the rollback been to seqno 3, EPBucket::loadPreparedSyncWrites
would handle reloading the prepare into the hashtable.

Change-Id: Ib73e81c52dd56f99c7390e1ec8ac47ce84a41e21
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/128625
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...

7a0ce25505-Jun-2020 Dave Rigby <daver@couchbase.com>

MB-39618: Use coarse clock for CappedDurationVBucketVisitor

CappedDurationVBucketVisitor::pauseVisitor() determines if it should
pause by calling steady_clock::now() and comparing it to

MB-39618: Use coarse clock for CappedDurationVBucketVisitor

CappedDurationVBucketVisitor::pauseVisitor() determines if it should
pause by calling steady_clock::now() and comparing it to the start
time, to see if the visitor has run for too long.

For tasks which either (a) run frequently or (b) spend little time per
VBucket, the number of calls to steady_clock::now() can be
significant. In the case of DurabilityTimeoutTask, it runs every 25ms,
and on a cluster with zero SyncWrites currently in progress it will
have little work to do when visiting a VBucket. As such, both (a) and
(b) are true for it.

steady_clock::now() is implemented on Linux using
clock_gettime(CLOCK_MONOTONIC). This is _normally_ a very fast call,
using the VDSO to return a result without having a make a syscall in
the common case. As such, the large number of steady_clock::now()
calls are not an issue.

However, under certain environments clock_gettime(CLOCK_MONOTONIC) is
*not* fast - for example if the 'HPET' clocksource is in use - a full
syscall is required. This is the case in the original MB linked, where
Docker for Mac was using HPET in it's embedded Linux VM. However HPET
could also be used in other environments - it has been seen in the
past on cloud-based virtualization platforms

This results in a significant increase in the idle CPU of memcached
(specifically the NonIO threads servicing the DurabilityTimeoutTask) -
the OP's environment 3 idle, empty Buckets went from ~15% CPU to
150% CPU.

To address this, change the clock used by CappedDurationVBucketVisitor
to one backed by clock_gettime(CLOCK_MONOTONIC_COARSE) (on
Linux). This is an altenative clock which only gives 1 millisecond
resolution, *but* crucually can always be handled in userspace[1]
without having to make a syscall (even when HPET is in use).

This reduces the CPU overhead back to the original 15%.

[1]: https://elixir.bootlin.com/linux/v4.19.76/source/arch/x86/entry/vdso/vclock_gettime.c#L282

Change-Id: I111cd5f0703b0b2dd6902c4ab0584da4477e94ac
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/129936
Tested-by: Build Bot <build@couchbase.com>
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@couchbase.com>

show more ...

e311e7ed05-Jun-2020 Richard de Mellow <richard.demellow@couchbase.com>

MB-37009: DcpProducer::handleResponce, don't disconnect on KeyEnoent

DcpProducer shouldn't disconnect on status code
cb::mcbp::Status::KeyEnoent. As this is returned by
DcpConsumer::

MB-37009: DcpProducer::handleResponce, don't disconnect on KeyEnoent

DcpProducer shouldn't disconnect on status code
cb::mcbp::Status::KeyEnoent. As this is returned by
DcpConsumer::lookupStreamAndDispatchMessage(), when it can't find a
stream for the vbucket the op is for. This can occur during rebalance
and thus we don't want to tear down all the streams because its likely a
vbucket is being moved to another node or has changed state. However, we
do need to disconnect for DCP durability ops (Prepare, Abort, Commit)
as KeyEnoent is used to indicate something more more serious has
happened e.g. a DCP_COMMIT can't find the matching prepare.

Change-Id: I7fee92b72c8e99c8422c0315248d75b0b3891230
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/129940
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...

e47bcf4203-Jun-2020 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-37374: Log IncludeDeletedUserXattrs enabled at dcp_open_executor

I've missed that at http://review.couchbase.org/c/kv_engine/+/126116,
fixing.

Change-Id: I521cd342351de1f9d9c

MB-37374: Log IncludeDeletedUserXattrs enabled at dcp_open_executor

I've missed that at http://review.couchbase.org/c/kv_engine/+/126116,
fixing.

Change-Id: I521cd342351de1f9d9c2f164be9efc0f5b6eb162
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/129702
Well-Formed: Build Bot <build@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...

12345678910>>...596