History log of /6.6.0/kv_engine/engines/ep/tests/mock/mock_ephemeral_vb.h (Results 1 – 14 of 14)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v7.1.1, v7.0.4, v7.1.0, v6.6.5-MP1, v6.6.5, v6.6.4, v7.0.2, v7.0.1, v6.6.3, v7.0.0, v6.6.2, v6.5.2, v6.0.5, v6.6.1, v6.5.1-MP5, v6.6.0, v6.5.1-MP3, v6.5.1, v6.0.4
# 34688619 15-Jan-2020 James Harrison <james.harrison@couchbase.com>

MB-37529: Permit locking of multiple ranges of seqList

Backfilling from the seqList requires a range of items to be
"protected" from being updated and moved during the lifetime of the
backfill. This

MB-37529: Permit locking of multiple ranges of seqList

Backfilling from the seqList requires a range of items to be
"protected" from being updated and moved during the lifetime of the
backfill. This is to ensure the resulting set of items is consistent at
the end of the backfill.

To do this, backfills take a range read marking seqnos X to Y, and
front end ops will not relocate items from inside this range. Instead,
a new item will be appended, and the old one marked as stale, to be
cleaned up later after the backfill has completed.

However, previously only a single range read was allowed at a given
time. This means that an ephemeral vbucket can only serve a single
backfill at a time, all others will be delayed. This can have a
significant impact, given that a vbucket might need to backfill for 3
replicas, indexer nodes, the projector for XCDR etc. all at the same
time. This is made more likely under high memory usage, if cursor
dropping is triggered (as in-memory streams will be forced back to
backfill).

This patch allows concurrent read-only access to ranges of the seqList,
allowing concurrent backfills.

There is also specific support for exclusive write access to ranges of
the seqList; this is required for tombstone purging. Exclusive range
locks block shared range locks that would overlap the exclusive range.
Exclusive locks will not succeed if a shared range lock exists which
would overlap it.

As described above, range locks still do not block front end ops.

The "locking" is read-preferring; as a consequence the tombstone purger
will be prevented from locking the needed seqno range as long as a
single backfill holds a lock covering any of those seqnos. As constant
backfills are not normally expected read-preference is acceptable for
now. However, given the importance of purging tombstones/stale items in
terms of managing memory usage (as noted briefly in MB-25218)
write-preference in this situation may be worth assessing.

Change-Id: I17bd544f091415bc6ae336551d92ddb2104f51ca
Reviewed-on: http://review.couchbase.org/121952
Well-Formed: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


Revision tags: v6.5.0, v6.0.3, v5.5.6
# f6ed4896 09-Aug-2019 Dave Rigby <daver@couchbase.com>

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

[[Re-apply after fixing error in DurabilityCompletionTask::run
(skipping last vBucket).]]

Change how SyncWrites which are

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

[[Re-apply after fixing error in DurabilityCompletionTask::run
(skipping last vBucket).]]

Change how SyncWrites which are Resolved and awaiting Completion are
handled, by moving the final VBucket::commit() / abort() into a
background task - DurabilityCompletionTask.

+Background+

There are two reasons for making this change:

a) Performance - specifically latency of front-end worker threads.

By moving completion into a background task, we reduce the amount of
work done on the thread which actually detected the SyncWrite was
resolved - typically the front-end DCP threads when a DCP_SEQNO_ACK
is processed.
Given that we SEQNO_ACK at the end of Snapshot, A single SEQNO_ACK
could result in committing multiple SyncWrites. Committing one
SyncWrite is similar to a normal front-end Set operation, so there is
potentially a non-trivial amount of work needed to be done when
completing SyncWrites, which could tie up the front-end thread (causing
other Connections to have to wait) for a noticable amount of time.

b) Simplification of lock management.

Doing completion in a background task simplifies lock management, for
example we avoid lock inversions with earlier locks acquired during
dcpSeqnoAck when attemping to later call notifySeqnoAvailable when this
was done on the original thread.

+Problem+

While (a) was the first reason identified for making this change
(see MB-33092), (b) is the reason this change is being made now. During
testing the following lock-order-inversion was seen:

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
Cycle in lock order graph:

Stream::streamMutex => StreamContainer::rwlock => Stream::streamMutex

The crux of the issue is the processing of DCP_SEQNO_ACKNOWLEDGED
messages by the DcpProducer - this acquires the Stream::streamMutex
before calling VBucket::seqnoAcknowledged(), however that function
currently results in VBucket::commit() being called to synchronously
complete the SyncWrite; which in turn must nodify all connected
replica that a new seqno is available, requiring
StreamContainer::rwlock to be acquired:

Mutex StreamContainer::rwlock acquired here while holding mutex Stream::streamMutex in thread T15:
...
#6 StreamContainer<std::shared_ptr<Stream> >::rlock()
#7 DcpProducer::notifySeqnoAvailable(Vbid, unsigned long)
...
#13 VBucket::commit(...)
#14 ActiveDurabilityMonitor::commit(...)
#15 ActiveDurabilityMonitor::processCompletedSyncWriteQueue()
#16 ActiveDurabilityMonitor::seqnoAckReceived(...)
#17 VBucket::seqnoAcknowledged(...)
#18 ActiveStream::seqnoAck(...)
#19 DcpProducer::seqno_acknowledged(...)
...

Mutex Stream::streamMutex previously acquired by the same thread here:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::seqnoAck(...)
#5 DcpProducer::seqno_acknowledged(...)
...

This conflicts with the ordering seen when sending items out on the
DCP connection - inside DcpProducer::step() where the
StreamContainer::rwlock is acquired first, then ActiveStream::mutex
acquired later:

Mutex Stream::streamMutex acquired here while holding mutex StreamContainer::rwlock in thread T15:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::next()
#5 DcpProducer::getNextItem()
#6 DcpProducer::step(dcp_message_producers*)
...

Mutex StreamContainer::rwlock previously acquired by the same thread here:
#0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
...
#4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&)
#5 StreamContainer<>::ResumableIterationHandle::ResumableIterationHandle()
#6 StreamContainer<>::startResumable()
#7 DcpProducer::getNextItem()
#8 DcpProducer::step(dcp_message_producers*)
...

+Solution+

The processing of resolved SyncWrites moved into a new background task.
Instead of immediately processing them within
ActiveDM::seqnoAckReceived(), that function notifies the new NonIO
DurabilityCompletionTask that there are SyncWrites waiting for
completion.

DurabilityCompletionTask maintains a bool per vBucket indicating if
there are SyncWrites for that vBucket pending completion. When the
task is run, for each flag which is true it calls
VBucket::processResolvedSyncWrites() for the associated VBucket.

+Implementation Notes+

Currently there is just a single DurabilityCompletionTask (per Bucket),
this was chosen as 1 task per vBucket (i.e. 1024 per Bucket) would
be inefficient for our current background task scheduler (both in terms
of latency to schedule each task for only one vBucket's worth of work,
and in terms of managing that many tasks in the future queue).

However, that does _potentially_ mean there's fewer resources (threads)
available to complete SyncWrites on - previously that work could be
done concurrently on all frontend threads (~O(num_cpus). Now the same
work only has 1 thread available to run on (there's only a single
DurabilityCompletionTask).

_If_ this becomes a bottleneck we could look at increasing the number of
DurabilityCompletionTask - e.g. sharding all vBuckets across multiple
tasks like flusher / bgfetcher.

Change-Id: I33ecfa78b03b4d2120b5d05f54984b24ce038fd8
Reviewed-on: http://review.couchbase.org/113749
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


# e39b54f7 21-Aug-2019 Dave Rigby <daver@couchbase.com>

MB-35629: Revert "MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask"

The introduction of the background commit has introduced a (severe)
performance regression - SyncWrites are

MB-35629: Revert "MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask"

The introduction of the background commit has introduced a (severe)
performance regression - SyncWrites are getting stuck and timing
out. Suspect an issue with the wakeup / notificaiotn of the
DurabilityCompletionTask; reverting change to restore performance
while investigating.

This reverts commit 7fdb98a995f6298678e3ac7b443f9454e21ffae1.

Change-Id: I8d08044d3f6ad3084f7724ead961b8d7530006f1
Reviewed-on: http://review.couchbase.org/113637
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


# 7fdb98a9 09-Aug-2019 Dave Rigby <daver@couchbase.com>

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

Change how SyncWrites which are Resolved and awaiting Completion are
handled, by moving the final VBucket::commit() / abort()

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

Change how SyncWrites which are Resolved and awaiting Completion are
handled, by moving the final VBucket::commit() / abort() into a
background task - DurabilityCompletionTask.

+Background+

There are two reasons for making this change:

a) Performance - specifically latency of front-end worker threads.

By moving completion into a background task, we reduce the amount of
work done on the thread which actually detected the SyncWrite was
resolved - typically the front-end DCP threads when a DCP_SEQNO_ACK
is processed.
Given that we SEQNO_ACK at the end of Snapshot, A single SEQNO_ACK
could result in committing multiple SyncWrites. Committing one
SyncWrite is similar to a normal front-end Set operation, so there is
potentially a non-trivial amount of work needed to be done when
completing SyncWrites, which could tie up the front-end thread (causing
other Connections to have to wait) for a noticable amount of time.

b) Simplification of lock management.

Doing completion in a background task simplifies lock management, for
example we avoid lock inversions with earlier locks acquired during
dcpSeqnoAck when attemping to later call notifySeqnoAvailable when this
was done on the original thread.

+Problem+

While (a) was the first reason identified for making this change
(see MB-33092), (b) is the reason this change is being made now. During
testing the following lock-order-inversion was seen:

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
Cycle in lock order graph:

Stream::streamMutex => StreamContainer::rwlock => Stream::streamMutex

The crux of the issue is the processing of DCP_SEQNO_ACKNOWLEDGED
messages by the DcpProducer - this acquires the Stream::streamMutex
before calling VBucket::seqnoAcknowledged(), however that function
currently results in VBucket::commit() being called to synchronously
complete the SyncWrite; which in turn must nodify all connected
replica that a new seqno is available, requiring
StreamContainer::rwlock to be acquired:

Mutex StreamContainer::rwlock acquired here while holding mutex Stream::streamMutex in thread T15:
...
#6 StreamContainer<std::shared_ptr<Stream> >::rlock()
#7 DcpProducer::notifySeqnoAvailable(Vbid, unsigned long)
...
#13 VBucket::commit(...)
#14 ActiveDurabilityMonitor::commit(...)
#15 ActiveDurabilityMonitor::processCompletedSyncWriteQueue()
#16 ActiveDurabilityMonitor::seqnoAckReceived(...)
#17 VBucket::seqnoAcknowledged(...)
#18 ActiveStream::seqnoAck(...)
#19 DcpProducer::seqno_acknowledged(...)
...

Mutex Stream::streamMutex previously acquired by the same thread here:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::seqnoAck(...)
#5 DcpProducer::seqno_acknowledged(...)
...

This conflicts with the ordering seen when sending items out on the
DCP connection - inside DcpProducer::step() where the
StreamContainer::rwlock is acquired first, then ActiveStream::mutex
acquired later:

Mutex Stream::streamMutex acquired here while holding mutex StreamContainer::rwlock in thread T15:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::next()
#5 DcpProducer::getNextItem()
#6 DcpProducer::step(dcp_message_producers*)
...

Mutex StreamContainer::rwlock previously acquired by the same thread here:
#0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
...
#4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&)
#5 StreamContainer<>::ResumableIterationHandle::ResumableIterationHandle()
#6 StreamContainer<>::startResumable()
#7 DcpProducer::getNextItem()
#8 DcpProducer::step(dcp_message_producers*)
...

+Solution+

The processing of resolved SyncWrites moved into a new background task.
Instead of immediately processing them within
ActiveDM::seqnoAckReceived(), that function notifies the new NonIO
DurabilityCompletionTask that there are SyncWrites waiting for
completion.

DurabilityCompletionTask maintains a bool per vBucket indicating if
there are SyncWrites for that vBucket pending completion. When the
task is run, for each flag which is true it calls
VBucket::processResolvedSyncWrites() for the associated VBucket.

+Implementaiton Notes+

Currently there is just a single DurabilityCompletionTask (per Bucket),
this was chosen as 1 task per vBucket (i.e. 1024 per Bucket) would
be inefficient for our current background task scheduler (both in terms
of latency to schedule each task for only one vBucket's worth of work,
and in terms of managing that many tasks in the future queue).

However, that does _potentially_ mean there's fewer resources (threads)
available to complete SyncWrites on - previously that work could be
done concurrently on all frontend threads (~O(num_cpus). Now the same
work only has 1 thread available to run on (there's only a single
DurabilityCompletionTask).

_If_ this becomes a bottleneck we could look at increasing the number of
DurabilityCompletionTask - e.g. sharding all vBuckets across multiple
tasks like flusher / bgfetcher.

Change-Id: I87897af1e3fd0a57e5abc2cb1ba9f795a9d3c63e
Reviewed-on: http://review.couchbase.org/113141
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>

show more ...


Revision tags: v6.0.2, v5.5.5
# 5954243e 15-May-2019 Ben Huddleston <ben.huddleston@couchbase.com>

Use MockEphemeralVBucket in VBucketDurabilityTest

To add some commit and SyncDelete tests in
the future, we need to be able to manually check seqList stats and
correct the collections stats for Ephe

Use MockEphemeralVBucket in VBucketDurabilityTest

To add some commit and SyncDelete tests in
the future, we need to be able to manually check seqList stats and
correct the collections stats for Ephemeral to prevent counter
underflow assertions. The underflow occurs because we don't hit the
public functions in VBucketDurabilityTest (we would need an engine to
do so) we miss a call to update the collections stats on a set.

Change-Id: If72a0b97c9b56415d68caec1fadb22e143ac63aa
Reviewed-on: http://review.couchbase.org/109199
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...


# bfdd494d 30-Apr-2019 Dave Rigby <daver@couchbase.com>

Convert item_eviction_policy_t to enum class

Change-Id: Id5aaf7db03703fb9ca1b2df733bcb2d44a1f3dab
Reviewed-on: http://review.couchbase.org/108449
Tested-by: Build Bot <build@couchbase.com>
Reviewed-

Convert item_eviction_policy_t to enum class

Change-Id: Id5aaf7db03703fb9ca1b2df733bcb2d44a1f3dab
Reviewed-on: http://review.couchbase.org/108449
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Daniel Owen <owend@couchbase.com>

show more ...


# bfe28015 23-Apr-2019 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-33860 [SR]: Introduce the SeqnoAckCallback

This is a pre-requirement for the PassiveDurabilityMonitor to call back
to VBucket for sending a SeqnoAck message on the PassiveStream.
Note that new ca

MB-33860 [SR]: Introduce the SeqnoAckCallback

This is a pre-requirement for the PassiveDurabilityMonitor to call back
to VBucket for sending a SeqnoAck message on the PassiveStream.
Note that new callback is not used yet, so there is no test in this
patch. The callback will be used and tested in follow-up patches where
the PassiveDM will actually trigger seqno-acks at high_prepared_seqno
updates.

Change-Id: I03b1030257e1a92d96cc173e9bd260a6df0b346b
Reviewed-on: http://review.couchbase.org/108149
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Dave Rigby <daver@couchbase.com>

show more ...


Revision tags: v5.5.4
# 04c1c8ec 20-Mar-2019 Trond Norbye <trond.norbye@gmail.com>

Remove config.h

Change-Id: I79eb8c762971255db9d36a5f6461a8a6d0f29249
Reviewed-on: http://review.couchbase.org/106517
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchba

Remove config.h

Change-Id: I79eb8c762971255db9d36a5f6461a8a6d0f29249
Reviewed-on: http://review.couchbase.org/106517
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...


Revision tags: v6.0.1
# ec5434bf 08-Jan-2019 Dave Rigby <daver@couchbase.com>

MB-32535 [SR]: Notify SyncWrite client when request complete

When a SyncWrite request is complete, notify the client which make the
request so the response can be sent back.

To do this we use notif

MB-32535 [SR]: Notify SyncWrite client when request complete

When a SyncWrite request is complete, notify the client which make the
request so the response can be sent back.

To do this we use notifyIOComplete() method on the engine; however
unlike other existing instances where the notification is done by a
background task; we are calling it from the DurabilityMonitor which
doesn't have direct access to the engine instance. Additionally unlike
existing EWOULDBLOCK notification for stores this isn't a idempotent
operation - if we performed the store() again then that would cause
/another/ mutation to occur. To solve this, we:

1. Define a new SyncWriteCompleted callback object, an instance of
which is passed into VBucket when it is created. This is typically
instantiated with a function / lambda from KVBucket which calls
engine.notifyIOComplete().

2. From VBucket::notifyClientOfCommit() this callback object is
invoked.

3. Inside EPEngine::storeIfInner() if this is a SyncWrite operation,
use engineSpecific to record that this is a SyncWrite which has
successfully been prepared, and that the next call to store()
should simply return the completed status & CAS of the
now-committed item.

Change-Id: I38a6dae3aeb98dcedd1ae1ec321db054ff1b7cd7
Reviewed-on: http://review.couchbase.org/103446
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


Revision tags: v5.5.3, v5.1.3, v6.0.0
# 6cfd6a6c 14-Sep-2018 Dave Rigby <daver@couchbase.com>

Ensure headers 'include what they use'

As a follow-up from the previous commit; now we have the ability to
compile each header standalone, fix any missing #includes where header
didn't fully define

Ensure headers 'include what they use'

As a follow-up from the previous commit; now we have the ability to
compile each header standalone, fix any missing #includes where header
didn't fully define all headers it requires.

Change-Id: Ib3d7e8274b0e736a52c83e8332891bd70f23c729
Reviewed-on: http://review.couchbase.org/99723
Reviewed-by: Daniel Owen <owend@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


Revision tags: v5.5.2
# 12561b64 03-Sep-2018 Christopher Farman <christopher.farman@couchbase.com>

MB-30552: Replace VBucket(Map)::id_type with Vbid type

Seeming as VBucket::id_type was changed from uint16_t to Vbid,
the use of VBucket::id_type can be clarified throughout the
codebase by changing

MB-30552: Replace VBucket(Map)::id_type with Vbid type

Seeming as VBucket::id_type was changed from uint16_t to Vbid,
the use of VBucket::id_type can be clarified throughout the
codebase by changing to the shorter, consistent type 'Vbid'.
Following suit, VBucketMap::id_type has been done in the same
patch to avoid potential confusion. Further, the addtion of
Vbid::id_type can be used if the implemented type of Vbid is
needed.

Change-Id: I2d41f75fd9e6684fb425b1fe7d17ae0e6a64e2e3
Reviewed-on: http://review.couchbase.org/99143
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


Revision tags: v5.5.1, v5.1.2, v5.5.0, v5.1.1, v5.1.0, v5.0.1, v5.0.0
# 2aeac7ff 14-Jun-2017 Dave Rigby <daver@couchbase.com>

MB-24765: Make EphTombstoneHTCleaner visit HashTable incrementally

Change the EphTombstoneHTCleaner task to visit the HashTable
incrementally, yielding back to the scheduler every
ephemeral_metadata

MB-24765: Make EphTombstoneHTCleaner visit HashTable incrementally

Change the EphTombstoneHTCleaner task to visit the HashTable
incrementally, yielding back to the scheduler every
ephemeral_metadata_purge_chunk_duration (default: 20ms). This
significantly reduces the impact this background task may have on the
scheduling of other NonIO tasks, given that prevously it was seen to
take a number of seconds to run.

This is implemented in a similar way to the DefragmenterTask - use
HashTable::pauseResumeVisit() instead of ::visit() to perform the
visitation of each HashTable (which allows the visitor to yield and
pause execution).

One difference with the DefragmenterTask is that EphTombstoneHTCleaner
does *not* snooze between runs if it hasn't yet completed - it simply
returns true, which will cause the scheduler to re-schedule it as soon
as possible. The reason for this is that tombstone purging can make a
material difference on the size of data stored - and hence we want to
complete a visitation of the whole bucket as quickly as possible.

Change-Id: If07eb6a0845dffa6e49670731fab01479f266c06
Reviewed-on: http://review.couchbase.org/79555
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


# 467c4742 31-May-2017 Manu Dhundi <manu@couchbase.com>

MB-24470: Do not add temp items to the sequence list

In certain cases of delete_with_meta() we add a short lived temp item
in the hashtable and then immediately generate a valid seqno to it,
hence m

MB-24470: Do not add temp items to the sequence list

In certain cases of delete_with_meta() we add a short lived temp item
in the hashtable and then immediately generate a valid seqno to it,
hence making it a non-temp item. In Ephemeral buckets, this item
while temp is put onto the sequence list with a negative seqno (temp
items have negative seqno in ep-engine). Having a negative seqno
item in the sequence list can cause problems for range (sequential)
readers like DCP, tombstone purge.

This commit ensures that we do not add to temp item on the
sequence list. Temp item is still added on the hash table of legacy
reasons. When the temp item is updated, that is made "non-temp", we
put it onto the sequence list under the "list write lock" and
generate a valid sequence number to it.

The commit also adds some module tests for the same.

Change-Id: I8a98cee401af11f59858d0a2f39cff3cbd540cbf
Reviewed-on: http://review.couchbase.org/78689
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


# ef22f9b0 25-May-2017 Dave Rigby <daver@couchbase.com>

Move ep-engine to engines/ep