History log of /6.6.0/kv_engine/engines/ep/tests/mock/mock_ephemeral_vb.cc (Results 1 - 11 of 11)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v7.0.2, v6.6.3, v7.0.1, v7.0.0, v6.6.2, v6.5.2, v6.6.1, v6.0.5, v6.6.0, v6.5.1, v6.0.4, v6.5.0, v6.0.3
# f6ed4896 09-Aug-2019 Dave Rigby <daver@couchbase.com>

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

[[Re-apply after fixing error in DurabilityCompletionTask::run
(skipping last vBucket).]]

Change how S

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

[[Re-apply after fixing error in DurabilityCompletionTask::run
(skipping last vBucket).]]

Change how SyncWrites which are Resolved and awaiting Completion are
handled, by moving the final VBucket::commit() / abort() into a
background task - DurabilityCompletionTask.

+Background+

There are two reasons for making this change:

a) Performance - specifically latency of front-end worker threads.

By moving completion into a background task, we reduce the amount of
work done on the thread which actually detected the SyncWrite was
resolved - typically the front-end DCP threads when a DCP_SEQNO_ACK
is processed.
Given that we SEQNO_ACK at the end of Snapshot, A single SEQNO_ACK
could result in committing multiple SyncWrites. Committing one
SyncWrite is similar to a normal front-end Set operation, so there is
potentially a non-trivial amount of work needed to be done when
completing SyncWrites, which could tie up the front-end thread (causing
other Connections to have to wait) for a noticable amount of time.

b) Simplification of lock management.

Doing completion in a background task simplifies lock management, for
example we avoid lock inversions with earlier locks acquired during
dcpSeqnoAck when attemping to later call notifySeqnoAvailable when this
was done on the original thread.

+Problem+

While (a) was the first reason identified for making this change
(see MB-33092), (b) is the reason this change is being made now. During
testing the following lock-order-inversion was seen:

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
Cycle in lock order graph:

Stream::streamMutex => StreamContainer::rwlock => Stream::streamMutex

The crux of the issue is the processing of DCP_SEQNO_ACKNOWLEDGED
messages by the DcpProducer - this acquires the Stream::streamMutex
before calling VBucket::seqnoAcknowledged(), however that function
currently results in VBucket::commit() being called to synchronously
complete the SyncWrite; which in turn must nodify all connected
replica that a new seqno is available, requiring
StreamContainer::rwlock to be acquired:

Mutex StreamContainer::rwlock acquired here while holding mutex Stream::streamMutex in thread T15:
...
#6 StreamContainer<std::shared_ptr<Stream> >::rlock()
#7 DcpProducer::notifySeqnoAvailable(Vbid, unsigned long)
...
#13 VBucket::commit(...)
#14 ActiveDurabilityMonitor::commit(...)
#15 ActiveDurabilityMonitor::processCompletedSyncWriteQueue()
#16 ActiveDurabilityMonitor::seqnoAckReceived(...)
#17 VBucket::seqnoAcknowledged(...)
#18 ActiveStream::seqnoAck(...)
#19 DcpProducer::seqno_acknowledged(...)
...

Mutex Stream::streamMutex previously acquired by the same thread here:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::seqnoAck(...)
#5 DcpProducer::seqno_acknowledged(...)
...

This conflicts with the ordering seen when sending items out on the
DCP connection - inside DcpProducer::step() where the
StreamContainer::rwlock is acquired first, then ActiveStream::mutex
acquired later:

Mutex Stream::streamMutex acquired here while holding mutex StreamContainer::rwlock in thread T15:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::next()
#5 DcpProducer::getNextItem()
#6 DcpProducer::step(dcp_message_producers*)
...

Mutex StreamContainer::rwlock previously acquired by the same thread here:
#0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
...
#4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&)
#5 StreamContainer<>::ResumableIterationHandle::ResumableIterationHandle()
#6 StreamContainer<>::startResumable()
#7 DcpProducer::getNextItem()
#8 DcpProducer::step(dcp_message_producers*)
...

+Solution+

The processing of resolved SyncWrites moved into a new background task.
Instead of immediately processing them within
ActiveDM::seqnoAckReceived(), that function notifies the new NonIO
DurabilityCompletionTask that there are SyncWrites waiting for
completion.

DurabilityCompletionTask maintains a bool per vBucket indicating if
there are SyncWrites for that vBucket pending completion. When the
task is run, for each flag which is true it calls
VBucket::processResolvedSyncWrites() for the associated VBucket.

+Implementation Notes+

Currently there is just a single DurabilityCompletionTask (per Bucket),
this was chosen as 1 task per vBucket (i.e. 1024 per Bucket) would
be inefficient for our current background task scheduler (both in terms
of latency to schedule each task for only one vBucket's worth of work,
and in terms of managing that many tasks in the future queue).

However, that does _potentially_ mean there's fewer resources (threads)
available to complete SyncWrites on - previously that work could be
done concurrently on all frontend threads (~O(num_cpus). Now the same
work only has 1 thread available to run on (there's only a single
DurabilityCompletionTask).

_If_ this becomes a bottleneck we could look at increasing the number of
DurabilityCompletionTask - e.g. sharding all vBuckets across multiple
tasks like flusher / bgfetcher.

Change-Id: I33ecfa78b03b4d2120b5d05f54984b24ce038fd8
Reviewed-on: http://review.couchbase.org/113749
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


# e39b54f7 21-Aug-2019 Dave Rigby <daver@couchbase.com>

MB-35629: Revert "MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask"

The introduction of the background commit has introduced a (severe)
performance regression - Sy

MB-35629: Revert "MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask"

The introduction of the background commit has introduced a (severe)
performance regression - SyncWrites are getting stuck and timing
out. Suspect an issue with the wakeup / notificaiotn of the
DurabilityCompletionTask; reverting change to restore performance
while investigating.

This reverts commit 7fdb98a995f6298678e3ac7b443f9454e21ffae1.

Change-Id: I8d08044d3f6ad3084f7724ead961b8d7530006f1
Reviewed-on: http://review.couchbase.org/113637
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


# 7fdb98a9 09-Aug-2019 Dave Rigby <daver@couchbase.com>

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

Change how SyncWrites which are Resolved and awaiting Completion are
handled, by moving the final VBucket::commit(

MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask

Change how SyncWrites which are Resolved and awaiting Completion are
handled, by moving the final VBucket::commit() / abort() into a
background task - DurabilityCompletionTask.

+Background+

There are two reasons for making this change:

a) Performance - specifically latency of front-end worker threads.

By moving completion into a background task, we reduce the amount of
work done on the thread which actually detected the SyncWrite was
resolved - typically the front-end DCP threads when a DCP_SEQNO_ACK
is processed.
Given that we SEQNO_ACK at the end of Snapshot, A single SEQNO_ACK
could result in committing multiple SyncWrites. Committing one
SyncWrite is similar to a normal front-end Set operation, so there is
potentially a non-trivial amount of work needed to be done when
completing SyncWrites, which could tie up the front-end thread (causing
other Connections to have to wait) for a noticable amount of time.

b) Simplification of lock management.

Doing completion in a background task simplifies lock management, for
example we avoid lock inversions with earlier locks acquired during
dcpSeqnoAck when attemping to later call notifySeqnoAvailable when this
was done on the original thread.

+Problem+

While (a) was the first reason identified for making this change
(see MB-33092), (b) is the reason this change is being made now. During
testing the following lock-order-inversion was seen:

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
Cycle in lock order graph:

Stream::streamMutex => StreamContainer::rwlock => Stream::streamMutex

The crux of the issue is the processing of DCP_SEQNO_ACKNOWLEDGED
messages by the DcpProducer - this acquires the Stream::streamMutex
before calling VBucket::seqnoAcknowledged(), however that function
currently results in VBucket::commit() being called to synchronously
complete the SyncWrite; which in turn must nodify all connected
replica that a new seqno is available, requiring
StreamContainer::rwlock to be acquired:

Mutex StreamContainer::rwlock acquired here while holding mutex Stream::streamMutex in thread T15:
...
#6 StreamContainer<std::shared_ptr<Stream> >::rlock()
#7 DcpProducer::notifySeqnoAvailable(Vbid, unsigned long)
...
#13 VBucket::commit(...)
#14 ActiveDurabilityMonitor::commit(...)
#15 ActiveDurabilityMonitor::processCompletedSyncWriteQueue()
#16 ActiveDurabilityMonitor::seqnoAckReceived(...)
#17 VBucket::seqnoAcknowledged(...)
#18 ActiveStream::seqnoAck(...)
#19 DcpProducer::seqno_acknowledged(...)
...

Mutex Stream::streamMutex previously acquired by the same thread here:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::seqnoAck(...)
#5 DcpProducer::seqno_acknowledged(...)
...

This conflicts with the ordering seen when sending items out on the
DCP connection - inside DcpProducer::step() where the
StreamContainer::rwlock is acquired first, then ActiveStream::mutex
acquired later:

Mutex Stream::streamMutex acquired here while holding mutex StreamContainer::rwlock in thread T15:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::next()
#5 DcpProducer::getNextItem()
#6 DcpProducer::step(dcp_message_producers*)
...

Mutex StreamContainer::rwlock previously acquired by the same thread here:
#0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
...
#4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&)
#5 StreamContainer<>::ResumableIterationHandle::ResumableIterationHandle()
#6 StreamContainer<>::startResumable()
#7 DcpProducer::getNextItem()
#8 DcpProducer::step(dcp_message_producers*)
...

+Solution+

The processing of resolved SyncWrites moved into a new background task.
Instead of immediately processing them within
ActiveDM::seqnoAckReceived(), that function notifies the new NonIO
DurabilityCompletionTask that there are SyncWrites waiting for
completion.

DurabilityCompletionTask maintains a bool per vBucket indicating if
there are SyncWrites for that vBucket pending completion. When the
task is run, for each flag which is true it calls
VBucket::processResolvedSyncWrites() for the associated VBucket.

+Implementaiton Notes+

Currently there is just a single DurabilityCompletionTask (per Bucket),
this was chosen as 1 task per vBucket (i.e. 1024 per Bucket) would
be inefficient for our current background task scheduler (both in terms
of latency to schedule each task for only one vBucket's worth of work,
and in terms of managing that many tasks in the future queue).

However, that does _potentially_ mean there's fewer resources (threads)
available to complete SyncWrites on - previously that work could be
done concurrently on all frontend threads (~O(num_cpus). Now the same
work only has 1 thread available to run on (there's only a single
DurabilityCompletionTask).

_If_ this becomes a bottleneck we could look at increasing the number of
DurabilityCompletionTask - e.g. sharding all vBuckets across multiple
tasks like flusher / bgfetcher.

Change-Id: I87897af1e3fd0a57e5abc2cb1ba9f795a9d3c63e
Reviewed-on: http://review.couchbase.org/113141
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>

show more ...


# 5954243e 15-May-2019 Ben Huddleston <ben.huddleston@couchbase.com>

Use MockEphemeralVBucket in VBucketDurabilityTest

To add some commit and SyncDelete tests in
the future, we need to be able to manually check seqList stats and
correct the collection

Use MockEphemeralVBucket in VBucketDurabilityTest

To add some commit and SyncDelete tests in
the future, we need to be able to manually check seqList stats and
correct the collections stats for Ephemeral to prevent counter
underflow assertions. The underflow occurs because we don't hit the
public functions in VBucketDurabilityTest (we would need an engine to
do so) we miss a call to update the collections stats on a set.

Change-Id: If72a0b97c9b56415d68caec1fadb22e143ac63aa
Reviewed-on: http://review.couchbase.org/109199
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...


# bfdd494d 30-Apr-2019 Dave Rigby <daver@couchbase.com>

Convert item_eviction_policy_t to enum class

Change-Id: Id5aaf7db03703fb9ca1b2df733bcb2d44a1f3dab
Reviewed-on: http://review.couchbase.org/108449
Tested-by: Build Bot <build@couchbas

Convert item_eviction_policy_t to enum class

Change-Id: Id5aaf7db03703fb9ca1b2df733bcb2d44a1f3dab
Reviewed-on: http://review.couchbase.org/108449
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Daniel Owen <owend@couchbase.com>

show more ...


# bfe28015 23-Apr-2019 Paolo Cocchi <paolo.cocchi@couchbase.com>

MB-33860 [SR]: Introduce the SeqnoAckCallback

This is a pre-requirement for the PassiveDurabilityMonitor to call back
to VBucket for sending a SeqnoAck message on the PassiveStream.

MB-33860 [SR]: Introduce the SeqnoAckCallback

This is a pre-requirement for the PassiveDurabilityMonitor to call back
to VBucket for sending a SeqnoAck message on the PassiveStream.
Note that new callback is not used yet, so there is no test in this
patch. The callback will be used and tested in follow-up patches where
the PassiveDM will actually trigger seqno-acks at high_prepared_seqno
updates.

Change-Id: I03b1030257e1a92d96cc173e9bd260a6df0b346b
Reviewed-on: http://review.couchbase.org/108149
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Dave Rigby <daver@couchbase.com>

show more ...


Revision tags: v5.5.4, v5.5.5
# 04c1c8ec 20-Mar-2019 Trond Norbye <trond.norbye@gmail.com>

Remove config.h

Change-Id: I79eb8c762971255db9d36a5f6461a8a6d0f29249
Reviewed-on: http://review.couchbase.org/106517
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave

Remove config.h

Change-Id: I79eb8c762971255db9d36a5f6461a8a6d0f29249
Reviewed-on: http://review.couchbase.org/106517
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Dave Rigby <daver@couchbase.com>

show more ...


Revision tags: v5.5.6
# aa69ca4f 04-Feb-2019 Jim Walker <jim@couchbase.com>

Refactor: Create warmed up VBuckets using dedicated method

Refactor VBucket creation so that the VB::Manifest is
constructed using a constructor relevant to the creation
path.

Refactor: Create warmed up VBuckets using dedicated method

Refactor VBucket creation so that the VB::Manifest is
constructed using a constructor relevant to the creation
path.

Change-Id: I474f57b83aa28fae4eead11a8b6936082af57d82
Reviewed-on: http://review.couchbase.org/104577
Reviewed-by: Dave Rigby <daver@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


Revision tags: v6.0.1
# ec5434bf 08-Jan-2019 Dave Rigby <daver@couchbase.com>

MB-32535 [SR]: Notify SyncWrite client when request complete

When a SyncWrite request is complete, notify the client which make the
request so the response can be sent back.

To

MB-32535 [SR]: Notify SyncWrite client when request complete

When a SyncWrite request is complete, notify the client which make the
request so the response can be sent back.

To do this we use notifyIOComplete() method on the engine; however
unlike other existing instances where the notification is done by a
background task; we are calling it from the DurabilityMonitor which
doesn't have direct access to the engine instance. Additionally unlike
existing EWOULDBLOCK notification for stores this isn't a idempotent
operation - if we performed the store() again then that would cause
/another/ mutation to occur. To solve this, we:

1. Define a new SyncWriteCompleted callback object, an instance of
which is passed into VBucket when it is created. This is typically
instantiated with a function / lambda from KVBucket which calls
engine.notifyIOComplete().

2. From VBucket::notifyClientOfCommit() this callback object is
invoked.

3. Inside EPEngine::storeIfInner() if this is a SyncWrite operation,
use engineSpecific to record that this is a SyncWrite which has
successfully been prepared, and that the next call to store()
should simply return the completed status & CAS of the
now-committed item.

Change-Id: I38a6dae3aeb98dcedd1ae1ec321db054ff1b7cd7
Reviewed-on: http://review.couchbase.org/103446
Reviewed-by: Paolo Cocchi <paolo.cocchi@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


Revision tags: v5.5.3, v6.0.0, v5.1.3
# 6cfd6a6c 14-Sep-2018 Dave Rigby <daver@couchbase.com>

Ensure headers 'include what they use'

As a follow-up from the previous commit; now we have the ability to
compile each header standalone, fix any missing #includes where header
didn

Ensure headers 'include what they use'

As a follow-up from the previous commit; now we have the ability to
compile each header standalone, fix any missing #includes where header
didn't fully define all headers it requires.

Change-Id: Ib3d7e8274b0e736a52c83e8332891bd70f23c729
Reviewed-on: http://review.couchbase.org/99723
Reviewed-by: Daniel Owen <owend@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@gmail.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...


Revision tags: v5.5.2, v5.5.1, v5.1.2, v5.1.1, v5.0.1, v5.1.0, v5.0.0
# 2aeac7ff 14-Jun-2017 Dave Rigby <daver@couchbase.com>

MB-24765: Make EphTombstoneHTCleaner visit HashTable incrementally

Change the EphTombstoneHTCleaner task to visit the HashTable
incrementally, yielding back to the scheduler every
ep

MB-24765: Make EphTombstoneHTCleaner visit HashTable incrementally

Change the EphTombstoneHTCleaner task to visit the HashTable
incrementally, yielding back to the scheduler every
ephemeral_metadata_purge_chunk_duration (default: 20ms). This
significantly reduces the impact this background task may have on the
scheduling of other NonIO tasks, given that prevously it was seen to
take a number of seconds to run.

This is implemented in a similar way to the DefragmenterTask - use
HashTable::pauseResumeVisit() instead of ::visit() to perform the
visitation of each HashTable (which allows the visitor to yield and
pause execution).

One difference with the DefragmenterTask is that EphTombstoneHTCleaner
does *not* snooze between runs if it hasn't yet completed - it simply
returns true, which will cause the scheduler to re-schedule it as soon
as possible. The reason for this is that tombstone purging can make a
material difference on the size of data stored - and hence we want to
complete a visitation of the whole bucket as quickly as possible.

Change-Id: If07eb6a0845dffa6e49670731fab01479f266c06
Reviewed-on: http://review.couchbase.org/79555
Reviewed-by: Manu Dhundi <manu@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>

show more ...