No more BigFile .part* deletion during a merge to preserve disk space. Instead MergeSpaceCoordinator is used for coordinating access to a large and possibly cheap storage with room for a whole resulting mergefile.
When a mrge file has been finished the reads are allowed from that and reads from the source files disallowed, which are then deleted. Then the file is renamed/moved from merge-space to regular collection storage using the 2-phase commit feature of GbMoveFile.cpp, and finally reads are done from the finished file.
Details:
RdbBase: Use MergeSpaceCoordinator and merge space for temporary target merge file.
RdbBase: better cleanup of crashed merges
RdbBase: more mutex locing while manipulatin m_fileInfo array
RdbBase: keep track of thraeds/jobs
RdbMerge: ditto
RdbMerge: Dont call file->chopHead()
Msg5/Msg3: no more "compensate for merge" flag
Msg3: Skip over RdbBase files that have reads disallowed
'dir' parameter was only used for a sanity-check. All callers specified g_hostdb.m_dir; and RdbBase et al uses g_hostdb.m_dir directly so there wasn't much point in keeping that parameter
Threads were being created and destroyed which can be expensive. The
thread-per-job model has been changed to a job scheduler that manages the job
queues and threads in pools. The submission of a job now specifies start/finish
routines, state, and as precisely what kind of job it is. The job scheduler then
takes care of the rest. it is hidden how many queues and pools there are.
disk space. added tagdb file cache for better performance,
less disk accesses. will help reduce disk load.
put file cache sizes in master controls and if they change
then update the cache size dynamically.