Commit Graph

  • 75d5a0b80e Removed PingInfo::m_currentSpiders Ivan Skytte Jørgensen 2017-05-01 12:32:26 +02:00
  • 08df3918ef Removed PingInfo::m_numCorruptDiskReads Ivan Skytte Jørgensen 2017-05-01 12:26:48 +02:00
  • b74c39cb04 Removed PingInfo::m_tcpSocketsInUse Ivan Skytte Jørgensen 2017-05-01 12:22:29 +02:00
  • f3ef115bb4 Removed PingInfo::m_udpSlotsInUseIncoming Ivan Skytte Jørgensen 2017-05-01 12:19:30 +02:00
  • 1dc77e8ec3 Removed PingInfo::m_socketsClosedFromHittingLimit Ivan Skytte Jørgensen 2017-05-01 12:07:44 +02:00
  • 26d3341d03 Detect inlinks with siteranks>15 (corrupt data in titledb?) Ivan Skytte Jørgensen 2017-05-01 11:54:30 +02:00
  • 6ab133910e Removed forgotten debug output Ivan Skytte Jørgensen 2017-04-28 16:12:58 +02:00
  • 8cd7d78998 Removed PingInfo::m_percentMemUsed and m_numOutOfMems Ivan Skytte Jørgensen 2017-05-01 11:33:13 +02:00
  • 6494ae8a08 Merge remote-tracking branch 'origin/master' into nomerge2 Brian Rasmusson 2017-04-30 21:09:52 +02:00
  • aadcceca90 Made termFreqWeight and frequency configurable and overridable. Made it possible to use other weights in a frontend UI and pass them to GB, and have them converted to internal values by prefixing the cgi param with fxui_. Synchronized cgi-parm names. Brian Rasmusson 2017-04-30 20:23:09 +02:00
  • 9274669218 removed unused code in setParm Brian Rasmusson 2017-04-30 10:56:58 +02:00
  • 238c36c019 split language match boost and unknown language boost into two different ranking weights Brian Rasmusson 2017-04-29 20:24:00 +02:00
  • 94bb71587c Merge remote-tracking branch 'origin/master' into nomerge2 Brian Rasmusson 2017-04-28 19:21:38 +02:00
  • 1e6451ed0c made page temp min/max weight configurable and overridable Brian Rasmusson 2017-04-28 16:43:14 +02:00
  • 018ef9df3a Removed PingInfo::m_recoveryLevel Ivan Skytte Jørgensen 2017-04-28 16:40:41 +02:00
  • 935c8fae54 Add scopelock while auto saving index & map. Don't auto-save while read is not allowed Ai Lin Chia 2017-04-28 16:35:56 +02:00
  • 1d223c0ae5 Removed fields from PingInfo: m_loadAvg, m_cpuUsage, m_diskUsage Ivan Skytte Jørgensen 2017-04-28 16:30:57 +02:00
  • aa990bf6a7 scale page temp logaritmically before scaling it linearly to even out top scorers that have massive page temp Brian Rasmusson 2017-04-28 16:23:19 +02:00
  • 48bd9cf930 Merge branch 'master' into dev-dumpthread Ai Lin Chia 2017-04-28 16:12:13 +02:00
  • b3c9f658bc Fix coverity warning about potential divide by zero Ai Lin Chia 2017-04-28 16:10:37 +02:00
  • a245e734b2 Untangle logic in DailyMerge::dailyMergeLoop() Ivan Skytte Jørgensen 2017-04-28 15:59:48 +02:00
  • 47ca7fa7ba Removed unused PingIfNo::m_localHostTimeMS Ivan Skytte Jørgensen 2017-04-28 15:42:18 +02:00
  • db58a32d3e Merge branch 'master' into nomerge2 Ivan Skytte Jørgensen 2017-04-28 15:27:10 +02:00
  • 0e6a61d80e Added scale_logarithmically() Ivan Skytte Jørgensen 2017-04-28 15:26:40 +02:00
  • 69e5212d65 Merge branch 'master' into nomerge2 Ivan Skytte Jørgensen 2017-04-28 14:59:50 +02:00
  • b047656975 bugfix 9e28e36abe Ivan Skytte Jørgensen 2017-04-28 14:59:36 +02:00
  • 465d7c0896 Make sure that we're not inserting while dumping Ai Lin Chia 2017-04-28 13:17:54 +02:00
  • 249bbb6187 Enable processing of Msg4 incoming in a thread Ai Lin Chia 2017-04-28 13:16:59 +02:00
  • 1567da54eb Make dumpTree blocking and simplify logic of dumping per collection (if it's not blocking, a simple for loop will do) Ai Lin Chia 2017-04-28 13:04:53 +02:00
  • 9eb5adf398 Preparation for running RdbDump in a thread. Modify all Rdb::dumpTree calls to Rdb::submitDumpJob. Add GbThreadQueue for dump thread. Call initialization & finalize functions. Ai Lin Chia 2017-04-28 10:45:12 +02:00
  • 9f3c3ca616 We can get 0 positive records after merge if we start with 0 positive records before merge Ai Lin Chia 2017-04-27 17:19:40 +02:00
  • d6aded878b We're trying to add a new file that already exist. It's probably a logic error somewhere. Don't just ignore the error and continue. Ai Lin Chia 2017-04-27 17:16:40 +02:00
  • 6d9f0f2afb Remove commented out code Ai Lin Chia 2017-04-27 17:13:24 +02:00
  • 126d0630c0 Rdb::dumpCollLoop should return true on error. Returning false means it's blocked. We should not continue without if we can't dump for a specific collection. We clear tree/bucket after completing dump (means we will lose data) Ai Lin Chia 2017-04-27 17:11:56 +02:00
  • 533a8f3d82 Remove unused Collectiondb::getFirstCollName Ai Lin Chia 2017-04-27 16:45:45 +02:00
  • 65e4d8aa9e Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2017-04-27 15:43:59 +02:00
  • 50886d8cec Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-27 14:19:15 +02:00
  • 56fea5921a Fix spidercoll dedup cache logic. We were not filtering out spiderdb with the same hopcount. Ai Lin Chia 2017-04-27 14:16:58 +02:00
  • a2dc33921a Remove now unused RdbBase::isManipulatingFiles Ai Lin Chia 2017-04-27 11:52:28 +02:00
  • 1621a5bc00 Move content of some RdbBase function to cpp file. Add lock on, constness on some function. Ai Lin Chia 2017-04-27 11:44:50 +02:00
  • 34e883b6c8 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-26 17:16:12 +02:00
  • d7be0994bd Free replyMaxSize instead of replySize (we should free allocated size & not used size) Ai Lin Chia 2017-04-26 17:15:37 +02:00
  • 11d84c888b Make RdbDump accept NULL callback. Don't delegate jobs to a thread if we have no callback defined. Ai Lin Chia 2017-04-26 16:34:17 +02:00
  • f455c0b97d If we want to leave a file slot for merge file, we should check if we have space for a dump file, and a merge file (hence +2 instead of +1) Ai Lin Chia 2017-04-26 16:15:58 +02:00
  • 470319ec35 Code style changes Ai Lin Chia 2017-04-26 16:12:00 +02:00
  • 4a1e0cf1d6 Add Rdb::getTree that returns NULL when tree is not used. Use Rdb::getTree/Rdb::getBuckets where relevant Ai Lin Chia 2017-04-26 16:08:42 +02:00
  • a90049221b Move collExist logic out to a function Ai Lin Chia 2017-04-26 15:56:44 +02:00
  • b98f313415 We shouldn't need to wait for all unlink/rename to be done. RdbBase::buryFiles is only called after all unlink/rename is done. Means RdbBase::m_fileInfo array will have all information until everything is done. By that time, we can reuse the fileId of recently deleted files when adding a new file. Ai Lin Chia 2017-04-26 14:42:14 +02:00
  • aa3d979776 Remove unused RdbBase function Ai Lin Chia 2017-04-26 13:39:50 +02:00
  • 14998ad0e5 Add comment Ai Lin Chia 2017-04-26 13:21:00 +02:00
  • f9ccdea9da Don't suppress warning log. We probably still want to know if we can't dump tree to disk 10 times after it happens. Ai Lin Chia 2017-04-26 12:18:21 +02:00
  • d773a14447 Code style changes & move variable declaration closer to use Ai Lin Chia 2017-04-26 11:57:30 +02:00
  • 2801a65e0b Code style changes Ai Lin Chia 2017-04-26 11:56:54 +02:00
  • 4df2721d29 Do logic directly inside if statement. Ai Lin Chia 2017-04-26 11:55:23 +02:00
  • 503641e12e Remove use of Rdb::canAdd and use Rdb::isDumping directly Ai Lin Chia 2017-04-26 11:50:07 +02:00
  • 6afe216df7 Fix compilation error Ai Lin Chia 2017-04-26 11:39:36 +02:00
  • 737c9f9cda Move Msg4In related functions to Msg4In namespace Ai Lin Chia 2017-04-26 11:25:29 +02:00
  • 17b0c56aae Use KEYNEG instead of checking key bit directly Ai Lin Chia 2017-04-25 15:09:36 +02:00
  • 2847f38159 Make GbThreadQueue::m_stop atomic Ai Lin Chia 2017-04-25 11:22:36 +02:00
  • d587f6f22a There shouldn't be any problem with dumping & saving at the same time Ai Lin Chia 2017-04-25 11:01:37 +02:00
  • f6df4d44cc Use initialization list instead Ai Lin Chia 2017-04-25 09:45:11 +02:00
  • ad6eaa5260 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-25 09:38:07 +02:00
  • 4d8435a858 Fix log line Ai Lin Chia 2017-04-25 09:37:07 +02:00
  • 0c91ff79d7 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-24 17:32:45 +02:00
  • d317ff9dc5 Split max percentage of lost positives after merge to be per rdb Ai Lin Chia 2017-04-24 17:31:57 +02:00
  • cfdff69424 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-24 12:07:22 +02:00
  • a7bfb39b9d Remove RdbTree/RdbBuckets/Rdb isWritable. We shouldn't need to disable writes just because we're saving. Ai Lin Chia 2017-04-24 10:35:55 +02:00
  • 948a796e06 Remove crawlinfo Ai Lin Chia 2017-04-21 16:43:22 +02:00
  • 1d6a479f3e show page temp and adjusted siterank in result score info Brian Rasmusson 2017-04-23 22:16:28 +02:00
  • 9b9a22563c Fix PageTemperatureRegistry to match format of .meta file Brian Rasmusson 2017-04-23 15:00:54 +02:00
  • f8f9af0094 show used ranking bits and values in result score info Brian Rasmusson 2017-04-22 23:05:51 +02:00
  • aceb4b06d1 Removed a bit of diffbot specific code and reformatted a bit in PageResults Brian Rasmusson 2017-04-22 20:08:24 +02:00
  • 3614a905ca allow override of rank adjustments by flags using cgi parms (bugfix) Brian Rasmusson 2017-04-22 17:53:24 +02:00
  • 9e28e36abe normalized binary search in RdbBucket::getNode() into somethign recognizable Ivan Skytte Jørgensen 2017-04-21 17:28:57 +02:00
  • 294987e148 fixed bug that caused not all returned docs to have score info created (intersect second loop) due to score info not handling intersectLists10_r being called once per posdb file Brian Rasmusson 2017-04-21 17:25:21 +02:00
  • ac6ea7daa1 added getFileNum method to DocumentIndexChecker Brian Rasmusson 2017-04-21 17:21:53 +02:00
  • a395847f50 Merge remote-tracking branch 'origin/master' into nomerge2 Brian Rasmusson 2017-04-21 17:17:26 +02:00
  • 72db061df4 added trace log to TopTree. Make sure TopTree does not exceed number of requested documents Brian Rasmusson 2017-04-21 17:12:42 +02:00
  • dfd361b551 Return value of RdbBucket::deleteNode() was not used; bool->void Ivan Skytte Jørgensen 2017-04-21 16:51:22 +02:00
  • 51e732f1fb Make logic clearer in RdbBuckets by added RdbBucket::isEmpty() Ivan Skytte Jørgensen 2017-04-21 16:48:09 +02:00
  • 52fd2ed47e Jeg er en knold Ivan Skytte Jørgensen 2017-04-21 15:03:51 +02:00
  • 4c81fc120c Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-21 12:45:32 +02:00
  • da5f2d3b29 Make Process::m_mode atomic Ai Lin Chia 2017-04-21 12:44:35 +02:00
  • e116ab0191 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-21 12:43:44 +02:00
  • b5f547a4d1 Make m_gotNewDataForScanningIp & m_scanningIp atomic Ai Lin Chia 2017-04-21 12:41:54 +02:00
  • f35833f1b9 Load min/max/default page temperatures from page_temperatures.dat.meta Ivan Skytte Jørgensen 2017-04-21 12:38:46 +02:00
  • e89a859f50 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-21 12:29:26 +02:00
  • c84408d7d0 Make sure m_isSaving & m_needsSave is protected by mutex for both RdbBuckets & RdbTree Ai Lin Chia 2017-04-21 12:28:35 +02:00
  • 135c5e5225 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-21 12:08:48 +02:00
  • 08207eeb5d Fix unit test compilation error Ai Lin Chia 2017-04-21 11:59:00 +02:00
  • e3532f73b4 Make SpiderLoop::m_lockTable thread safe Ai Lin Chia 2017-04-21 11:50:21 +02:00
  • fd78be54a3 Set isSaving/needsSave variable right after saving (align with changes made in RdbBuckets) Ai Lin Chia 2017-04-21 11:47:45 +02:00
  • 760a372bd0 Code style changes Ai Lin Chia 2017-04-21 11:46:59 +02:00
  • 88ecd3153c Protect m_needWrite variable with a mutex to avoid RdbIndex not writing when it needs to (in preparation for moving Msg4In to a thread) Ai Lin Chia 2017-04-21 11:44:37 +02:00
  • 55a01e552f Protect RdbBase::getNumFiles with m_mtxFileInfo mutex Ai Lin Chia 2017-04-21 11:43:08 +02:00
  • 2fda591e46 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-20 18:14:24 +02:00
  • 8e215eac6f Move RdbBuckets save to a thread Ai Lin Chia 2017-04-20 17:55:46 +02:00
  • 3ac9b74bf7 No need to use abbreviated variable name because we are not afraid of columns past 78 Ivan Skytte Jørgensen 2017-04-20 17:16:20 +02:00
  • 57d23341c5 Merge branch 'master' into nomerge2 Ai Lin Chia 2017-04-20 15:59:43 +02:00
  • e2e4783635 We don't need to save after tree/bucket reset. Align code between RdbTree & RdbBuckets Ai Lin Chia 2017-04-20 14:53:16 +02:00