Commit Graph

  • fde4d39f6a Move reloading of RobotsBlockedResultOverride to a separate thread Ai Lin Chia 2017-11-02 15:47:33 +01:00
  • 02779cdd34 Add virtual to LanguageResultOverride::getSummary & LanguageResultOverride::getTitle Ai Lin Chia 2017-11-02 15:02:42 +01:00
  • 40e5aeee25 Move reloading of UrlResultOverride to a separate thread Ai Lin Chia 2017-11-02 14:43:30 +01:00
  • 5b4323d1b3 Move reloading of RobotsCheckList to a separate thread Ai Lin Chia 2017-11-02 14:29:56 +01:00
  • 8335402a4a Move reloading of DnsBlockList to a separate thread Ai Lin Chia 2017-11-02 14:26:07 +01:00
  • 139296451d Move reloading of UrlMatchList to a separate thread Ai Lin Chia 2017-11-02 14:24:51 +01:00
  • 78e1410022 spiderdb-lookup: don't use msg5 for local lookups in spiderdb Ivan Skytte Jørgensen 2017-11-02 16:07:33 +01:00
  • f7f1359db4 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-11-02 15:46:01 +01:00
  • c4a6d56ecc Don't log members in msg0 after callback has been called because we may no longer exist Ivan Skytte Jørgensen 2017-11-02 15:45:06 +01:00
  • 4a5a473587 Handle convertspiderdb with no collectioname more nicely Ivan Skytte Jørgensen 2017-11-02 15:27:07 +01:00
  • af6d855878 Slightly clearer log messages for sqlite conversion Ivan Skytte Jørgensen 2017-11-02 15:18:33 +01:00
  • 6cd7e64dce Support rebuild of spiderdb in sqlite Ivan Skytte Jørgensen 2017-11-02 14:24:12 +01:00
  • 6956ba00c2 Fix compilation error with isnan Ai Lin Chia 2017-11-02 14:00:17 +01:00
  • 2d8a237c3e Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-11-02 12:25:00 +01:00
  • aca7284680 #include <math.h> in PageStats for use of isnan() macro (slight differences between glibc/gcc versions) Ivan Skytte Jørgensen 2017-10-31 16:52:04 +01:00
  • b46ae83888 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-31 16:17:32 +01:00
  • 1ec5c427b9 Support rebulding spiderdb from titledb documnet URLs only Ivan Skytte Jørgensen 2017-10-31 15:48:49 +01:00
  • c865c0b5ff Fix min length check for gzip content, and continue loading content when http status is not 200 Ai Lin Chia 2017-10-31 15:27:30 +01:00
  • 039b1f9f0f Removed unnecessary EINTR goto-retry Ai Lin Chia 2017-10-30 15:14:22 +01:00
  • d9a228b69e Correct loop end condition in Repaird.cpp Ivan Skytte Jørgensen 2017-10-31 12:11:08 +01:00
  • 38214778be Repair.cpp: Correct use of logTrace() (was outputting stuff twice) Ivan Skytte Jørgensen 2017-10-31 12:08:01 +01:00
  • 071907bf19 added option to disable spidering of adult content Brian Rasmusson 2017-10-30 20:00:13 +01:00
  • b04aea960d Repair.cpp: goto->for() Ivan Skytte Jørgensen 2017-10-30 14:46:11 +01:00
  • 7d88ec7db7 Repair.cpp: Use logTrace() instead of if...log(LOG_TRACE...) Ivan Skytte Jørgensen 2017-10-30 14:30:13 +01:00
  • 3757714d97 Use KEYNEG() macro instead of direct bit inspection Ivan Skytte Jørgensen 2017-10-30 14:21:08 +01:00
  • 9fbd6d7590 Added RdbBase::unlink() Ivan Skytte Jørgensen 2017-10-30 14:02:28 +01:00
  • 2d1482da2d Use debug version of gb Ai Lin Chia 2017-10-30 12:37:54 +01:00
  • 32d68c0322 Removed 'dataKeySize' parameter from RdbCache::init() Ivan Skytte Jørgensen 2017-10-27 16:35:18 +02:00
  • 0c81208e1c RdbCache: removed default values for 3 parameters to init() Ivan Skytte Jørgensen 2017-10-27 16:29:38 +02:00
  • dbd7f533a8 Removed more unused stuff from SpiderColl Ivan Skytte Jørgensen 2017-10-27 16:02:07 +02:00
  • 13c56a7a20 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-27 15:49:51 +02:00
  • 5508cce3d9 Moved local variablers closer to first use Ivan Skytte Jørgensen 2017-10-27 15:47:23 +02:00
  • dec13a69b0 Removed unused local variable Ivan Skytte Jørgensen 2017-10-27 15:41:37 +02:00
  • a90b92b64a Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-27 15:13:42 +02:00
  • 56bfb0c5e9 Made SpiderColl::m_lastPrintCount private Ivan Skytte Jørgensen 2017-10-27 15:13:26 +02:00
  • cdfa234d2a Removed unused SpiderColl::m_sendLocalCrawlInfoToHost Ivan Skytte Jørgensen 2017-10-27 14:51:11 +02:00
  • b5a500acef Spider/SpiderColl: Moved #defines from header to source Ivan Skytte Jørgensen 2017-10-27 14:47:42 +02:00
  • cf38bb447a SpiderColl: Moved #define from header to source Ivan Skytte Jørgensen 2017-10-27 14:33:24 +02:00
  • 9a8391ed7b optimized adult check. do not re-hash terms and phrases Brian Rasmusson 2017-10-27 14:32:27 +02:00
  • a3521a178c Deleted unwanted spiderrequests as we scan through them Ivan Skytte Jørgensen 2017-10-27 14:17:53 +02:00
  • 205980429a Don't close sqlite db when just a statement couldn't be prepared Ivan Skytte Jørgensen 2017-10-27 14:15:49 +02:00
  • 376555571d Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-27 13:50:18 +02:00
  • eb7027cec3 skip spiderrequetsts for unwanted urls Ivan Skytte Jørgensen 2017-10-27 13:48:57 +02:00
  • e78abc742f zero terminate debug buffer before trying to log it Brian Rasmusson 2017-10-27 13:48:38 +02:00
  • d7d9a3e436 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-27 13:40:48 +02:00
  • 0bd67dfd71 Fix spiderign .zip URLs (loop end condition was wrong) Ivan Skytte Jørgensen 2017-10-27 13:40:29 +02:00
  • d8db09e705 Merge branch 'sqlite' of github.com:privacore/open-source-search-engine into sqlite Ivan Skytte Jørgensen 2017-10-26 16:52:24 +02:00
  • 30c2606b14 sqlite: Made spiderdblookup work again Ivan Skytte Jørgensen 2017-10-26 16:52:12 +02:00
  • ae4b6b0233 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-26 16:10:03 +02:00
  • 0636bd2c92 commented out adultcheck sanity check that is no longer valid Brian Rasmusson 2017-10-26 15:52:25 +02:00
  • 07d29210e5 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-26 15:40:58 +02:00
  • 02f27b42d3 Msg3In: insert spiderrecords using batches Ivan Skytte Jørgensen 2017-10-26 15:38:44 +02:00
  • 045e98353b bit of code cleanup in AdultCheck. Added trace log option Brian Rasmusson 2017-10-26 15:13:50 +02:00
  • 127b80baa0 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-26 14:30:42 +02:00
  • e829e5f9c1 Removed no longer relevant/correct code for shard selection in Msg0 Ivan Skytte Jørgensen 2017-10-26 14:28:39 +02:00
  • 794c1b7845 Fix Msg0/Multicast to handle reads from spiderdb when some hosts have spidering disabled Ivan Skytte Jørgensen 2017-10-26 14:22:18 +02:00
  • 2758af694d Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2017-10-26 13:51:41 +02:00
  • 4d6079b150 Renamed macro/constant RDBIDOFFSET to MSG0RDBIDOFFSET (which is what it is) Ivan Skytte Jørgensen 2017-10-26 13:51:32 +02:00
  • c5d07d92f0 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-26 13:43:33 +02:00
  • b0d8d7b581 converted file to utf8 Brian Rasmusson 2017-10-26 13:21:27 +02:00
  • 9ec9342312 new adult detection code Brian Rasmusson 2017-10-26 12:20:04 +02:00
  • de5362ddbd Add ResultOverride for exact url Ai Lin Chia 2017-10-25 16:12:02 +02:00
  • 891d25ca7e License update Brian Rasmusson 2017-10-26 10:31:55 +02:00
  • 3956eac447 Merge branch 'sqlite' of github.com:privacore/open-source-search-engine into sqlite Ivan Skytte Jørgensen 2017-10-24 15:42:48 +02:00
  • 75ce4aa11b Fixed statistics page so the stats for the caches (including the winnerlistcache) is shown correctly Ivan Skytte Jørgensen 2017-10-24 15:41:53 +02:00
  • b742754aea Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-24 14:48:28 +02:00
  • e7f2087bd3 Treat specific case of titledb corruption as non-fatal Ivan Skytte Jørgensen 2017-10-24 14:47:31 +02:00
  • f56e0f6d7f Fix tld_hint for SearchInput Ai Lin Chia 2017-10-24 12:50:25 +02:00
  • c0de55d4c2 Provide exception guarantees in FxBlobCache<K>::insert() Ivan Skytte Jørgensen 2017-10-24 12:22:51 +02:00
  • e51830d220 Add ResultOverride unit test Ai Lin Chia 2017-10-24 11:03:15 +02:00
  • 6dd90e04c5 Add custom title/summary for results that are blocked by robots.txt Ai Lin Chia 2017-10-24 10:41:19 +02:00
  • a5e12821fb Changed spiderloop:winnerlistcache from RdbCache to FxBlobCache Ivan Skytte Jørgensen 2017-10-23 16:52:09 +02:00
  • 425fc35220 Browser accept language instead of browser language Ai Lin Chia 2017-10-23 16:47:22 +02:00
  • 175098a73a Store empty titlerec & index middomain url for root url disallowed by robots.txt Ai Lin Chia 2017-10-23 16:40:41 +02:00
  • 53678e151e Move some SearchInput query parameters into SearchInput object Ai Lin Chia 2017-10-23 16:39:35 +02:00
  • da674d4abe Remove unused SearchInput::m_gbcountry Ai Lin Chia 2017-10-23 15:53:06 +02:00
  • 6b30955d30 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-23 13:32:24 +02:00
  • cab0703aa8 Updat sparsepp to use latest commit Ai Lin Chia 2017-10-23 12:13:31 +02:00
  • 57f5164628 Removed unused RdbCache:m_errno Ivan Skytte Jørgensen 2017-10-20 15:27:02 +02:00
  • 96c749d79b Removed unused 'useHalfKeys' parameter+member from RdbCache Ivan Skytte Jørgensen 2017-10-20 15:03:05 +02:00
  • 59b28df56e Removed unused 'supportLists' parameter+member from RdbCache Ivan Skytte Jørgensen 2017-10-20 14:41:16 +02:00
  • 8c3a77b454 Dropped wc 'shortcut'. It was not making code clearer Ivan Skytte Jørgensen 2017-10-20 14:32:11 +02:00
  • 328d12e807 Added SpiderLoop::nukeWinnerListCache() so Doledb doesn't have to poke directly inside spiderloop Ivan Skytte Jørgensen 2017-10-20 14:25:07 +02:00
  • 43c292faec spiderloop:urlcache: Changed cache timeout in gbCahe from seconds to milliseconds Ivan Skytte Jørgensen 2017-10-20 14:13:24 +02:00
  • a2825b1f4a Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-20 14:12:13 +02:00
  • 18f3f3159b Changed cache timeout in gbCahe from seconds to milliseconds Ivan Skytte Jørgensen 2017-10-20 14:06:49 +02:00
  • db1c8bdff0 Minor changes to GbCache (constness, temporaries, encapsulation Ivan Skytte Jørgensen 2017-10-20 14:01:03 +02:00
  • 3060b63dc4 Merge branch 'master' into sqlite Ivan Skytte Jørgensen 2017-10-20 13:51:06 +02:00
  • 26ec05efbe Removed unused g_cacheWritesEnabled Ivan Skytte Jørgensen 2017-10-20 12:35:08 +02:00
  • 05287f42b1 Removed unused list functionality from RdbCache Ivan Skytte Jørgensen 2017-10-20 12:34:18 +02:00
  • f392a374a0 Parse output*.xml Ai Lin Chia 2017-10-19 17:55:00 +02:00
  • 9063159b5f Move deadhost detection for admin/status further up so we don't return initializing when not all host is started Ai Lin Chia 2017-10-19 17:19:59 +02:00
  • 66c4f6cf75 Setup instances as separate step Ai Lin Chia 2017-10-19 16:52:12 +02:00
  • cae0927abd More fixes Ai Lin Chia 2017-10-19 15:52:05 +02:00
  • 62c74f6f24 Fix stage name Ai Lin Chia 2017-10-19 15:49:47 +02:00
  • 52e1e69287 Let's try running multi instance automated test Ai Lin Chia 2017-10-19 15:48:39 +02:00
  • b9d5bb7ba4 Change lang log to debug Ai Lin Chia 2017-10-19 15:48:16 +02:00
  • 5405bd4334 replyFlags, not requestFlags Ivan Skytte Jørgensen 2017-10-19 14:50:00 +02:00
  • e8b5d0c5b3 spdierdb:sqlite: ensure m_replyFlags are set Ivan Skytte Jørgensen 2017-10-19 14:22:18 +02:00
  • 6bf3586474 bugfix: preselect in spdierdbsqlite:addRequestRecord() Ivan Skytte Jørgensen 2017-10-19 14:03:28 +02:00