Commit Graph

  • 695ac6ddd2 More constness in RdbList.cpp Ivan Skytte Jørgensen 2016-04-12 12:35:25 +02:00
  • da6bf8fe89 Simplified merge loop in RdbList::posdbMerge_r() to chaning gotots into structured loops Ivan Skytte Jørgensen 2016-04-12 12:28:08 +02:00
  • c269fc6993 Moved test programs and tools to subdirectoy 'misc' Ivan Skytte Jørgensen 2016-04-12 11:53:45 +02:00
  • 94a3f2ef55 Removed local copy of libz Ivan Skytte Jørgensen 2016-04-12 11:34:25 +02:00
  • 85bab12116 Commented out thread UNsafe callback parameter where possible Brian Rasmusson 2016-04-11 20:53:03 +02:00
  • 6c83553a51 Help gcc with optimization Ivan Skytte Jørgensen 2016-04-11 17:28:32 +02:00
  • 2341c999dd Separate out ugly 6-byte (sub-) key comparisons Ivan Skytte Jørgensen 2016-04-11 16:35:23 +02:00
  • e9b69d3a77 More cleanup in posdbMerge_r() Ivan Skytte Jørgensen 2016-04-11 16:23:21 +02:00
  • 2230829cfb Removed move commented-out code from posdbMerge_r Ivan Skytte Jørgensen 2016-04-11 15:55:15 +02:00
  • b4ad3490bc Changed hiKeys/loKeys from pointer-array to array-of-array Ivan Skytte Jørgensen 2016-04-11 15:35:09 +02:00
  • 950dfae00e Made KEYxxx functions in types.h static inline instead of just inline Ivan Skytte Jørgensen 2016-04-11 14:57:24 +02:00
  • a1e758a555 Removed unused and out-commented methods from RdbList (mostly those taking key_t arguments) Ivan Skytte Jørgensen 2016-04-11 14:56:46 +02:00
  • 6e9a961249 Made Msg5::getList() take const keys Ivan Skytte Jørgensen 2016-04-11 11:48:30 +02:00
  • 61a73d93bb Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2016-04-11 11:27:32 +02:00
  • 1ca9dde6e1 Removed unused cancel code from Threads.cpp Brian Rasmusson 2016-04-10 16:33:37 +02:00
  • fa26d9c377 Threads.cpp used a counter that was never incremented or decremented... Brian Rasmusson 2016-04-10 15:48:50 +02:00
  • 16d039ac60 More Threads cleanup Brian Rasmusson 2016-04-10 15:36:02 +02:00
  • ce5151f0f2 remove dead code from Threads implementation Brian Rasmusson 2016-04-09 17:15:04 +02:00
  • 38183dbaae Removed unused 'yieldPoint' in merge_r() Ivan Skytte Jørgensen 2016-04-08 17:41:17 +02:00
  • 5a77118235 Spring cleaing in RdbList::posdbMerge_r() Ivan Skytte Jørgensen 2016-04-08 16:28:31 +02:00
  • 9d17137b83 Removed unused local variable 'yieldPoint' Ivan Skytte Jørgensen 2016-04-08 15:37:23 +02:00
  • 5f2b8cb5cd Make max-threads configurable Ivan Skytte Jørgensen 2016-04-08 13:33:32 +02:00
  • 10aad5416d Use statvfs() instead of external 'df' to get disk usage Ivan Skytte Jørgensen 2016-04-08 12:13:13 +02:00
  • 6dc2cbec7a More cleanup in Process.* Ivan Skytte Jørgensen 2016-04-08 11:48:17 +02:00
  • 275d10e139 Removed unused temperature/fan-state from Process Ivan Skytte Jørgensen 2016-04-08 11:41:32 +02:00
  • 6cf7fad7bc Removed more unused struct members Ivan Skytte Jørgensen 2016-04-08 11:37:57 +02:00
  • f6fab65a6d Removed unused members of local structs Ivan Skytte Jørgensen 2016-04-08 11:20:50 +02:00
  • b8dac218fb Add constness to some Url methods Ai Lin Chia 2016-04-07 17:57:16 +02:00
  • 242c48a338 Change remove LOG_TRACE from log in Threads. LOG_TRACE is always logged so it needs to be surrounded by configuration option at every call. Ai Lin Chia 2016-04-07 17:16:00 +02:00
  • 48cad55568 Wrap RdbMap trace log inside g_conf.m_logTraceRdbMap Ai Lin Chia 2016-04-07 17:13:30 +02:00
  • 8daa051974 Removed unused parameters tfns/tfndbList from RdbList::merge_r() Ivan Skytte Jørgensen 2016-04-07 18:17:53 +02:00
  • 105df94aea Removed unused doGroupMask/isRealMerge parameters from RdbList::merge_r() Ivan Skytte Jørgensen 2016-04-07 18:15:37 +02:00
  • fb62a1c3af Removed commented-out parameters to RdbList::merge_r() Ivan Skytte Jørgensen 2016-04-07 18:07:01 +02:00
  • d6be7284c7 Removed unused parameter 'filtered' to RdbList::merge_r() Ivan Skytte Jørgensen 2016-04-07 18:04:11 +02:00
  • 7035c3b2da Removed effectively unused Msg5::m_filtered Ivan Skytte Jørgensen 2016-04-07 18:01:13 +02:00
  • 5a3e9b022a Simplify the two KEYCMP functions Ivan Skytte Jørgensen 2016-04-07 17:06:30 +02:00
  • d93ee6bcbf Removed g_inMemcpy global from unittest Ai Lin Chia 2016-04-07 15:58:38 +02:00
  • 9cb9b7bf7a Removed g_inMemcpy global Ivan Skytte Jørgensen 2016-04-07 15:10:42 +02:00
  • 589b8ba5dd Increase optimization of RdbList+rdbMap from -O2 to -O3 Ivan Skytte Jørgensen 2016-04-07 13:43:17 +02:00
  • 036a298cd2 Remove more unused files Ai Lin Chia 2016-04-07 00:09:35 +02:00
  • d4cc685c08 Enhanced debug output for RdbList/Posdb merging. Removed unused code Brian Rasmusson 2016-04-06 16:42:31 +02:00
  • a1df6baa1d Fix bug in Url::getDisplayUrl where it does not handle xn-- in url path correctly with a non idn domain Ai Lin Chia 2016-04-06 15:50:09 +02:00
  • 4c3d6b4619 Fix international domain printing bug. Zak Betz 2016-03-29 12:41:34 -06:00
  • 3bfa17e8f5 Fix segfault when NULL is passed as robots.txt. Minor log changes. Ai Lin Chia 2016-04-05 23:08:18 +02:00
  • 98e17f143d Rename param from lt_robot to ltr Ai Lin Chia 2016-04-05 23:07:42 +02:00
  • 5f06305be0 Remove hack for pragma pack mess Ai Lin Chia 2016-04-05 16:57:25 +02:00
  • 09c15a8f88 More Url::set changes. Add more overloads for all false parameters. Ai Lin Chia 2016-04-04 23:54:33 +02:00
  • 34a4d08cd1 Add Url::set overload for all false boolean paramater Ai Lin Chia 2016-04-04 23:31:07 +02:00
  • 9b4ac7fe9e Change usage of Url::set to another overload Ai Lin Chia 2016-04-04 23:16:59 +02:00
  • 766dfbddb4 Remove unused Url::set overload Ai Lin Chia 2016-04-04 23:12:25 +02:00
  • 43775ffdc9 Remove always false parameter from Url::set Ai Lin Chia 2016-04-04 23:06:48 +02:00
  • b0b5cda487 Remove code for hijacked detection. We're always setting it to false in XmlDoc. The detection code doesn't look very safe. Ai Lin Chia 2016-04-04 12:31:46 +02:00
  • 13a8d2017f Extend unit test scope logging Ai Lin Chia 2016-04-04 10:04:21 +02:00
  • 3aa1dec175 Quick hack to prevent the pragma pack mess to crash our code/unit test. Add -fno-stack-protector for now until pragma pack stuff is fixed Ai Lin Chia 2016-04-01 17:11:52 +02:00
  • 5ccfde3e2a Remove titleRecVersion from Url.h/Url.cpp. Checks there are versions from before gigablast was open-sourced Ai Lin Chia 2016-04-01 17:04:51 +02:00
  • 0ae20dbfee Add more unit test for RobotRule & unit test for real robots.txt found in the wild Ai Lin Chia 2016-04-01 14:05:17 +02:00
  • fa019cf8a8 Always use rules from user-agent when we have found it even when it's empty Ai Lin Chia 2016-04-01 13:59:17 +02:00
  • eccd13b5db Change rules from using std::list to std::vector Ai Lin Chia 2016-04-01 11:36:20 +02:00
  • cbc7209ce2 Fix wildcard matching. Add test case to cover it. Ai Lin Chia 2016-03-31 23:22:06 +02:00
  • bf17be704f Add timing logs for Robots.cpp Ai Lin Chia 2016-03-31 17:46:54 +02:00
  • c02cac54b2 Remove old implementation of robots.txt parser. Remove now unused Mime.cpp & Mime.h Ai Lin Chia 2016-03-31 15:54:58 +02:00
  • 5f5fb650ac Disable debug build log Ai Lin Chia 2016-03-31 15:46:32 +02:00
  • 9d73d42065 Replace old robots.txt parser implementation with new Ai Lin Chia 2016-03-31 15:46:12 +02:00
  • dceebd6053 Add debug log. Remove logging from unit tests Ai Lin Chia 2016-03-31 15:37:13 +02:00
  • 1b8f66902b Make Robots::print public Ai Lin Chia 2016-03-31 15:16:55 +02:00
  • a8b5aa6d24 Move all unit test to use new code Ai Lin Chia 2016-03-31 14:47:35 +02:00
  • 097c650de2 Remove commented out debug log lines Ai Lin Chia 2016-03-31 14:28:06 +02:00
  • 41d1418518 Add code for wildcard searching. Additional unit test for wildcard searching Ai Lin Chia 2016-03-31 00:03:49 +02:00
  • 82a76baab0 Refactored Robots. Move RobotRule into separate file. Additional unit test. Ai Lin Chia 2016-03-29 16:58:52 +02:00
  • 2336613db4 Fix multi-line consecutive user-agent Ai Lin Chia 2016-03-29 13:01:25 +02:00
  • 86a708cae5 Add code to handle crawl-delay & unit test for it Ai Lin Chia 2016-03-29 11:57:48 +02:00
  • 3e0ce1fc0b Add code to extract lines, field & value from robots.txt Ai Lin Chia 2016-03-25 17:45:18 +01:00
  • 210e15a4d5 Add m_logTraceRobots for trace Robots log Ai Lin Chia 2016-03-24 10:01:36 +01:00
  • 91df665033 Add more robots.txt unit test Ai Lin Chia 2016-03-22 22:50:30 +01:00
  • 65e668885c Remove unused Mime::getValue function Ai Lin Chia 2016-03-22 13:27:47 +01:00
  • f44e6053dd Removed unnecessary #includes from RdbList.cpp Ivan Skytte Jørgensen 2016-04-05 22:50:10 +02:00
  • e4323f397f Removed global #pragma pack(4) from <types.h> Ivan Skytte Jørgensen 2016-04-05 16:39:35 +02:00
  • 11f091a10b more valgrind suppressions Ivan Skytte Jørgensen 2016-04-05 15:01:41 +02:00
  • 3f8e094a25 Made padding in Inlink explicit Ivan Skytte Jørgensen 2016-04-05 14:57:33 +02:00
  • dcfe671519 Made local variable s_dummy2 static Ivan Skytte Jørgensen 2016-04-05 13:54:33 +02:00
  • 78a2e8f9a3 Call Msg39Reply::reset() so there are no uninitialized bytes Ivan Skytte Jørgensen 2016-04-04 16:09:08 +02:00
  • 9be2b2a931 Fix start script so that it runs on dash shell Ai Lin Chia 2016-04-04 14:44:22 +02:00
  • 1f6305087e Removed explicit m_buf[0] from InjectionRequest Ivan Skytte Jørgensen 2016-04-04 14:27:56 +02:00
  • ce5a6cf3a9 Removed explicit m_buf[0] from Msg25Request Ivan Skytte Jørgensen 2016-04-04 14:23:57 +02:00
  • ef3b8a343b Removed explicit m_buf[0] from Msg13Request Ivan Skytte Jørgensen 2016-04-04 13:43:09 +02:00
  • 2bb84f8bd9 Fixed debug log in Threads.cpp Ivan Skytte Jørgensen 2016-04-04 13:31:57 +02:00
  • e57fb2f72c Removed explicit m_buf[0] from Msg39Request/Msg39Reply Ivan Skytte Jørgensen 2016-04-04 12:36:32 +02:00
  • b82536583f Removed explicit m_buf [0] member from Msg20Request/Msg20Reply Ivan Skytte Jørgensen 2016-04-04 12:26:25 +02:00
  • f6efc87728 Use general serialize/deserialize functions in Msg20Request Ivan Skytte Jørgensen 2016-04-04 12:07:03 +02:00
  • 7f6b3e4f7e Use general serialization/deserialization functions in msg20 Ivan Skytte Jørgensen 2016-04-04 11:57:31 +02:00
  • 807218902e Removed unnecessary clearing in Msg20Request::reset() Ivan Skytte Jørgensen 2016-04-04 10:42:22 +02:00
  • fc5f02e592 Removed non-pthreads code Ivan Skytte Jørgensen 2016-03-31 15:07:14 +02:00
  • c52264b13a Allow more than 1 thread for posdb merge adn intersection Ivan Skytte Jørgensen 2016-03-31 14:37:08 +02:00
  • f360546778 Use of #pragma pack breaks standard library functionality. Partial fix Brian Rasmusson 2016-03-30 15:54:28 +02:00
  • d14d07bfd0 Marked most Posdb member functions as static Ivan Skytte Jørgensen 2016-03-29 15:30:47 +02:00
  • 322c8d1e7a Made al 81) members of Posdb private Ivan Skytte Jørgensen 2016-03-29 15:09:43 +02:00
  • 19c27d6aac fix keysize==8 bug in keycmp. Manual merge of cab6d5c51954069d48ec2130ecdd46f55499ba9b Brian Rasmusson 2016-03-29 11:16:20 +02:00
  • 81ee2f196c gbufix memory leak in high-freq-term shortcut Ivan Skytte Jørgensen 2016-03-22 16:29:10 +01:00
  • d95b4624bd fix memory leak in summary cache Ivan Skytte Jørgensen 2016-03-22 15:51:57 +01:00
  • 66b6fb5093 Use memcpy/memmove instead of gbmemcpy in Mem.cpp Ivan Skytte Jørgensen 2016-03-22 14:14:30 +01:00