Commit Graph

  • 16db97b1c2 Ignore killall error Ai Lin Chia 2018-02-09 14:46:39 +01:00
  • 5f50a48dd4 Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-09 14:34:10 +01:00
  • 1a03f99252 Fix missed out changes in commit a8716aec79 Ai Lin Chia 2018-02-09 14:32:23 +01:00
  • 27e90e164f Removed Words::getAsLong() and moved code to the one place it was used Ivan Skytte Jørgensen 2018-02-09 14:20:11 +01:00
  • cd100af354 Just use killall Ai Lin Chia 2018-02-09 14:18:47 +01:00
  • 8644a13cbc Remove write only variable SiteGetter::m_sitePathDepth Ai Lin Chia 2018-02-09 14:07:56 +01:00
  • cb1089a575 Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-09 14:03:15 +01:00
  • a8716aec79 Removed always-truee argument computeWordsIds to Words::set() and Words::addWords() Ivan Skytte Jørgensen 2018-02-09 14:03:08 +01:00
  • 773c5f1e93 Fix bug introduced on commit 7f96ef2510 Ai Lin Chia 2018-02-09 13:57:54 +01:00
  • 88fc9423ae Append parameter changes to eventlog (only on instance #0) Ai Lin Chia 2018-02-09 11:11:39 +01:00
  • 79537f9f09 We should dereference changed Ai Lin Chia 2018-02-08 17:26:54 +01:00
  • 32afe70ce7 Merge branch 'master' into sqlite Ai Lin Chia 2018-02-08 17:13:23 +01:00
  • 184de2baeb Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-08 16:40:47 +01:00
  • a860329472 Better unittests for Words class Ivan Skytte Jørgensen 2018-02-08 16:40:32 +01:00
  • 4921f47784 Delay reload of dns settings until there is no pending requests, and at the same time pause requests Ai Lin Chia 2018-02-08 16:32:55 +01:00
  • fbe1a97d8e Removed debug-printf in unittest Ivan Skytte Jørgensen 2018-02-08 16:15:44 +01:00
  • 2d3faa2b3e unicode: fix unittests Ivan Skytte Jørgensen 2018-02-08 16:13:18 +01:00
  • 3f43604cd1 unicode: bugfix Words::addWords() Ivan Skytte Jørgensen 2018-02-08 16:12:27 +01:00
  • 673c9c107f bugfix is_wordchar Ivan Skytte Jørgensen 2018-02-08 15:00:19 +01:00
  • 91f90723c5 unicode: extended wordchar unittest Ivan Skytte Jørgensen 2018-02-08 14:43:01 +01:00
  • f6417161e2 .gitnore: libunicode.a Ivan Skytte Jørgensen 2018-02-08 14:13:02 +01:00
  • 558dfcca05 clean unicode/ in clean targer (and sto+word_variations) Ivan Skytte Jørgensen 2018-02-08 14:11:32 +01:00
  • e4cda54bc0 Avoid callback when it's not necessary Ai Lin Chia 2018-02-08 14:06:06 +01:00
  • 3245abf58c unicode: forgotten fix in word-variations Ivan Skytte Jørgensen 2018-02-08 13:49:18 +01:00
  • 9079428cbf unicode: optimized getUtf8CharSize() a bit Ivan Skytte Jørgensen 2018-02-08 13:02:13 +01:00
  • b7af061cb6 Added new unicode *.dat files to repository. Removed old files Ivan Skytte Jørgensen 2018-02-08 12:34:07 +01:00
  • 5ca7f7a45d unicode: don't generate .dat files by default Ivan Skytte Jørgensen 2018-02-08 12:31:39 +01:00
  • 005f0c503a Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-08 12:09:44 +01:00
  • f9c4593405 Only do reload when we have changes in parms Ai Lin Chia 2018-02-08 11:28:15 +01:00
  • bc854f4d5f Cater for nullptr that can be returned by getIp Ai Lin Chia 2018-02-08 10:55:12 +01:00
  • 2889358915 Remove SpiderdbHostDelete feature. We'll delete records from spiderdb externally instead of from gb directly. Ai Lin Chia 2018-02-07 13:30:26 +01:00
  • fb0f3ce55d Default crawlDelayMS to 0 instead of 1 Ai Lin Chia 2018-02-07 12:21:55 +01:00
  • 3988f8f52a Merge branch 'master' into sqlite Ai Lin Chia 2018-02-07 11:27:53 +01:00
  • d4cf5f6455 unicode: Preliminary commit, work-in-progress Ivan Skytte Jørgensen 2018-02-06 17:06:35 +01:00
  • a4ff12d635 Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-06 17:02:59 +01:00
  • 9df357a106 bugfix PageTemperatureRegistry: coredumped if not present Ivan Skytte Jørgensen 2018-02-06 17:02:26 +01:00
  • 723ce40a0f Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-06 15:27:08 +01:00
  • d658c53877 Removed accent / unicode-decomposition based synonyms, eg. "kål" <=> "kal" Ivan Skytte Jørgensen 2018-02-06 15:26:49 +01:00
  • 364dd030f5 unicode: tmp. compile fix Ivan Skytte Jørgensen 2018-02-06 15:05:53 +01:00
  • 842b80caa2 Aded Unicode::recursive_canonical_decompose() Ivan Skytte Jørgensen 2018-02-06 14:54:01 +01:00
  • a84144b0b1 unicode: made table name more explicit (g_unicode_canonical_decomposition_map) Ivan Skytte Jørgensen 2018-02-06 13:54:58 +01:00
  • 620cd5bfa7 Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-06 13:40:04 +01:00
  • 337c71e629 pagetemp: optimized use in postdbtable Ivan Skytte Jørgensen 2018-02-06 13:39:41 +01:00
  • 75a1f7d7b6 unicode: generate and load unicode_is_uppercase.dat and unicode_is_lowercase.dat Ivan Skytte Jørgensen 2018-02-05 23:57:36 +01:00
  • 0f748aaac4 unicode: generate and load unicode_is_alphabetic.dat Ivan Skytte Jørgensen 2018-02-05 23:50:01 +01:00
  • 30af33695d Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-05 23:44:26 +01:00
  • b0ca7b70f9 Removed unused ucIsDigit() Ivan Skytte Jørgensen 2018-02-05 23:44:17 +01:00
  • e223cda3bf Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-05 23:24:18 +01:00
  • 5c5bef89b5 Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-05 18:16:15 +01:00
  • 3be99ae9cd Moved ucToUtf8() to separate module Ivan Skytte Jørgensen 2018-02-05 18:16:05 +01:00
  • 50952d4e2c Merge branch 'master' into dev-ipblock Ai Lin Chia 2018-02-05 18:15:02 +01:00
  • 2040903102 Use fallback per-site page temperature from site_median_page_temperatures.dat if it exists Ivan Skytte Jørgensen 2018-02-05 17:36:00 +01:00
  • c6101798c5 Move loading of contenttypeallowed.txt to init function Ai Lin Chia 2018-02-05 16:38:04 +01:00
  • b16d39c07b Make more private in Unicode Ivan Skytte Jørgensen 2018-02-05 16:29:38 +01:00
  • b63609ab9a moved char/utf8 lookup tables from fctypes to utf8_fast Ivan Skytte Jørgensen 2018-02-05 15:44:36 +01:00
  • 636d98a68b Add check for ipblocklist while spidering Ai Lin Chia 2018-02-05 15:23:04 +01:00
  • 7bf01ea089 Fixed ancient bug in is_alnum_api_utf8_string() Ivan Skytte Jørgensen 2018-02-05 15:21:40 +01:00
  • c6aa9a47ec Revert "Don't insert into spiderdb when it's record not found; Set index code to ENOTFOUND as well" Ai Lin Chia 2018-02-05 15:13:53 +01:00
  • 82340e52da Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-05 15:02:53 +01:00
  • 5ccad7408d Moved utf8 string manipulation fucntions from fctypes to separate file utf8_fast Ivan Skytte Jørgensen 2018-02-05 14:56:44 +01:00
  • c7dbd2c3cc Merge branch 'master' into dev-ipblock Ai Lin Chia 2018-02-05 14:54:56 +01:00
  • 5b0ae5a399 Fix unit test compilation from changes in 2eb2408c87 Ai Lin Chia 2018-02-05 14:54:18 +01:00
  • 7d6d17708a Removed unused functions from UnicodeProperties.* Ivan Skytte Jørgensen 2018-02-05 14:07:40 +01:00
  • a920f49928 Removed some unused types from UnicodeProperties.h Ivan Skytte Jørgensen 2018-02-05 13:57:20 +01:00
  • 2eb2408c87 Moved pure utf-8 fucntions to separate module Ivan Skytte Jørgensen 2018-02-05 13:53:12 +01:00
  • a764a6787d Fix compilation error Ai Lin Chia 2018-02-04 19:17:30 +01:00
  • bdc10cde92 Initial commit of IpBLockList Ai Lin Chia 2018-02-04 19:02:03 +01:00
  • 19c50e534f Add function to add to blocklist (can be overriden by child class) Ai Lin Chia 2018-02-04 17:06:02 +01:00
  • 0f99e4b03f Modify BlockList to template based Ai Lin Chia 2018-02-04 16:58:04 +01:00
  • b0529fb488 Fix unit test compilation error from commit 20fbe0a129 Ai Lin Chia 2018-02-04 10:59:31 +01:00
  • 2756717c20 Add unit test to make sure code doesn't segfault anymore Ai Lin Chia 2018-02-04 10:59:11 +01:00
  • bdcf170f6c Fix coredump when filtering doc with only dots & spaces Ai Lin Chia 2018-02-04 10:43:15 +01:00
  • 03e2241e98 Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-03 21:03:08 +01:00
  • 20fbe0a129 Moved isUtf8UnwantedSymbols() to separate header Ivan Skytte Jørgensen 2018-02-03 20:58:45 +01:00
  • 85fa403ca4 Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-02 18:03:16 +01:00
  • f3d9b27440 Preliminary commit on unicode update Ivan Skytte Jørgensen 2018-02-02 18:01:59 +01:00
  • 3ac7bd682c Fold getUtf8CharSize() variants into one and make some code reuse Ivan Skytte Jørgensen 2018-02-02 17:55:40 +01:00
  • b1c38aee9d Made gbiconv_open/gbiconv_close() privatge Ivan Skytte Jørgensen 2018-02-02 17:29:01 +01:00
  • 09521e69ed Made ucToUtf8() non-inline and ucToAny() private Ivan Skytte Jørgensen 2018-02-02 17:27:15 +01:00
  • 1cc7ea1441 Remove commented out code Ai Lin Chia 2018-02-02 17:18:20 +01:00
  • 064d025b7d Merge branch 'master' into sqlite Ai Lin Chia 2018-02-02 17:16:01 +01:00
  • 0e18088cfa Fix m_siteHash32 for SpiderRequest::setFromAddUrl (getHostFast returns host without port, getHost returns host with port) Ai Lin Chia 2018-02-02 17:15:07 +01:00
  • e59e310fc3 Remove now unused variable Ai Lin Chia 2018-02-02 16:57:05 +01:00
  • 139055164b Remove now unused variable/function Ai Lin Chia 2018-02-02 16:33:13 +01:00
  • cc0613481f Remove hop count (not stored in sqlite based spiderdb) Ai Lin Chia 2018-02-02 15:50:03 +01:00
  • 87f9ca70a1 Use HttpMime::reset instead of setting it separately to avoid forgetting to clear some variables in HttpMime::set function Ai Lin Chia 2018-02-02 15:06:22 +01:00
  • 87a10c311a Remove commented out code Ai Lin Chia 2018-02-02 12:52:57 +01:00
  • 3f6e3b888c Remove hopcount from preconfigured url filters Ai Lin Chia 2018-02-02 12:50:57 +01:00
  • d131be1b64 Merge branch 'master' into sqlite Ai Lin Chia 2018-02-02 10:51:11 +01:00
  • 8e5c6bf85d Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-01 14:45:24 +01:00
  • 78adcea37d Introduced settings to allow/disallow requests between spiderhosts and query hosts. Ivan Skytte Jørgensen 2018-02-01 14:44:01 +01:00
  • 5d1e8451d9 More fixes to json Ai Lin Chia 2018-01-31 17:20:38 +01:00
  • 111831c420 Fix json formatting error Ai Lin Chia 2018-01-31 17:13:41 +01:00
  • df6f817e8c Return more fields when searching for spiderdb records in json format Ai Lin Chia 2018-01-31 16:33:12 +01:00
  • de5f3a9747 Remove config files for make cleantest Ai Lin Chia 2018-01-31 16:28:06 +01:00
  • 78b8e2f5a2 Fix Jenkinsfile so that it works again Ai Lin Chia 2018-01-31 12:12:50 +01:00
  • 8b25467168 Another attempt of making Jenkins to checkout into a subdir Ai Lin Chia 2018-01-31 12:07:27 +01:00
  • 3b31cb8876 Add BranchDiscoveryTrait Ai Lin Chia 2018-01-31 11:59:42 +01:00
  • e70f049db1 More tests for Jenkinsfile Ai Lin Chia 2018-01-31 11:55:36 +01:00
  • 80a159d286 Let try again with Jenkinsfile Ai Lin Chia 2018-01-31 11:28:32 +01:00