Commit Graph

  • b98d24e87b goto -> for() Ivan Skytte Jørgensen 2018-02-22 18:09:05 +01:00
  • fe95e110d5 More cleanup in Sections.* Ivan Skytte Jørgensen 2018-02-22 18:05:46 +01:00
  • 825278b560 Split out logic used for waiting tree list into SpiderdbRdbSqliteBridge::getFirstIps and use SELECT DISTINCT instead. (Don't remove multi ip select this time) Ai Lin Chia 2018-02-22 17:37:22 +01:00
  • 0d6b2af281 Revert "Split out logic used for waiting tree list into SpiderdbRdbSqliteBridge::getFirstIps and use SELECT DISTINCT instead." Ai Lin Chia 2018-02-22 17:33:12 +01:00
  • 2af8585fc0 Split out logic used for waiting tree list into SpiderdbRdbSqliteBridge::getFirstIps and use SELECT DISTINCT instead. Ai Lin Chia 2018-02-22 17:22:49 +01:00
  • 8b1bb139d3 Sections: more const, better encapsulation, better #include setup Ivan Skytte Jørgensen 2018-02-22 17:15:18 +01:00
  • 2c8bf4d7a8 Don't get from spiderdb anymore when we're already at the end of list. Increment next key for spiderdb by uh48 instead of just incrementing by one Ai Lin Chia 2018-02-22 15:42:27 +01:00
  • 5c47917b71 Revert "Don't loop to select from spiderdb anymore. We always return all rows for the single ip now." Ai Lin Chia 2018-02-22 15:13:36 +01:00
  • 4bed872b77 Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-22 15:08:10 +01:00
  • f50e1d04bd Send site-default-pagem-temperature from spidering to query-host Ivan Skytte Jørgensen 2018-02-22 15:07:08 +01:00
  • 12a549495c Don't loop to select from spiderdb anymore. We always return all rows for the single ip now. Ai Lin Chia 2018-02-22 15:02:58 +01:00
  • f081241d00 bugfix in site-defalut median temperature files. It was always openign and maintaining the secondary file Ivan Skytte Jørgensen 2018-02-22 15:02:42 +01:00
  • d643652a67 Add missing sqlite3_finalize calls Ai Lin Chia 2018-02-22 14:39:49 +01:00
  • ac015ee8d9 Prepared document indexing for looking up remote site-default page temperatures Ivan Skytte Jørgensen 2018-02-22 12:35:17 +01:00
  • 77d0dbabe4 Simplified command-line host-range parsing in main.cpp Ivan Skytte Jørgensen 2018-02-20 14:01:36 +01:00
  • a37ab2c057 Fix logging Ai Lin Chia 2018-02-20 13:56:24 +01:00
  • 73a226fc20 Add more logs Ai Lin Chia 2018-02-20 13:49:22 +01:00
  • 445c3faf48 Doesn't matter if m_siteNumInlinksValid is not valid when getting a rebuild spiderdb req Ai Lin Chia 2018-02-20 12:28:04 +01:00
  • a043c978af Remove unused variable passed to function Ai Lin Chia 2018-02-20 11:15:01 +01:00
  • 6541742797 Cater for scenario when we're only requesting SpiderRequest using Msg0::getList Ai Lin Chia 2018-02-19 17:51:03 +01:00
  • 49c9736146 Added cmd-line command for preparing and switching to new site-default-page-temp generation Ivan Skytte Jørgensen 2018-02-19 17:51:06 +01:00
  • 60bd1c65eb Typo in log: vlaue -> value Ivan Skytte Jørgensen 2018-02-19 17:15:37 +01:00
  • 163432be01 Found a bug in html entity handling Ivan Skytte Jørgensen 2018-02-19 16:32:28 +01:00
  • 84cab0e2d6 Made call to htmlDecode() in XmlDoc::getUtf8Content() a bit clearer Ivan Skytte Jørgensen 2018-02-19 16:17:21 +01:00
  • 7190880487 Renamed getEntity_a() to getHtmlEntity() Ivan Skytte Jørgensen 2018-02-19 15:46:51 +01:00
  • e6b3bebc4f Don't reset GB_PRE if it has been set Ai Lin Chia 2018-02-19 13:58:26 +01:00
  • 3a16fda28b Add trace log for SpiderdbRdbSqliteBridge Ai Lin Chia 2018-02-19 12:16:59 +01:00
  • e503fa3afb Remove unused Conf variables Ai Lin Chia 2018-02-19 11:23:34 +01:00
  • 2ab67bb799 Add sqlite transaction time threshold configuration Ai Lin Chia 2018-02-19 11:07:12 +01:00
  • 3c9ea2e7cf signal transaction layer upward in sqlite bridge Ivan Skytte Jørgensen 2018-02-16 18:07:47 +01:00
  • bb9e823131 Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-16 18:03:38 +01:00
  • 9d924817cf rollback transaction if commit fails Ivan Skytte Jørgensen 2018-02-16 17:50:29 +01:00
  • 5e3a75d392 Set to force delete instead of removing it directly from spiderdb when blocked if it was indexed Ai Lin Chia 2018-02-16 17:27:13 +01:00
  • 68b1c4ec54 Reworked SiteMedianPageTemperatureRegistry to how it will work in the future Ivan Skytte Jørgensen 2018-02-16 17:09:18 +01:00
  • ae008abd9f Removed obsolete comments about XmlDoc::computeVector() Ivan Skytte Jørgensen 2018-02-16 15:47:40 +01:00
  • 2d0c1c9d7a Merge branch 'master' into sqlite Ai Lin Chia 2018-02-16 11:55:30 +01:00
  • 9921aa5120 Make sure the we have spiderdb sqlite if we still have spiderdb rdb files Ai Lin Chia 2018-02-16 11:23:08 +01:00
  • b3911069e2 Log 100000 instead of 99999 Ai Lin Chia 2018-02-16 11:22:16 +01:00
  • 38e97757ab Readd ./gb dump s command Ai Lin Chia 2018-02-15 17:21:20 +01:00
  • b6dbaaa0ce Removed unused function getTagName() Ivan Skytte Jørgensen 2018-02-15 15:20:46 +01:00
  • 9bbd7a1dc7 Made getTagLen() a local function in XmlNode.cpp Ivan Skytte Jørgensen 2018-02-15 15:19:16 +01:00
  • b78cf045fe Made isTagStart() a local function in XmlNode.* Ivan Skytte Jørgensen 2018-02-15 15:09:55 +01:00
  • fbb0cdf048 Removed unused enum value Ivan Skytte Jørgensen 2018-02-15 15:02:09 +01:00
  • 862822f476 Fix wrong comment: xml nodeid is 16-bit - not 8-bit Ivan Skytte Jørgensen 2018-02-15 14:29:06 +01:00
  • 115b334ea2 Removed references to spiderrestore.dat (long gone) Ivan Skytte Jørgensen 2018-02-15 12:51:46 +01:00
  • 36128a3952 Update readme with changes to spiderdb Ai Lin Chia 2018-02-15 11:42:12 +01:00
  • 6e8c5b6495 Add code to remove pagereindex spiderdb row when we're done with it Ai Lin Chia 2018-02-14 15:24:58 +01:00
  • 414b3a2f3c Remove not relevant comments Ai Lin Chia 2018-02-14 14:53:35 +01:00
  • a96a6d7bfe Revert "Use SpiderReply delete flag to delete the row in spiderdb" Ai Lin Chia 2018-02-14 14:45:50 +01:00
  • 7b63927d7f Revert "Fix compilation errowarning" Ai Lin Chia 2018-02-14 14:45:44 +01:00
  • d535605278 Fix compilation issue on travis Ai Lin Chia 2018-02-14 12:40:55 +01:00
  • 97a1265c50 Fix compilation errowarning Ai Lin Chia 2018-02-14 12:33:30 +01:00
  • 0bd60de5b0 Add sqlite3 to travis packages Ai Lin Chia 2018-02-14 12:26:41 +01:00
  • ab6b8eb0a6 Merge branch 'master' into sqlite Ai Lin Chia 2018-02-14 11:51:17 +01:00
  • 534ee9924d Use SpiderReply delete flag to delete the row in spiderdb Ai Lin Chia 2018-02-14 11:50:13 +01:00
  • 11f3a964d7 Move isreindex urlfilter rule to below urlfilter error checks rules Ai Lin Chia 2018-02-14 10:05:17 +01:00
  • 01d7ced9be Remove commented out codes Ai Lin Chia 2018-02-14 09:57:49 +01:00
  • 0f3d5c3321 Reset m_isAddUrl, m_isInjecting and m_isPageParser after a successful reply Ai Lin Chia 2018-02-14 09:29:48 +01:00
  • 16b656b4f2 More const in Words (less non-const functions) Ivan Skytte Jørgensen 2018-02-13 16:55:53 +01:00
  • d3097c7717 Fix compilation warning Ai Lin Chia 2018-02-13 16:41:29 +01:00
  • 386a9e9949 Fix table header Ai Lin Chia 2018-02-13 16:33:48 +01:00
  • d2787baca3 Make sure we add fx_max function in ConvertSpiderdb as well Ai Lin Chia 2018-02-13 16:31:35 +01:00
  • 47eaf91ceb Remove errCount & sameErrCount from SpiderRequest Ai Lin Chia 2018-02-13 16:31:13 +01:00
  • 5836faa71b pagetemp/sitetemp: reload,; and mutexlock loading Ivan Skytte Jørgensen 2018-02-13 16:19:14 +01:00
  • dd7a1cf235 unicode: more lowercase tests Ivan Skytte Jørgensen 2018-02-13 15:47:07 +01:00
  • 1956046a56 Periodically check if page_temperatures.dat needs to be reloaded Ivan Skytte Jørgensen 2018-02-13 15:41:32 +01:00
  • a1a32d1992 XmlDoc::getHeaderTagBuf(): goto -> for loop Ivan Skytte Jørgensen 2018-02-13 14:56:30 +01:00
  • 074435c8db Merge branch 'master' into sqlite Ai Lin Chia 2018-02-12 17:32:57 +01:00
  • 91a67da3d4 Make sure we use SiteGetter to calculate sitehash instead of just defaulting to host Ai Lin Chia 2018-02-12 17:25:13 +01:00
  • 0b8eed9f78 Remove unused timestamp variable from SiteGetter::getSite Ai Lin Chia 2018-02-12 17:18:47 +01:00
  • 1cd18f7223 Always update m_hasAuthorityInlink when it's valid, and not only when m_hasAuthorityInlink is true Ai Lin Chia 2018-02-12 15:53:56 +01:00
  • f07c6b79df Modify m_siteNumInlinks to allow null value. Use fx_max instead of max. Store null value when m_siteNumInlinks is not valid. Ai Lin Chia 2018-02-12 15:52:23 +01:00
  • 35cd7738b4 Use fx_max instead of max to make sure initial null value doesn't stay null. Set m_priority to -1 when sqlite m_priority is null Ai Lin Chia 2018-02-12 15:50:00 +01:00
  • 7450af595a Add fx_max sqlite function that treats null value as the lowest value (unlike max sqlite function that returns null immediately whenever null value is encountered) Ai Lin Chia 2018-02-12 15:46:13 +01:00
  • df926b24b2 More const in XmlDoc_indexing.cpp (after hashString() is finally const) Ivan Skytte Jørgensen 2018-02-12 15:26:26 +01:00
  • ad2486a448 Removed unused Msg22::getAvailDocIdOnly() Ivan Skytte Jørgensen 2018-02-12 15:08:51 +01:00
  • 5beb6ed949 Merge branch 'master' into sqlite Ai Lin Chia 2018-02-12 14:41:21 +01:00
  • d2336e427e Update json variable names Ai Lin Chia 2018-02-12 14:26:46 +01:00
  • 5ea4b5356b Merge branch 'master' into sqlite Ai Lin Chia 2018-02-12 14:10:46 +01:00
  • da87d65320 bugfix Words::countWords() Ivan Skytte Jørgensen 2018-02-12 14:03:27 +01:00
  • 6c829daa5d Refined .gitignore Ivan Skytte Jørgensen 2018-02-12 13:31:03 +01:00
  • 237916ce9d Merge branch 'master' into sqlite Ai Lin Chia 2018-02-12 10:10:46 +01:00
  • 6fb810cf7c Constness in Words::set() and Matches::* Ivan Skytte Jørgensen 2018-02-09 17:52:47 +01:00
  • 16752d5b90 Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-09 17:47:33 +01:00
  • fc8b1fa9fa Changed Words::set() to not use the NUL-termination hack Ivan Skytte Jørgensen 2018-02-09 17:47:24 +01:00
  • 174ad1207a Merge branch 'master' into sqlite Ai Lin Chia 2018-02-09 17:45:19 +01:00
  • 7b3ef9370a Append events to eventlog Ai Lin Chia 2018-02-09 17:37:06 +01:00
  • 36349c02b6 Make sure we still run GbDns::reinitializeSettings when submit job fails Ai Lin Chia 2018-02-09 17:32:37 +01:00
  • 8f41e07f62 Fixed tmp-set-NUL-string hack in Words. Ivan Skytte Jørgensen 2018-02-09 17:11:50 +01:00
  • f7fa5858fc More words unittest Ivan Skytte Jørgensen 2018-02-09 16:56:00 +01:00
  • 85a4107cc7 Added Words untitest for buffer-limitation Ivan Skytte Jørgensen 2018-02-09 16:45:39 +01:00
  • 11a3e8ffef fix Words unittest (default arg false->0) Ivan Skytte Jørgensen 2018-02-09 16:12:53 +01:00
  • 1e22db6816 Make gcc 7.x happy Ivan Skytte Jørgensen 2018-02-09 16:01:47 +01:00
  • 4484ac5f47 Merge branch 'unicode' Ivan Skytte Jørgensen 2018-02-09 15:57:46 +01:00
  • 9ef5a590fa Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-09 15:31:20 +01:00
  • 8c0b82ef40 Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-09 15:07:43 +01:00
  • 01952c7707 Removed Words::m_xml member Ivan Skytte Jørgensen 2018-02-09 15:00:59 +01:00
  • 0cf9296336 Merge branch 'master' into unicode Ivan Skytte Jørgensen 2018-02-09 14:56:02 +01:00
  • 6747770e6e Merge branch 'master' of github.com:privacore/open-source-search-engine Ivan Skytte Jørgensen 2018-02-09 14:54:19 +01:00
  • 22811a4658 Removed Words::getContent() and Words::getPreCount() Ivan Skytte Jørgensen 2018-02-09 14:54:12 +01:00