Commit Graph

  • 6b04e16164 Changed some needless if..continue constructs to plain if {} Ivan Skytte Jørgensen 2017-06-16 14:44:05 +02:00
  • 8d1871a393 Changed QueryTermInfo::m_matching* from separate arrays to array-of-structs Ivan Skytte Jørgensen 2017-06-16 13:55:01 +02:00
  • 017e5cbd1b Link c-ares to gb. Add c-ares as a dependency Ai Lin Chia 2017-06-29 15:15:37 +02:00
  • 47cb037531 Added dependency list for OpenSuse Ivan Skytte Jørgensen 2017-06-29 14:15:08 +02:00
  • 759e921300 Use default min/max bytes for cld3 unless contentLen is bigger than default max bytes Ai Lin Chia 2017-06-28 15:06:05 +02:00
  • e1d88617ae Don't pass in contentLen for CLD3, don't use best effort flag for CLD2 Ai Lin Chia 2017-06-28 15:03:25 +02:00
  • ea60aeb00b Merge branch 'master' into dev-encoding Ai Lin Chia 2017-06-28 14:42:08 +02:00
  • bab6c505ed Pick language based on specific criteria Ai Lin Chia 2017-06-28 14:18:07 +02:00
  • fafe712c1c Remove gb & cld3 summary lang detection Ai Lin Chia 2017-06-28 13:46:59 +02:00
  • a3636032ee Return langUnknown when contentLen is 0 Ai Lin Chia 2017-06-28 12:31:15 +02:00
  • 1724188ec9 Pass isPlainText boolean to CLD2 Ai Lin Chia 2017-06-28 12:22:51 +02:00
  • ea97639243 Move CLD related functions to GbLanguage. Add mapping between CLD to GB Ai Lin Chia 2017-06-28 11:52:46 +02:00
  • ecaca8aeea Add more languages Ai Lin Chia 2017-06-28 11:52:06 +02:00
  • 025bf99dca Don't set S(ynonym) bit in bigram posdb entries. Ivan Skytte Jørgensen 2017-06-27 16:34:30 +02:00
  • 9f072606d3 Removed "hack of confusion" in PosdbTable.cpp Ivan Skytte Jørgensen 2017-06-27 16:19:19 +02:00
  • af95ceab70 More trace info in PosdbTable.pp Ivan Skytte Jørgensen 2017-06-20 16:49:24 +02:00
  • afccd9ca1d keep track of whe a qti matchingsublist came from (for later referencing the qt->* fields (weights etc.) Ivan Skytte Jørgensen 2017-06-20 16:41:57 +02:00
  • e852cd90ac Added more strategic VALGRIND_MAKE_MEM_UNDEFINED calls Ivan Skytte Jørgensen 2017-06-20 14:07:19 +02:00
  • 0721d0fa94 Changed eneless if-chaigns for qt->m_fieldCode into plain switch() Ivan Skytte Jørgensen 2017-06-20 13:35:52 +02:00
  • 5796dd3d2d Eliminated QueryTermInfo::m_totalSubListsSize Ivan Skytte Jørgensen 2017-06-20 12:51:47 +02:00
  • e7aa9012e9 More const in PosdbTable.cpp Ivan Skytte Jørgensen 2017-06-20 12:41:25 +02:00
  • 1116c270da Restructed QueryTermInfo Ivan Skytte Jørgensen 2017-06-20 12:28:50 +02:00
  • 1877d7a998 Changed quer:* m_fieldCode from char to an enum Ivan Skytte Jørgensen 2017-06-19 16:25:14 +02:00
  • b61b05d840 Actually find the first position match of a qti, and not the first position in the last list of a qti Ivan Skytte Jørgensen 2017-06-19 14:46:44 +02:00
  • 07fbc3433d Factor out common sublist iteratore from PosdbTable::prefilterMaxPossibleScoreByDistance() to setRingbufFromQTI() Ivan Skytte Jørgensen 2017-06-19 14:45:14 +02:00
  • 641ae77211 Use Posdb::getWordPos() in PosdbTable.cpp Ivan Skytte Jørgensen 2017-06-19 14:27:52 +02:00
  • f70a6c6ff1 Move local variable declarations nearer to first use Ivan Skytte Jørgensen 2017-06-19 13:11:59 +02:00
  • 105e2df144 if...continue -> if {} Ivan Skytte Jørgensen 2017-06-16 15:53:15 +02:00
  • 0f8a73eade Moved local variable decl to first use Ivan Skytte Jørgensen 2017-06-16 14:57:35 +02:00
  • eb3b264b13 Changed some needless if..continue constructs to plain if {} Ivan Skytte Jørgensen 2017-06-16 14:44:05 +02:00
  • 0d4c2e521f Changed QueryTermInfo::m_matching* from separate arrays to array-of-structs Ivan Skytte Jørgensen 2017-06-16 13:55:01 +02:00
  • 86f5cbd2b9 Avoid testing on undfined values Ivan Skytte Jørgensen 2017-06-27 14:29:04 +02:00
  • 9a3def35cf merged from staging Brian Rasmusson 2017-06-27 11:51:34 +02:00
  • 50c2e025b6 if-loooong-else --> if-early-break Ivan Skytte Jørgensen 2017-06-26 16:45:58 +02:00
  • 2633d75612 Merge branch 'master' into dev-encoding Ai Lin Chia 2017-06-26 15:18:14 +02:00
  • 43a9f1257f Turn do-while into plain while in PosdbTable::mergeTermSubListsForDocId() Ivan Skytte Jørgensen 2017-06-26 15:01:25 +02:00
  • 8419c6b351 Moved decl of local 'ks' to where it was assigned and used Ivan Skytte Jørgensen 2017-06-26 14:44:39 +02:00
  • 914eda5e08 Always use CED detected encoding when it's reliable or when we can't detect charset from other methods Ai Lin Chia 2017-06-26 14:42:31 +02:00
  • bf32138203 Move getCharsetFast to GbEncoding Ai Lin Chia 2017-06-26 14:16:47 +02:00
  • 0bb43ce451 Remove now empty LanguageIdentifier.h and LanguageIdentifier.cpp Ai Lin Chia 2017-06-26 13:05:05 +02:00
  • 01a1450a05 Disable synonyms in query-reindex. And disable high-freq-term-cache Ivan Skytte Jørgensen 2017-06-26 12:36:05 +02:00
  • f8626a7058 Always run through libced Ai Lin Chia 2017-06-26 12:33:41 +02:00
  • ba0856caaa Make PosdbTable::allocateTopTree estimate correctly Ivan Skytte Jørgensen 2017-06-26 12:21:48 +02:00
  • 1ce009ea5a Don't use max length as it will not get any summary when getting from meta tags Ai Lin Chia 2017-06-26 11:18:52 +02:00
  • e090c6c473 let's return something other than "xx" if we can't get summary Ai Lin Chia 2017-06-23 15:30:47 +02:00
  • 3fcaf21a34 Move guessCountryTLD to CountryCode.h from LanguageIdentifier.h Ai Lin Chia 2017-06-23 15:13:50 +02:00
  • 4f78bd0218 Fixed reallocation of toptree Ivan Skytte Jørgensen 2017-06-23 14:20:58 +02:00
  • 8030c75e75 Get as much summary & title as possible Ai Lin Chia 2017-06-22 16:31:55 +02:00
  • cece00ace1 We should only get up to MAX_SUMMARY_LEN - 1 Ai Lin Chia 2017-06-22 15:55:34 +02:00
  • 7064117112 Various changes for language detection. Use max title/summary length in language detection. Pass in only text (without html tags) to CLD2/CLD3 Ai Lin Chia 2017-06-22 15:03:17 +02:00
  • efefc700dd Use -1 for max node instead of arbitary 999999 Ai Lin Chia 2017-06-22 15:01:33 +02:00
  • 53f5257ac9 Add getLangIdSummary using gb method Ai Lin Chia 2017-06-21 15:31:48 +02:00
  • f1d9225e24 Try to check language for summary Ai Lin Chia 2017-06-21 15:22:48 +02:00
  • 4f3b27256d Add new librarys to g_files Ai Lin Chia 2017-06-21 15:01:49 +02:00
  • 57b38509f5 Add dependency for libprotobuf and protoc-compiler Ai Lin Chia 2017-06-21 14:50:47 +02:00
  • e79a0c7447 Merge branch 'master' into dev-encoding Ai Lin Chia 2017-06-21 14:48:07 +02:00
  • 551f0cb6ed Expand on dependency section Ai Lin Chia 2017-06-21 14:29:00 +02:00
  • 142b8a8c3b Fix error in Jenkinsfile Ai Lin Chia 2017-06-21 13:46:20 +02:00
  • 3f9311ad35 Fix error in Jenkinsfile Ai Lin Chia 2017-06-21 13:46:20 +02:00
  • cbf2fff3ac Fix unit test Ai Lin Chia 2017-06-21 13:26:50 +02:00
  • 3dae03da28 Add dirty flag to ced submodule Ai Lin Chia 2017-06-21 12:59:44 +02:00
  • 4b883ad60a Merge branch 'dev-language' into dev-encoding Ai Lin Chia 2017-06-21 12:59:18 +02:00
  • 6f6c69b6d8 Merge branch 'master' into dev-language Ai Lin Chia 2017-06-21 12:56:39 +02:00
  • a3aaf89837 Don't send slack 'Back to normal' message with first build of the branch Ai Lin Chia 2017-06-21 12:08:02 +02:00
  • 5f42cc3150 Don't send slack 'Back to normal' message with first build of the branch Ai Lin Chia 2017-06-21 12:08:02 +02:00
  • 5ee8ebede9 First use of libced to detect document encoding as last resort Ai Lin Chia 2017-06-21 12:04:36 +02:00
  • 88e6e4b039 Build & link ced to gb Ai Lin Chia 2017-06-21 11:56:11 +02:00
  • f7b14a07dd Merge branch 'master' into dev-encoding Ai Lin Chia 2017-06-21 11:49:33 +02:00
  • 387cf217ef Fix typo. Use JUnitType instead of JunitType Ai Lin Chia 2017-06-21 10:23:59 +02:00
  • add5759062 Process system test output xml file Ai Lin Chia 2017-06-21 10:20:30 +02:00
  • e27c695bfb Lets try to run system test Ai Lin Chia 2017-06-20 15:56:35 +02:00
  • 83772c9f1b Add cleantest target to clean between pywebtest test runs Ai Lin Chia 2017-06-19 16:27:10 +02:00
  • 6cc880882d Change pywebserver to pywebtest to reflect repo name Ai Lin Chia 2017-06-19 15:01:41 +02:00
  • cf87cf11a6 Add substring match to botname match Ai Lin Chia 2017-06-19 15:00:26 +02:00
  • f47a53820a Cater for json format when requesting 'admin/spiderdb' Ai Lin Chia 2017-06-19 14:29:14 +02:00
  • 69aa997443 Remove unused/noop variable/function from SpiderColl Ai Lin Chia 2017-06-19 14:28:02 +02:00
  • f319ce7834 Remove currentTimeUTC and add processStartTime for admin/status page in JSON Ai Lin Chia 2017-06-19 14:13:56 +02:00
  • d83191a22a Fix default score value if no single terms were scored, and all terms are not special (e.g. numbers). Would happen for documents matching bigrams only, and not single terms, which could happen when searching for "bridget jones" and a document has the text "bridgetjon es" as the only match (bigram). Brian Rasmusson 2017-06-18 23:46:46 +02:00
  • 4208772687 Fix default score value if no single terms were scored, and all terms are not special (e.g. numbers). Would happen for documents matching bigrams only, and not single terms, which could happen when searching for "bridget jones" and a document has the text "bridgetjon es" as the only match (bigram). Brian Rasmusson 2017-06-18 23:46:46 +02:00
  • 1c1e2a6984 revert changes from commit d2c1430da8 to make gbsortbyint work again Brian Rasmusson 2017-06-18 10:21:13 +02:00
  • d45b7eecbd revert changes from commit d2c1430da8 to make gbsortbyint work again Brian Rasmusson 2017-06-18 10:21:13 +02:00
  • 73a64b789f Remove unused spider status Ai Lin Chia 2017-06-16 16:58:39 +02:00
  • 2baadc1071 Fix abort when spider debug log is enabled Ai Lin Chia 2017-06-16 13:43:51 +02:00
  • df4a1e5792 Better encapsulation of PosdbTable Ivan Skytte Jørgensen 2017-06-16 15:25:49 +02:00
  • f520bf7a8e scoreMatrix is a m_numQueryTermInfos*m_numQueryTermInfos matrix and not a m_nqt*m_nqt matrix. Fix indexing Ivan Skytte Jørgensen 2017-06-15 17:04:43 +02:00
  • f7a5d47359 Add placeholder to run system test Ai Lin Chia 2017-06-15 15:19:03 +02:00
  • d3358b78ea Only send slack message for failure/unstable and when we return to successful. Don't send for every successful build Ai Lin Chia 2017-06-15 11:49:40 +02:00
  • a259028bb5 Let's checkout pywebserver as well Ai Lin Chia 2017-06-14 17:16:34 +02:00
  • 28b259d001 Failing test sets result as unstable Ai Lin Chia 2017-06-14 16:56:42 +02:00
  • 0d7fad169a Allow longer working directory Ai Lin Chia 2017-06-14 16:38:53 +02:00
  • 9e788fb80d Fixing error (hopefully) Ai Lin Chia 2017-06-14 16:23:07 +02:00
  • e13af3ec3b Use relativeTargetDir in preparation of multiple repo checkout Ai Lin Chia 2017-06-14 15:52:36 +02:00
  • 188c391db9 Add notification for changed block. Updates needed to notify changed status Ai Lin Chia 2017-06-14 14:32:06 +02:00
  • 0df469210f Empty steps not allowed in Jenkinsfile Ai Lin Chia 2017-06-14 14:19:41 +02:00
  • 06d6db90b8 First step of adding slack notification Ai Lin Chia 2017-06-14 14:18:24 +02:00
  • 39d187df02 Fix typo in Jenkinsfile Ai Lin Chia 2017-06-14 12:31:20 +02:00
  • 0d51df8e92 Add xunit publisher for unit test Ai Lin Chia 2017-06-14 12:23:01 +02:00
  • 8fd5baf49a Increase MAX_FILENAME_LEN to allow for deeper path (eg: for jenkins) Ai Lin Chia 2017-06-14 12:07:49 +02:00
  • 9105c58d22 Fix commit 3b75814945 where sometimes collection is not saved Ai Lin Chia 2017-06-14 11:58:58 +02:00
  • 7f1553b3b3 Don't do default checkout. We have a checkout step that initialize submodules Ai Lin Chia 2017-06-14 11:36:23 +02:00