255 Commits

Author SHA1 Message Date
0302452f25 Merge branch 'master' into sto 2017-12-11 15:01:07 +01:00
6ff1f91726 Moved rankign/scoring parameters into a single struct BaseScoringParameters
The ranking parameters/weights/flags/etc were spread over many variables and difficult to keep track of. Collected into a single struct for easier overview.
2017-12-11 14:44:58 +01:00
6e2fac79d5 Merge branch 'master' into sto 2017-12-08 12:56:16 +01:00
aa48008d11 Renamed members ..scoringWeights to derievedScoringWeights 2017-12-08 12:53:58 +01:00
6f32f30527 Removed commented-out members of Msg39Request 2017-12-08 11:38:38 +01:00
542bb564a6 Merge branch 'master' into sto 2017-12-08 11:31:47 +01:00
267daa0e96 Made siterankmultipler configurable 2017-12-07 15:34:11 +01:00
0720bafd26 Merge branch 'master' into sto 2017-12-01 18:56:59 +01:00
b47884890d Removed write-only QueryTerm::m_ks 2017-12-01 18:52:46 +01:00
2520a07fe5 Removed unused QueryWord::m_hardCount and QueryTerm::m_hardCount 2017-12-01 18:42:01 +01:00
603c724d9e Removed QueryWord/Term::m_implicitBits/m_matchesExplicitBits/m_explicitBit and MAX_EXPLICIT_BITS
Fields were set and some calculation leading nowhere was done. ==> unused.
2017-12-01 18:35:21 +01:00
3d5318c77a word variations seems to work now 2017-11-27 16:51:50 +01:00
2c64859421 Made wordvarition stuff configurable. made wordvariation-da handle indefinite->definite variations 2017-11-24 16:43:19 +01:00
36fd2c2615 Made lang-spec word variations configurable 2017-11-24 14:19:50 +01:00
f9ee20cae9 Renamed m_queryExpansion to m_wiktionaryWordVariations 2017-11-24 12:47:11 +01:00
cad25a18e2 Dropped unused member QueryTerm::m_numAlnumWordsInBase 2017-11-23 12:41:19 +01:00
56694dddc4 cleanup/simplify msg39 term debug log 2017-11-21 16:50:11 +01:00
128790084b Use lang_t enum more than just plain uint8_t 2017-11-21 16:18:45 +01:00
d36a6a4e39 disable site clustering when doing domain-like searches (configurable) 2017-11-14 14:50:30 +01:00
c69273219d Check allowHighFrequencyTermCache flag before using g_htfs 2017-10-05 15:56:18 +02:00
8a72b10b33 Made bigram weight configurable 2017-09-01 13:57:46 +02:00
3a16202418 Changed how and when QueryTermInfo::m_termFreqWeight is set 2017-07-07 17:20:15 +02:00
22d2a83d34 Rewrite API-like queries 2017-07-07 14:19:45 +02:00
b969a864e6 Made domain-like query rewrite configurable 2017-07-07 13:49:31 +02:00
c3013411c5 Query rewriting: domains
querys of the form aaa.bbb.ccc or aaa.bbb are interpeted as all terms and bigrams are required and matches in URLs are gived a boost.
2017-07-04 15:14:53 +02:00
ba0856caaa Make PosdbTable::allocateTopTree estimate correctly
The record count / toptree size was estimated based solely on the first file (eg posdb0001). Changed to make a correct estimated based on all files.
2017-06-26 12:21:48 +02:00
4f78bd0218 Fixed reallocation of toptree
The TopTree doesn't support resizing/reallocation so don't do that; except if the toptree size is 0 in which case we should try to size the tree
2017-06-23 14:20:58 +02:00
df4a1e5792 Better encapsulation of PosdbTable 2017-06-16 15:25:49 +02:00
490ee03383 Check for empty toptree in PosdbTable::intersectLists()
If allocateTopTree() resulted in a zero-sized toptree then intersectLists() should return early.
2017-06-12 16:56:46 +02:00
45e443bd1c Fix allocation of working-area in PosdbTable::intersectLists10_r()
The local arrays were allocated for num-query-terms but what was really needed was num-query-term-infos. Changed to plain std::vector<> arrays and call to intersectLists() wrapped for cathcing std::bad_alloc. 'scoreMatrix' is still kept as a lineraized verison of a 2-dimensional matrix.

No changed all the way through so there a still some "&(vector[0])" expressions.
2017-06-12 16:14:28 +02:00
b8a340c55b Moved memory allocation from msg39 processing thread to intersection thread
toptree, scoringinfo, whitelisttable and queryterminfo was allocated in the msg39/coordinator thread due to threads not being able to the the goog old days.
Allocations moved to PosdbTable::intersectLists10_r(), which makes it more likly the memory is near that thread.
2017-06-09 16:48:45 +02:00
2ca828369b Made PosdbTable::m_topTree private 2017-06-09 13:55:36 +02:00
5574aba451 Removed unsused PosdbTable::m_logstate 2017-06-09 13:37:53 +02:00
f5177a366a Whitespace changes 2017-05-15 17:39:17 +02:00
f012b7bfa2 Remove unused maxCacheAge & addToCache from Msg0 & Msg51 2017-05-08 16:01:19 +02:00
82e7aeda9a Work aroudn disappearing colections in Msg39 (could have referenced a NULL ptr) 2017-05-08 16:00:04 +02:00
1e5aa38045 Remove commented out code 2017-05-08 13:10:37 +02:00
6494ae8a08 Merge remote-tracking branch 'origin/master' into nomerge2 2017-04-30 21:09:52 +02:00
aadcceca90 Made termFreqWeight and frequency configurable and overridable. Made it possible to use other weights in a frontend UI and pass them to GB, and have them converted to internal values by prefixing the cgi param with fxui_. Synchronized cgi-parm names. 2017-04-30 20:23:09 +02:00
238c36c019 split language match boost and unknown language boost into two different ranking weights 2017-04-29 20:24:00 +02:00
94bb71587c Merge remote-tracking branch 'origin/master' into nomerge2 2017-04-28 19:21:38 +02:00
1e6451ed0c made page temp min/max weight configurable and overridable 2017-04-28 16:43:14 +02:00
6c75ce7c49 Merge branch 'master' into nomerge2 2017-03-02 13:30:54 +01:00
5de8c9f29c Log with ERR when detecting query inconsistencies 2017-03-02 13:10:50 +01:00
64bc15158b Merge branch 'master' into nomerge2 2017-02-20 20:39:50 +01:00
f29c6aaf8b bugfix clusterign allocatign 0 records and expecing non-null pointer from mmalloc() 2017-02-20 20:39:33 +01:00
eab379f3b5 Merge branch 'master' into nomerge2 2017-02-20 16:09:13 +01:00
63bae5c001 Made QueryTerm::m_qword const 2017-02-20 15:12:17 +01:00
0ba68cedd5 Merge remote-tracking branch 'origin/master' into nomerge2 2017-02-18 21:12:37 +01:00
0cd90b5cad Encapsulated Query::m_orig 2017-02-15 22:18:18 +01:00