Commit Graph

35 Commits

Author SHA1 Message Date
5b92b5f6d5 now term freqs are almost exact for qatest123.
sometimes an off by 1 bug. we should really call
msg5 to get the list w/o thread and get a truly
exact term freq for qatest123 for consistency.
that would be in Posdb.cpp::getTermFreq()
2014-11-25 15:54:15 -07:00
ea67c688b9 fixed a couple really nasty mem leak bugs from new facet code 2014-11-25 11:00:27 -07:00
4e8a42e024 text replacements for bad int32_t substitutions 2014-11-17 18:24:38 -08:00
931a1c4bc6 good checkpoint. quite a few fixes. 2014-11-17 18:13:36 -08:00
4c19453ea9 working with -m32 for basic testing.
compiles for 64-bit.
2014-11-12 11:38:37 -08:00
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
e7dd8f7956 replace long long with int64_t 2014-10-30 13:36:39 -06:00
5c7fc3b083 fix OOM for large &n=1000000000 values when searching.
just alloc for the docids found, not the docids asked for.
2014-10-09 11:35:35 -07:00
a9e61b5aca facet text lookup fixes. 2014-07-29 19:32:27 -07:00
842d72b5db Merge branch 'testing' into diffbot-matt 2014-07-08 09:58:54 -07:00
d7cc290a1f added a few new search parms that can be used
to override collection defaults.
hide all clustered results.
max title len.
max summary excerpt/line width.
2014-07-08 07:01:51 -07:00
d9ae010371 shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed.
got them working again multicasting a msg 0x39 to the appropriate shard.
set special msg39request flag for better performance for those guys.
2014-07-07 12:32:27 -07:00
7bd37dfaa2 facet updates 2014-06-28 10:26:08 -06:00
d731d17b3b fix core 2014-06-20 18:09:11 -07:00
b0e82edc93 new facet crap compiling now. 2014-06-20 12:28:50 -07:00
c314e61968 make sectiondb stats just a special case of facets 2014-06-17 16:39:02 -06:00
308f2d07f7 fixes for section info injection into squid proxied responses 2014-06-13 10:48:59 -07:00
20c4ac4205 got it marking up html now with sectiondb stats.
seems to work ok.
2014-06-12 14:42:08 -07:00
e4ce9bc9ac squidproxycache/floaters/sectiondbtagging all compiles.
need to do run-time debugging now.
2014-06-11 17:57:28 -07:00
5f16013a9e add support for stripping accent marks from greek letters. 2014-05-30 20:09:37 -07:00
2f331d55e5 widget updates 2014-05-06 10:47:57 -07:00
1d766826ae retry if too man docids deduped when &stream=1 2014-05-01 17:07:31 -07:00
e351d2a6f1 get searching on token working 2014-03-06 17:01:41 -08:00
27e8e810d2 use collnum instead of coll string.
more stable since resetting collections
keeps string the same but changes the collnum.
2014-03-06 15:48:11 -08:00
25cf0efdbf first compiled stab at multi collection searching. 2014-03-06 10:45:13 -08:00
2d4af1aefe index numbers as integers too, not just floats
so we can sort by spider date without losing
128 seconds of resolution.
2014-02-06 20:57:54 -08:00
8a49e87a61 got code with shard rebalancing compiling.
now we store a "sharded by termid" bit in posdb
key for checksums, etc keys that are not sharded
by docid. save having to do disk seeks on every
host in the cluster to do a dup check, etc.
2014-01-11 16:08:42 -08:00
44ae7c4de6 mem labelling fixes.
fixed bad alloc when generating gigabits.
2013-12-09 14:05:02 -07:00
5e4b5a112c Merge branch 'master' into diffbot
Conflicts:

	PageResults.cpp
	Threads.cpp
	XmlDoc.cpp
	XmlDoc.h
2013-12-07 11:34:26 -07:00
5da41cd113 fix a couple different cores. 2013-11-24 19:46:44 -07:00
fe97e08281 move from groups to shards. got rid of annoying
groupid bit mask thing.
2013-10-04 16:18:56 -07:00
107037c6a2 new &sites=xyz.com+abc.com+... functionality compiles ok. 2013-09-15 18:14:32 -06:00
b684414e16 almost done adding support for whitelists.
i.e. list of sites to restrict search results to,
for instance.
2013-09-15 15:15:56 -06:00
aaf333c46c try to get family filter (&ff=1) working again
to filter out adult search results.
2013-09-01 18:22:38 -06:00
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00