5b92b5f6d5
now term freqs are almost exact for qatest123.
...
sometimes an off by 1 bug. we should really call
msg5 to get the list w/o thread and get a truly
exact term freq for qatest123 for consistency.
that would be in Posdb.cpp::getTermFreq()
2014-11-25 15:54:15 -07:00
ea67c688b9
fixed a couple really nasty mem leak bugs from new facet code
2014-11-25 11:00:27 -07:00
4e8a42e024
text replacements for bad int32_t substitutions
2014-11-17 18:24:38 -08:00
931a1c4bc6
good checkpoint. quite a few fixes.
2014-11-17 18:13:36 -08:00
4c19453ea9
working with -m32 for basic testing.
...
compiles for 64-bit.
2014-11-12 11:38:37 -08:00
96b8197ad3
now it compiles with -m32
2014-11-10 14:45:11 -08:00
e7dd8f7956
replace long long with int64_t
2014-10-30 13:36:39 -06:00
5c7fc3b083
fix OOM for large &n=1000000000 values when searching.
...
just alloc for the docids found, not the docids asked for.
2014-10-09 11:35:35 -07:00
a9e61b5aca
facet text lookup fixes.
2014-07-29 19:32:27 -07:00
842d72b5db
Merge branch 'testing' into diffbot-matt
2014-07-08 09:58:54 -07:00
d7cc290a1f
added a few new search parms that can be used
...
to override collection defaults.
hide all clustered results.
max title len.
max summary excerpt/line width.
2014-07-08 07:01:51 -07:00
d9ae010371
shard gbfacetstr:gbxpathsitehash123456 terms by termid for speed.
...
got them working again multicasting a msg 0x39 to the appropriate shard.
set special msg39request flag for better performance for those guys.
2014-07-07 12:32:27 -07:00
7bd37dfaa2
facet updates
2014-06-28 10:26:08 -06:00
d731d17b3b
fix core
2014-06-20 18:09:11 -07:00
b0e82edc93
new facet crap compiling now.
2014-06-20 12:28:50 -07:00
c314e61968
make sectiondb stats just a special case of facets
2014-06-17 16:39:02 -06:00
308f2d07f7
fixes for section info injection into squid proxied responses
2014-06-13 10:48:59 -07:00
20c4ac4205
got it marking up html now with sectiondb stats.
...
seems to work ok.
2014-06-12 14:42:08 -07:00
e4ce9bc9ac
squidproxycache/floaters/sectiondbtagging all compiles.
...
need to do run-time debugging now.
2014-06-11 17:57:28 -07:00
5f16013a9e
add support for stripping accent marks from greek letters.
2014-05-30 20:09:37 -07:00
2f331d55e5
widget updates
2014-05-06 10:47:57 -07:00
1d766826ae
retry if too man docids deduped when &stream=1
2014-05-01 17:07:31 -07:00
e351d2a6f1
get searching on token working
2014-03-06 17:01:41 -08:00
27e8e810d2
use collnum instead of coll string.
...
more stable since resetting collections
keeps string the same but changes the collnum.
2014-03-06 15:48:11 -08:00
25cf0efdbf
first compiled stab at multi collection searching.
2014-03-06 10:45:13 -08:00
2d4af1aefe
index numbers as integers too, not just floats
...
so we can sort by spider date without losing
128 seconds of resolution.
2014-02-06 20:57:54 -08:00
8a49e87a61
got code with shard rebalancing compiling.
...
now we store a "sharded by termid" bit in posdb
key for checksums, etc keys that are not sharded
by docid. save having to do disk seeks on every
host in the cluster to do a dup check, etc.
2014-01-11 16:08:42 -08:00
44ae7c4de6
mem labelling fixes.
...
fixed bad alloc when generating gigabits.
2013-12-09 14:05:02 -07:00
5e4b5a112c
Merge branch 'master' into diffbot
...
Conflicts:
PageResults.cpp
Threads.cpp
XmlDoc.cpp
XmlDoc.h
2013-12-07 11:34:26 -07:00
5da41cd113
fix a couple different cores.
2013-11-24 19:46:44 -07:00
fe97e08281
move from groups to shards. got rid of annoying
...
groupid bit mask thing.
2013-10-04 16:18:56 -07:00
107037c6a2
new &sites=xyz.com+abc.com+... functionality compiles ok.
2013-09-15 18:14:32 -06:00
b684414e16
almost done adding support for whitelists.
...
i.e. list of sites to restrict search results to,
for instance.
2013-09-15 15:15:56 -06:00
aaf333c46c
try to get family filter (&ff=1) working again
...
to filter out adult search results.
2013-09-01 18:22:38 -06:00
f6e560c1f4
Initial file population.
2013-08-02 13:12:24 -07:00