2fd511f002
updates
Matt Wells
2014-12-16 17:09:25 -08:00
d4179634a1
crc fixes
Matt Wells
2014-12-16 16:38:54 -08:00
730b131bbf
added new indicators so we can make gb more stable. now hosts table reports # ooms, disk read corruptions, closed sockets from overloads, and we # of outstanding spiders. made ping request a class so we can easily add new indicators.
Matt
2014-12-16 16:22:50 -08:00
6c5ca9162c
quick fix for internal ip bug
Matt
2014-12-16 13:39:09 -08:00
27db9d57a1
added undeletable posdb key test to qainject1(). caught an undeletable rec and fixed that in xmldoc.cpp.
Matt
2014-12-16 13:29:04 -08:00
80a052a259
help doc update
Matt
2014-12-16 10:30:30 -08:00
f57b2e0ab5
always restrict to seed domains.
Matt
2014-12-15 17:46:13 -08:00
c8c56a24da
fixed query reindex for diffbot json docs. added recycle content checkbox to query reindex. fix gbsortbyint: at end of query core. only show 'all spiders paused' msg for active jobs. show error summaries if doc not found and &showerrors=1.
Matt Wells
2014-12-15 16:49:20 -08:00
07b0c7d29c
Merge branch 'diffbot-testing' into testing
Matt
2014-12-15 10:20:59 -08:00
14e104b68c
default ask for gzip to off for stability. no oom errors.
Matt
2014-12-14 16:48:29 -08:00
41fecf9424
deactivate rebuild tool and profiler (broken in 64-bit). query reindex can replace rebuild tool.
Matt
2014-12-14 16:44:27 -08:00
531ab85144
Merge branch 'diffbot' into diffbot-testing
Matt
2014-12-14 16:38:27 -08:00
15c802f0aa
Merge branch 'diffbot-testing' into testing
Matt
2014-12-14 16:38:03 -08:00
cb10eade5d
fix freezeup
Matt Wells
2014-12-13 13:34:47 -08:00
584d48edc7
be able to turn off getting of link info for faster rebuild of GI.
Matt Wells
2014-12-13 13:17:26 -08:00
eb249c1380
Merge branch 'diffbot' into diffbot-testing
Matt Wells
2014-12-12 06:52:02 -08:00
8e6ae1f202
quick fix to also accumulate the avg/min/max stats from every shard.
Matt Wells
2014-12-12 06:51:34 -08:00
578cde9d9d
fix sections.cpp to not set root title section to tagid TAG_TITLE.
Matt
2014-12-11 19:54:33 -08:00
690cc7cfc6
fix merge
Matt
2014-12-11 18:28:02 -08:00
bd926a69b5
Merge branch 'diffbot' into diffbot-testing
Matt
2014-12-11 18:26:30 -08:00
b89f071f7c
quite a few bug fixes from adding the new query syntax qa test.
Matt
2014-12-11 18:24:28 -08:00
c52aec9c3e
emergency fix
Matt Wells
2014-12-11 16:12:33 -08:00
9a3489773d
query syntax updates
Matt
2014-12-11 14:37:30 -08:00
c8f305468f
fix facet bug
Matt
2014-12-11 13:13:50 -08:00
a0e887467e
fix facet link bug for gbmin/gbmax range query
Matt Wells
2014-12-11 12:36:30 -08:00
daebad6e79
print avg,min,max in facet stats in xml/json output
Matt Wells
2014-12-11 12:28:04 -08:00
cd7ce466a1
facet work
Matt
2014-12-11 12:09:06 -08:00
d19ee6ceea
Merge branch 'diffbot' into diffbot-testing
Matt Wells
2014-12-11 08:40:55 -08:00
7d67f104fb
emergency fixes
Matt Wells
2014-12-11 08:39:26 -08:00
27df9a4276
link text extraction fixes
Matt
2014-12-11 06:52:14 -08:00
4f71a95da5
reinstantiate linkdb min files to merge parm.
mwells
2014-12-11 07:20:15 -07:00
19f118dddf
Merge branch 'diffbot-testing' into testing
Matt Wells
2014-12-10 14:08:10 -08:00
e43365fc70
more pthread_t pid_t fixes
Matt
2014-12-10 14:06:17 -08:00
08169a5562
makefile minor change
Matt
2014-12-10 13:29:59 -08:00
feed7d5b3c
pthread_t pid_t compatibility fixes
Matt
2014-12-10 13:15:26 -08:00
619e980a97
try to fix pthread_t pid_t issues on Threads.cpp
Matt
2014-12-10 12:13:25 -08:00
329f004e74
compiler updates
Matt
2014-12-10 12:09:04 -08:00
d8ba619df3
makefile updates
Matt
2014-12-10 11:53:46 -08:00
ecf486ffb8
Merge branch 'diffbot-testing' into testing
Matt
2014-12-10 11:27:47 -08:00
1de49af4e6
add query term info into json output as well
Matt
2014-12-10 11:27:30 -08:00
44eddd63e8
fix signed/unsigned bug
Matt
2014-12-10 11:04:37 -08:00
3ee6f7149f
Merge branch 'diffbot-testing' into testing
Matt
2014-12-10 11:03:12 -08:00
6b2e714964
try to fix core when generating statsdb graph
Matt Wells
2014-12-10 11:01:50 -08:00
c96b24f39d
try to fix core on #0 and #16 from empty query. if empty query or n<=0 and &stream=1 then fix bug that was not sending back the reply properly. basically, disable streaming if msg40 would not block. upped MAX_SHARDS from 128 to 1024. should not take up any more mem really or slow things down.
Matt Wells
2014-12-10 10:44:01 -08:00
febb1d4658
print pretty floats in the facets menu, whether printing a single float or a range of floats.
Matt Wells
2014-12-09 17:17:12 -08:00
720517c2f5
fix facet range lists
Matt Wells
2014-12-09 16:51:14 -08:00
d0bed16be5
fix type in sytnax.html page
Matt Wells
2014-12-09 14:15:00 -08:00
b218bc403d
fix atotime1() output on json "date": field to restrict to 32-bit min/max for time_t's that are beyond 32 bits. so we truncate to min/max. later: add another termlist to add more date coverage. would be useful for searching for big numeric ranges, too, more than 32-bits.
Matt Wells
2014-12-09 13:40:34 -08:00
82a2a7d18e
Merge branch 'diffbot-testing' into testing
Matt
2014-12-08 10:41:40 -08:00
dfce03eca8
fix printing of "next 10" link
Matt
2014-12-08 09:55:16 -08:00
0460335861
more permission system updates
Matt
2014-12-08 09:49:17 -08:00
2670cfd2f0
Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing
Matt
2014-12-08 09:48:51 -08:00
59c4db704a
made pwd/ip security text areas less rows. send empty page in Parms.cpp if trying to access admin page and no pwd/ip and cloud user support not enabled.
Matt
2014-12-08 09:47:38 -08:00
40f80cd559
Merge branch 'diffbot-testing' into testing
Matt
2014-12-08 09:40:55 -08:00
5a844068fe
fix cores in top tree with last commit. this one speeds things up greatly. don't scan scoreinfo buf for every docid we add to top tree if scoreinfobuf has plenty of space. later we'll have to be more clever about removing things from scoreinfobuf if it comes down to that.
Matt Wells
2014-12-08 09:29:21 -08:00
4fbb2443b5
Revert "Revert "emergency fix so ppl can download large # objects in json""
Matt Wells
2014-12-08 07:40:44 -08:00
aaa5b34126
Revert "emergency fix so ppl can download large # objects in json"
Matt Wells
2014-12-08 07:36:31 -08:00
c692b54bfd
emergency fix so ppl can download large # objects in json
Matt Wells
2014-12-08 07:13:00 -08:00
2c5f6daca2
bring back posdb min files to merge again so we can set high when quicckly building index
mwells
2014-12-07 08:14:47 -07:00
f3195c7eda
make gb start do keepalive start again
Matt Wells
2014-12-06 13:26:45 -08:00
048fbfe60f
fix gb start cmd
Matt Wells
2014-12-06 13:19:13 -08:00
b51d19a88c
Merge branch 'diffbot-testing' into diffbot
Matt Wells
2014-12-06 13:09:55 -08:00
559ef067c5
fix core from langid too big in pageresults.cpp
Matt Wells
2014-12-06 13:09:30 -08:00
840ca9b091
Merge branch 'diffbot-testing' into diffbot
Matt Wells
2014-12-06 11:16:30 -08:00
845c9d8a13
version update in makefile
Matt Wells
2014-12-06 11:15:34 -07:00
b38cc19a40
dont print query term info unless header bit set
Matt Wells
2014-12-06 09:36:57 -08:00
41c8817bdb
fixed summary initialization error of the flags buffer. fixed term freq algo. use exact term freq for qatest123. made Summary.o -O3 again. fix gbsystem() to disable both timers.
Matt
2014-12-06 10:14:48 -07:00
01d61d5427
remove type long, replace with int32_t
Matt
2014-12-05 08:55:22 -07:00
76c32bb741
identation cleanups
Matt
2014-12-05 08:54:27 -07:00
7c57283b88
fix tld lang url filter. was being reset.
mwells
2014-12-04 14:34:08 -07:00
090c18f59d
show how long it took in html serps
mwells
2014-12-04 14:22:40 -07:00
2ccb4f5c69
show spider req/repl sizes added to spiderdb in logIt(). make 'make' by itself work on 32-bit archs again.
mwells
2014-12-04 13:57:42 -07:00
2021919d8c
if its a diffbot crawl/bulk job then do not use linkdb to save disk space.
Matt Wells
2014-12-04 13:25:10 -07:00
d825e64d3b
on bad hint offset do not core, just return corrupt data errno.
Matt Wells
2014-12-04 12:15:58 -08:00
fd33997716
print query info in json, too, not just xml
Matt Wells
2014-12-04 13:03:18 -07:00
dc306858cc
nomenclature change
Matt Wells
2014-12-04 11:02:54 -07:00
0331363893
show language query synonym terms came from in the xml/json feed.
Matt Wells
2014-12-04 10:57:01 -07:00
832392887c
do not spam the logs with spider request corrupt count msgs. but store a count for them now in coll rec.
Matt Wells
2014-12-04 10:00:13 -07:00
4894bf51ce
fix core
Matt Wells
2014-12-03 14:08:18 -08:00
ca8194d9b0
when rebuilding posdb do not rebuild for spiderdb
mwells
2014-12-03 11:32:22 -07:00
12d1477135
fix another 64bit conversion bug for synonyms
Matt Wells
2014-12-03 07:45:27 -08:00
654084f557
fix 64bit conversion bug. realloc offset should have been 64bit not 32bit in Linkdb.cpp.
Matt Wells
2014-12-03 07:35:14 -08:00
18583234c4
do not show discussions,etc. tabs yet
mwells
2014-12-02 21:51:20 -07:00
144c488a4e
do not force -m64 on a 32-bit system
mwells
2014-12-02 21:45:52 -07:00
0f2ee3edcf
fix core from doing rebuild of posdb
mwells
2014-12-02 21:28:31 -07:00
251f7d2f22
fix core when removing row from url filters table. tail safebuf was not detaching buf. clear all of memtable on startup, use sizeof(char) not 4. fix m_memtablesize since it can't be based on m_maxMem because g_hostdb inits before g_conf.m_maxMem and calls Mem::addMem()
Matt Wells
2014-12-02 16:17:06 -08:00
3c92fd6916
fix Inlink accessor function core. don't use off_urlBuf, etc. any more just use size_urlBuf, etc. now for better backwards compatibility.
mwells
2014-12-02 10:26:10 -07:00
7508deb38f
Merge branch 'diffbot-testing' into testing
mwells
2014-12-02 09:47:11 -07:00
abfa9a500e
mem fix
Matt Wells
2014-12-02 07:09:57 -08:00
a1d673936f
fix some final issues with 64bit stuff
Matt Wells
2014-12-02 06:48:56 -08:00
ac685eb4b8
fix rdbcache init core
Matt Wells
2014-12-01 12:37:51 -08:00
1b347188a3
fix makefile to use -m64 by default.
mwells
2014-12-01 11:44:41 -07:00