d4179634a1crc fixes
Matt Wells
2014-12-16 16:38:54 -08:00
730b131bbfadded new indicators so we can make gb more stable. now hosts table reports # ooms, disk read corruptions, closed sockets from overloads, and we # of outstanding spiders. made ping request a class so we can easily add new indicators.
Matt
2014-12-16 16:22:50 -08:00
6c5ca9162cquick fix for internal ip bug
Matt
2014-12-16 13:39:09 -08:00
27db9d57a1added undeletable posdb key test to qainject1(). caught an undeletable rec and fixed that in xmldoc.cpp.
Matt
2014-12-16 13:29:04 -08:00
80a052a259help doc update
Matt
2014-12-16 10:30:30 -08:00
f57b2e0ab5always restrict to seed domains.
Matt
2014-12-15 17:46:13 -08:00
c8c56a24dafixed query reindex for diffbot json docs. added recycle content checkbox to query reindex. fix gbsortbyint: at end of query core. only show 'all spiders paused' msg for active jobs. show error summaries if doc not found and &showerrors=1.
Matt Wells
2014-12-15 16:49:20 -08:00
07b0c7d29cMerge branch 'diffbot-testing' into testing
Matt
2014-12-15 10:20:59 -08:00
14e104b68cdefault ask for gzip to off for stability. no oom errors.
Matt
2014-12-14 16:48:29 -08:00
41fecf9424deactivate rebuild tool and profiler (broken in 64-bit). query reindex can replace rebuild tool.
Matt
2014-12-14 16:44:27 -08:00
531ab85144Merge branch 'diffbot' into diffbot-testing
Matt
2014-12-14 16:38:27 -08:00
15c802f0aaMerge branch 'diffbot-testing' into testing
Matt
2014-12-14 16:38:03 -08:00
cb10eade5dfix freezeup
Matt Wells
2014-12-13 13:34:47 -08:00
584d48edc7be able to turn off getting of link info for faster rebuild of GI.
Matt Wells
2014-12-13 13:17:26 -08:00
eb249c1380Merge branch 'diffbot' into diffbot-testing
Matt Wells
2014-12-12 06:52:02 -08:00
8e6ae1f202quick fix to also accumulate the avg/min/max stats from every shard.
Matt Wells
2014-12-12 06:51:34 -08:00
578cde9d9dfix sections.cpp to not set root title section to tagid TAG_TITLE.
Matt
2014-12-11 19:54:33 -08:00
690cc7cfc6fix merge
Matt
2014-12-11 18:28:02 -08:00
bd926a69b5Merge branch 'diffbot' into diffbot-testing
Matt
2014-12-11 18:26:30 -08:00
b89f071f7cquite a few bug fixes from adding the new query syntax qa test.
Matt
2014-12-11 18:24:28 -08:00
c52aec9c3eemergency fix
Matt Wells
2014-12-11 16:12:33 -08:00
9a3489773dquery syntax updates
Matt
2014-12-11 14:37:30 -08:00
c8f305468ffix facet bug
Matt
2014-12-11 13:13:50 -08:00
a0e887467efix facet link bug for gbmin/gbmax range query
Matt Wells
2014-12-11 12:36:30 -08:00
daebad6e79print avg,min,max in facet stats in xml/json output
Matt Wells
2014-12-11 12:28:04 -08:00
cd7ce466a1facet work
Matt
2014-12-11 12:09:06 -08:00
d19ee6ceeaMerge branch 'diffbot' into diffbot-testing
Matt Wells
2014-12-11 08:40:55 -08:00
7d67f104fbemergency fixes
Matt Wells
2014-12-11 08:39:26 -08:00
27df9a4276link text extraction fixes
Matt
2014-12-11 06:52:14 -08:00
4f71a95da5reinstantiate linkdb min files to merge parm.
mwells
2014-12-11 07:20:15 -07:00
19f118dddfMerge branch 'diffbot-testing' into testing
Matt Wells
2014-12-10 14:08:10 -08:00
e43365fc70more pthread_t pid_t fixes
Matt
2014-12-10 14:06:17 -08:00
08169a5562makefile minor change
Matt
2014-12-10 13:29:59 -08:00
feed7d5b3cpthread_t pid_t compatibility fixes
Matt
2014-12-10 13:15:26 -08:00
619e980a97try to fix pthread_t pid_t issues on Threads.cpp
Matt
2014-12-10 12:13:25 -08:00
329f004e74compiler updates
Matt
2014-12-10 12:09:04 -08:00
d8ba619df3makefile updates
Matt
2014-12-10 11:53:46 -08:00
ecf486ffb8Merge branch 'diffbot-testing' into testing
Matt
2014-12-10 11:27:47 -08:00
1de49af4e6add query term info into json output as well
Matt
2014-12-10 11:27:30 -08:00
44eddd63e8fix signed/unsigned bug
Matt
2014-12-10 11:04:37 -08:00
3ee6f7149fMerge branch 'diffbot-testing' into testing
Matt
2014-12-10 11:03:12 -08:00
6b2e714964try to fix core when generating statsdb graph
Matt Wells
2014-12-10 11:01:50 -08:00
c96b24f39dtry to fix core on #0 and #16 from empty query. if empty query or n<=0 and &stream=1 then fix bug that was not sending back the reply properly. basically, disable streaming if msg40 would not block. upped MAX_SHARDS from 128 to 1024. should not take up any more mem really or slow things down.
Matt Wells
2014-12-10 10:44:01 -08:00
febb1d4658print pretty floats in the facets menu, whether printing a single float or a range of floats.
Matt Wells
2014-12-09 17:17:12 -08:00
720517c2f5fix facet range lists
Matt Wells
2014-12-09 16:51:14 -08:00
d0bed16be5fix type in sytnax.html page
Matt Wells
2014-12-09 14:15:00 -08:00
b218bc403dfix atotime1() output on json "date": field to restrict to 32-bit min/max for time_t's that are beyond 32 bits. so we truncate to min/max. later: add another termlist to add more date coverage. would be useful for searching for big numeric ranges, too, more than 32-bits.
Matt Wells
2014-12-09 13:40:34 -08:00
82a2a7d18eMerge branch 'diffbot-testing' into testing
Matt
2014-12-08 10:41:40 -08:00
dfce03eca8fix printing of "next 10" link
Matt
2014-12-08 09:55:16 -08:00
0460335861more permission system updates
Matt
2014-12-08 09:49:17 -08:00
2670cfd2f0Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing
Matt
2014-12-08 09:48:51 -08:00
59c4db704amade pwd/ip security text areas less rows. send empty page in Parms.cpp if trying to access admin page and no pwd/ip and cloud user support not enabled.
Matt
2014-12-08 09:47:38 -08:00
40f80cd559Merge branch 'diffbot-testing' into testing
Matt
2014-12-08 09:40:55 -08:00
5a844068fefix cores in top tree with last commit. this one speeds things up greatly. don't scan scoreinfo buf for every docid we add to top tree if scoreinfobuf has plenty of space. later we'll have to be more clever about removing things from scoreinfobuf if it comes down to that.
Matt Wells
2014-12-08 09:29:21 -08:00
4fbb2443b5Revert "Revert "emergency fix so ppl can download large # objects in json""
Matt Wells
2014-12-08 07:40:44 -08:00
aaa5b34126Revert "emergency fix so ppl can download large # objects in json"
Matt Wells
2014-12-08 07:36:31 -08:00
c692b54bfdemergency fix so ppl can download large # objects in json
Matt Wells
2014-12-08 07:13:00 -08:00
2c5f6daca2bring back posdb min files to merge again so we can set high when quicckly building index
mwells
2014-12-07 08:14:47 -07:00
f3195c7edamake gb start do keepalive start again
Matt Wells
2014-12-06 13:26:45 -08:00
048fbfe60ffix gb start cmd
Matt Wells
2014-12-06 13:19:13 -08:00
b51d19a88cMerge branch 'diffbot-testing' into diffbot
Matt Wells
2014-12-06 13:09:55 -08:00
559ef067c5fix core from langid too big in pageresults.cpp
Matt Wells
2014-12-06 13:09:30 -08:00
840ca9b091Merge branch 'diffbot-testing' into diffbot
Matt Wells
2014-12-06 11:16:30 -08:00
845c9d8a13version update in makefile
Matt Wells
2014-12-06 11:15:34 -07:00
b38cc19a40dont print query term info unless header bit set
Matt Wells
2014-12-06 09:36:57 -08:00
41c8817bdbfixed summary initialization error of the flags buffer. fixed term freq algo. use exact term freq for qatest123. made Summary.o -O3 again. fix gbsystem() to disable both timers.
Matt
2014-12-06 10:14:48 -07:00
01d61d5427remove type long, replace with int32_t
Matt
2014-12-05 08:55:22 -07:00
76c32bb741identation cleanups
Matt
2014-12-05 08:54:27 -07:00
2f43eb828dMerge pull request #35 from emmanuelcharon/diffbot-testing
Gigablast
2014-12-05 08:53:22 -07:00
7c57283b88fix tld lang url filter. was being reset.
mwells
2014-12-04 14:34:08 -07:00
090c18f59dshow how long it took in html serps
mwells
2014-12-04 14:22:40 -07:00
2ccb4f5c69show spider req/repl sizes added to spiderdb in logIt(). make 'make' by itself work on 32-bit archs again.
mwells
2014-12-04 13:57:42 -07:00
2021919d8cif its a diffbot crawl/bulk job then do not use linkdb to save disk space.
Matt Wells
2014-12-04 13:25:10 -07:00
d825e64d3bon bad hint offset do not core, just return corrupt data errno.
Matt Wells
2014-12-04 12:15:58 -08:00
fd33997716print query info in json, too, not just xml
Matt Wells
2014-12-04 13:03:18 -07:00
dc306858ccnomenclature change
Matt Wells
2014-12-04 11:02:54 -07:00
0331363893show language query synonym terms came from in the xml/json feed.
Matt Wells
2014-12-04 10:57:01 -07:00
832392887cdo not spam the logs with spider request corrupt count msgs. but store a count for them now in coll rec.
Matt Wells
2014-12-04 10:00:13 -07:00
4894bf51cefix core
Matt Wells
2014-12-03 14:08:18 -08:00
ca8194d9b0when rebuilding posdb do not rebuild for spiderdb
mwells
2014-12-03 11:32:22 -07:00
12d1477135fix another 64bit conversion bug for synonyms
Matt Wells
2014-12-03 07:45:27 -08:00
654084f557fix 64bit conversion bug. realloc offset should have been 64bit not 32bit in Linkdb.cpp.
Matt Wells
2014-12-03 07:35:14 -08:00
18583234c4do not show discussions,etc. tabs yet
mwells
2014-12-02 21:51:20 -07:00
144c488a4edo not force -m64 on a 32-bit system
mwells
2014-12-02 21:45:52 -07:00
0f2ee3edcffix core from doing rebuild of posdb
mwells
2014-12-02 21:28:31 -07:00
251f7d2f22fix core when removing row from url filters table. tail safebuf was not detaching buf. clear all of memtable on startup, use sizeof(char) not 4. fix m_memtablesize since it can't be based on m_maxMem because g_hostdb inits before g_conf.m_maxMem and calls Mem::addMem()
Matt Wells
2014-12-02 16:17:06 -08:00
3c92fd6916fix Inlink accessor function core. don't use off_urlBuf, etc. any more just use size_urlBuf, etc. now for better backwards compatibility.
mwells
2014-12-02 10:26:10 -07:00
7508deb38fMerge branch 'diffbot-testing' into testing
mwells
2014-12-02 09:47:11 -07:00
abfa9a500emem fix
Matt Wells
2014-12-02 07:09:57 -08:00
a1d673936ffix some final issues with 64bit stuff
Matt Wells
2014-12-02 06:48:56 -08:00
ac685eb4b8fix rdbcache init core
Matt Wells
2014-12-01 12:37:51 -08:00
1b347188a3fix makefile to use -m64 by default.
mwells
2014-12-01 11:44:41 -07:00
375069bd15fix url dup cache bug
Matt Wells
2014-12-01 07:45:59 -08:00
b61575a2c8modified hopcount computation for custom crawls
emmanuel.charon
2014-12-01 07:28:55 -08:00
75aafc6e87fix core from allocating too many nodes in top tree
Matt Wells
2014-12-01 07:19:54 -08:00
320eb66237fixed time_t in LinkInfo class for 64 bit conversion
Matt Wells
2014-11-27 20:55:18 -08:00
24544733f6fix bad http status err msg
Matt
2014-11-27 20:39:37 -07:00
3ae29dd438fix core in diskpagecache
Matt
2014-11-27 14:54:45 -07:00