Commit Graph

  • d4179634a1 crc fixes Matt Wells 2014-12-16 16:38:54 -08:00
  • 730b131bbf added new indicators so we can make gb more stable. now hosts table reports # ooms, disk read corruptions, closed sockets from overloads, and we # of outstanding spiders. made ping request a class so we can easily add new indicators. Matt 2014-12-16 16:22:50 -08:00
  • 6c5ca9162c quick fix for internal ip bug Matt 2014-12-16 13:39:09 -08:00
  • 27db9d57a1 added undeletable posdb key test to qainject1(). caught an undeletable rec and fixed that in xmldoc.cpp. Matt 2014-12-16 13:29:04 -08:00
  • 80a052a259 help doc update Matt 2014-12-16 10:30:30 -08:00
  • f57b2e0ab5 always restrict to seed domains. Matt 2014-12-15 17:46:13 -08:00
  • c8c56a24da fixed query reindex for diffbot json docs. added recycle content checkbox to query reindex. fix gbsortbyint: at end of query core. only show 'all spiders paused' msg for active jobs. show error summaries if doc not found and &showerrors=1. Matt Wells 2014-12-15 16:49:20 -08:00
  • 07b0c7d29c Merge branch 'diffbot-testing' into testing Matt 2014-12-15 10:20:59 -08:00
  • 14e104b68c default ask for gzip to off for stability. no oom errors. Matt 2014-12-14 16:48:29 -08:00
  • 41fecf9424 deactivate rebuild tool and profiler (broken in 64-bit). query reindex can replace rebuild tool. Matt 2014-12-14 16:44:27 -08:00
  • 531ab85144 Merge branch 'diffbot' into diffbot-testing Matt 2014-12-14 16:38:27 -08:00
  • 15c802f0aa Merge branch 'diffbot-testing' into testing Matt 2014-12-14 16:38:03 -08:00
  • cb10eade5d fix freezeup Matt Wells 2014-12-13 13:34:47 -08:00
  • 584d48edc7 be able to turn off getting of link info for faster rebuild of GI. Matt Wells 2014-12-13 13:17:26 -08:00
  • eb249c1380 Merge branch 'diffbot' into diffbot-testing Matt Wells 2014-12-12 06:52:02 -08:00
  • 8e6ae1f202 quick fix to also accumulate the avg/min/max stats from every shard. Matt Wells 2014-12-12 06:51:34 -08:00
  • 578cde9d9d fix sections.cpp to not set root title section to tagid TAG_TITLE. Matt 2014-12-11 19:54:33 -08:00
  • 690cc7cfc6 fix merge Matt 2014-12-11 18:28:02 -08:00
  • bd926a69b5 Merge branch 'diffbot' into diffbot-testing Matt 2014-12-11 18:26:30 -08:00
  • b89f071f7c quite a few bug fixes from adding the new query syntax qa test. Matt 2014-12-11 18:24:28 -08:00
  • c52aec9c3e emergency fix Matt Wells 2014-12-11 16:12:33 -08:00
  • 9a3489773d query syntax updates Matt 2014-12-11 14:37:30 -08:00
  • c8f305468f fix facet bug Matt 2014-12-11 13:13:50 -08:00
  • a0e887467e fix facet link bug for gbmin/gbmax range query Matt Wells 2014-12-11 12:36:30 -08:00
  • daebad6e79 print avg,min,max in facet stats in xml/json output Matt Wells 2014-12-11 12:28:04 -08:00
  • cd7ce466a1 facet work Matt 2014-12-11 12:09:06 -08:00
  • d19ee6ceea Merge branch 'diffbot' into diffbot-testing Matt Wells 2014-12-11 08:40:55 -08:00
  • 7d67f104fb emergency fixes Matt Wells 2014-12-11 08:39:26 -08:00
  • 27df9a4276 link text extraction fixes Matt 2014-12-11 06:52:14 -08:00
  • 4f71a95da5 reinstantiate linkdb min files to merge parm. mwells 2014-12-11 07:20:15 -07:00
  • 19f118dddf Merge branch 'diffbot-testing' into testing Matt Wells 2014-12-10 14:08:10 -08:00
  • e43365fc70 more pthread_t pid_t fixes Matt 2014-12-10 14:06:17 -08:00
  • 08169a5562 makefile minor change Matt 2014-12-10 13:29:59 -08:00
  • feed7d5b3c pthread_t pid_t compatibility fixes Matt 2014-12-10 13:15:26 -08:00
  • 619e980a97 try to fix pthread_t pid_t issues on Threads.cpp Matt 2014-12-10 12:13:25 -08:00
  • 329f004e74 compiler updates Matt 2014-12-10 12:09:04 -08:00
  • d8ba619df3 makefile updates Matt 2014-12-10 11:53:46 -08:00
  • ecf486ffb8 Merge branch 'diffbot-testing' into testing Matt 2014-12-10 11:27:47 -08:00
  • 1de49af4e6 add query term info into json output as well Matt 2014-12-10 11:27:30 -08:00
  • 44eddd63e8 fix signed/unsigned bug Matt 2014-12-10 11:04:37 -08:00
  • 3ee6f7149f Merge branch 'diffbot-testing' into testing Matt 2014-12-10 11:03:12 -08:00
  • 6b2e714964 try to fix core when generating statsdb graph Matt Wells 2014-12-10 11:01:50 -08:00
  • c96b24f39d try to fix core on #0 and #16 from empty query. if empty query or n<=0 and &stream=1 then fix bug that was not sending back the reply properly. basically, disable streaming if msg40 would not block. upped MAX_SHARDS from 128 to 1024. should not take up any more mem really or slow things down. Matt Wells 2014-12-10 10:44:01 -08:00
  • febb1d4658 print pretty floats in the facets menu, whether printing a single float or a range of floats. Matt Wells 2014-12-09 17:17:12 -08:00
  • 720517c2f5 fix facet range lists Matt Wells 2014-12-09 16:51:14 -08:00
  • d0bed16be5 fix type in sytnax.html page Matt Wells 2014-12-09 14:15:00 -08:00
  • b218bc403d fix atotime1() output on json "date": field to restrict to 32-bit min/max for time_t's that are beyond 32 bits. so we truncate to min/max. later: add another termlist to add more date coverage. would be useful for searching for big numeric ranges, too, more than 32-bits. Matt Wells 2014-12-09 13:40:34 -08:00
  • 82a2a7d18e Merge branch 'diffbot-testing' into testing Matt 2014-12-08 10:41:40 -08:00
  • dfce03eca8 fix printing of "next 10" link Matt 2014-12-08 09:55:16 -08:00
  • 0460335861 more permission system updates Matt 2014-12-08 09:49:17 -08:00
  • 2670cfd2f0 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt 2014-12-08 09:48:51 -08:00
  • 59c4db704a made pwd/ip security text areas less rows. send empty page in Parms.cpp if trying to access admin page and no pwd/ip and cloud user support not enabled. Matt 2014-12-08 09:47:38 -08:00
  • 40f80cd559 Merge branch 'diffbot-testing' into testing Matt 2014-12-08 09:40:55 -08:00
  • 5a844068fe fix cores in top tree with last commit. this one speeds things up greatly. don't scan scoreinfo buf for every docid we add to top tree if scoreinfobuf has plenty of space. later we'll have to be more clever about removing things from scoreinfobuf if it comes down to that. Matt Wells 2014-12-08 09:29:21 -08:00
  • 4fbb2443b5 Revert "Revert "emergency fix so ppl can download large # objects in json"" Matt Wells 2014-12-08 07:40:44 -08:00
  • aaa5b34126 Revert "emergency fix so ppl can download large # objects in json" Matt Wells 2014-12-08 07:36:31 -08:00
  • c692b54bfd emergency fix so ppl can download large # objects in json Matt Wells 2014-12-08 07:13:00 -08:00
  • 2c5f6daca2 bring back posdb min files to merge again so we can set high when quicckly building index mwells 2014-12-07 08:14:47 -07:00
  • f3195c7eda make gb start do keepalive start again Matt Wells 2014-12-06 13:26:45 -08:00
  • 048fbfe60f fix gb start cmd Matt Wells 2014-12-06 13:19:13 -08:00
  • b51d19a88c Merge branch 'diffbot-testing' into diffbot Matt Wells 2014-12-06 13:09:55 -08:00
  • 559ef067c5 fix core from langid too big in pageresults.cpp Matt Wells 2014-12-06 13:09:30 -08:00
  • 840ca9b091 Merge branch 'diffbot-testing' into diffbot Matt Wells 2014-12-06 11:16:30 -08:00
  • 845c9d8a13 version update in makefile Matt Wells 2014-12-06 11:15:34 -07:00
  • b38cc19a40 dont print query term info unless header bit set Matt Wells 2014-12-06 09:36:57 -08:00
  • 41c8817bdb fixed summary initialization error of the flags buffer. fixed term freq algo. use exact term freq for qatest123. made Summary.o -O3 again. fix gbsystem() to disable both timers. Matt 2014-12-06 10:14:48 -07:00
  • 01d61d5427 remove type long, replace with int32_t Matt 2014-12-05 08:55:22 -07:00
  • 76c32bb741 identation cleanups Matt 2014-12-05 08:54:27 -07:00
  • 2f43eb828d Merge pull request #35 from emmanuelcharon/diffbot-testing Gigablast 2014-12-05 08:53:22 -07:00
  • 7c57283b88 fix tld lang url filter. was being reset. mwells 2014-12-04 14:34:08 -07:00
  • 090c18f59d show how long it took in html serps mwells 2014-12-04 14:22:40 -07:00
  • 2ccb4f5c69 show spider req/repl sizes added to spiderdb in logIt(). make 'make' by itself work on 32-bit archs again. mwells 2014-12-04 13:57:42 -07:00
  • 2021919d8c if its a diffbot crawl/bulk job then do not use linkdb to save disk space. Matt Wells 2014-12-04 13:25:10 -07:00
  • d825e64d3b on bad hint offset do not core, just return corrupt data errno. Matt Wells 2014-12-04 12:15:58 -08:00
  • fd33997716 print query info in json, too, not just xml Matt Wells 2014-12-04 13:03:18 -07:00
  • dc306858cc nomenclature change Matt Wells 2014-12-04 11:02:54 -07:00
  • 0331363893 show language query synonym terms came from in the xml/json feed. Matt Wells 2014-12-04 10:57:01 -07:00
  • 832392887c do not spam the logs with spider request corrupt count msgs. but store a count for them now in coll rec. Matt Wells 2014-12-04 10:00:13 -07:00
  • a7462ed1f4 fix injection stuff mwells 2014-12-04 09:29:17 -07:00
  • 8157c5be14 added 'gb dstart' cmd mwells 2014-12-03 19:02:08 -07:00
  • 4894bf51ce fix core Matt Wells 2014-12-03 14:08:18 -08:00
  • ca8194d9b0 when rebuilding posdb do not rebuild for spiderdb mwells 2014-12-03 11:32:22 -07:00
  • 12d1477135 fix another 64bit conversion bug for synonyms Matt Wells 2014-12-03 07:45:27 -08:00
  • 654084f557 fix 64bit conversion bug. realloc offset should have been 64bit not 32bit in Linkdb.cpp. Matt Wells 2014-12-03 07:35:14 -08:00
  • 18583234c4 do not show discussions,etc. tabs yet mwells 2014-12-02 21:51:20 -07:00
  • 144c488a4e do not force -m64 on a 32-bit system mwells 2014-12-02 21:45:52 -07:00
  • 0f2ee3edcf fix core from doing rebuild of posdb mwells 2014-12-02 21:28:31 -07:00
  • 251f7d2f22 fix core when removing row from url filters table. tail safebuf was not detaching buf. clear all of memtable on startup, use sizeof(char) not 4. fix m_memtablesize since it can't be based on m_maxMem because g_hostdb inits before g_conf.m_maxMem and calls Mem::addMem() Matt Wells 2014-12-02 16:17:06 -08:00
  • 3c92fd6916 fix Inlink accessor function core. don't use off_urlBuf, etc. any more just use size_urlBuf, etc. now for better backwards compatibility. mwells 2014-12-02 10:26:10 -07:00
  • 7508deb38f Merge branch 'diffbot-testing' into testing mwells 2014-12-02 09:47:11 -07:00
  • abfa9a500e mem fix Matt Wells 2014-12-02 07:09:57 -08:00
  • a1d673936f fix some final issues with 64bit stuff Matt Wells 2014-12-02 06:48:56 -08:00
  • ac685eb4b8 fix rdbcache init core Matt Wells 2014-12-01 12:37:51 -08:00
  • 1b347188a3 fix makefile to use -m64 by default. mwells 2014-12-01 11:44:41 -07:00
  • 375069bd15 fix url dup cache bug Matt Wells 2014-12-01 07:45:59 -08:00
  • b61575a2c8 modified hopcount computation for custom crawls emmanuel.charon 2014-12-01 07:28:55 -08:00
  • 75aafc6e87 fix core from allocating too many nodes in top tree Matt Wells 2014-12-01 07:19:54 -08:00
  • 320eb66237 fixed time_t in LinkInfo class for 64 bit conversion Matt Wells 2014-11-27 20:55:18 -08:00
  • 24544733f6 fix bad http status err msg Matt 2014-11-27 20:39:37 -07:00
  • 3ae29dd438 fix core in diskpagecache Matt 2014-11-27 14:54:45 -07:00