Commit Graph

  • c307cce330 update rebuild instructions Matt 2015-01-06 13:06:42 -0800
  • 24fd6a1a26 fix log rotation logic. Matt Wells 2015-01-06 12:50:41 -0800
  • b693fe1530 fix bugs related to restarting a cored shard during repair mode. need to be able to resume repair/rebuild scan. Matt Wells 2015-01-06 11:28:55 -0800
  • 19c92339b3 fix core from corrupted root lang in title rec Matt Wells 2015-01-05 17:52:08 -0800
  • e5b81cfb04 fix ping age being negative in hosts table bug. Matt Wells 2015-01-05 15:19:46 -0800
  • f488be4ede make new logfile when current logfile hits 1GB. this will save disk space so we can delete the old log files that can be many GBs in size. Matt 2015-01-05 11:29:49 -0800
  • c03ba31ec2 try to reduce log spam Matt 2015-01-05 11:03:49 -0800
  • b7cb2b56e1 try to debug the core Matt Wells 2014-12-23 18:53:11 -0700
  • e7dd98f54d fix print float pretty Matt Wells 2014-12-23 16:22:38 -0800
  • 6e09035f46 try to fix performance slam when compiling like 640k facet list from each of 32 shards. hashtable was not hashing well and destroying the complexity. Matt Wells 2014-12-22 13:28:03 -0700
  • 30ab1ec875 added a log statement to debug the ECTCPTIMEDOUT streaming core. make default qlang in searchinput.cpp parms.cpp be "" not NULL so it won't log qlang of "(null)" is NOT SUPPORTED. Matt Wells 2014-12-19 11:04:29 -0800
  • b2bb5b4a45 Merge branch 'diffbot-testing' into testing Matt 2014-12-17 16:32:21 -0800
  • e178c67f4b do not core on qa test fail Matt 2014-12-17 16:31:37 -0800
  • 0f9cb96b91 speed up large query reindexes by using fake firstips limited to 0-64k to avoid excessive doledb winner generations. fix bug when injecting a content-less url that has the canonical tag in it. force it to go through. Matt Wells 2014-12-17 16:19:04 -0800
  • d57f2264c4 more indicator fixes Matt Wells 2014-12-17 15:11:49 -0800
  • f52e163fb0 fix a couple bugs. added out of sync indicator. Matt Wells 2014-12-17 14:28:32 -0800
  • 465d30e0ee fix ping bug. Matt 2014-12-17 10:43:00 -0800
  • ca68ae022a fix punct at beginning of term bug. Matt 2014-12-17 10:29:26 -0800
  • 8c3f6a05c1 quick fix to prevent unnecessary re-INDEXING of diffbot replies. only reindex them when recycle diffbotreply is true AND doing query reindex with recycle set to true. Matt Wells 2014-12-17 06:45:43 -0800
  • cad1c7c930 typo Matt Wells 2014-12-17 06:35:38 -0800
  • 943beedf1b updated stats page to show # ooms, and took out # colls swapped out Matt Wells 2014-12-17 06:32:37 -0800
  • 74ea82384f emergency core fix in XmlDoc::redoJSONObjects() from mismatched json count with title hashes for old doc. Matt Wells 2014-12-17 06:26:07 -0800
  • 38c2671fd3 Merge branch 'diffbot' into testing Matt Wells 2014-12-16 19:22:35 -0800
  • 2fd511f002 updates Matt Wells 2014-12-16 17:09:25 -0800
  • d4179634a1 crc fixes Matt Wells 2014-12-16 16:38:54 -0800
  • 730b131bbf added new indicators so we can make gb more stable. now hosts table reports # ooms, disk read corruptions, closed sockets from overloads, and we # of outstanding spiders. made ping request a class so we can easily add new indicators. Matt 2014-12-16 16:22:50 -0800
  • 6c5ca9162c quick fix for internal ip bug Matt 2014-12-16 13:39:09 -0800
  • 27db9d57a1 added undeletable posdb key test to qainject1(). caught an undeletable rec and fixed that in xmldoc.cpp. Matt 2014-12-16 13:29:04 -0800
  • 80a052a259 help doc update Matt 2014-12-16 10:30:30 -0800
  • f57b2e0ab5 always restrict to seed domains. Matt 2014-12-15 17:46:13 -0800
  • c8c56a24da fixed query reindex for diffbot json docs. added recycle content checkbox to query reindex. fix gbsortbyint: at end of query core. only show 'all spiders paused' msg for active jobs. show error summaries if doc not found and &showerrors=1. Matt Wells 2014-12-15 16:49:20 -0800
  • 07b0c7d29c Merge branch 'diffbot-testing' into testing Matt 2014-12-15 10:20:59 -0800
  • 14e104b68c default ask for gzip to off for stability. no oom errors. Matt 2014-12-14 16:48:29 -0800
  • 41fecf9424 deactivate rebuild tool and profiler (broken in 64-bit). query reindex can replace rebuild tool. Matt 2014-12-14 16:44:27 -0800
  • 531ab85144 Merge branch 'diffbot' into diffbot-testing Matt 2014-12-14 16:38:27 -0800
  • 15c802f0aa Merge branch 'diffbot-testing' into testing Matt 2014-12-14 16:38:03 -0800
  • cb10eade5d fix freezeup Matt Wells 2014-12-13 13:34:47 -0800
  • 584d48edc7 be able to turn off getting of link info for faster rebuild of GI. Matt Wells 2014-12-13 13:17:26 -0800
  • eb249c1380 Merge branch 'diffbot' into diffbot-testing Matt Wells 2014-12-12 06:52:02 -0800
  • 8e6ae1f202 quick fix to also accumulate the avg/min/max stats from every shard. Matt Wells 2014-12-12 06:51:34 -0800
  • 578cde9d9d fix sections.cpp to not set root title section to tagid TAG_TITLE. Matt 2014-12-11 19:54:33 -0800
  • 690cc7cfc6 fix merge Matt 2014-12-11 18:28:02 -0800
  • bd926a69b5 Merge branch 'diffbot' into diffbot-testing Matt 2014-12-11 18:26:30 -0800
  • b89f071f7c quite a few bug fixes from adding the new query syntax qa test. Matt 2014-12-11 18:24:28 -0800
  • c52aec9c3e emergency fix Matt Wells 2014-12-11 16:12:33 -0800
  • 9a3489773d query syntax updates Matt 2014-12-11 14:37:30 -0800
  • c8f305468f fix facet bug Matt 2014-12-11 13:13:50 -0800
  • a0e887467e fix facet link bug for gbmin/gbmax range query Matt Wells 2014-12-11 12:36:30 -0800
  • daebad6e79 print avg,min,max in facet stats in xml/json output Matt Wells 2014-12-11 12:28:04 -0800
  • cd7ce466a1 facet work Matt 2014-12-11 12:09:06 -0800
  • d19ee6ceea Merge branch 'diffbot' into diffbot-testing Matt Wells 2014-12-11 08:40:55 -0800
  • 7d67f104fb emergency fixes Matt Wells 2014-12-11 08:39:26 -0800
  • 27df9a4276 link text extraction fixes Matt 2014-12-11 06:52:14 -0800
  • 4f71a95da5 reinstantiate linkdb min files to merge parm. mwells 2014-12-11 07:20:15 -0700
  • 19f118dddf Merge branch 'diffbot-testing' into testing Matt Wells 2014-12-10 14:08:10 -0800
  • e43365fc70 more pthread_t pid_t fixes Matt 2014-12-10 14:06:17 -0800
  • 08169a5562 makefile minor change Matt 2014-12-10 13:29:59 -0800
  • feed7d5b3c pthread_t pid_t compatibility fixes Matt 2014-12-10 13:15:26 -0800
  • 619e980a97 try to fix pthread_t pid_t issues on Threads.cpp Matt 2014-12-10 12:13:25 -0800
  • 329f004e74 compiler updates Matt 2014-12-10 12:09:04 -0800
  • d8ba619df3 makefile updates Matt 2014-12-10 11:53:46 -0800
  • ecf486ffb8 Merge branch 'diffbot-testing' into testing Matt 2014-12-10 11:27:47 -0800
  • 1de49af4e6 add query term info into json output as well Matt 2014-12-10 11:27:30 -0800
  • 44eddd63e8 fix signed/unsigned bug Matt 2014-12-10 11:04:37 -0800
  • 3ee6f7149f Merge branch 'diffbot-testing' into testing Matt 2014-12-10 11:03:12 -0800
  • 6b2e714964 try to fix core when generating statsdb graph Matt Wells 2014-12-10 11:01:50 -0800
  • c96b24f39d try to fix core on #0 and #16 from empty query. if empty query or n<=0 and &stream=1 then fix bug that was not sending back the reply properly. basically, disable streaming if msg40 would not block. upped MAX_SHARDS from 128 to 1024. should not take up any more mem really or slow things down. Matt Wells 2014-12-10 10:44:01 -0800
  • febb1d4658 print pretty floats in the facets menu, whether printing a single float or a range of floats. Matt Wells 2014-12-09 17:17:12 -0800
  • 720517c2f5 fix facet range lists Matt Wells 2014-12-09 16:51:14 -0800
  • d0bed16be5 fix type in sytnax.html page Matt Wells 2014-12-09 14:15:00 -0800
  • b218bc403d fix atotime1() output on json "date": field to restrict to 32-bit min/max for time_t's that are beyond 32 bits. so we truncate to min/max. later: add another termlist to add more date coverage. would be useful for searching for big numeric ranges, too, more than 32-bits. Matt Wells 2014-12-09 13:40:34 -0800
  • 82a2a7d18e Merge branch 'diffbot-testing' into testing Matt 2014-12-08 10:41:40 -0800
  • dfce03eca8 fix printing of "next 10" link Matt 2014-12-08 09:55:16 -0800
  • 0460335861 more permission system updates Matt 2014-12-08 09:49:17 -0800
  • 2670cfd2f0 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt 2014-12-08 09:48:51 -0800
  • 59c4db704a made pwd/ip security text areas less rows. send empty page in Parms.cpp if trying to access admin page and no pwd/ip and cloud user support not enabled. Matt 2014-12-08 09:47:38 -0800
  • 40f80cd559 Merge branch 'diffbot-testing' into testing Matt 2014-12-08 09:40:55 -0800
  • 5a844068fe fix cores in top tree with last commit. this one speeds things up greatly. don't scan scoreinfo buf for every docid we add to top tree if scoreinfobuf has plenty of space. later we'll have to be more clever about removing things from scoreinfobuf if it comes down to that. Matt Wells 2014-12-08 09:29:21 -0800
  • 4fbb2443b5 Revert "Revert "emergency fix so ppl can download large # objects in json"" Matt Wells 2014-12-08 07:40:44 -0800
  • aaa5b34126 Revert "emergency fix so ppl can download large # objects in json" Matt Wells 2014-12-08 07:36:31 -0800
  • c692b54bfd emergency fix so ppl can download large # objects in json Matt Wells 2014-12-08 07:13:00 -0800
  • 2c5f6daca2 bring back posdb min files to merge again so we can set high when quicckly building index mwells 2014-12-07 08:14:47 -0700
  • f3195c7eda make gb start do keepalive start again Matt Wells 2014-12-06 13:26:45 -0800
  • 048fbfe60f fix gb start cmd Matt Wells 2014-12-06 13:19:13 -0800
  • b51d19a88c Merge branch 'diffbot-testing' into diffbot Matt Wells 2014-12-06 13:09:55 -0800
  • 559ef067c5 fix core from langid too big in pageresults.cpp Matt Wells 2014-12-06 13:09:30 -0800
  • 840ca9b091 Merge branch 'diffbot-testing' into diffbot Matt Wells 2014-12-06 11:16:30 -0800
  • 845c9d8a13 version update in makefile Matt Wells 2014-12-06 11:15:34 -0700
  • b38cc19a40 dont print query term info unless header bit set Matt Wells 2014-12-06 09:36:57 -0800
  • 41c8817bdb fixed summary initialization error of the flags buffer. fixed term freq algo. use exact term freq for qatest123. made Summary.o -O3 again. fix gbsystem() to disable both timers. Matt 2014-12-06 10:14:48 -0700
  • 01d61d5427 remove type long, replace with int32_t Matt 2014-12-05 08:55:22 -0700
  • 76c32bb741 identation cleanups Matt 2014-12-05 08:54:27 -0700
  • 2f43eb828d Merge pull request #35 from emmanuelcharon/diffbot-testing Gigablast 2014-12-05 08:53:22 -0700
  • 7c57283b88 fix tld lang url filter. was being reset. mwells 2014-12-04 14:34:08 -0700
  • 090c18f59d show how long it took in html serps mwells 2014-12-04 14:22:40 -0700
  • 2ccb4f5c69 show spider req/repl sizes added to spiderdb in logIt(). make 'make' by itself work on 32-bit archs again. mwells 2014-12-04 13:57:42 -0700
  • 2021919d8c if its a diffbot crawl/bulk job then do not use linkdb to save disk space. Matt Wells 2014-12-04 13:25:10 -0700
  • d825e64d3b on bad hint offset do not core, just return corrupt data errno. Matt Wells 2014-12-04 12:15:58 -0800
  • fd33997716 print query info in json, too, not just xml Matt Wells 2014-12-04 13:03:18 -0700
  • dc306858cc nomenclature change Matt Wells 2014-12-04 11:02:54 -0700