Commit Graph

  • 00e111a182 make spider status msgs clickable to see the urls with that status. Matt Wells 2014-07-03 12:52:44 -0700
  • c5815829e5 Merge branch 'testing' into diffbot-testing Matt Wells 2014-07-03 12:31:13 -0700
  • 8ebda5ca51 little comment update Matt Wells 2014-07-03 12:26:02 -0700
  • 886063a3bd fixes for query reindex. Matt Wells 2014-07-03 12:24:14 -0700
  • 1586c2dcd7 minor parm change to back what it was mwells 2014-07-03 07:57:08 -0700
  • 6153cd8ee2 double slash cleanup mwells 2014-07-03 07:39:02 -0700
  • a327f9ceb0 Merge branch 'diffbot-testing' into testing mwells 2014-07-03 07:30:39 -0700
  • b0caf3eb00 get summary "ns" parm and collectionrec knobs for summary gen working. mwells 2014-07-03 07:29:44 -0700
  • 0701411bb1 fix page reindex bugs. Matt Wells 2014-07-02 17:13:37 -0700
  • 781b26b820 fix so query reindex does not delete the collection. Matt Wells 2014-07-02 16:03:16 -0700
  • b3b743d111 timezone fix for atotime1() et al Matt Wells 2014-07-02 14:06:43 -0700
  • 2db8edc527 fix core when doing federated search while streaming. Matt Wells 2014-07-02 12:51:36 -0700
  • 1361e5728c show actual diffbot error in urls.csv. do not stop indexing page and harvesting links on diffbot error. Matt Wells 2014-07-02 11:53:24 -0700
  • 5e8c2e4800 fix core from cr being null for page root and not letting searchinput set itself to defaults. Matt Wells 2014-07-02 10:46:58 -0700
  • af014abdcd title max len fixes. mwells 2014-07-02 08:03:33 -0700
  • aa331eb880 fix core from null nsr. Matt Wells 2014-07-01 21:12:18 -0700
  • a699432a99 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-07-01 17:41:49 -0700
  • 9f9b33edc9 try to hack fix core when streaming html results back. Matt Wells 2014-07-01 17:18:02 -0700
  • 2b321efb1e debug log msgs mwells 2014-07-01 16:57:25 -0700
  • 2ddd7d7366 finally got http tunnel logic working. mwells 2014-07-01 16:28:15 -0600
  • 2f8b0694fd more http tunnel fixes mwells 2014-07-01 15:43:20 -0600
  • 5de927f385 some fixes for http proxy tunnel mwells 2014-07-01 15:18:18 -0600
  • fe8d41a3c3 Merge branch 'diffbot-testing' into diffbot-matt mwells 2014-07-01 14:18:54 -0600
  • 78c3dab6dc fix hanging when doing &stream=1 for a federated search. hack it so it thinks we got m_msg3a.m_numDocIds summaries after we've printed what was requested. so we don't waste time getting all the summaries. this popped up for federated search because we merged the msg3as into a single msg3a and had an overabundance of docids. Matt Wells 2014-07-01 12:12:01 -0700
  • d93d44250a fix debug print statements Matt Wells 2014-07-01 11:46:01 -0700
  • ea2a125a81 Merge branch 'diffbot-testing' into diffbot-matt mwells 2014-07-01 11:46:30 -0600
  • 69dfd60bf3 Merge branch 'testing' into diffbot-testing mwells 2014-07-01 11:43:22 -0600
  • 20e0f0eca5 fix buggy title:schmuck OR gbmin:offerPrice query. Matt Wells 2014-07-01 10:15:42 -0700
  • 92799ef393 add support for tunnelling https fetch through an http proxy using CONNECT directive. needs more debugging. mwells 2014-07-01 10:43:52 -0600
  • e54a250790 fix parm offset bug mwells 2014-07-01 10:13:57 -0600
  • f8da7c5b83 use post op not get for site list pages since the site list can be large. mwells 2014-07-01 08:56:28 -0600
  • d819e07c9a Merge branch 'diffbot-testing' into testing mwells 2014-07-01 08:46:04 -0600
  • fe8694d904 Merge branch 'master' of github.com:gigablast/open-source-search-engine mwells 2014-07-01 08:44:56 -0600
  • 8304a4aae2 package build fixes mwells 2014-07-01 08:44:44 -0600
  • bab2a4292a turn off stack smash detection so it will get a seg fault and save and dump core when stack gets smashed. Matt Wells 2014-07-01 06:43:05 -0700
  • 9249564191 now floaters are working pretty well mwells 2014-06-30 16:26:10 -0600
  • 859c5ee12f fix spider proxy core Matt Wells 2014-06-30 12:09:51 -0700
  • e6dd317664 Merge branch 'diffbot-testing' into diffbot-matt Matt Wells 2014-06-30 11:37:12 -0700
  • 5e39b7870d fix for bad crawl info stats Matt Wells 2014-06-30 10:53:11 -0700
  • 262e89f2bb api update mwells 2014-06-28 10:27:54 -0600
  • 7bd37dfaa2 facet updates mwells 2014-06-28 10:26:08 -0600
  • 222a454d67 sectiondb/facet updates mwells 2014-06-28 09:00:55 -0600
  • 98b317b421 Merge branch 'diffbot-testing' into diffbot-matt Matt Wells 2014-06-27 17:23:03 -0700
  • 2227d1fca7 Merge branch 'diffbot-matt' of github.com:gigablast/open-source-search-engine into diffbot-matt Matt Wells 2014-06-27 17:18:20 -0700
  • 2137e150e7 Merge branch 'testing' into diffbot-matt Matt Wells 2014-06-27 17:17:14 -0700
  • 3e1191bffd ask all hosts for ALL crawl info for ALL collections when we first startup. pass in 'f' flag. use 'i' normally to indicate 'incremental' to only send crawlinfos that changed. Matt Wells 2014-06-27 11:44:43 -0700
  • 7ec51567d4 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-06-27 11:21:19 -0700
  • 0db5400065 fix stack smash core when title is huge. Matt Wells 2014-06-27 11:21:01 -0700
  • 3162c83473 add some debug msgs Matt Wells 2014-06-27 08:28:28 -0700
  • bb3efdb229 if no &token= or &c= then use default collnum for searching Matt Wells 2014-06-26 10:19:58 -0700
  • f61d67cd80 simple fix for &token=... searches Matt Wells 2014-06-26 10:06:34 -0700
  • 4af94ec08e Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-06-26 09:54:38 -0700
  • efe95e549b re-enable spidering if off because we missed a crawlinfo reply from udp timed out. Matt Wells 2014-06-26 09:54:15 -0700
  • 81511f9d9a Merge branch 'diffbot-testing' into diffbot Matt Wells 2014-06-26 06:04:58 -0700
  • dfd69e145b Merge branch 'testing' into diffbot-testing Matt Wells 2014-06-26 05:44:17 -0700
  • 5a63019dad Merge branch 'master' into testing Matt Wells 2014-06-26 05:43:37 -0700
  • e3aa263cdc fix infinite loop. why wasn't this pulled in from master? Matt Wells 2014-06-26 05:43:03 -0700
  • f26b964717 remove debug log comments Matt Wells 2014-06-25 19:37:36 -0700
  • e9ff8c48d8 try to remove the sluggishness from all hosts... should really reduce load. Matt Wells 2014-06-25 17:46:28 -0700
  • 39fbb5b5b6 update crawl info once per sec again now that we only send if localCrawlInfo has changed. Matt Wells 2014-06-25 12:55:10 -0700
  • 3cf31ed230 fix core Matt Wells 2014-06-25 12:26:33 -0700
  • 573dee87f9 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-06-25 11:41:47 -0700
  • dd49e69672 fix type:json bug to not return non-diffbot reply docs. upped query max words for large client bool query. Matt Wells 2014-06-25 11:41:10 -0700
  • 651f0f27ac only send localcrawlinfo if it has been updated significantly since last time. should remove the sluggishness/missedhearbeats from host #0 on neo. mwells 2014-06-25 12:38:51 -0600
  • e9cf7b7dce Merge branch 'master' into testing Matt Wells 2014-06-25 10:21:01 -0700
  • 61a20da025 fix make master-deb mwells 2014-06-25 06:31:05 -0600
  • c86108da02 fix out of fds condition when indexing images. mwells 2014-06-25 06:25:02 -0600
  • 5941f5c524 infinite loop fix for injecting Matt Wells 2014-06-24 06:31:13 -0700
  • f640061d63 infinite loop fix for injecting Matt Wells 2014-06-24 06:30:33 -0700
  • 48a98df71d make &s=20000 search much faster by skipping generation of first 20000 summaries if deduping is off, site clustering is off and gigabit generation are off (&dr=0&sc=0&dsrt=0). turn gigabits off on load for all customcrawls(diffbot) Matt Wells 2014-06-23 14:44:21 -0700
  • 410fc9f014 Merge branch 'master' into testing Matt Wells 2014-06-23 13:06:25 -0700
  • ec7f569e99 do not spider inject pages links by default. Matt Wells 2014-06-23 07:43:50 -0600
  • f0b7d6ad1a added S99gb for loading at boot. do a 'sudo make install' to install to /var/gigablast/ dir Matt Wells 2014-06-23 07:32:38 -0600
  • 7b98dff979 doc updates Matt Wells 2014-06-22 21:59:37 -0700
  • a413ce7f2c nothing Matt Wells 2014-06-22 22:58:18 -0600
  • 2a810c7b3e bring back bits subdir for compiling on new ubuntu which broke gcc-multilib package. mwells 2014-06-22 22:48:26 -0600
  • 36f4a8dddf more html head/tail fixes mwells 2014-06-21 21:40:25 -0700
  • ed2a71605c some fixes for html head/tail stuff mwells 2014-06-21 21:18:25 -0700
  • 7ce45e8712 get html head and tail working again now. mwells 2014-06-21 21:07:38 -0700
  • 43d2e8b5b3 start supporting custom html head/tail in search controls again. mwells 2014-06-21 11:25:33 -0700
  • 6da972704b bring back custom home page html into search controls mwells 2014-06-21 09:57:51 -0700
  • e6ba1d123b minor msg update mwells 2014-06-21 07:50:35 -0700
  • ed19754dd9 compare.html updates mwells 2014-06-21 07:47:58 -0700
  • 22e0c636a7 mostly doc updates mwells 2014-06-21 07:27:45 -0700
  • f684695ed8 no gb.conf mwells 2014-06-20 18:12:02 -0700
  • d731d17b3b fix core mwells 2014-06-20 18:09:11 -0700
  • 0033ac3407 more sectiondb faceting updates mwells 2014-06-20 17:46:46 -0700
  • b0e82edc93 new facet crap compiling now. mwells 2014-06-20 12:28:50 -0700
  • a09d4cd723 Merge branch 'master' into diffbot-matt mwells 2014-06-20 09:35:39 -0700
  • 589b591fd4 fix make master-deb mwells 2014-06-20 10:20:13 -0600
  • 0c7e78c947 Merge branch 'master' into testing mwells 2014-06-20 10:13:12 -0600
  • 4cc526ee90 fix including hosts.conf and gb.conf in pkg install mwells 2014-06-20 09:11:48 -0700
  • 327c866327 widget scrolling more continuous mwells 2014-06-20 07:59:19 -0700
  • 89bab91647 nothing mwells 2014-06-20 08:35:15 -0600
  • 3d2d3f802d Merge branch 'testing' mwells 2014-06-20 08:32:27 -0600
  • c2cc2d4d15 try to fix some cores mwells 2014-06-20 07:30:24 -0700
  • b49ed5c0bc Merge branch 'diffbot-testing' into testing mwells 2014-06-19 21:51:44 -0700
  • 18b2271ac1 api page updates mwells 2014-06-19 20:54:44 -0700
  • afb4c96ff7 api updates mwells 2014-06-19 20:15:39 -0700
  • d3d5fa3cc8 api updates Matt Wells 2014-06-19 19:42:09 -0700