Commit Graph

  • 92799ef393 add support for tunnelling https fetch through an http proxy using CONNECT directive. needs more debugging. mwells 2014-07-01 10:43:52 -06:00
  • e54a250790 fix parm offset bug mwells 2014-07-01 10:13:57 -06:00
  • f8da7c5b83 use post op not get for site list pages since the site list can be large. mwells 2014-07-01 08:56:28 -06:00
  • d819e07c9a Merge branch 'diffbot-testing' into testing mwells 2014-07-01 08:46:04 -06:00
  • fe8694d904 Merge branch 'master' of github.com:gigablast/open-source-search-engine mwells 2014-07-01 08:44:56 -06:00
  • 8304a4aae2 package build fixes mwells 2014-07-01 08:44:44 -06:00
  • bab2a4292a turn off stack smash detection so it will get a seg fault and save and dump core when stack gets smashed. Matt Wells 2014-07-01 06:43:05 -07:00
  • 9249564191 now floaters are working pretty well mwells 2014-06-30 16:26:10 -06:00
  • 859c5ee12f fix spider proxy core Matt Wells 2014-06-30 12:09:51 -07:00
  • e6dd317664 Merge branch 'diffbot-testing' into diffbot-matt Matt Wells 2014-06-30 11:37:12 -07:00
  • 5e39b7870d fix for bad crawl info stats Matt Wells 2014-06-30 10:53:11 -07:00
  • 262e89f2bb api update mwells 2014-06-28 10:27:54 -06:00
  • 7bd37dfaa2 facet updates mwells 2014-06-28 10:26:08 -06:00
  • 222a454d67 sectiondb/facet updates mwells 2014-06-28 09:00:55 -06:00
  • 98b317b421 Merge branch 'diffbot-testing' into diffbot-matt Matt Wells 2014-06-27 17:23:03 -07:00
  • 2227d1fca7 Merge branch 'diffbot-matt' of github.com:gigablast/open-source-search-engine into diffbot-matt Matt Wells 2014-06-27 17:18:20 -07:00
  • 2137e150e7 Merge branch 'testing' into diffbot-matt Matt Wells 2014-06-27 17:17:14 -07:00
  • 3e1191bffd ask all hosts for ALL crawl info for ALL collections when we first startup. pass in 'f' flag. use 'i' normally to indicate 'incremental' to only send crawlinfos that changed. Matt Wells 2014-06-27 11:44:43 -07:00
  • 7ec51567d4 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-06-27 11:21:19 -07:00
  • 0db5400065 fix stack smash core when title is huge. Matt Wells 2014-06-27 11:21:01 -07:00
  • 3162c83473 add some debug msgs Matt Wells 2014-06-27 08:28:28 -07:00
  • bb3efdb229 if no &token= or &c= then use default collnum for searching Matt Wells 2014-06-26 10:19:58 -07:00
  • f61d67cd80 simple fix for &token=... searches Matt Wells 2014-06-26 10:06:34 -07:00
  • 4af94ec08e Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-06-26 09:54:38 -07:00
  • efe95e549b re-enable spidering if off because we missed a crawlinfo reply from udp timed out. Matt Wells 2014-06-26 09:54:15 -07:00
  • 81511f9d9a Merge branch 'diffbot-testing' into diffbot Matt Wells 2014-06-26 06:04:58 -07:00
  • dfd69e145b Merge branch 'testing' into diffbot-testing Matt Wells 2014-06-26 05:44:17 -07:00
  • 5a63019dad Merge branch 'master' into testing Matt Wells 2014-06-26 05:43:37 -07:00
  • e3aa263cdc fix infinite loop. why wasn't this pulled in from master? Matt Wells 2014-06-26 05:43:03 -07:00
  • f26b964717 remove debug log comments Matt Wells 2014-06-25 19:37:36 -07:00
  • e9ff8c48d8 try to remove the sluggishness from all hosts... should really reduce load. Matt Wells 2014-06-25 17:46:28 -07:00
  • 39fbb5b5b6 update crawl info once per sec again now that we only send if localCrawlInfo has changed. Matt Wells 2014-06-25 12:55:10 -07:00
  • 3cf31ed230 fix core Matt Wells 2014-06-25 12:26:33 -07:00
  • 573dee87f9 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-06-25 11:41:47 -07:00
  • dd49e69672 fix type:json bug to not return non-diffbot reply docs. upped query max words for large client bool query. Matt Wells 2014-06-25 11:41:10 -07:00
  • 651f0f27ac only send localcrawlinfo if it has been updated significantly since last time. should remove the sluggishness/missedhearbeats from host #0 on neo. mwells 2014-06-25 12:38:51 -06:00
  • e9cf7b7dce Merge branch 'master' into testing Matt Wells 2014-06-25 10:21:01 -07:00
  • 61a20da025 fix make master-deb mwells 2014-06-25 06:31:05 -06:00
  • c86108da02 fix out of fds condition when indexing images. mwells 2014-06-25 06:25:02 -06:00
  • 5941f5c524 infinite loop fix for injecting Matt Wells 2014-06-24 06:31:13 -07:00
  • f640061d63 infinite loop fix for injecting Matt Wells 2014-06-24 06:30:33 -07:00
  • 48a98df71d make &s=20000 search much faster by skipping generation of first 20000 summaries if deduping is off, site clustering is off and gigabit generation are off (&dr=0&sc=0&dsrt=0). turn gigabits off on load for all customcrawls(diffbot) Matt Wells 2014-06-23 14:44:21 -07:00
  • 410fc9f014 Merge branch 'master' into testing Matt Wells 2014-06-23 13:06:25 -07:00
  • ec7f569e99 do not spider inject pages links by default. Matt Wells 2014-06-23 07:43:50 -06:00
  • f0b7d6ad1a added S99gb for loading at boot. do a 'sudo make install' to install to /var/gigablast/ dir Matt Wells 2014-06-23 07:32:38 -06:00
  • 7b98dff979 doc updates Matt Wells 2014-06-22 21:59:37 -07:00
  • a413ce7f2c nothing Matt Wells 2014-06-22 22:58:18 -06:00
  • 2a810c7b3e bring back bits subdir for compiling on new ubuntu which broke gcc-multilib package. mwells 2014-06-22 22:48:26 -06:00
  • 36f4a8dddf more html head/tail fixes mwells 2014-06-21 21:40:25 -07:00
  • ed2a71605c some fixes for html head/tail stuff mwells 2014-06-21 21:18:25 -07:00
  • 7ce45e8712 get html head and tail working again now. mwells 2014-06-21 21:07:38 -07:00
  • 43d2e8b5b3 start supporting custom html head/tail in search controls again. mwells 2014-06-21 11:25:33 -07:00
  • 6da972704b bring back custom home page html into search controls mwells 2014-06-21 09:57:51 -07:00
  • e6ba1d123b minor msg update mwells 2014-06-21 07:50:35 -07:00
  • ed19754dd9 compare.html updates mwells 2014-06-21 07:47:58 -07:00
  • 22e0c636a7 mostly doc updates mwells 2014-06-21 07:27:45 -07:00
  • f684695ed8 no gb.conf mwells 2014-06-20 18:12:02 -07:00
  • d731d17b3b fix core mwells 2014-06-20 18:09:11 -07:00
  • 0033ac3407 more sectiondb faceting updates mwells 2014-06-20 17:46:46 -07:00
  • b0e82edc93 new facet crap compiling now. mwells 2014-06-20 12:28:50 -07:00
  • a09d4cd723 Merge branch 'master' into diffbot-matt mwells 2014-06-20 09:35:39 -07:00
  • 589b591fd4 fix make master-deb mwells 2014-06-20 10:20:13 -06:00
  • 0c7e78c947 Merge branch 'master' into testing mwells 2014-06-20 10:13:12 -06:00
  • 4cc526ee90 fix including hosts.conf and gb.conf in pkg install mwells 2014-06-20 09:11:48 -07:00
  • 327c866327 widget scrolling more continuous mwells 2014-06-20 07:59:19 -07:00
  • 89bab91647 nothing mwells 2014-06-20 08:35:15 -06:00
  • 3d2d3f802d Merge branch 'testing' mwells 2014-06-20 08:32:27 -06:00
  • c2cc2d4d15 try to fix some cores mwells 2014-06-20 07:30:24 -07:00
  • b49ed5c0bc Merge branch 'diffbot-testing' into testing mwells 2014-06-19 21:51:44 -07:00
  • 18b2271ac1 api page updates mwells 2014-06-19 20:54:44 -07:00
  • afb4c96ff7 api updates mwells 2014-06-19 20:15:39 -07:00
  • d3d5fa3cc8 api updates Matt Wells 2014-06-19 19:42:09 -07:00
  • 97f26f1ffa api updates Matt Wells 2014-06-19 17:56:44 -07:00
  • e151de4796 api page update Matt Wells 2014-06-19 16:10:43 -07:00
  • aaec46f612 added gbdocspiderdate and gbdocindexdate terms just for docs and not spider reply "documents". do not index plain terms for CT_STATUS spider reply docs. create gb.conf if does not exist, take out of repo. Matt Wells 2014-06-19 15:27:46 -07:00
  • 149e88efc8 fix bad parsing of objects array for double backslashes Matt Wells 2014-06-19 13:30:57 -07:00
  • a4e2ed8faf fix oops Matt Wells 2014-06-19 12:22:14 -07:00
  • 1ffeb41777 another urlisdocid fix Matt Wells 2014-06-19 11:52:47 -07:00
  • 4a40412e1b now &roundStart=1 increments the round # and sets the processed/crawled round counts back to 0 so maxToProcess/maxToCrawl will not hold us back. Matt Wells 2014-06-19 11:20:30 -07:00
  • 27193d444a minor image updates mwells 2014-06-19 06:49:19 -07:00
  • 39c19e67e9 allow for &xml=1 again mwells 2014-06-19 06:42:29 -07:00
  • bb788b7941 allow for &xml=1 again mwells 2014-06-19 06:41:56 -07:00
  • ed35451807 added support for images in the xml feed. mwells 2014-06-19 06:38:29 -07:00
  • 494c43d5dd fix gb execution in main.cpp::getcwd2() function. mwells 2014-06-19 06:03:11 -07:00
  • 8cc8e19d4d Merge branch 'testing' of github.com:gigablast/open-source-search-engine into testing mwells 2014-06-19 05:42:15 -07:00
  • d4202cb49d add donate button mwells 2014-06-19 05:41:52 -07:00
  • fdf9d51280 bring back nuggabits Matt Wells 2014-06-18 19:58:46 -07:00
  • 51a2d7d123 oops fix Matt Wells 2014-06-18 16:49:18 -07:00
  • 721cdec30c fix bug related to re-adding spider requests for diffbot object parent urls for a query reindex. url is no longer a docid. Matt Wells 2014-06-18 16:46:26 -07:00
  • 6b512f1379 Merge branch 'master' into testing Matt Wells 2014-06-18 09:17:13 -07:00
  • f1ec530eef critical bug fixes Matt Wells 2014-06-18 09:16:28 -07:00
  • d730f087c2 fix more injection bugs Matt Wells 2014-06-18 07:05:55 -07:00
  • ad42739e3e nothing mwells 2014-06-18 08:09:02 -06:00
  • 772bd02e8d Merge branch 'testing' of git@github.com:gigablast/open-source-search-engine into testing Matt Wells 2014-06-18 06:51:08 -07:00
  • 82be2ba28a fix injection bugs Matt Wells 2014-06-18 06:22:19 -07:00
  • 91df090d1d nothing mwells 2014-06-18 06:37:06 -06:00
  • 8bbdc2b48a fix another core Matt Wells 2014-06-18 05:23:48 -07:00
  • cd33553bf2 nothing mwells 2014-06-18 06:10:58 -06:00
  • 9e2b1532d9 quick fix Matt Wells 2014-06-18 05:06:23 -07:00
  • 1bef36c03c emergency bug fixes Matt Wells 2014-06-18 05:04:45 -07:00