Commit Graph

  • 61f919a19c website updates mwells 2014-08-30 21:25:14 -0700
  • aafacf5207 website updates mwells 2014-08-30 20:49:11 -0700
  • acd1672277 get crawlinfo every 3 seonds not 5. mwells 2014-08-29 16:18:46 -0700
  • 2fdea42e78 lower SPIDER_DONE_TIMER so we can be more zippy to complete jobs and also so we can pass smoke tests for crawlCreationCompletionTime etc. mwells 2014-08-29 13:50:25 -0700
  • 9d6437c2f8 remove html column from data csv output of json objects. mwells 2014-08-29 11:58:30 -0700
  • 9de4e4bf3d Merge branch 'testing' into diffbot-testing mwells 2014-08-29 11:23:13 -0700
  • aa966043ad added quickpolls. mwells 2014-08-28 19:45:25 -0700
  • fb161e0102 more minor bug fixes. mwells 2014-08-28 18:11:07 -0700
  • 4dbc6ed745 minor bug fix mwells 2014-08-28 18:09:23 -0700
  • 060e887f08 misc/various bug fixes. fix canonical redir url bug with iframes. mwells 2014-08-28 18:07:22 -0700
  • 3457245893 fix printf compiler warnings mwells 2014-08-28 13:23:46 -0700
  • caee238c46 fixes to make easier to compile on max os x. mwells 2014-08-28 12:55:02 -0700
  • 7f59ea4637 Merge c22a4a6e0c into 94e8b43cc9 Andy Chou 2014-08-28 19:27:28 +0000
  • c22a4a6e0c First attempt to port to Mac OSX Andy Chou 2014-08-28 12:23:56 -0700
  • 94e8b43cc9 fix bug of not running df -ka to get disk usage mwells 2014-08-28 09:49:47 -0700
  • c641666b45 Merge branch 'master' of github.com:gigablast/open-source-search-engine mwells 2014-08-28 08:52:48 -0600
  • adcdf672bd Merge branch 'testing' mwells 2014-08-28 07:46:35 -0700
  • 38cef7d52e fix # docs and recs bug. mwells 2014-08-28 07:45:43 -0700
  • e7aa933959 awesome updates to help.html page mwells 2014-08-27 22:21:30 -0700
  • 51baca8917 Merge branch 'testing' of github.com:gigablast/open-source-search-engine into testing mwells 2014-08-27 23:11:18 -0600
  • ad8168f214 updates for query help table mwells 2014-08-27 23:10:27 -0600
  • 8f6b82261f makefile updates mwells 2014-08-27 22:09:21 -0600
  • f7af7ea2af re-enable support for canonical url "redirects". mwells 2014-08-27 19:28:48 -0700
  • 2a34e6b2b8 Merge branch 'testing' mwells 2014-08-27 19:25:16 -0700
  • c4a0967b12 remove copyright all rights reserved from serps if custom html tail provided. mwells 2014-08-27 19:24:40 -0700
  • 3c980fe592 Merge branch 'testing' mwells 2014-08-27 20:00:49 -0600
  • 203754b78a makefile updates mwells 2014-08-27 19:59:57 -0600
  • 52e601e27a Merge branch 'testing' of git@github.com:gigablast/open-source-search-engine into testing Matt Wells 2014-08-27 18:30:01 -0700
  • ed6ed14196 fix rocket.jpg link Matt Wells 2014-08-27 18:29:51 -0700
  • 9be5027fb0 expose thread pool sizes again for spider/query time tasks. mwells 2014-08-27 17:27:00 -0700
  • d5ef8a36e7 fix crawldelay bug. we were ignoring it. mwells 2014-08-27 17:19:13 -0700
  • 76fffc3a81 Merge branch 'testing' of github.com:gigablast/open-source-search-engine into testing mwells 2014-08-27 16:46:41 -0700
  • 234b6ff831 use scp not rcp for 'gb installgb' etc so it works on redhat. mwells 2014-08-27 16:46:14 -0700
  • 839509e406 fix performance graph. Matt Wells 2014-08-27 15:28:30 -0700
  • 81006e4f74 updated dotgdbinit Matt Wells 2014-08-27 15:04:11 -0700
  • ec4d4a95ef added dotgdbinit file for gdb Matt Wells 2014-08-27 15:03:30 -0700
  • bdaf321c4c show process IDs on the stats page now. use scp not rcp for 'gb installgb' because it doesn't work on default redhat systems. Matt Wells 2014-08-27 14:10:50 -0700
  • 8772e7fffe overhauled the main loop. (BIGLOOP) in Loop.cpp. sigtimedwait() was cutting it, it was queueing up too many DUPLICATE signals and overflowing the rt signal queue. now gb has its own real-time signal queueing logic that just sets the bit of the FDs that need attention. i think threaded reads/writes are better now too but the performance graph is broken so i need to fix that first. the threads page looks good though. overhaul this hopefully is a massive and stable performance improvement. Matt Wells 2014-08-27 14:07:13 -0700
  • 317af88770 take out debug logs. mwells 2014-08-27 10:52:44 -0700
  • f73195870b hacked up to debug why we're not getting signals on redhat etc. mwells 2014-08-27 10:37:03 -0700
  • c79c2a59c4 updates mwells 2014-08-26 11:34:19 -0700
  • bdc72c9e8a doc admin.html updates mwells 2014-08-26 08:50:27 -0700
  • 042ec4b5cd show gigabits in xml/json feeds. update optimizing section in admin.html by adding a 'disable gigabits' section for making queries faster. mwells 2014-08-26 08:46:59 -0700
  • d0ccbdd455 gui and doc updates mwells 2014-08-26 00:05:01 -0700
  • 5c69d49176 fix html.html bug mwells 2014-08-25 21:49:32 -0700
  • 314c678538 html cleanups mwells 2014-08-25 21:38:52 -0700
  • b080a96301 fix widths of admin tool boxes mwells 2014-08-25 21:02:50 -0700
  • 8e6d2db194 put basic/advanced tabs on top. mwells 2014-08-25 19:58:26 -0700
  • 0c43fc82ea Merge branch 'diffbot-matt' into diffbot-testing Matt Wells 2014-08-25 17:16:31 -0700
  • c3699f0da5 fix bugs found from qa tests. mwells 2014-08-25 14:34:30 -0700
  • 6607cc2cbe added gbfieldmatch: operator for exactly matching full field names. case sensitive. uses gbfacetstr: values that were hashed at index time. example: gbfieldmatch:object.field:"Some Value" See help.html for more examples. mwells 2014-08-25 13:57:55 -0700
  • 2800ce0e04 fix a few bugs pertaining to tags.uri:"" fix a while back. mwells 2014-08-25 12:40:51 -0700
  • 15421908be Merge branch 'master' into testing mwells 2014-08-25 10:51:37 -0700
  • 4dcc0ef369 fix the core better from getSpiderReplyMetaList2() Matt Wells 2014-08-25 07:06:25 -0700
  • 1f9c230290 another bug fix for getSpiderReplyMetaList2() coring. mwells 2014-08-25 07:15:19 -0700
  • 80f73bf297 Merge branch 'master' into testing Matt Wells 2014-08-23 07:30:17 -0700
  • 425b2bb81b try to fix core dump that happens while spidering. mwells 2014-08-23 07:35:04 -0700
  • 5d3fd80063 make it so we can dump tagdb to a wget-table list of urls to re-add tags to another tagdb. Matt Wells 2014-08-23 07:29:40 -0700
  • bb7c6c29ce added image mwells 2014-08-22 03:53:57 -0700
  • 42e5d02f31 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-08-18 16:24:31 -0700
  • 7befaac371 do not allow injects until clock is synced with host #0. Matt Wells 2014-08-18 16:23:57 -0700
  • 346e21aa0a fix site: query operator and others from fixing the tags.uri: bug. Matt Wells 2014-08-18 10:25:55 -0700
  • e45c0d32f6 Merge branch 'diffbot-testing' into testing mwells 2014-08-15 17:05:22 -0700
  • 1d9411152f fix so tags.uri:foo/bar/baz is treated like tags.uri:"foo/bar/baz". also fixed another quote-related bug. Matt Wells 2014-08-15 14:49:21 -0700
  • a62c971223 fix tags.uri:org/resource/Foo" query Matt Wells 2014-08-15 13:38:14 -0700
  • 2af299da2c various fixes. prioritize process only urls over crawl urls to get data faster. do not merge on high negative rec concentration. we need to fix that more. allow simplified redirs again for custom crawls to avoid too many dups. raise crawlinfo delay from 1 sec to 5 secs to reduce network usage for now. add back in injection enabled parm, but hidden. Matt Wells 2014-08-15 10:27:50 -0700
  • 786d73f481 help page updates mwells 2014-08-08 08:04:41 -0700
  • 5b99c95c76 awesome update to the help page mwells 2014-08-08 07:48:11 -0700
  • 50cc24e2be updating help table. still more work on query.cpp to do mwells 2014-08-07 22:02:55 -0700
  • fb0c8f0c4d fix redirect detection error mwells 2014-08-07 17:11:58 -0700
  • 31f8c13369 facet tests and fixes mwells 2014-08-07 16:35:36 -0700
  • ffa64ea540 fix gbequalfloat:1.23 term mwells 2014-08-07 15:33:37 -0700
  • 9b94ce2e40 Merge branch 'diffbot-testing' into testing mwells 2014-08-07 15:15:08 -0700
  • 68d0f40723 always count seeds towards page crawl count even if does not match crawl pattern. mwells 2014-08-07 14:33:13 -0700
  • 7a02fb3676 minor fix mwells 2014-08-07 11:43:01 -0700
  • 734973bb81 do not increment pagedownloadsuccesses if url does not match crawl pattern. mwells 2014-08-07 11:22:31 -0700
  • 5149f855af more nyt.com bug fixes mwells 2014-08-07 10:26:30 -0700
  • 4c81ddedf3 add showbanned menu mwells 2014-08-07 08:08:20 -0700
  • ab46517e13 two rows of filters mwells 2014-08-07 08:03:19 -0700
  • ff14341f53 javascript updates for search filter bar mwells 2014-08-07 07:38:00 -0700
  • 484cf91a60 more search filter bar updates mwells 2014-08-06 22:05:30 -0700
  • 55e27f4fa5 search filter bar updates mwells 2014-08-06 20:43:48 -0700
  • 177dbeb23d Merge branch 'testing' of github.com:gigablast/open-source-search-engine into testing mwells 2014-08-06 16:00:50 -0700
  • 6a28250e94 get qa test working after nyt bug fix mwells 2014-08-06 16:00:25 -0700
  • 470c487be4 get search filters actually working mwells 2014-08-06 08:12:05 -0700
  • f59b05eec1 gui updates mwells 2014-08-05 21:21:04 -0700
  • 947be58f10 Merge branch 'diffbot-testing' into testing mwells 2014-08-05 17:19:53 -0700
  • cc1ceaaac2 fix nyt.com cookie redir bug. fixed bug when POSTing injection request with multipart/form-data. mwells 2014-08-05 17:04:11 -0700
  • 90c7d4328b search filter bar updates mwells 2014-08-05 08:07:07 -0700
  • 2c7e14d4ca go button updates mwells 2014-08-04 20:27:51 -0700
  • 5bedf74e55 some more gui stuff mwells 2014-08-04 14:08:44 -0700
  • 6e167b94b2 gui updates mwells 2014-08-04 07:17:13 -0700
  • 90139509a6 gui updates mwells 2014-08-03 19:38:50 -0700
  • 93152633db gui updates. fixed gigabits mwells 2014-08-03 17:25:22 -0700
  • 0da51d595b gui fixes mwells 2014-08-03 13:19:32 -0700
  • 13743acd5a gui updates mwells 2014-08-03 10:42:45 -0700
  • 6eed25e27e added rocket.jpg mwells 2014-08-03 09:52:58 -0700
  • d56b3d43a3 serp gui updates mwells 2014-08-03 09:52:41 -0700
  • 429f50b3af great gui updates mwells 2014-08-02 22:19:15 -0700
  • 6b7c7f4086 some gui updates mwells 2014-08-02 17:58:11 -0700