Commit Graph

  • 958343957a nothing mwells 2014-07-29 10:44:23 -07:00
  • 58f5a2dd57 save conf files safely to disk so we don't lose them because the disk is full. mwells 2014-07-29 10:02:43 -07:00
  • 9f70d43a4b create qa subdir if does not exist mwells 2014-07-29 07:36:40 -07:00
  • af4797c991 version updates mwells 2014-07-29 08:32:05 -06:00
  • df34124a86 Merge branch 'testing' mwells 2014-07-29 07:16:10 -07:00
  • 811c9c3c4d show latest spider status msgs in page basic status. show coll name, not num, in spider queue. mwells 2014-07-29 07:14:54 -07:00
  • 7cc480a21f qa updates mwells 2014-07-28 21:29:11 -07:00
  • 3cc54b72cc qa updates mwells 2014-07-28 19:15:31 -07:00
  • 388806c299 fix file.org prepending www all the time issue mwells 2014-07-28 14:46:37 -07:00
  • 0409571262 Merge branch 'diffbot-testing' into testing mwells 2014-07-28 14:37:44 -07:00
  • 343f783592 another fix for &restartRound=1 Matt Wells 2014-07-28 13:58:36 -07:00
  • 3a32301f99 minor ui update Matt Wells 2014-07-28 13:26:17 -07:00
  • b62547c6ca fix some cores. reduce mem usage per coll. added debugging msgs. fixes for &roundStart=1 Matt Wells 2014-07-28 13:11:27 -07:00
  • baf712a638 qa test updates. mwells 2014-07-28 11:13:02 -07:00
  • 85b628cade qa updates mwells 2014-07-26 21:39:16 -07:00
  • f54298e1a2 qa updates mwells 2014-07-26 20:05:05 -07:00
  • 6a5f5cf3f7 qa updates mwells 2014-07-26 18:43:11 -07:00
  • c18128a083 qatest updates mwells 2014-07-26 08:12:42 -07:00
  • b984e227c3 qa page updates mwells 2014-07-26 06:59:05 -07:00
  • 2a094accff add qa page mwells 2014-07-25 17:39:29 -07:00
  • 0460c3cc45 Merge branch 'diffbot-testing' into testing mwells 2014-07-25 08:22:24 -07:00
  • beba7e43db fix core when host #0 has collnum that other host does not. Matt Wells 2014-07-25 08:14:38 -07:00
  • 564371b625 added cdataDecode() function. Matt Wells 2014-07-25 07:29:57 -07:00
  • 93fb2ab111 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-07-25 06:52:01 -07:00
  • 71a3b425e9 fix issue when saving coll.conf with tons of seeds Matt Wells 2014-07-25 06:51:07 -07:00
  • 837b6cf465 api updates mwells 2014-07-23 08:47:48 -07:00
  • fa034c2b2c Merge branch 'diffbot-matt' of github.com:gigablast/open-source-search-engine into diffbot-matt Matt Wells 2014-07-23 07:30:53 -07:00
  • 066ccc831c fix core. lower 0x39 outstanding to 10 Matt Wells 2014-07-23 07:30:04 -07:00
  • 31494e4831 update api to include &sections=1 for /admin/inject Matt Wells 2014-07-22 15:32:49 -07:00
  • b00c776f73 fix bugs for injecting with &sections=1. follow simpler redirs if injecting. Matt Wells 2014-07-22 15:14:42 -07:00
  • 3d1dcb08c1 fix core when getting sections while injecting Matt Wells 2014-07-22 14:23:41 -07:00
  • 2128b5af37 add support for &sections=1 for /admin/inject api to return the posted content but with fresh sectiondb info inserted. Matt Wells 2014-07-22 13:11:21 -07:00
  • 5723c637b6 Merge branch 'testing' of github.com:gigablast/open-source-search-engine into testing mwells 2014-07-22 12:01:55 -06:00
  • 3f9067b7b0 query expansion bug fix mwells 2014-07-22 12:01:42 -06:00
  • d2b1196a85 Merge branch 'diffbot-testing' into testing Matt Wells 2014-07-22 10:47:33 -07:00
  • b265e8d027 change bad master link to admin link Matt Wells 2014-07-22 10:42:30 -07:00
  • 46ca3fceeb fix oops Matt Wells 2014-07-22 10:18:03 -07:00
  • c5d72e0e18 do not do any simplified redirects for custom crawls, just like bulk jobs. Matt Wells 2014-07-22 07:14:55 -07:00
  • 248b02ea9e fix another spiderdb corruption core Matt Wells 2014-07-22 06:34:34 -07:00
  • f043d983a3 fix sockets never closing bug when client deletes then queries a collection. Matt Wells 2014-07-22 06:25:15 -07:00
  • d0a34da75d fix core trying to roundstart=1 a coll that does not exist Matt Wells 2014-07-22 06:09:47 -07:00
  • 72883dd340 fix a core when deleting a coll while saving its doledb. fix it right actually. Matt Wells 2014-07-20 20:20:45 -07:00
  • 289626ae86 fix core from corrupt spider request Matt Wells 2014-07-20 08:15:40 -07:00
  • 47b46a202c computing link text is slow for some reason. needs to be looked at, but take out for now for custom crawls to speed things up. Matt Wells 2014-07-19 11:12:33 -07:00
  • a0addd4000 try to fix spiders not going. try to fix another core. Matt Wells 2014-07-17 13:48:43 -07:00
  • f53e23d76c fix not shutting down bug Matt Wells 2014-07-16 13:00:16 -07:00
  • d0bc187a77 more core fixes. more stability. Matt Wells 2014-07-16 12:52:51 -07:00
  • 6b797f5023 more core stability fixes. prevent core dumps Matt Wells 2014-07-16 12:07:39 -07:00
  • dc7a78687c fix long-standing core when getting linkinfo from a collection that got nuked. Matt Wells 2014-07-16 10:40:12 -07:00
  • 3ad667765a fix retrying error forever pointlessly when msg4 request is corrupt Matt Wells 2014-07-16 07:01:33 -07:00
  • 9347b1fc79 Merge branch 'diffbot-testing' into testing mwells 2014-07-15 19:30:34 -07:00
  • 3421befd3a Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-07-15 16:10:50 -07:00
  • c1c31c1364 fix for using more than 32k colls Matt Wells 2014-07-15 16:10:35 -07:00
  • cd48799030 try to fix core on neo mwells 2014-07-15 10:46:12 -07:00
  • e950790ae2 version update mwells 2014-07-15 11:22:24 -06:00
  • 6e345227a8 qa test fixes mwells 2014-07-15 10:06:33 -07:00
  • 038e2b441d include more dmoz info in xml json feeds mwells 2014-07-15 07:08:41 -07:00
  • 2a7f40fd43 fix core from adding site like http://xyz.com/?ddd mwells 2014-07-15 06:26:27 -07:00
  • 83e5f2e1b2 fi &dio=1 for json mwells 2014-07-14 18:31:14 -07:00
  • 15756ec94a Merge branch 'diffbot-testing' into testing mwells 2014-07-14 18:10:13 -07:00
  • a72c5dae51 fix <script> tags that immediately end in </script> or never end but hit another <script> or a </gbiframe> tag. mwells 2014-07-14 17:24:20 -07:00
  • 6078d36dcc qa test fixes mwells 2014-07-14 12:44:32 -07:00
  • a1e2395d27 ignore ENOCOLLREC msgs in handleRequest1() in Msg1.cpp. they happen when a collection gets deleted and adds were in transit. let's hope it's ok to do this. Matt Wells 2014-07-14 12:21:32 -07:00
  • c17074defa take query expansion default from collrec. mwells 2014-07-14 07:44:01 -07:00
  • 77cc75b523 fix clone so it works for url filters and other array parms mwells 2014-07-14 07:41:33 -07:00
  • 367f671bd5 fix oops mwells 2014-07-14 07:20:57 -07:00
  • 24aa79bc85 seed urls after tag: directives. mwells 2014-07-14 07:13:55 -07:00
  • 0745f65e4d api updates mwells 2014-07-14 07:02:05 -07:00
  • 0440f71da1 json rep for page stats. mwells 2014-07-14 06:42:23 -07:00
  • d0503b588a got some xml for page stats... still need more and need json mwells 2014-07-13 13:50:41 -07:00
  • fa3acbecc1 hosts page json fixes mwells 2014-07-13 11:16:45 -07:00
  • 44ec1c26ad page hosts now available in json/xml mwells 2014-07-13 11:03:25 -07:00
  • d5805733e5 more api updates mwells 2014-07-13 09:35:44 -07:00
  • e66e7e5d11 undid some log debug msg stuff mwells 2014-07-12 17:02:45 -07:00
  • 4adb57f98e prepare for release 1.2 mwells 2014-07-12 17:58:36 -06:00
  • 02f7c1050b qa test updates. made "sw" parm work. mwells 2014-07-12 16:47:32 -07:00
  • fe7fd9da50 get gb qainject working again mwells 2014-07-12 08:37:30 -07:00
  • 61afdc0fb9 fix core from adding/deleting collection mwells 2014-07-12 08:23:40 -07:00
  • 2f8207ccf7 qa fixes mwells 2014-07-11 19:07:49 -07:00
  • bf89eb5d5e facet printing updates mwells 2014-07-11 09:45:34 -07:00
  • 9be74debf5 qa fixes mwells 2014-07-11 08:35:20 -07:00
  • 5f26918910 lots of bug fixes. more qa fixes. mwells 2014-07-11 08:00:30 -07:00
  • 13e295b89d Merge branch 'diffbot-matt' into testing mwells 2014-07-11 05:41:22 -07:00
  • 2a570cc1ef got tag: support working in the sitelist and url filters mwells 2014-07-10 20:41:59 -07:00
  • 8384371926 squid/sections qa test mwells 2014-07-10 16:42:22 -07:00
  • 0ecc7933d6 qa test for squid/sections Matt Wells 2014-07-10 16:28:24 -07:00
  • 9969644d23 fix section stats display bugs Matt Wells 2014-07-10 15:55:18 -07:00
  • 3e7d72cc41 fix ECORRUPTHTTPGZIP bug Matt Wells 2014-07-10 14:40:09 -07:00
  • 073fdd7eca minor adjustments Matt Wells 2014-07-10 11:20:51 -07:00
  • 0177555361 qa setup for testing squid proxy returning sectiondb voting info. Matt Wells 2014-07-10 11:16:42 -07:00
  • 4d3353fc3c Merge branch 'diffbot-testing' into diffbot-matt Matt Wells 2014-07-10 10:07:11 -07:00
  • b393a1bbbe Merge branch 'testing' into diffbot-matt Matt Wells 2014-07-10 10:06:55 -07:00
  • bcb584c1fd Merge branch 'diffbot-matt' of github.com:gigablast/open-source-search-engine into diffbot-matt Matt Wells 2014-07-10 10:03:20 -07:00
  • e3532a9c5f fix core when getting facet values in xmldoc.cpp Matt Wells 2014-07-10 10:02:53 -07:00
  • 0da6063983 bring tags back in site list / url filters. mwells 2014-07-10 07:44:16 -07:00
  • 683abd3875 more api work mwells 2014-07-10 07:10:49 -07:00
  • 5bbdb8e172 got page add url and add url api working. mwells 2014-07-09 20:32:30 -07:00
  • 4c72e376fa fix core dump when no collrec Matt Wells 2014-07-09 17:36:23 -07:00
  • 950352d781 do not hash redundant xpaths that have the same inner sentence/alnum html as their children tags. waste of index space. mwells 2014-07-09 17:16:01 -07:00
  • b231bc8042 incorporate total # of docs with that xpathsitehash into the tag attr. so using the MxDy should be good enough to determine if something is chrome or not. mwells 2014-07-09 16:47:47 -07:00