Commit Graph

  • 8946517b7c minor admin.html update Matt Wells 2014-05-26 10:30:48 -0400
  • fe536cf31f minor updates to admin.html mwells 2014-05-26 07:14:32 -0700
  • 5ecd486f48 update admin.html Matt Wells 2014-05-25 22:28:05 -0400
  • b201333549 Merge branch 'master' into testing Matt Wells 2014-05-25 22:13:45 -0400
  • 8ad18d2cd3 make it so we don't need --nodeps with rpm -ivh (rpm install) to install pkg. Matt Wells 2014-05-25 22:08:46 -0400
  • 2e7f32b01a fix getcwd2() so it works on red hat. defaults to /var/gigablast/data0/gb if cmd is "gb" and the "gb" binary is not in the current working directory. Matt Wells 2014-05-25 20:53:49 -0400
  • d0df3da508 added dotemacs mwells 2014-05-25 07:54:30 -0700
  • b0f9227bbc path fixes for gb startup Matt Wells 2014-05-25 10:28:13 -0400
  • 98c2e7a8b6 redhat build updates on fedora Matt Wells 2014-05-25 09:58:07 -0400
  • 3fe1d3f184 updates to compile cleanly on redhat. Matt Wells 2014-05-24 23:58:12 -0400
  • b33959191b rpmbuild updates mwells 2014-05-24 07:16:17 -0700
  • 8234aaed23 put lastspidertimeutc back in because we need it for debugging. Matt Wells 2014-05-23 09:43:46 -0700
  • e3b6f6b74e a second fix for crawls saying they're done and then resuming. it seems to happen when we turn spiders off then back on again. so hack that. Matt Wells 2014-05-23 07:29:18 -0700
  • 562b3eafda more spec file fixes. use relative symlinks mwells 2014-05-22 21:57:46 -0700
  • 5c55517fe6 more rpm build fixes mwells 2014-05-22 21:01:30 -0700
  • ddec6353ed rpm updates mwells 2014-05-22 19:24:33 -0700
  • a783c9155b add spec file to build rpm. mwells 2014-05-22 19:06:09 -0700
  • b2e9cfcc1b minor make install changes mwells 2014-05-22 18:46:38 -0700
  • 1f4dc2df97 fix bug in spider scan of spiderdb for unique firstips Matt Wells 2014-05-22 13:08:01 -0700
  • 68fcffb2da speed up scan of spiderdb to repopulate waiting tree by jumping over last firstip. Matt Wells 2014-05-22 12:20:03 -0700
  • e9c4c9bb9a fix possible loss of data when doing reads on especially doledb. Matt Wells 2014-05-22 11:06:56 -0700
  • 1660805f66 more useful logging for debugging Matt Wells 2014-05-22 10:36:44 -0700
  • 32735677d2 wait 45 seconds before ending round, not 30 to try to fix some issues... Matt Wells 2014-05-22 08:32:19 -0700
  • 935cc72e19 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-21 13:55:29 -0700
  • b8886c399c show start/end job times on pagecrawlbot. Matt Wells 2014-05-21 13:55:01 -0700
  • 61fc015014 fix potential diffbot injection bug Matt Wells 2014-05-21 12:21:29 -0700
  • b0c87b355c log update Matt Wells 2014-05-21 10:09:50 -0700
  • 45df139ccb update logging Matt Wells 2014-05-21 10:05:49 -0700
  • 7ad9058f77 when doing a query reindex on a json child url we need to add the spider request of the original parent url and make sure it does not get "EDOCUNCHANGED" error. then the possibly new json child objects won't get indexed. Matt Wells 2014-05-21 05:43:53 -0700
  • 34afc7c7cf Merge branch 'diffbot-dan' into diffbot-testing Matt Wells 2014-05-21 05:30:56 -0700
  • e39dffadcf use "expand" option when calling Diffbot Daniel Steinberg 2014-05-20 22:00:46 -0700
  • 4b587f168b fix bug of not including empty responses when &icc=1 Matt Wells 2014-05-20 21:07:21 -0700
  • c729b51ae5 fixed exact # search results hit count when using min/max/sort operators. Matt Wells 2014-05-20 13:45:00 -0700
  • 6664faa792 fix printing back-to-back commas when showing results in json with &icc=1. Matt Wells 2014-05-20 13:23:29 -0700
  • ffc4036840 update admin.html mwells 2014-05-19 06:22:34 -0700
  • cd3e11b6ee Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-16 18:48:06 -0700
  • d2cc117d82 fix oops Matt Wells 2014-05-16 18:47:52 -0700
  • 526be98ec8 fix core scenario when diffbot reply that was injected using &diffbotreply= contains the http mime. Matt Wells 2014-05-16 18:46:39 -0700
  • baf1ccb7d5 note updates Matt Wells 2014-05-16 09:52:41 -0700
  • eea5dff0f5 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-16 09:38:42 -0700
  • a22396c344 quick doc update Matt Wells 2014-05-16 09:38:32 -0700
  • 2484147403 fix core Matt Wells 2014-05-16 09:30:46 -0700
  • 1af8ca846f Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-16 08:08:42 -0700
  • a81f2145bd fix sendmail ip to 127.0.0.1 Matt Wells 2014-05-16 08:08:20 -0700
  • 4684298965 minor doc update Matt Wells 2014-05-16 08:01:29 -0700
  • 2ce6ed266a fix another core from a 0 docid Matt Wells 2014-05-16 07:59:04 -0700
  • 6d9fdc975b fix core from not setting m_gotClusterRecs in Msg39.cpp Matt Wells 2014-05-16 06:32:51 -0700
  • 5c2cc973a8 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-15 18:27:13 -0700
  • a303bda1f8 fix core Matt Wells 2014-05-15 15:10:57 -0700
  • b38f62c7dc nothing Matt Wells 2014-05-15 14:15:05 -0700
  • 72c6d032d8 fix query reindex on subdocuments (diffbot json blurbs) so that they just put in a spiderrequest to reindex the parent url. Added &diffbotreply= to the injection interface so dan can provide that along with the pageUrl he passes in with &u= Matt Wells 2014-05-15 14:11:12 -0700
  • fc5cfa2a62 move list of bulk urls to new directory earlier. May fix Defect if there is something that is causing the bulk job to restart before this function returns Daniel Steinberg 2014-05-15 13:35:32 -0700
  • 6afa3f2561 save spots to disk as space separated Daniel Steinberg 2014-05-14 14:40:46 -0700
  • 00b652581f fix boolean query containing quoted phrase Matt Wells 2014-05-14 11:22:07 -0700
  • 8ac7fdfa24 Msg39::controlLoop now works Matt Wells 2014-05-14 11:02:09 -0700
  • d95cbb42d6 Merge branch 'diffbot-testing' into diffbot-matt Matt Wells 2014-05-14 10:52:45 -0700
  • db543ddd9f nothing Matt Wells 2014-05-14 09:37:59 -0700
  • 40bca5d120 try to fix msg22 core some more Matt Wells 2014-05-14 08:16:47 -0700
  • 48df53e74f Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-14 07:48:23 -0700
  • 0242fe88ff try to fix msg22 based cores Matt Wells 2014-05-14 07:46:32 -0700
  • 88eb44827f fix avail docid logic some more for indexing spdier replies Matt Wells 2014-05-13 21:27:05 -0700
  • 773c9ad8f6 Merge branch 'diffbot-testing' into diffbot-matt Matt Wells 2014-05-13 21:11:14 -0700
  • 015b9d4597 fix oopsy Matt Wells 2014-05-13 21:10:34 -0700
  • 4cba959529 revised msg39.cpp in order to fix boolean bug Matt Wells 2014-05-13 20:50:11 -0700
  • 0905fc48c1 fix bug in getAvailDocId() Matt Wells 2014-05-13 20:10:03 -0700
  • 3773b84f84 Merge branch 'diffbot-dan' into diffbot-testing Matt Wells 2014-05-13 17:49:05 -0700
  • 75642f44a3 don't need var i Daniel Steinberg 2014-05-13 17:25:42 -0700
  • ffee90f3bb Defect Daniel Steinberg 2014-05-13 17:23:07 -0700
  • 35f6652ceb make gb start and kstart not use hostid any more. it is now inferred from path of gb binary. Matt Wells 2014-05-12 21:24:41 -0700
  • 037067170c fix for symlinks in host paths in hosts.conf Matt Wells 2014-05-12 20:50:11 -0700
  • 32a95cca45 fix 'gb install' Matt Wells 2014-05-12 17:04:38 -0700
  • c5ae5ca4b5 v3 support for tokenized diffbot replies using the "objects" array in the json. Matt Wells 2014-05-12 16:13:24 -0700
  • 8d1c4e3097 Merge branch 'diffbot-testing' into diffbot-dan Matt Wells 2014-05-12 15:33:15 -0700
  • 4bb1f99296 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-12 15:15:52 -0700
  • c58bd016a6 multiple content types for page parser content Matt Wells 2014-05-12 15:15:34 -0700
  • 5f7bbe7523 fix diffbot smoke tests. do not index spider replies for custom crawls. Matt Wells 2014-05-12 15:14:11 -0700
  • 78e2bd8171 start implementing handling for array of "objects" Daniel Steinberg 2014-05-12 15:04:36 -0700
  • 0a2523f361 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-12 10:59:00 -0700
  • 8e00a9e7e1 Merge branch 'diffbot-testing' of github.com:gigablast/open-source-search-engine into diffbot-testing Matt Wells 2014-05-12 10:58:44 -0700
  • 7b840f1231 updated err msg Matt Wells 2014-05-12 10:58:36 -0700
  • 5041307c2a Merge branch 'testing' into diffbot-testing Matt Wells 2014-05-12 10:53:57 -0700
  • 7ca1e8e790 gb.conf update for new parm Matt Wells 2014-05-12 10:53:22 -0700
  • 6c72292e57 added mysynomyms.txt file to official list mwells 2014-05-12 10:36:54 -0700
  • 8f6d54f5a0 Merge branch 'master' into diffbot-testing Matt Wells 2014-05-12 10:09:36 -0700
  • deaaf69968 fix core from federated search and m_r being null Matt Wells 2014-05-12 10:05:38 -0700
  • 85818e1b98 doc update mwells 2014-05-12 07:56:22 -0700
  • 1717bd2ab1 Merge branch 'master' into testing mwells 2014-05-12 07:32:28 -0700
  • 1d2b234831 quick fix for core mwells 2014-05-12 07:32:05 -0700
  • 45b8bb3421 log msg cleanups mwells 2014-05-11 21:55:44 -0700
  • a9dc18c866 fix more bugs. mwells 2014-05-11 19:44:41 -0700
  • c3a1c674c3 now we run gb without a hostid. we use its path and the local ip to identify its hostid # in the hosts.conf. mwells 2014-05-11 19:36:24 -0700
  • 463dc2159f more make install updates mwells 2014-05-11 17:02:15 -0700
  • 5df28bb147 more pkg fixes mwells 2014-05-11 15:02:03 -0700
  • a7fbcfc188 make install updates mwells 2014-05-11 14:47:46 -0700
  • 7c30c6b970 make install fixes. getting ready for pkg build. mwells 2014-05-11 14:20:24 -0700
  • 70016ec3a3 work on make install. mwells 2014-05-11 12:48:56 -0700
  • aa76b36bf0 nothing mwells 2014-05-11 12:04:10 -0700
  • 467e70bd98 improvements for thumbnail generator. mwells 2014-05-11 08:44:38 -0700
  • 533a6caef7 image formatting fixes mwells 2014-05-11 07:06:35 -0700
  • 2d6bc12866 thumbnails gen off by default for now mwells 2014-05-10 17:24:48 -0700