Commit Graph

  • 4377bd2b70 fix for wrong crawlName construction Michael Peter Christen 2021-06-30 18:03:54 +02:00
  • e81b770f79 enabled crawl starts with very large sets of start urls i.e. 10MB large url list with approx 0.5 million start points Michael Peter Christen 2021-06-30 10:45:58 +02:00
  • 4b73b3f9f2 docker has no latest-alpine frankenstein91 2021-06-20 22:27:50 +02:00
  • c623a3252e fix for jdk 14 bug Michael Peter Christen 2021-04-23 09:11:03 +02:00
  • dbd211a1ad removed/replaced reflection in memory tool Michael Peter Christen 2021-04-22 20:24:13 +02:00
  • 160f00e59e removed reconfigure script which is seven years old any may not up to standards of current password implementation. See https://github.com/yacy/yacy_search_server/issues/409 as hint Michael Peter Christen 2021-04-15 20:41:01 +02:00
  • 1cdb21592b added hazelcast and some modifications to align legacy YaCy with YaCyGrid Michael Peter Christen 2021-04-15 20:39:22 +02:00
  • 42ea2a1c6f Merge pull request #405 from jfhs/jfhs/support-all-html-entities Michael Christen 2021-03-31 01:44:54 +02:00
  • b2af745dd6 Merge pull request #404 from lnceballosz/master Michael Christen 2021-03-30 23:48:21 +02:00
  • 10bddc2c2d Decode HTML entities in all property values by default jfhs 2021-03-30 21:30:52 +02:00
  • 2135d259e3 Replace hardcoded html/xml entities with a file, support decoding all defined HTML entities jfhs 2021-03-30 20:58:47 +02:00
  • 8f876a8c72 added concurrency to enhance indexing speed during json surrogate import Michael Peter Christen 2021-03-30 12:07:36 +02:00
  • f8cbaeef93 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2021-03-29 18:46:53 +02:00
  • a857e3d3d5 fix for json importer Michael Peter Christen 2021-03-29 18:46:42 +02:00
  • 7fecd859e5 fixes showing metadata from Searchresult, by removing defType=edismax also removes defType=edismax from IndexBrowser, but still does not show dates sgaebel 2021-03-21 00:04:55 +01:00
  • 1546232c94 adds ranking for multi document queries only sgaebel 2021-03-20 17:10:21 +01:00
  • 93b353d22d does not boost or add fields for zero-row-queries (exists()) sgaebel 2021-03-20 16:29:16 +01:00
  • f16cd154f7 removes unused imports and variables sgaebel 2021-03-20 15:14:09 +01:00
  • c69c462a15 replaces a expensive getLoadTimeURL() by exists() refactors urlExists to getHarvestProcess as that is what it does sgaebel 2021-03-20 14:46:38 +01:00
  • a5488ac8f5 uses edismax queries on query counts > 1 only sgaebel 2021-03-20 01:04:15 +01:00
  • 26223dc25a replaces getLoadTime() by exists() with a simpler query since solr-8.8.1 getLoadTime() causes a high cpu usage sgaebel 2021-03-20 00:35:30 +01:00
  • 8e4d014c06 removes useless SolrRequestInfo.clearRequestInfo(), avoids spamming the log sgaebel 2021-03-18 22:05:15 +01:00
  • 88c6bc8cd7 adds missing solr lib: opentracing 0.33.0 sgaebel 2021-03-18 19:36:04 +01:00
  • 139b5a4033 improving license info in README Lina Ceballos 2021-03-11 12:23:53 +01:00
  • a96752f5ab adding SPDX license and copyright headers Lina Ceballos 2021-03-11 12:17:11 +01:00
  • 221038f16d creating LICENSES directory Lina Ceballos 2021-03-11 12:16:37 +01:00
  • e18d0ef544 trying to set a higher priority to the process that is involved in index export Michael Peter Christen 2021-03-09 00:04:05 +01:00
  • c552a2845f added new commons library (missed in latest commit) Michael Peter Christen 2021-03-08 13:39:48 +01:00
  • 8b4394a6c5 fixes for solr 8.8.1 migration - replace new guava 30 with older 25 because that is the correct dependency for solr 8.8.1. The newer one did actually not work! - index will be crated in a DATA/INDEX/freeworld/SEGMENTS/solr_8_8_1 subfolder. The older solr_6_6 index is not touched but also not migrated. The index starts with fresh (empty) content. - Older indexes must be migrated by hand (export/import) so far until a better solution is found. - Large schema adoptions for lucene 8.8.1 Michael Peter Christen 2021-03-08 13:39:27 +01:00
  • 3befaaf4f1 reformatting pom.xml to make it easier to update it with recent library versions Michael Peter Christen 2021-03-08 00:41:41 +01:00
  • dffe9e1c23 Merge pull request #402 from SebastianoPistore/junitUpdate Michael Christen 2021-03-06 13:45:11 +01:00
  • 7c86826db3 new version for solr 8 ATTENTION: old indexes from solr 6 CANNOT be migrated to solr 8 DO NOT use this version if you still have a solr 6 index. Michael Peter Christen 2021-03-06 13:37:06 +01:00
  • ed9789214e fixed seed initialization problem Michael Peter Christen 2021-03-06 13:35:46 +01:00
  • f4f3808d43 added missing new dependencies for migration to Solr 8 after pulling https://github.com/yacy/yacy_search_server/pull/403 Michael Peter Christen 2021-03-06 13:35:32 +01:00
  • ffe8786d69 Merge pull request #403 from alsutton/address_security_issues Michael Christen 2021-03-06 12:58:56 +01:00
  • f4dd6e6d41 Update Lucene to 8.8.1 Al Sutton 2021-03-04 17:42:59 +00:00
  • 721dd3e1ba Update Guava to match version pulled through from solr dependencies Al Sutton 2021-03-04 17:32:07 +00:00
  • b5203de923 Update ant build solr dependency to 8.8.1 Al Sutton 2021-03-04 16:48:10 +00:00
  • 8ade8b8775 Remove forced clear to match new behaviour in 2da71c2a40 Al Sutton 2021-03-04 16:37:56 +00:00
  • 09695fc6d3 Update exceptions to match updated API Al Sutton 2021-03-04 16:34:02 +00:00
  • 69014a701e Update API Usage Al Sutton 2021-03-04 16:14:56 +00:00
  • 9ba0fa1beb Update dependencies to address vulnerabilities. Al Sutton 2021-03-04 13:41:43 +00:00
  • 78bd82f8ef Workaround for CVE-2020-15250 Sebastiano Pistore 2021-02-22 20:53:24 +01:00
  • b46513f4a1 added stub of rc3assembly style a little bit late but whatever Michael Peter Christen 2021-02-09 20:30:10 +01:00
  • 3da7628117 use environment variables to overwrite configuration variables you can i.e. do: export YACY_PORT=8092 && ./startYACY.sh Just append "YACY_" to uppercase version of environment variables and replace all "." with "_". Michael Peter Christen 2021-02-09 20:26:49 +01:00
  • 13a2e6dc6e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Michael Peter Christen 2021-01-25 11:49:32 +01:00
  • 0ae8ccf657 Make it possible to set an empty password disabling the authentication protocol completely If you set now an empty password, then the http server will not ask to authentify. This is required for environment where we attach an outside authentification service like keycloak or similar using authentication in an ingress proxy. This change is part of the approach to run YaCy inside of a kubernetes cluster where we do not want individual authentication of peers and want to apply a ingress authentication. Michael Peter Christen 2021-01-25 11:49:21 +01:00
  • 96592a10cf added option to set yacy configuration values using environment variables To use that feature, set an environment variable with prefix "yacy." and suffix identical to the yacy configuration attribute name. Additionaly we implemented a way to set a peer name using the setting "network.unit.agent". This can therefore now be used to set a peer name with the java call parameter -Dyacy.network.unit.agent=anonymous The purpose for this feature is the ability to set peer names in mass-deployed kubernetes clusters to the same name to prevent that we are flooding peer name statistics with auto-deployment-generated names. Michael Peter Christen 2021-01-24 22:50:37 +01:00
  • 198826c362 added network scanner process to discover all YaCy peers in the intranet this will be used to wire YaCy peers in a kubernetes cluster Michael Peter Christen 2021-01-23 15:14:49 +01:00
  • d9602e8325 Implemented a new syntax in the template engine to simplify json APIs Added also an example for one of the existing APIs. The problem is the comma separator between objects which must not be there for the last entry in a sequence. The new syntax adds the separator symbol automatically. Michael Peter Christen 2021-01-18 00:01:08 +01:00
  • 5a7f12a9c1 allow network scans for non-standard http/https ports Michael Peter Christen 2021-01-11 00:28:24 +01:00
  • 022fb15670 fix for https://github.com/yacy/yacy_search_server/issues/385 Michael Peter Christen 2021-01-06 22:12:17 +01:00
  • 17672fcbb4 adding hint how to shrink the disk size after an index deletion. implements https://github.com/yacy/yacy_search_server/issues/360 Michael Peter Christen 2021-01-06 22:02:00 +01:00
  • b8d264f7ec fixes logging sgaebel 2021-01-04 20:46:21 +01:00
  • 13e42c2dd2 aded dockerfiles for 32 and 64 bit ARM/Raspberry Pi Michael Peter Christen 2020-12-31 00:02:23 +01:00
  • 062111a003 improved dockerfiles They do not use git pull to get the latest YaCy code. Instead they copy from local file system. Michael Peter Christen 2020-12-29 21:01:35 +01:00
  • 4c920d05b5 removed superfluous lines Michael Peter Christen 2020-12-29 20:19:58 +01:00
  • 48dd87e1e1 added a dockerignore file Michael Peter Christen 2020-12-29 20:19:45 +01:00
  • ca10f0afca fixed optional default PW Michael Peter Christen 2020-12-29 20:19:07 +01:00
  • 907f121d0c do not overwrite PW with random PW Michael Peter Christen 2020-12-29 20:18:25 +01:00
  • 3e6a1e0a49 fixed surrogate process counter Michael Peter Christen 2020-12-28 18:26:22 +01:00
  • 88590db91e Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2020-12-28 17:05:14 +01:00
  • d3526c52af fixed a problem in warc importer: do not fail if single WARC entries are faulty Michael Peter Christen 2020-12-28 17:05:06 +01:00
  • 256fa3d985 new limitation documentation just replaced two by four Michael Peter Christen 2020-12-22 16:33:12 +01:00
  • 3078b74e1d Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Michael Peter Christen 2020-12-22 00:46:56 +01:00
  • 01cc32217f fixed apicall call method parameters and verification in transaction manager which did not have and exception for localhost/basic authentication Michael Peter Christen 2020-12-22 00:46:47 +01:00
  • 7997836506 fixed lock image Michael Peter Christen 2020-12-20 23:18:50 +01:00
  • 63f58e4785 enhanced strategy in host browser limit number of fresh hosts in round robin hashes Michael Peter Christen 2020-12-20 23:15:55 +01:00
  • 9be36800a4 increased redirect depth by one this makes sense if one redirect replaces http with https and another replaces www subdomain by without (and vice versa) Michael Peter Christen 2020-12-20 19:44:16 +01:00
  • d0abb0cedb enabling all crawl profiles in all network modes also: increased default internet crawl speed to 4 urls/s/host Michael Peter Christen 2020-12-19 01:00:51 +01:00
  • 32ca669bfb panic release for #googledown Michael Peter Christen 2020-12-14 13:20:28 +01:00
  • baad56d83d beautified default peer names Michael Peter Christen 2020-12-14 02:08:49 +01:00
  • a9befbba5f Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2020-12-14 01:26:34 +01:00
  • fed8bd6325 automatically refresh css cache when switching skin and setting of default skin to current skin in selector Michael Peter Christen 2020-12-14 01:26:26 +01:00
  • 9a5694261a design update more space Michael Peter Christen 2020-12-12 14:17:45 +01:00
  • 4ec55289a8 using a lock symbol which looks also good in dark designs Michael Peter Christen 2020-12-12 03:02:40 +01:00
  • 43a9f4f574 updated solr 6.6.6 -> 7.7.3 dropped GSA support (GSA API is still in YaCy Grid) The 6.6.6 solr index works without migration also with 7.7.3 Michael Peter Christen 2020-12-12 02:06:43 +01:00
  • c0d9a3e9a7 turned HostBrowser into a admin-only page, now called IndexBrowser This was required because spiders and bots crawled through this page and created load on the peer without use for the user or the YaCy network. Michael Peter Christen 2020-12-11 00:50:52 +01:00
  • d359d521a1 fixed warc importer The importer tried to import a gziped files as plain warc. It will now check the file extension and use a unzip automatically on-the-fly. Michael Peter Christen 2020-12-10 11:19:25 +01:00
  • 39f87f7f28 added a hint to the default settings how to set a default password Michael Peter Christen 2020-12-09 02:42:05 +01:00
  • e54ab39958 Going back to basic authentication for console/shell commands This does not affect security because: - it is going to localhost only - only users who have already access to the pw hash can do this - no clear text pw is transmitted because that is not stored anywhere The switch to basic is required because these commands are required in the context of hosting on root servers and docker containers where a password change must be done. But the password shell command was not working without password which made the concept unusable. This deficit made it virtually impossible for root server operators to use YaCy because they had been unable to set up a proper password. Michael Peter Christen 2020-12-09 02:36:55 +01:00
  • 6271e9122c javadoc fix Michael Peter Christen 2020-12-09 02:22:47 +01:00
  • e0f4e3fd9a enhanced ability to debug the code Michael Peter Christen 2020-12-09 02:22:30 +01:00
  • eea2d71851 prevent creation of auth schema factories every time a servlet is called Michael Peter Christen 2020-12-06 01:49:34 +01:00
  • fcc9386ed3 enhanced the (already fast!) png exporter Michael Peter Christen 2020-12-03 12:18:07 +01:00
  • 4e9b425f98 missing fix for latest commit Michael Peter Christen 2020-12-03 00:40:51 +01:00
  • 3213d9db37 updated jetty from 9.4.17 to 9.4.35 and fixed a bug in ServerSideIncludes that appeared only in that recent version of jetty Michael Peter Christen 2020-12-03 00:21:15 +01:00
  • 787fec0658 reduced complexity - removed concurrency in sort Michael Peter Christen 2020-12-02 18:39:45 +01:00
  • cef5fde343 adding message to UI to make port change transparent Michael Peter Christen 2020-12-02 18:05:38 +01:00
  • 52228cb6be added a gc to cleanup process (once every 10 minutes) Michael Peter Christen 2020-12-02 00:13:00 +01:00
  • 22841ffbf1 creating a threaddump during every cleanup process to be able to find out what a peer did (not) last time before a crash Michael Peter Christen 2020-12-01 03:00:24 +01:00
  • 36e616271b do better documentation on how to set a default password Michael Peter Christen 2020-12-01 02:18:08 +01:00
  • df2bf9ef28 try to fix maven build error Michael Peter Christen 2020-11-29 14:24:33 +01:00
  • 264bab6700 trying to fight the UI unavaiability this path addresses a possible issue with too many open connections to remote peers Michael Peter Christen 2020-11-29 14:15:34 +01:00
  • 4921771951 Merge pull request #390 from rpodgorny/patch-1 Michael Christen 2020-11-26 23:28:33 +01:00
  • 836953bd5b typo fix Radek Podgorny 2020-11-26 20:52:46 +01:00
  • 7947baeb49 removed all remaining deprecation warnings Michael Peter Christen 2020-11-23 00:03:18 +01:00
  • c0f6d6e11d removed one deprecation warning for jetty library initializing ssl server port Michael Peter Christen 2020-11-22 23:27:58 +01:00
  • 133440a7a6 some debug lines Michael Peter Christen 2020-11-22 23:12:04 +01:00
  • d7b2d82faa showing MB instead of KB in PerformanceMemory Michael Peter Christen 2020-11-22 23:02:49 +01:00