Commit Graph

  • 58b9834729 Added HTML microdata typed items parsing capability. luccioman 2018-02-02 09:31:40 +01:00
  • 80fb1026d0 Create recrawl requests with the relevant crawl profile. luccioman 2018-01-30 21:00:18 +01:00
  • 539925a275 Added an utility to generate/update XLIFF master file from lng files. luccioman 2018-01-29 18:34:47 +01:00
  • 41a6b052d9 Updated master and French translation for the IndexReIndexMonitor_p page luccioman 2018-01-29 16:51:00 +01:00
  • fa6d030b0b Moved dbtest to the test source folder. luccioman 2018-01-29 14:03:01 +01:00
  • 6cd3847d0a Fixed NullPointerException case on Table init with relative file path. luccioman 2018-01-29 14:00:43 +01:00
  • 28883d8a71 Shutdown daemon threads at the end of dbtest luccioman 2018-01-29 13:56:37 +01:00
  • 929e0d6eae Replaced improper ByteBuffer.equals() implementation by Arrays.equals() luccioman 2018-01-29 13:38:25 +01:00
  • 098ee63911 Added a manual performance test for the HostBalancer. luccioman 2018-01-28 12:41:56 +01:00
  • fefe2d1b6e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git luccioman 2018-01-28 12:30:43 +01:00
  • 5aa4fb1144 upd to metadata-extractor-2.11.0.jar reger 2018-01-27 18:32:45 +01:00
  • 46b5249c20 Removed time condition on HostBalancer initialization in JUnit test. luccioman 2018-01-26 17:15:27 +01:00
  • 8b572b7337 Commit Solr index before simulating or starting recrawl job. luccioman 2018-01-26 10:31:13 +01:00
  • 5b943c07ab Merge pull request #155 from JeremyRand/readme-typo-fixes luccioman 2018-01-26 09:50:40 +01:00
  • dea856c854 Fix some typos in the README. JeremyRand 2018-01-26 04:34:31 +00:00
  • 733cacdbb8 Revised the RDFaParser main launcher for minimal proper operation. luccioman 2018-01-25 07:57:56 +01:00
  • 7baa99f26f Fixed stored URL in web cache when redirection(s) occurs. luccioman 2018-01-20 18:54:08 +01:00
  • 5e2812c060 Automatically refresh running recrawl report when JavaScript is enabled. luccioman 2018-01-19 11:58:52 +01:00
  • 19903a984f Merge pull request #154 from tangdou1/master luccioman 2018-01-19 10:18:35 +01:00
  • 49d103ad16 Merge pull request #1 from tangdou1/tangdou1-patch-1 tangdou1 2018-01-16 17:16:14 +08:00
  • dd4f93f049 Update zh.lng tangdou1 2018-01-16 17:11:07 +08:00
  • e585b4f597 Update zh.lng tangdou1 2018-01-16 15:35:54 +08:00
  • 0fce264ba4 Set reindex page to html5 and removed presentational only html tables. luccioman 2018-01-15 18:32:34 +01:00
  • 83df922afc Removed unused duplicated HTML id on header hidden field luccioman 2018-01-15 17:16:54 +01:00
  • 9ddf92d143 Removed unncessary reflection usage for workflow tasks. luccioman 2018-01-15 10:05:49 +01:00
  • 897d3d30cc Added new recrawl job profile to the list of default crawl profiles luccioman 2018-01-15 08:30:37 +01:00
  • 9624516bf8 Refresh recrawl job profile threshold date like other default profiles luccioman 2018-01-15 08:06:28 +01:00
  • b712a0671e Added a specific default crawl profile for the recrawl job. luccioman 2018-01-13 15:46:04 +01:00
  • adf3fa493d Added comments about crawl profiles recrawl cycles luccioman 2018-01-13 12:13:04 +01:00
  • 3638e16c2e More comprehensive log on rejected recrawls caused by date constraint luccioman 2018-01-13 12:07:56 +01:00
  • d47afe6fab Use a constant for crawler reject reason prefix with specific processing luccioman 2018-01-13 10:45:00 +01:00
  • 4e03335625 Added more details to the recrawl job report luccioman 2018-01-12 11:47:13 +01:00
  • d95d393a0d Add a query link to local Solr to browse selected recrawl candidates luccioman 2018-01-12 10:23:26 +01:00
  • 59f7763af6 Display recrawl job report also when job is actively running luccioman 2018-01-11 09:53:27 +01:00
  • 6425963cee Fixed internal tables exact value match iterator luccioman 2018-01-10 18:38:42 +01:00
  • 0c9e0b3566 Record recrawl calls to make them schedulable luccioman 2018-01-10 17:05:53 +01:00
  • 433e241e4f Added a report info box about eventual last terminated recrawl job luccioman 2018-01-09 22:33:15 +01:00
  • b2af25b14f Added a stop condition to the Recrawl busy thread luccioman 2018-01-09 10:22:26 +01:00
  • 421728d25a Made possible to customize selection query before launching a recrawl luccioman 2018-01-08 21:20:46 +01:00
  • fab6e54fec Enforced controls (HTTP method, token) on ReIndex and ReCrawl operations luccioman 2018-01-07 15:25:16 +01:00
  • 36e9b1c5b3 Fixed SegmentTest test case time dependant occasional failures luccioman 2018-01-02 10:21:07 +01:00
  • 8a4ea1c11e Added UI switch to control content domain constraint per search request luccioman 2018-01-02 08:13:14 +01:00
  • 36a45b3905 Added UI setting for strictness of content-type checking on media search luccioman 2017-12-29 11:32:42 +01:00
  • cedb53be4e upd to commons-io-2.6 reger 2017-12-28 03:13:42 +01:00
  • f8071ac8ae Make TokenizedStringNavigator (used for keyword search facet) active check case insensitive. As keywords are compared lower case, make sure user input keyword:Key or keyword:key will be shown as active in facet entry key. reger 2017-12-28 02:51:52 +01:00
  • 270b77074e upd to httpclient-4.5.4 and httpmime-4.5.4 reger 2017-12-24 01:34:23 +01:00
  • 6db7f5525b upd to icu4j-60.2 reger 2017-12-24 01:02:18 +01:00
  • e6907fdab3 Added optional search parameter/setting to control content domain filter luccioman 2017-12-23 18:56:17 +01:00
  • f52217c939 Enable full size images preview for users with extended search rights luccioman 2017-12-22 11:39:30 +01:00
  • d42c1773c8 Added UI setting for optional encryption with https on p2p searches luccioman 2017-12-22 11:01:02 +01:00
  • 09c4ee56a7 Added optional https support for remote crawl and profile operations luccioman 2017-12-21 18:41:32 +01:00
  • 5db1c9155a Do locale independant case conversion on hosts, schemes, and file exts. luccioman 2017-12-19 13:52:05 +01:00
  • 1c4803e40a Enable optional https support for /yacy/transferURL API calls. luccioman 2017-12-19 12:30:49 +01:00
  • 79a2ba306a Updated links to Java Regular Expressions documentation to version 8 luccioman 2017-12-19 11:14:20 +01:00
  • c94bc82f6a upd to commons-compress-1.15 reger 2017-12-16 00:49:48 +01:00
  • c6e1befbca Restored peer URL host name stripping removed from previous commit. luccioman 2017-12-15 17:03:35 +01:00
  • 17e004599d Started implementing optional https preference for protocol operations luccioman 2017-12-15 11:28:46 +01:00
  • 2bc61f5657 Merge pull request #149 from Scre13/bugfix_default_settings luccioman 2017-12-13 07:38:04 +01:00
  • bb3d3fe074 fixed default loading default settings; load was populated with wrong value ScRe13 2017-12-12 23:25:56 +01:00
  • 20bba135fe Show hide or show public surftip button depending on current config status, to show the button to switch the status (hiding button of current status) reger 2017-12-10 01:25:20 +01:00
  • b907819cb4 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Michael Peter Christen 2017-12-09 22:29:54 +01:00
  • 25573bd5ab added a crawl filter based on <div> tag class names When a crawl is started, a new field to exclude content from scraping is available. The field can be identified with the class name of div tags. All text contained in such a div tag where the configured class name(s) match are not indexed, while the remaining page is indexed. Michael Peter Christen 2017-12-09 22:29:35 +01:00
  • 640fed2a9c Removed Java 1.8 no more necessary version checking (fixes issue #147) luccioman 2017-12-08 15:26:46 +01:00
  • d95b288f19 Removed use of deprecated Jetty IPAccessHandler for client filtering. luccioman 2017-12-08 15:12:08 +01:00
  • cc7a93e6b6 remove deprecated jetty continuation class from urlproxyservlet (was a long time carry over, while not supporting async requests) reger 2017-12-08 01:01:07 +01:00
  • 607b39b427 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Michael Peter Christen 2017-12-07 15:25:41 +01:00
  • 4355de0f3c (more!) evaluation of XRealIP from nginx reverse proxy Michael Peter Christen 2017-12-07 15:16:11 +01:00
  • e5b4799838 upd to Jetty-9.4.8.v20171121 reger 2017-12-07 00:24:33 +01:00
  • f9cba827c0 Made "tld:" modifier case insensitive and IDN complient. luccioman 2017-12-04 19:13:16 +01:00
  • a4494d6e01 Improved support for internationalized domain names on "site:" modifier luccioman 2017-12-04 18:23:26 +01:00
  • d07006bac4 Do locale independant case conversion on "filetype:" query modifier. luccioman 2017-12-04 14:11:29 +01:00
  • 8fbf25d1ed Made "site:" query modifier case insensitive. luccioman 2017-12-04 14:08:34 +01:00
  • 867388e05b Refactored 'site:' query modifier parsing into a dedicated function. luccioman 2017-12-04 13:58:15 +01:00
  • c5c3cc1274 Use HTTP Post operation for resetting memory monitoring state. luccioman 2017-12-04 08:48:37 +01:00
  • 0704b1d644 upd to httpcore-4.4.8 reger 2017-12-04 01:12:50 +01:00
  • bfe753acea Merge pull request #144 from him2him2/_fic_HTTPS luccioman 2017-12-02 08:45:42 +01:00
  • c9d80b5b77 Prefer fine URL match over approximate URL mask regex on final filtering luccioman 2017-12-01 11:52:52 +01:00
  • 0a120787e3 Improved accuracy of URLs search filters : protocol, tld, host, file ext luccioman 2017-12-01 11:19:31 +01:00
  • d1c7dfd852 Fixed URL parsing with fragment and empty path luccioman 2017-12-01 09:48:42 +01:00
  • e07ef1b610 Apply tld query modifier on Solr host_s mandatory field. luccioman 2017-12-01 08:46:46 +01:00
  • 478e92deff Fixed url mask filter generated when protocol modifier is not null luccioman 2017-11-30 20:21:45 +01:00
  • 29de4a65d7 Refactored url mask filter build from query modifiers luccioman 2017-11-30 09:20:32 +01:00
  • a1879115dc upd to Jsoup-1.11.2 reger 2017-11-26 22:01:42 +01:00
  • d5a75537e4 remove redundant setting of timeout for remoteinstance and replace depreciated updatesolrclient instantiation with recommended builder reger 2017-11-26 02:53:51 +01:00
  • f01aac31fd Made possible to use https for remote search on peers with SSL enabled. luccioman 2017-11-24 14:10:41 +01:00
  • 97dff48abf Update HTTP -> HTTPS in README.md Ronald Eddy Jr 2017-11-23 00:54:36 -08:00
  • 01dca12d05 Upgraded apache POI dependency from 3.16 to 3.17 luccioman 2017-11-22 09:07:36 +01:00
  • e2f6427a63 Added a basic JUnit test for the Visio parser (vsdParser) luccioman 2017-11-22 09:06:16 +01:00
  • 1e9cdaabd4 Do locale neutral case conversion of HTML charset name. luccioman 2017-11-20 18:52:45 +01:00
  • d41ad7af6f Restore initial locale at the end of a JUnit test case which modify it. luccioman 2017-11-20 18:50:49 +01:00
  • 7206f1ed71 Do locale neutral case conversions on domain names. luccioman 2017-11-20 18:47:46 +01:00
  • 398c66f06c Do locale neutral case conversions in MultiProtocolURL luccioman 2017-11-20 15:23:33 +01:00
  • 9531b83598 Do locale neutral case conversions in Classification luccioman 2017-11-20 09:48:46 +01:00
  • bab5f0485f Added signing key to developer releases location. luccioman 2017-11-17 11:09:55 +01:00
  • d22fc0d0a2 Updated lists of known sponsored and country-code TLDs. luccioman 2017-11-16 09:50:55 +01:00
  • ac209cac2e Updated the generic top-level known domains list. luccioman 2017-11-14 09:42:09 +01:00
  • 938d8a9731 Added some JavaDoc luccioman 2017-11-14 09:24:13 +01:00
  • c32ac9c4c7 Updated log path in informative message of stop script. luccioman 2017-11-14 09:17:43 +01:00
  • 8f07df5f85 Upgraded com.twelvemonkeys.imageio dependencies from 3.3.1 to 3.3.2 luccioman 2017-11-09 09:30:20 +01:00
  • fcd57e2d0f Improved some JUnit tests isolation and resources release luccioman 2017-11-08 09:33:30 +01:00