Commit Graph

  • 807e3dc78a upd to httpclient-4.5 and httpmime-4.5 reger 2015-07-26 00:53:40 +02:00
  • 202620b4a2 upd to icu4j-55.1.jar reger 2015-07-25 00:50:41 +02:00
  • 149e41f25b upd to jsch-0.1.53.jar reger 2015-07-21 22:31:34 +02:00
  • ab22a32c09 Fixed CSS scrolling Kirill Fomchenko 2015-07-21 08:21:10 +03:00
  • 30135d8964 upd to lib/weupnp-0.1.3.jar reger 2015-07-20 03:45:23 +02:00
  • ec75959162 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-07-16 23:42:51 +02:00
  • 785781253e added jsonp to suggest servlet Michael Peter Christen 2015-07-16 23:42:41 +02:00
  • 5cf988f224 upd NB classpath reger 2015-07-15 01:04:59 +02:00
  • 32a804b10c Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-07-13 12:15:58 +02:00
  • 0e87a99ab8 more fixes for special windows paths Michael Peter Christen 2015-07-10 17:34:29 +02:00
  • e5b6424eed patch for bad windows file paths Michael Peter Christen 2015-07-10 17:14:14 +02:00
  • 0aa6fcf259 remove old vocabularies and synonyms before adding new Michael Peter Christen 2015-07-10 16:47:19 +02:00
  • e1cd9c0dba added another default network / commented out Michael Peter Christen 2015-07-09 16:25:11 +02:00
  • 289018b559 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-07-08 17:37:03 +02:00
  • 7b412e8c07 added msg (text emails) format; should be handled by html parser. Michael Peter Christen 2015-07-08 17:36:37 +02:00
  • f91298d3b6 fix one implicit Integer/Long type conversion -> causes Java 1.8 compile error reger 2015-07-08 03:02:10 +02:00
  • 821262a179 add CommonPattern for multiple spaces to eliminate empty split words on following spaces reger 2015-07-04 22:49:01 +02:00
  • 59096935d0 Use language-detection library for increased accuracy Ryszard Goń 2015-07-02 18:41:13 +02:00
  • 90f75c8c3d added enrichment of synonyms and vocabularies for imported documents during surrogate reading: those attributes from the dump are removed during the import process and replaced by new detected attributes according to the setting of the YaCy peer. This may cause that all such attributes are removed if the importing peer has no synonyms and/or no vocabularies defined. Michael Peter Christen 2015-07-02 00:23:50 +02:00
  • 7829480b82 refactoring: separated condenser and tokenizer Michael Peter Christen 2015-07-01 18:28:18 +02:00
  • 00d2062813 Rem depreciated AdminHandlers in solrconfig.xml avoid warning log W org.apache.solr.handler.admin.AdminHandlers <requestHandler name="/admin/" class="solr.admin.AdminHandlers" /> is deprecated . It is not required anymore reger 2015-07-01 00:58:23 +02:00
  • f901e7d3cf fix for non-authorized view of IndexBrowser: show only the number of non-failure documents Michael Peter Christen 2015-06-30 11:12:36 +02:00
  • 593de05922 enhanced surrogate import process speed (dramatically!) Michael Peter Christen 2015-06-29 12:28:34 +02:00
  • 3c4c69adea fix for - bad regex computation for crawl start from file (limitation on domain did not work) - servlet error when starting crawl from a large list of urls Michael Peter Christen 2015-06-29 02:02:01 +02:00
  • 1fec7fb3c1 suppress access to solr when doing search suggestions in case that the index has more than two million documents. This protects the index from beeing flooded with search requests that cannot be resolved before the real search query has to be computet. Michael Peter Christen 2015-06-24 13:02:12 +02:00
  • 886fca2260 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-06-24 01:59:46 +02:00
  • 694b22f165 migration to Solr 5.2: huge benefits - this is a lot faster! Michael Peter Christen 2015-06-24 01:55:51 +02:00
  • 6c2e6f1f37 remove redundant code Michael Peter Christen 2015-06-23 23:41:43 +02:00
  • e427efbe54 Next Try for a fix for upload-connection staying in blocked state. This was caused by reading via GZIP from close-wait connection an caused high cpu- and system-loads. Instat of implementing handling of the RedListener now I found a timelimeted 'get' "realy" solving this problem. sixcooler 2015-06-14 22:56:26 +02:00
  • 0fab445b19 Resourceobserver log warning - deleting releases files - only on actual deletes instead of entering routine reger 2015-06-10 02:35:37 +02:00
  • ef6a64b2a4 Fix for upload-connection staying in blocked state. This was caused by reading via GZIP from close-wait connection an caused high cpu- and system-loads. Solved by implementing handling of the RedListener. sixcooler 2015-06-09 21:26:10 +02:00
  • c973f94936 add log entry on release file delete by ResourceObserver reger 2015-06-08 03:17:12 +02:00
  • 121972752c implement deleteOldDownloads in RexourceObserver on low diskspace - direct assign sb.observer (skip redundant InitThread) reger 2015-06-08 02:52:13 +02:00
  • 0d5ac6e527 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-06-07 22:25:26 +02:00
  • 9c12555be5 added link to Snapshots in search results if the snapshot exists and option is set in ConfigSearchPage_p (this is a stub: we also need a visualization of pdf files!) Michael Peter Christen 2015-06-07 20:37:37 +02:00
  • 480e4a6a5c Update to Jetty-9.2.11 - a bugfix-release that did not solve my Problems, but does not harm anything sixcooler 2015-06-07 20:09:27 +02:00
  • 72f6a0b0b2 enhance recrawl job - allow to modify the query to select documents to process (after job has started) - allow to include failed urls (httpstatus <> 200) reger 2015-06-06 18:45:39 +02:00
  • e0a23c56c7 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-06-05 08:32:55 +02:00
  • fb9e1dd3f5 servlet for latest commit Michael Peter Christen 2015-06-05 07:22:35 +02:00
  • 5183ad718d upd to poi-3.12.jar reger 2015-06-05 03:36:57 +02:00
  • 7478338a40 remove augmented parsing activation from frontend experimental implementation not used and based on error prone experimental rdfaparser reger 2015-06-05 00:51:00 +02:00
  • 11aa2edfe1 remove RDFa parser activation from frontend reason: experimental implementatin of RDFa parser not executed (limited to special urls) but may cause error on normal html parsing due to a inputstream.reset reger 2015-06-05 00:15:16 +02:00
  • ff11ac89f7 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-06-04 23:04:04 +02:00
  • 5e2d23b7a0 removed the new index export method from the IndexControlURLs_p.html servlet and moved it to a new /IndexExport_p.html servlet. This servlet is now more prominent linked in the main menu under Production -> Index Export/Import Michael Peter Christen 2015-06-04 23:03:46 +02:00
  • 64a7b0b140 Merge origin/master reger 2015-06-04 22:44:46 +02:00
  • 49b79987c9 remove obsolete searchfl work table was used to register urls with not complete words in snippet but is never accessed reger 2015-06-04 22:44:01 +02:00
  • 4533f392b0 correct the dark themes to show also a dark navbar on searchresults sixcooler 2015-06-04 22:15:38 +02:00
  • d0aff91f23 fix for index import Michael Peter Christen 2015-06-01 01:56:09 +02:00
  • 34de1e8cbc gzip compression will perform more efficient and with better compression level Michael Peter Christen 2015-06-01 01:24:33 +02:00
  • 98be59ce9c full solr xml exports will now be automatically compressed during export. That makes it possible to export a solr xml dump even if disc space is low. Michael Peter Christen 2015-05-30 19:02:54 +02:00
  • a1a8edfc0a wrap HeaReader close() in a catch Throwable block to prevent that an excpetion during close blocks the whole shotdown process Michael Peter Christen 2015-05-30 17:54:02 +02:00
  • b43811d38c added surrogate import process for exported solr dumps. Just throw your solr dump file into DATA/SURROGATES/in/ and it will be imported! Michael Peter Christen 2015-05-30 13:19:59 +02:00
  • b77537294d prevent disc usage when showing tray animation Michael Peter Christen 2015-05-30 06:57:15 +02:00
  • eec78e1b0c added intensity option to graphics Michael Peter Christen 2015-05-30 06:31:08 +02:00
  • a5007f345e re-licensing some of my old visualization classes under LGPL 2.1 Michael Peter Christen 2015-05-30 06:12:08 +02:00
  • c99a665593 adding a 3-pixel font generator made some time ago.. Michael Peter Christen 2015-05-30 06:01:52 +02:00
  • c7576d6028 added a full solr export to the IndexControlURLs_p.html servlet. The export function is also now the default export option. The export file format for a full solr export is very similar to a solr search result xml, only the <lst name="responseHeader"> tag is missing. Michael Peter Christen 2015-05-29 15:05:52 +02:00
  • 47682bf467 fix for unresolved pattern Michael Peter Christen 2015-05-28 17:43:52 +02:00
  • 197f7449e5 All entities of crawl profiles are now editable in the crawl profile editor. Michael Peter Christen 2015-05-28 16:07:40 +02:00
  • 1d8e1e4bac - Image search expand box, adjust javascript hs padtominsize parameter, to make sure expand box doesn't shrink on small images - asure ImageResult.imagetext has value for the link text (use filename if no alt text given) reger 2015-05-27 02:31:13 +02:00
  • 8b35656007 remove hard throw exception in makeResultEntry remove not used "share." peername.yacy url rewrite reger 2015-05-26 23:57:06 +02:00
  • af57fbefad use available mime (instead null) on imageresult from metadatanode reger 2015-05-26 23:54:04 +02:00
  • dd7782bac0 revert deletion of BinSearch (accident) reger 2015-05-26 04:26:26 +02:00
  • 000dde9511 Eleminate duplication of values for search ResultEntry by instatiation from URIMetadataNode, by eleminating differentiation of ResultEntry/URIMetadataNode. - moved remaining ResultEntry functionallity to URIMetadataNode - for 1:1 functionallity added a function makeResultEntry() - removed ResultEntry - refactored related code reger 2015-05-26 04:15:00 +02:00
  • 29c4aa3991 fix compiler notification of missing serialID from last commit reger 2015-05-25 21:51:32 +02:00
  • 3d53da8236 refactor ResultEntry to be based on MetadataNode/SolrDocument to share/reuse common access routines reger 2015-05-25 21:28:48 +02:00
  • d882991bc5 Implement sharing of ioDispatcher for term & citation index as proposed in ioDispatcher description reger 2015-05-25 19:46:26 +02:00
  • 17e820cfd7 use doctype() in ViewFile to choose display routines in preference of getfileExtension() reger 2015-05-25 00:08:38 +02:00
  • 370ba9da71 On imageSearch prefere mime to sort out none-image documents Generalize the hack to prevent urls with just a img extension beeing returned reger 2015-05-24 21:48:58 +02:00
  • cd31633369 improve MultiprotocolURL.getFileExtension() prevent string OOB while querypart contains a dot (return just "") see log snippet in http://mantis.tokeek.de/view.php?id=533 reger 2015-05-24 19:38:04 +02:00
  • c60ccdfbcf Increase IODspatcher dumpQueue size to 2 to reduce risk of concurrent emergency dump, skip concurrent emergency merge dealing with/see http://mantis.tokeek.de/view.php?id=566 reger 2015-05-24 18:03:27 +02:00
  • 8a9622c31c fix string OoB on getImagelinks with long alttext in description calculation reger 2015-05-24 01:59:40 +02:00
  • aa83931765 Convert content charset for display via CacheResource_p Cached resource charset encoding might not fit to internal handling (using utf-8), convert resource to utf-8 see http://mantis.tokeek.de/view.php?id=576 reger 2015-05-23 20:31:37 +02:00
  • 3e742d1e34 Init remote crawler on demand If remote crawl option is not activated, skip init of remoteCrawlJob to save the resources of queue and ideling thread. Deploy of the remoteCrawlJob deferred on activation of the option. reger 2015-05-23 02:06:39 +02:00
  • dbf9e3503d Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-05-22 11:39:00 +02:00
  • 8b1a30be50 removed a -UNRESOLVED_PATTERN- Michael Peter Christen 2015-05-22 11:22:36 +02:00
  • 9938c81378 fix for division by zero Michael Peter Christen 2015-05-22 11:15:53 +02:00
  • 13f013f64a Limit extra sleep of BusyThread on LowMemCycle reger 2015-05-17 06:21:12 +02:00
  • cd7c0e0aae detail optimization of RecrawlThread reger 2015-05-17 00:13:00 +02:00
  • ace71a8877 Initial (experimental) implementation of index update/re-crawl job added to IndexReIndexMonitor_p.html Selects existing documents from index and feeds it to the crawler. currently only the field fresh_date_dt is used determine documents for recrawl (fresh_date_dt:[* TO NOW-1DAY] Documents are added in small chunks (200) to the crawler, only if no other crawl is running. reger 2015-05-16 01:23:08 +02:00
  • 141cd80456 correct log msg text reger 2015-05-16 00:01:54 +02:00
  • f3ce99bfb8 fix extract of inboundlinks_protocol_sxt url counter maybe > 999 reger 2015-05-14 00:03:09 +02:00
  • 2bc9cb5828 fix early return in addToCrawler check / handle all supplied urls after error url reger 2015-05-13 21:58:43 +02:00
  • f5f88272e4 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-05-12 12:06:42 +02:00
  • 5c67c4d460 fix for latest commit, see f810915717 (commitcomment-11145880) Michael Peter Christen 2015-05-12 12:06:21 +02:00
  • c37dda8849 fix NPE on MultiProtocolURL on url with parameter value and '=' in getAttribute - added test case for it reger 2015-05-12 01:09:10 +02:00
  • f810915717 added crawl start from a clone with very, very large url: they are now encoded as post submit form inside a javascript creation function. Michael Peter Christen 2015-05-11 16:30:41 +02:00
  • 51de86c992 disabled debug thread dumps Michael Peter Christen 2015-05-11 14:46:09 +02:00
  • d524a9d77c Merge branch 'master' of git@github.com:yacy/yacy_search_server.git Michael Peter Christen 2015-05-11 14:42:40 +02:00
  • 0710648c31 enable api calls with very long urls Michael Peter Christen 2015-05-11 14:42:21 +02:00
  • 31346e873b upd library reference of missing jsch-0.1.21 in seeduploadscp.xml upd to jsch-0.1.52.jar reger 2015-05-11 01:35:12 +02:00
  • 609c52e987 refactor getBookmark to consistenly check existance by != null (w/o throwing exception on not found) reger 2015-05-11 00:37:04 +02:00
  • 1481a8ab56 add opensearch rss results to dht collection (due to text = snippet) which is used to differentiate meta from full data - make sure check for dht is not dependant on number of collection entries reger 2015-05-10 18:52:33 +02:00
  • 5f4d35437e add bookmark.query to edit form reger 2015-05-10 15:30:21 +02:00
  • f134aa7f7f persist bookmark timestamp on setTimeStamp() reger 2015-05-10 15:29:23 +02:00
  • 752eec6697 fix NPE in addToIndex when used outside searchEvent reger 2015-05-10 05:18:23 +02:00
  • a6daddbeaa upd to commons-io-2.4.jar reger 2015-05-10 03:00:05 +02:00
  • 89124335c4 update bookmark autosearch description - add german translation reger 2015-05-10 02:29:08 +02:00
  • fbf85a1561 added temporary debug output in http client Michael Peter Christen 2015-05-08 15:31:01 +02:00
  • ff29b0e503 added option to re-index exported xml snapshot dumps to HTCACHE/snapshots by just placing them in the SURROGATES/in path Michael Peter Christen 2015-05-08 15:30:26 +02:00